Decided to learn some automation with ansible for my arch machine-learning setup, expecting to break things rather often. Crude, ongoing WIP :) Aiming for it to be useful for others too, adding some docs in case anyone needs a place to start for their own project.
Current state 14-06-2024:
- Roles for setting up a vanilla arch system with adding various tools and custom configurations
- currently mamba is the environment manager of choice, simply because ML dependencies are cumbersome and I want my friends to be able to install things without sudo
- might add containers later but rn in no mood struggling with the docker vs wireguard vs kvm circus
- Adding new users to the system with everything configured to allow them play around with ML
This ansible playbook automates my personal machine learning setup on Arch Linux.
I have a workstation with a RTX A4000 Nvidia GPU and want to use it as major playground especially for local AI-models. Furthermore as sharing is caring, thus I want to be able to give access to some friends The workstation is only accessible trough Wireguard, thus if you plan to adapt any of this keep that in mind - especially when it comes to further-down details on configuring exposure of open-webUI etc.
This playbook sets up and configures a workstation with a vanilla arch install. The setup is designed with testing in mind, around my approach to deploy step by step in the VM first and then transfer it to further testing on the workstation.
Currently, there are two groups: *[machines] *[vms]
Things are identical for both groups, except sections that require GPU||a lot of disk space, those get skipped when [vms] is specified. I tend to work around with dummy-roles instead in that case.
Remote graphical access currently is realized via waypipe
. See section Examples
at the bottom
The reason I decided to go for mamba, are:
- obviously keeping things separated
- enabling my friends to install what they need without needing sudo or spamming the system
- current ml-projects are often peculiar in terms of needs and dependencies, the approach with least headache seems to be to launch an environment via mamba and install pip + your-needed-python-version from _within and THEN install all further project-specific requirements via pip - again from within the env. Sucks but from all approaches I've worked with, the least painful and most reproducible so far
- currently in no mood struggling with the docker vs wireguard vs kvm circus I ran into when initially planning to do it via containers
- also, afaik accessing the GPU via containers is non-trivial so maybe sth for the future, too much overkill for this setup
- might still do so in future tho
So far the playbook has following structure:
roles:
- name: roles/install
- role: setup
- role: updates
- role: dummy-env
- role: open-webui
when: inventory_host name in groups['machines']
- role: setup-new-user
- Sets up and configures yay
- Installs all packages listed in
config.yml
according to their sections
- Configures some services, tools and working environment
Updates can be automatically launched as as post-tasks too (see site.yml
) but i currently prefer it this way whilst actively developing
Excluded by default, used this for finding out on how to make mamba run properly, kept it as convenient role for further experiments
Installs open-webUI in a dedicated mamba-env. Given space requirements only executed when "host=machines" is set #TODO: as it's technically groups the naming and way of calling is more than confusing, I hope to be able to find a better way #TODO add further roles for various ML-applications
Sets up a new user without sudo access and baseline configuration of some tools.
An initial password is generated via mkpasswd --method=sha-512
and the hashed value added in the according section in vars/main.yml
Ask your friend for their pub key and add it to files
. Refer to it from within main.yml
#TODO Find a way to put this into vars/main.yml
too
- ansible installed on your host-system (as time of writing I am running ansible 2.17.0)
- vanilla arch installation
- a user with sudo
- working ssh between your host and the machine-to-be-deployed
- UI - I choose KDE via the installer for convenience so it's not in the roles
git clone https://github.com/nsultova/rechenknecht-dev.git
cd rechenknecht-configure
- ensure you have your vanilla arch up and running and a working ssh connections
- add your credential as needed in
inventory
- execute via
ansible-playbook -K site.yml -i inventory -e "host=machines"
- after initial setup, reboot
-
Most things have been tested on a VM, you can set up a vanilla arch vm and run everything via
ansible-playbook -K site.yml -i inventory -e "host=vms"
-
sections that are specifically dependent on GPU/big disk requirements are skipped
-
don't forget to configure your ssh-access and add your VM's credentials in
inventory
Access open-WebUI:
-
log in via ssh on the workstation
-
(again here, there's a wireguard in between so you need to do your own configuration here)
-
launch a tmux session (bc you don't want your session to die together with your ssh connection)
-
activate the custom mamba environment (from within tmux session):
mamba env activate open-webui-env
-
cd into the open-webUI folder and start the server using:
open-webui serve
-
Server should be listening on localhost:8080
-
From your host, connect via:
waypipe -c lz4=9 ssh [email protected] firefox
-
Now, the workstations firefox should pop-up on your machine
-
access open-webUI via
http://127.0.0.1:8080
-
..and you're good to go :)