# The Ansible Edda Ansible playbooks for provisioning The Nine Worlds. ## Secrets vault - Encrypt with: ```ansible-vault encrypt vault.yml``` - Decrypt with: ```ansible-vault decrypt secrets.yml``` - Encrypt all `vault.yml` in a directory with: ```ansible-vault encrypt directory/**/vault.yml``` - Decrypt all `vault.yml` in a directory with: ```ansible-vault decrypt directory/**/vault.yml``` - Run a playbook with ```ansible-playbook --vault-id @prompt playbook.yml``` ## The Nine Worlds The main entrypoint for The Nine Worlds is [`main.yml`](main.yml). ### Keyring integration Keyring integration requires `python3-keyring` to be installed. To set the keyring password run: ``` sh ./vault-keyring-client.py --set [--vault-id ] ``` If `--vault-id` is not specified, the password will be stored under `ansible`. To use the password from the keyring invoke playbooks with: ``` sh ansible-playbook --vault-id @vault-keyring-client.py ... ``` ### Production and testing The inventory files are split into [`production`](production) and [`testing`](testing). To run the `main.yml` playbook on production hosts: ``` sh ansible-playbook main.yml -i production ``` To run the `main.yml` playbook on production hosts: ``` sh ansible-playbook main.yml -i testing ``` ### Testing virtual machines The scripts for starting, stopping, and reverting the testing virtual machines is located in `scripts/testing/vmgr.py`. ### Playbooks The Ansible Edda playbook is composed of smaller [`playbooks`](playbooks). To run a single playbook, invoke the relevant playbook directly from the playbook directory. For example, to run the [`system`](system) playbook, run: ``` sh ansible-playbook playbooks/system.yml ``` Alternatively you can use its tag as well: ``` sh ansible-playbook main.yml --tags "system" ``` ### Roles Playbooks are composed of roles defined in the `roles` directory, [`playbooks/roles`](playbooks/roles). To play only a specific role, e.g. `system/base` in the playbook `system`, run: ``` sh ansible-playbook playbooks/system.yml --tags "system:base" ``` Or from the main playbook: ``` sh ansible-playbook main.yml --tags "system:base" ``` ### Role sub-tasks Some roles are split into smaller groups of tasks. This can be checked by looking at the `tasks/main.yml` file of a role, e.g. [`playbooks/roles/system/base/tasks/main.yml`](playbooks/roles/system/base/tasks/main.yml). To play only a particular group within a role, e.g. `sshd` in `base` of `system`, run: ``` sh ansible-playbook playbooks/system.yml --tags "system:base:sshd" ``` Or from the main playbook: ``` sh ansible-playbook main.yml --tags "system:base:sshd" ``` ## Testing backups Before testing the backups, you may want to shut `yggdrasil` down for extra confidence that it is not being accessed/modified during this process. It is easy to access `yggdrasil` by accident if `/etc/hosts` is not modified in the test VM, something that is easy to forget. 1. Create `baldur` by running: ```sh python scripts/scaleway/baldur.py create --volume-size ``` Pick a volume size that's larger than what `yggdrasil` estimates for `rpool/var/lib/yggdrasil/data`. 2. Provision `baldur` by running ```sh ansible-playbook --vault-id @vault-keyring-client.py -i inventory/baldur_production playbooks/baldur.yml ``` 3. Restore all the backups by ssh'ing into `baldur` and running (as root): ```sh /usr/local/sbin/restic-batch --config-dir /etc/restic-batch.d restore ``` 4. Start all the pod services with: ```sh ansible-playbook --vault-id @vault-keyring-client.py -i inventory/baldur_production playbooks/services_start.yml ``` Give them some time to download all the images and start. 5. Once the CPU returns to idling check the state of all the pod services and their `veth` interfaces. If necessary restart the affected pod. Sometimes they fail to start (presumably due to issues related to limited CPU and RAM). 6. Boot into a test VM. Ideally, one installed onto a virtual disk since the live system might not have enough space. A VM is used to make sure that none of the services on the host workstation connect to `baldur` by accident. 7. Modify `/etc/hosts` in the VM to point at `baldur` for all relevant domains. 8. Test each service manually one by one. Use the Flagfox add-on to verify that you are indeed connecting to `baldur`. 9. Stop all the pod services with: ```sh ansible-playbook --vault-id @vault-keyring-client.py -i inventory/baldur_production playbooks/services_stop.yml ``` 10. Destroy `baldur` by running: ```sh python scripts/scaleway/baldur.py delete ```