diff --git a/README.md b/README.md index b4fb445..c8ca1d5 100644 --- a/README.md +++ b/README.md @@ -108,42 +108,56 @@ Before testing the backups, you may want to shut `yggdrasil` down for extra conf not being accessed/modified during this process. It is easy to access `yggdrasil` by accident if `/etc/hosts` is not modified in the test VM, something that is easy to forget. +### Baldur on Scaleway + 1. Create `baldur` by running: ```sh python scripts/scaleway/baldur.py create --volume-size ``` Pick a volume size that's larger than what `yggdrasil` estimates for `rpool/var/lib/yggdrasil/data`. -2. Provision `baldur` by running +2. When done destroy `baldur` by running: + ```sh + python scripts/scaleway/baldur.py delete + ``` + +### Baldur on Yggdrasil + +1. Create a VM on `yggdrasil`. + - Install the OS on a zvol on `rpool`. + - Prepare a zvol on `hpool` of size that's larger than what `yggdrasil` estimates for + `rpool/var/lib/yggdrasil/data` and mount at `/var/lib/baldur/data`. + - Create non-root user `wojtek` with `sudo` privileges. +2. Configure SSH to use `yggdrasil` as a jump server. + +### Test + +1. Provision `baldur` by running ```sh ansible-playbook --vault-id @vault-keyring-client.py -i inventory/baldur_production playbooks/baldur.yml ``` -3. Restore all the backups by ssh'ing into `baldur` and running (as root): +2. Restore all the backups by ssh'ing into `baldur` and running (as root): ```sh /usr/local/sbin/restic-batch --config-dir /etc/restic-batch.d restore ``` -4. Start all the pod services with: +3. Start all the pod services with: ```sh ansible-playbook --vault-id @vault-keyring-client.py -i inventory/baldur_production playbooks/services_start.yml ``` Give them some time to download all the images and start. -5. Once the CPU returns to idling check the state of all the pod services and their `veth` +4. Once the CPU returns to idling check the state of all the pod services and their `veth` interfaces. If necessary restart the affected pod. Sometimes they fail to start (presumably due to issues related to limited CPU and RAM). -6. Boot into a test VM. Ideally, one installed onto a virtual disk since the live system might not +5. Boot into a test VM. Ideally, one installed onto a virtual disk since the live system might not have enough space. A VM is used to make sure that none of the services on the host workstation connect to `baldur` by accident. -7. Modify `/etc/hosts` in the VM to point at `baldur` for all relevant domains. -8. Test each service manually one by one. Use the Flagfox add-on to verify that you are indeed +6. Modify `/etc/hosts` in the VM to point at `baldur` for all relevant domains. +7. Test each service manually one by one. Use the Flagfox add-on to verify that you are indeed connecting to `baldur`. -9. Stop all the pod services with: +8. Stop all the pod services with: ```sh ansible-playbook --vault-id @vault-keyring-client.py -i inventory/baldur_production playbooks/services_stop.yml ``` -10. Destroy `baldur` by running: - ```sh - python scripts/scaleway/baldur.py delete - ``` ## Music organisation