Project Syn Tech

Tutorial: Backing up Kubernetes Clusters with K8up

23. Jun 2020

One of the most common questions we got from companies moving to Kubernetes has always had to do with backups: how can we ensure that the information in our pods and services can be quickly and safely restored in case of problems?
This situation is so common that we VSHN decided to tackle it with our own Kubernetes operator for backups, which we called K8up.
Note: This tutorial is available in three versions, each in its own branch of the GitHub repository bundled with this text:

1. What is K8up?

K8up (pronounced „/keɪtæpp/“ or simply „ketchup“) is a Kubernetes operator distributed via a Helm chart, compatible with OpenShift and plain Kubernetes. It allows cluster operators to:

  • Backup all PVCs marked as ReadWriteMany or with a specific annotation.
  • Perform individual, on-demand backups.
  • Schedule backups to be executed on a regular basis.
  • Schedule archivals (for example to AWS Glacier), usually executed in longer intervals.
  • Perform „Application Aware“ backups, containing the output of any tool capable of writing to stdout.
  • Check the backup repository for its integrity.
  • Prune old backups from a repository.
  • Based on top of Restic, it can save backups in Amazon S3 buckets, and Minio (used we’ll see in this tutorial.)

K8up is written in Go and is an open source project hosted in GitHub.

2. Introduction

This tutorial will show you how to backup a small Minikube cluster running on your laptop. We are going to deploy MinioMariaDB and WordPress on this cluster, and create a blog post in our new website. Later we’re going to „deface“ it, so that we can safely restore it later. Through this process, you are going to learn more about K8up and its capabilities.
Note: All the scripts and YAML files are available in GitHub: github.com/vshn/k8up-tutorial.

2.1 Requirements

This tutorial has been tested in both Linux (Ubuntu 18.04) and macOS (10.15 Catalina.) Please install the following software packages before starting:

  • Make sure PyYAML 5.1 or later is installed: pip install PyYAML==5.1
  • The kubectl command.
  • The Restic backup application.
  • The latest version of Minikube (1.9 at the time of this writing.)
  • Helm, required to install K8up in your cluster.
  • k9s to display the contents of our clusters on the terminal.
  • jq, a lightweight and flexible command-line JSON processor.

3. Tutorial

It consists of six steps to be executed in sequence:

  1. Setting up the cluster.
  2. Creating a blog.
  3. Backing up the blog.
  4. Restoring the contents of the backup.
  5. Scheduling regular backups.
  6. Cleaning up.

Let’s get started!

3.1 Setting up the cluster

Note: The operations of this step can be executed at once using the scripts/1_setup.sh script.

  1. Start your minikube instance with a configuration slightly more powerful than the default one:
    • minikube start --memory 4096 --disk-size 60g --cpus 4
      Note: On some laptops, running Minikube on battery power severely undermines its performance, and pods can take really long to start. Make sure to be plugged in to power before starting this tutorial.
  2. Copy all required secrets and passwords into the cluster:
    • kubectl apply -k secrets
  3. Install and run Minio in your cluster:
    • kubectl apply -k minio
  4. Install MariaDB in your cluster:
    • kubectl apply -k mariadb
  5. Install WordPress:
    • kubectl apply -k wordpress
  6. Install K8up in Minikube:
    • helm repo add appuio charts.appuio.ch
    • helm repo update
    • helm install appuio/k8up --generate-name --set k8up.backupImage.tag=v0.1.8-root

After finishing all these steps, check that everything is running; the easiest way is to launch k9s and leave it running in its own terminal window, and of course you can use the usual kubectl get pods.
Tip: In k9s you can easily delete a pod by going to the „Pods“ view (type :, write pods at the prompt and hit Enter), selecting the pod to delete with the arrow keys, and hitting the CTRL+D key shortcut.

The asciinema movie below shows all of these steps in real time.

 

3.2 Viewing Minio and WordPress on a browser

Note: The operations of this step can be executed at once using the scripts/2_browser.sh script.

  1. Open WordPress in your default browser using the minikube service wordpress command. You should see the WordPress installation wizard appearing on your browser window.
  2. Open Minio in your default browser with the minikube service minio command.
    • You can login into minio with these credentials: access key minio, secret key minio123.

3.2.1 Setting up the new blog

Follow these instructions in the WordPress installation wizard to create your blog:

  1. Select your language from the list and click the Continue button.
  2. Fill the form to create new blog.
  3. Create a user admin.
  4. Copy the random password shown, or use your own password.
  5. Click the Install WordPress button.
  6. Log in to the WordPress console using the user and password.
    • Create one or many new blog posts, for example using pictures from Unsplash.
  7. Enter some text or generate some random text using a Lorem ipsum generator.
  8. Click on the „Document“ tab.
  9. Add the image as „Featured image“.
  10. Click „Publish“ and see the new blog post on the site.

3.3 Backing up the blog

Note: The operations of this step can be executed at once using the scripts/3_backup.sh script.
To trigger a backup, use the command kubectl apply -f k8up/backup.yaml. You can see the job in the „Jobs“ section of k9s.
Running the logs command on a backup pod brings the following information:

$ kubectl logs backupjob-1564752600-6rcb4
No repository available, initialising...
created restic repository edaea22006 at s3:http://minio:9000/backups
Please note that knowledge of your password is required to access
the repository. Losing your password means that your data is
irrecoverably lost.
Removing locks...
created new cache in /root/.cache/restic
successfully removed locks
Listing all pods with annotation appuio.ch/backupcommand in namespace default
Adding default/mariadb-9588f5d7d-xmbc7 to backuplist
Listing snapshots
snapshots command:
0 Snapshots
backing up via mariadb stdin...
Backup command: /bin/bash, -c, mysqldump -uroot -p"${MARIADB_ROOT_PASSWORD}" --all-databases
done: 0.00%
backup finished! new files: 1 changed files: 0 bytes added: 4184711
Listing snapshots
snapshots command:
1 Snapshots
sending webhook Listing snapshots
snapshots command:
1 Snapshots
backing up...
Starting backup for folder wordpress-pvc
done: 0.00%
backup finished! new files: 1932 changed files: 0 bytes added: 44716176
Listing snapshots
snapshots command:
2 Snapshots
sending webhook Listing snapshots
snapshots command:
2 Snapshots
Removing locks...
successfully removed locks
Listing snapshots
snapshots command:
2 Snapshots

If you look at the Minio browser window, there should be now a set of folders that appeared out of nowhere. That’s your backup in Restic format!

3.3.1 How does K8up work?

K8up runs Restic in the background to perform its job. It will automatically backup the following:

  1. All PVCs in the cluster with the ReadWriteMany attribute.
  2. All PVCs in the cluster with the k8up.syn.tools/backup: "true" annotation.

The PVC definition below shows how to add the required annotation for K8up to do its job.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: wordpress-pvc
  labels:
    app: wordpress
  annotations:
    k8up.syn.tools/backup: "true"
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

Just like any other Kubernetes object, K8up uses YAML files to describe every single action: backups, restores, archival, etc. The most important part of the YAML files used by K8up is the backend object:

backend:
  repoPasswordSecretRef:
    name: backup-repo
    key: password
  s3:
    endpoint: http://minio:9000
    bucket: backups
    accessKeyIDSecretRef:
      name: minio-credentials
      key: username
    secretAccessKeySecretRef:
      name: minio-credentials
      key: password

This object specifies two major keys:

  • repoPasswordSecretRef contains the reference to the secret that contains the Restic password. This is used to open, read and write to the backup repository.
  • s3 specifies the location and credentials of the storage where the Restic backup is located. The only valid option at this moment is an AWS S3 compatible location, such as a Minio server in our case.

3.4 Restoring a backup

Note: The operations of this step can be executed at once using the scripts/4_restore.sh script.
Let’s pretend now that an attacker has gained access to your blog: we will remove all blog posts and images from the WordPress installation and empty the trash.

Oh noes! But don’t worry: thanks to K8up you can bring your old blog back in a few minutes.
There are many ways to restore Restic backups, for example locally (useful for debugging or inspection) and remotely (on PVCs or S3 buckets, for example.)

3.4.1 Restoring locally

To restore using Restic, set these variables (in a Unix-based system; for Windows, the commands are different):

export KUBECONFIG=""
export RESTIC_REPOSITORY=s3:$(minikube service minio --url)/backups/
export RESTIC_PASSWORD=p@ssw0rd
export AWS_ACCESS_KEY_ID=minio
export AWS_SECRET_ACCESS_KEY=minio123

Note: You can create these variables simply running source scripts/environment.sh.
With these variables in your environment, run the command restic snapshots to see the list of backups, and restic restore XXXXX --target ~/restore to trigger a restore, where XXXXX is one of the IDs appearing in the results of the snapshots command.

3.4.2 Restoring the WordPress PVC

K8up is able to restore data directly on specified PVCs. This requires some manual steps.

  • Using the steps in the previous section, „Restore Locally,“ check the ID of the snapshot you would like to restore:
$ source scripts/environment.sh
$ restic snapshots
$ restic snapshots XXXXXXXX --json | jq -r '.[0].id'
  • Use that long ID in your restore YAML file k8up/restore/wordpress.yaml:
    • Make sure the restoreMethod:folder:claimName: value corresponds to the Paths value of the snapshot you want to restore.
    • Replace the snapshot key with the long ID you just found:
apiVersion: backup.appuio.ch/v1alpha1
kind: Restore
metadata:
  name: restore-wordpress
spec:
  snapshot: 00e168245753439689922c6dff985b117b00ca0e859cc69cc062ac48bf8df8a3
  restoreMethod:
    folder:
      claimName: wordpress-pvc
  backend:
  • Apply the changes:
    • kubectl apply -f k8up/restore/wordpress.yaml
    • Use the kubectl get pods commands to see when your restore job is done.

Tip: If you use the kubectl get pods --sort-by=.metadata.creationTimestamp command to order the pods in descending age order; at the bottom of the list you will see the restore job pod.

3.4.3 Restoring the MariaDB pod

In the case of the MariaDB pod, we have used a backupcommand annotation. This means that we have to „pipe“ the contents of the backup into the mysql command of the pod, so that the information can be restored.
Follow these steps to restore the database:

  1. Retrieve the ID of the MariaDB snapshot:
    • restic snapshots --json --last --path /default-mariadb | jq -r '.[0].id'
  2. Save the contents of the backup locally:
    • restic dump SNAPSHOT_ID /default-mariadb > backup.sql
  3. Get the name of the MariaDB pod:
    • kubectl get pods | grep mariadb | awk '{print $1}'
  4. Copy the backup into the MariaDB pod:
    • kubectl cp backup.sql MARIADB_POD:/
  5. Get a shell to the MariaDB pod:
    • kubectl exec -it MARIADB_POD — /bin/bash
  6. Execute the mysql command in the MariaDB pod to restore the database:
    • mysql -uroot -p"${MARIADB_ROOT_PASSWORD}" < /backup.sql

Now refresh your WordPress browser window and you should see the previous state of the WordPress installation restored, working and looking as expected!

3.5 Scheduling regular backups

Note: The operations of this step can be executed at once using the scripts/5_schedule.sh script.
Instead of performing backups manually, you can also set a schedule for backups. This requires specifying the schedule in cron format.

backup:
  schedule: '*/2 * * * *'    # backup every 2 minutes
  keepJobs: 4
  promURL: http://minio:9000

Tip: Use crontab.guru to help you set up complex schedule formats in cron syntax.
The schedule can also specify archive and check tasks to be executed regularly.

archive:
  schedule: '0 0 1 * *'       # archive every week
  restoreMethod:
    s3:
      endpoint: http://minio:9000
      bucket: archive
      accessKeyIDSecretRef:
        name: minio-credentials
        key: username
      secretAccessKeySecretRef:
        name: minio-credentials
        key: password
check:
  schedule: '0 1 * * 1'      # monthly check
  promURL: http://minio:9000

Run the kubectl apply -f k8up/schedule.yaml command. This will setup an automatic schedule to backup the PVCs every 5 minutes (for minutes that are divisors of 5).
Wait for at most 2 minutes, and run the restic snapshots to see more backups piling up in the repository.
Tip: Running the watch restic snapshots command will give you a live console with your current snapshots on a terminal window, updated every 2 seconds.

3.6 Cleaning up the cluster

Note: The operations of this step can be executed at once using the scripts/6_stop.sh script.
When you are done with this tutorial, just execute the minikube stop command to shut the cluster down. You can also minikube delete it, if you would like to get rid of it completely.

4. Conclusion

We hope that this walkthrough has given you a good overview of K8up and its capabilities. But it can do much more than that! We haven’t talked about the archive, prune, and check commands, or about the backup of any data piped to stdout (called „Application Aware“ backups.) You can check these features in the K8up documentation website where they are described in detail.
K8up is still a work in progress, but it is already being used in production in many clusters. It is also an open source project, and everybody is welcome to use it freely, and even better, to contribute to it!

Adrian Kosmaczewski

Adrian Kosmaczewski ist bei VSHN für den Bereich Developer Relations zuständig. Er ist seit 1996 Software-Entwickler, Trainer und veröffentlichter Autor. Adrian hat einen Master in Informationstechnologie von der Universität Liverpool.

Kontaktiere uns

Unser Expertenteam steht für dich bereit. Im Notfall auch 24/7.

Kontakt