Backups with Restic

  • 1838 words
  • 10 min

Restic is a powerful CLI based backup tool written in Go that I use regularly across my Darwin laptop and NixOS desktop. The important features of restic are.

  • A variety of storage providers available for your backup 'repositories', Local, AWS S3 conformant, Google Cloud Storage (My preference).
  • Deduplication of file chunks, greatly reducing the size of incremental backups.
  • A single repository can be shared by a group of machines to enable greater deduplication, the same file on two machines is only stored once.
  • Repositories are encrypted at rest so an attacker cannot simply steal access to your backup repo and read your data.

In this post, I'll outline how I use restic on a regular basis using a custom bash script. But first, let's get set up with restic.

Installing and Getting Setup

The official restic documentation is a great place to look if you get stuck at some point in this article.

First, install restic. Restic is available using most package managers, and as always I would recommend using a package manager so you can keep restic up to date.

You can install restic in the following ways.

# Using nix, my preferred method
nix-env -i restic
# Using Homebrew
brew install restic
# Using pacman
pacman -S restic

I use two different repositories for backups, a locally attached external hard drive (referred to as Public Drive) and a Google Cloud Storage bucket (GCS) which is a relatively low cost highly reliable and offsite blob storage option. The offsite nature of the GCS bucket is key. Restic has native support for local filesystem directories as backup repositories (obviously) but also ships with support for GCS buckets. You can follow the restic documentation for an explanation of setting up a basic local repository, but there's a lot that can be done with a bit of shell scripting to make backing up easy.

The key thing to note is that the repository-specific parts of a restic command can be set using environment variables. For example, the repository password can be sourced from a 'password-command' which can be set using RESTIC_PASSWORD_COMMAND and when using a GCS bucket, restic relies upon the envar GOOGLE_PROJECT_ID

A Shell Function

The first step I took to making backups easy was creating a fish shell function to include the necessary restic environment variables for my GCS repository.

function restic-gcs
  set -lx GOOGLE_PROJECT_ID REDACTED
  set -lx GOOGLE_APPLICATION_CREDENTIALS $HOME/.config/restic/envs/gcs/gcs-service-key.json
  set -lx GS_CONNECTIONS 50

  set -lx RESTIC_REPOSITORY gs:REDACTED:/
  set -lx RESTIC_PASSWORD_COMMAND "gopass-password-only password"


  set -lx KEEP_LAST 3
  set -lx RETENTION_HOURS 7
  set -lx RETENTION_DAYS 7
  set -lx RETENTION_WEEKS 7
  set -lx RETENTION_MONTHS 18
  set -lx RETENTION_YEARS 20

  restic \
    --option gs.connections=$GS_CONNECTIONS \
    $argv
end

The set -lx statements are the fish equivalent of export but scoped to the local function. The function, once placed in ~/.config/fish/functions can be invoked using restic-gcs with all arguments $argv being passed to restic within the function, which contains the appropriate envars.

Running restic-gcs init will create the repository and restic-gcs {backup, forget, snapshots} will all work conveniently without the need to continually set the envars and pass excess options.

One step further: a script to simplify common operations

My next objective was to create a portable (can be used on multiple machines) script that allows convenient backups of common directories using a specified repo. The script I wrote is called back and is deployed to ~/.config/scripts/shared/misc/back using my dotfile setup. I wanted to use the following argument structure for the script back {repo} {operation: backup-full-home, backup-limited-home, forget-prune}

The repo specific and machine specific requirements include...

  • Machine
    • Specify paths and files to backup per machine
    • Specify paths and files to ignore per machine
  • Repo
    • Specify repository, password command, retention policy, etc. per repo

I decided that the machine specific includes and excludes should be files located in ~/.config/restic/includes and ~/.config/restic/excludes respectively: this means the files can be deployed separately based on the machine without having to change the env files. The env file for each repo is located under ~/.config/restic/envs/{repo}/env.sh

The Env file

The env file is used by each supported repo to source the appropriate envars. Much like the previous fish shell script, the GCS env file is pretty simple.

#!/usr/bin/env bash
# ~/.config/restic/envs/gcs/env.sh

export GOOGLE_PROJECT_ID=REDACTED
export GOOGLE_APPLICATION_CREDENTIALS="$HOME/.config/restic/envs/gcs/gcs-service-key.json"
export GS_CONNECTIONS=50

export RESTIC_REPOSITORY=gs:REDACTED:/
export RESTIC_PASSWORD_COMMAND="gopass-password-only password"


export KEEP_LAST=3
export RETENTION_HOURS=7
export RETENTION_DAYS=7
export RETENTION_WEEKS=7
export RETENTION_MONTHS=18
export RETENTION_YEARS=20

This script (exporting a bunch of envars) can be appropriately sourced by the back script.

The Include/Exclude files

The include and exclude files are used by the back scripts backup-limited-home, backup-home, and backup-full-home commands. These files determine which directories should be backed up and which files should be excluded. You can read more about includes and excludes here.

# ~/.config/restic/includes/limited-home

/Users/x/.config/
/Users/x/.password-store/
/Users/x/wallpapers/
/Users/x/nix-home/
/Users/x/notes/


# ~/.config/restic/excludes/full-home

video/
Bin/
Applications/
Library/
.Trash/
.android/
.cache/
.cocoapods/
.nix-profile/

devel/**/node_modules/
devel/**/bin/
devel/**/vendor/
devel/**/.venv/
devel/go/bin/
devel/go/pkg/

Using my dotfile deployment setup, I deploy different include/exclude files on a per device basis.

The script

The script itself utilizes an array to store the arguments passed to restic and two switches based on $1 and $2 to append arguments and source envs based on the repo, $1, and command $2 I'm not going to explain the specifics of writing a script like this. Hopefully, this shell script makes sense to you.

#!/usr/bin/env bash
# Run a restic backup command using the correct repo and environment values
set -e -o pipefail

# Args to be passed to the restic invocation
args=()

# The repo is the first argument
case $1 in
  gcs)
    source $HOME/.config/restic/envs/gcs/env.sh
    echo "Running command using GCS, repository: $RESTIC_REPOSITORY"

    # Add Args
    args+=( "--option" "gs.connections=$GS_CONNECTIONS" )
    ;;

  public-drive | pd)
    source $HOME/.config/restic/envs/backup-drive/env.sh
    echo "Running command using backup drive filesystem, repository: $RESTIC_REPOSITORY"
    ;;

  *)
    echo "The first argument is the repository to use for the backup, the options are..."
    echo "gcs: Google cloud storage repository."
    echo "pd | public-drive: Public hard drive mounted at /Volumes/public"
    exit
    ;;
esac

# The command to execute comes second
case $2 in
  backup-limited-home | blh)
    echo "Backing up limited home."
    echo "Directories: ~/.config/ ~/.password-store/ ~/wallpapers/ ~/notes/"

    # Add Args
    args+=( 'backup' )
    args+=( '--tag' 'limited-home' )
    args+=( "--files-from" "$HOME/.config/restic/includes/limited-home" )
    ;;

  backup-home | bh)
    echo "Backing up important user data home directories."
    echo "Directories: ~/.config/ ~/.password-store/ ~/wallpapers/ ~/notes/ ~/scripts/ ~/dotfiles/ ~/drive ~/devel ~/Music"

    # Add Args
    args+=( 'backup' )
    args+=( '--tag' 'home' )
    args+=( '--exclude-file' "$HOME/.config/restic/excludes/home" )
    args+=( "--files-from" "$HOME/.config/restic/includes/home" )
    ;;

  backup-full-home | bfh)
    echo "Backing up full home directory"
    echo "Excluding files using: $HOME/.config/restic/excludes/full-home"

    args+=( 'backup' )
    args+=( '--tag' 'full-home' )
    args+=( "$HOME" )
    args+=( "--exclude-file" "$HOME/.config/restic/excludes/full-home")
    ;;

  forget | f)
    echo "Forgetting old snapshots"

    args+=(
      "forget"
      "--host" "$hostname"
      "--group-by" "paths"
      "--keep-last" "$KEEP_LAST"
      "--keep-hourly" "$RETENTION_HOURS"
      "--keep-daily" "$RETENTION_DAYS"
      "--keep-weekly" "$RETENTION_WEEKS"
      "--keep-monthly" "$RETENTION_MONTHS"
      "--keep-yearly" "$RETENTION_YEARS"
    )
    ;;

  forget-prune | fp)
    echo "Forgetting and pruning old snapshots"

    args+=(
      "forget"
      "--prune"
      "--host" "$hostname"
      "--group-by" "paths"
      "--keep-last" "$KEEP_LAST"
      "--keep-hourly" "$RETENTION_HOURS"
      "--keep-daily" "$RETENTION_DAYS"
      "--keep-weekly" "$RETENTION_WEEKS"
      "--keep-monthly" "$RETENTION_MONTHS"
      "--keep-yearly" "$RETENTION_YEARS"
    )
    ;;

  *)
    echo "The second argument is the operation to run, the options are..."
    echo "blh | backup-limited-home: Backup a small selection of home folders, mainly configuration files."
    echo "bfh | backup-full-home: Backup the entire home directory, exclude some large directories like ~/Library/"
    echo "bh | backup-home: Backup home files containing important user documents, this includes folders like ~/Desktop, ~/Documents and ~/drive"
    echo "f | forget: Forgets snapshots according to the retention policies"
    echo "fp | forget-prune: Like forget but also prunes unused data from the repository"
    exit
    ;;
esac

restic "${args[@]}"

Usage

The script is complete. The main purpose of the script is to make it easy to run common backups and tasks. Instead of running restic-gcs backup ~/.config ~/wallpapers ~/notes --exclude ... which has already been simplified using the fish shell function, I can write back gcs blh

=>  back
The first argument is the repository to use for the backup, the options are...
gcs: Google cloud storage repository.
pd | public-drive: Public hard drive mounted at /Volumes/public

=> back gcs
Running command using GCS, repository: gs:REDACTED:/
The second argument is the operation to run, the options are...
blh | backup-limited-home: Backup a small selection of home folders, mainly configuration files.
bfh | backup-full-home: Backup the entire home directory, exclude some large directories like ~/Library/
bh | backup-home: Backup home files containing important user documents, this includes folders like ~/Desktop, ~/Documents and ~/drive
f | forget: Forgets snapshots according to the retention policies
fp | forget-prune: Like forget but also prunes unused data from the repository

=> back gcs blh
Running command using GCS, repository: gs:REDACTED:/
Backing up limited home.
Directories: ~/.config/ ~/.password-store/ ~/wallpapers/ ~/notes/

Files:          13 new,    10 changed, 13684 unmodified
Dirs:            0 new,     2 changed,     0 unmodified
Added to the repo: 3.322 KiB

processed 13707 files, 564.171 MiB in 0:07
snapshot REDACTED saved

The fish shell function is still just as important though, I commonly run operations such as the following.

=> restic-gcs snapshots -c --group-by tags,host

snapshots for (host [some-device], tags [home]):
...
...
...
snapshots for (host [place], tags [limited-home]):
ID        Time                 Host   Tags
--------------------------------------------------
REDACTED  2019-XX-XX XX:41:00  place  limited-home
REDACTED  2019-XX-XX XX:32:15  place  limited-home
REDACTED  2019-XX-XX XX:23:29  place  limited-home
REDACTED  2019-XX-XX XX:10:51  place  limited-home
REDACTED  2019-XX-XX XX:12:24  place  limited-home
REDACTED  2019-XX-XX XX:59:51  place  limited-home
REDACTED  2019-XX-XX XX:29:22  place  limited-home
REDACTED  2019-XX-XX XX:21:01  place  limited-home
REDACTED  2019-XX-XX XX:50:46  place  limited-home
REDACTED  2019-XX-XX XX:03:22  place  limited-home
REDACTED  2019-XX-XX XX:36:51  place  limited-home
REDACTED  2019-XX-XX XX:48:00  place  limited-home
REDACTED  2019-XX-XX XX:29:41  place  limited-home
REDACTED  2019-XX-XX XX:52:05  place  limited-home
REDACTED  2019-XX-XX XX:43:22  place  limited-home
--------------------------------------------------
15 snapshots

That's it

Backups for me are now easy and convenient. I can make offsite backups to GCS using WiFi. Making frequent backups over the internet is often much easier than plugging in a single hard drive at different times.