Backing up your ActionKit database

Here’s how — and why! — to set up your own automated backups of your ActionKit database.

Why would I want this?

The ActionKit team maintains their own internal backups of your full database, but it can be useful to have your own external backups as well.

Within ActionKit, changes aren’t tracked on most fields — so any time you change a user’s name, address, phone number, custom user fields, or group memberships, the previous values of those fields become permanently inaccessible. That’s also true for most other information stored in your database: everything from custom action fields to event details are permanently overwritten every time an update occurs.1

That means that if you perform a big import and later discover that you included data for users that you shouldn’t have, it’s quite hard to roll back your user data to the state it was in beforehand.

If you have a recent backup available, though, you can revert those changes. Just create a new database on the fly with your latest backup as the source, then query that to get a CSV file you can import if you realize something was done in error.

You can also use your backup as a lightweight analytics platform. ActionKit’s reporting system and replica database is fantastic, but sometimes you’ll find that there are queries that you just can’t figure out how to construct — or how to run without waiting for hours — from within the system. With a backup that you can import on demand into a temporary database that isn’t limited to read-only queries, you can create new tables to aggregate and restructure your data, import data from other sources to join to your ActionKit tables, and then run much simpler (and faster) queries to generate dashboards, emailed summaries, or files to import back into ActionKit.

And if you ever need to migrate off ActionKit to a different platform, you should definitely keep a final export of your full database so that you can continue to query it for historical data and bring in data to other systems via imports going forward. Don’t lose all those years’ worth of content and deep member engagement data!

How do I do it?

From a terminal, this command will generate a full export of your ActionKit MySQL replica database as an on-disk text file on your local computer or cloud server, assuming you’ve set the appropriate environment variables and have a MySQL client installed:

mysqldump --host=$ACTIONKIT_DB_HOST --user=$ACTIONKIT_DB_USER --password=$ACTIONKIT_DB_PASSWORD \
	--skip-lock-tables —-set-gtid-purged=OFF --no-tablespaces \
	--databases $ACTIONKIT_DB_NAME > $ACTIONKIT_DB_NAME.sql

This set of flags2 should give you a backup that you can later restore to a cloud-managed database.

That’s not very helpful though. You probably don’t want to have to run that command every time you want a database backup — you want it to run automatically every night. And you almost certainly don’t want that backup file on your computer — you want it securely hosted in a cloud storage environment like Amazon S3.

How do I do it easily, cheaply, and automatically?

Fortunately, it’s quite easy to set up a version of this command to run on a fixed schedule in the cloud. And if your organization has a Github account, you can do this all for virtually no cost through Github Actions.

Just fork this repository that we’ve put together to make this simple:

thethirdbearsolutions/auto-actionkit-database-backups

Then navigate to your repository’s Settings → Secrets and variables → Actions and enter in the following eight Repository Secrets using the big green “New repository secret” button:

  • ACTIONKIT_DB_NAME: the name of your replica database, e.g. ak_yourinstance

  • ACTIONKIT_DB_HOST: the hostname of your replica database, e.g. your-organization.client-db.actionkit.com

  • ACTIONKIT_DB_USER: the username to connect to your replica database, e.g. yourinstance_username

  • ACTIONKIT_DB_PASSWORD: the password for your database access that was sent to you via ots.actionkit.com

(The above four values can be found by logging in to your ActionKit instance, navigating to Staff → your user account, and clicking “MySQL Access: Create Account” or “MySQL Access: Reset Password.” You’ll then receive an email containing the information you need to fill in those four secrets.)

  • AWS_ACCESS_KEY_ID: your Amazon access key.

  • AWS_SECRET_ACCESS_KEY: the secret access key associates with your AWS access key.

(If you’re using ActionKit you probably already have an AWS account set up with at least one access key, which you provided to ActionKit support during onboarding. However, we recommend using AWS IAM to set up an entirely new user account with its own access key — since you’ll be uploading very sensitive data here (i.e. your entire database) you should keep access as isolated as possible so that you can audit, rotate, and/or revoke keys at any time without impact on other systems.)

  • AWS_S3_BUCKET_NAME: the name of a new S3 bucket that you created specifically for your backups. Make sure that the access key you provided above has full access to this bucket.

  • AWS_DEFAULT_REGION: the AWS region that your bucket was created in, e.g. us-east-1

And that’s it! Github Actions will now run a backup of your database every day at midnight UTC, compress the backup, and upload it to your S3 bucket as a .sql.gz file.

Note that we recommend making your copy of the repository private. If your repository is public, your Repository Secrets will remain totally private, but the timing of your backups — including both how frequently you run them, and how long they take — would be publicly viewable, which would inadvertently reveal some indirect information about how big your database is.

How long will a backup take?

If your database is relatively small, expect each run of the backup to take 5-30 minutes. For large databases reflecting many years of member engagement and organizational history against a large list, the backup can take 2-4 hours. Jobs in the Github Actions platform are limited to six hours of execution time, which should be enough time for your backup to complete unless you have an extremely large ActionKit database.

To give a more concrete sense of scale, we’ve seen backups of a 6 GiB ActionKit database complete in 12 minutes, and backups of a 90 GiB ActionKit database complete in 2 hours and 45 minutes.

How much will a backup cost?

By default, Github Actions allows you to run your scripts for 2000-3000 minutes per month at no additional cost3 so if your backup completes in an hour or less you should be able to run it daily without incurring any costs, unless you have other jobs running in the Github Actions platform as well.

For large databases that take 2-4 hours to back up, you can reduce the backup schedule4 and only run 2-3 times per week if you’re concerned about hitting the cap on free minutes. Or you can set a Github Actions spending limit to control exactly how much you’ll pay.

Storing your backups on S3 should generally be very cheap — no more than a few dollars per month, and probably less than a dollar. We’ve seen backups of a 6 GiB ActionKit database that result in a 1.2 GiB compressed backup file, and backups of a 90 GiB ActionKit database that result in a 30 GiB compressed backup file. At S3’s baseline price of $0.023 per GB these work out to well under a dollar per month each.

Alternatives If you’re not using Amazon S3, this command can be adapted very easily for any S3-compatible cloud storage platform. That includes Google Cloud Storage, Digital Ocean Spaces, and Cloudflare’s R2, among many others. You’ll need to modify the command a bit — it can be found in the .github/workflows/backup.yml file in your forked repository — and will likely need a slightly different set of repository secrets.

If you don’t have a Github account or are otherwise unable to use Github Actions, there are many other options for running scheduled commands. That said, you’ll need to make sure that you’re using a service that lets you execute potentially-very-long-running commands — a database backup can take a while, and many hosted scheduling services have pretty short timeouts. You’ll also want to be confident that you trust the service with your sensitive data — you’ll be providing access to your database after all. Github Actions checks both of those boxes, and it’s also nicely easy to set up since it comes with the AWS command-line toolkit and mysql client preinstalled; on other platforms you may need to do more setup work to install those dependencies before running the command.

You can also look into services that specifically offer scheduled database backup commands and/or fully-hosted backup solutions, though these are likely to cost more and may not provide the same peace of mind around stability, security, and transparency as a Github + Amazon/Google/Cloudflare combination.

We’ve used Skyvia a few times and have found it to be quite user-friendly and simple to set up for a variety of snapshotting, querying, and data integration use cases.

Many other options exist, including bakup.io, weap.io, and simplebackups.com. In all of these cases note that you may run into some trouble if you’re not able to pass the --skip-lock-tables and --no-tablespaces flags into your mysqldump command.

Now what?

In future posts we’ll layer in some improvements to this workflow, like automatic backup rotation. And we’ll also go over ways to easily spin up a database from your backup, if you need to use it to retrieve missing data or want to run queries against it.

Let us know how this goes! And if this is out of your comfort zone you can reach out to us about doing it for you.

Storing your backups on S3 should generally be very cheap — no more than a few dollars per month, and probably less than a dollar. We’ve seen backups of a 6 GiB ActionKit database that result in a 1.2 GiB compressed backup file, and backups of a 90 GiB ActionKit database that result in a 30 GiB compressed backup file. At S3’s baseline price of $0.023 per GB these work out to well under a dollar per month each.


  1. The major exception here is mailing list subscriptions, thanks to the core_subscriptionhistory table. ↩︎

  2. In particular, note:

     ↩︎
  3. Depending on whether you’re already paying for Github. If you’re on a free Github plan, you get 2000 minutes included. If you’re on a paid Github plan for your organization, you get 3000 minutes included. ↩︎

  4. To reduce the backup schedule, navigate to .github/workflows/backup.yml in your copy of the repository. Look for the line that starts - cron: and replace the rest of that line with a different schedule using cron syntax. (Crontab.guru is a great resource for remembering cron syntax.) For example, 0 0 * * 1,3,6 would run backups three times per week at midnight UTC every Monday, Wednesday, and Saturday. ↩︎

© 2024 The Third Bear Solutions, LLC
About Contact