Distributed Database Replication, In One Evening2022-04-20
I have read about Litestream with interest and thought I should finally try it out. The appeal of a lightweight replication mechanism on top of one of my favorite little databases is too great to ignore.
First I Need an SFTP Server
I need an SFTP server not because I am nostalgic for the 90s but because it is one of the supported backends for Litestream. While the intended use seems to be replicating SQLite databases to cloud storage such as S3 there is in fact support for a more… self-hosted approach, via SFTP.
Litestream works by watching the write-ahead log from SQLite and basically streaming that log to remote storage. Supported backends include all the big cloud providers as well as smaller players like DigitalOcean and Linode. The code isn't actually too bad to just read and I felt like I could safely ignore all the complexity of the providers I am not interested in using.
With some rough understanding of what I intended to do the first
step is to create a dedicated SFTP user with
and avoid any headaches around managing logins or home directories:
ALPHA# cat /etc/sysusers.d/replicator.conf u replicator - "sftp user"
Apply that change to create the user:
ALPHA# systemd-sysusers /etc/sysusers.d/replicator.conf
Create an ssh key (not on the same server, obviously):
BRAVO# ssh-keygen -t ed25519 -C "replicator@bravo" -f /etc/replicator/replicator.key
Because this user will not have a home directory I need to bring the
public key from BRAVO to the SFTP
server, ALPHA, and put it some place
Configure the SSH server to disallow ssh access and root all remote
logins from this user into a single directory. I added this to the
Match User replicator ChrootDirectory /var/data/%u ForceCommand internal-sftp AllowTcpForwarding no AuthorizedKeysFile /etc/ssh/authorized_keys_%u
Obviously the chroot directory needs to exist, so in my case I need to:
ALPHA# mkdir -p /var/data/replicator/db
In order to chroot the
is owned by root, the final child directory (
is owned by the new system user:
ALPHA# chown replicator /var/data/replicator/db
What it looks like in practice
BRAVO$ sftp -i replicator.key replicator@ALPHA Connected to ALPHA. sftp> pwd Remote working directory: / sftp> cd .. sftp> pwd Remote working directory: / sftp> # can't escape beyond the chroot sftp> ls -al drwxr-xr-x 3 root root 4096 Apr 21 00:20 . drwxr-xr-x 3 root root 4096 Apr 21 00:20 .. drwxr-xr-x 2 982 root 4096 Apr 21 00:28 db sftp> sftp> put junk.txt . Uploading junk.txt to /./junk.txt dest open("/./junk.txt"): Permission denied sftp> # no permissions beyond the owned directory sftp> sftp> put junk.txt db Uploading junk.txt to /db/junk.txt junk.txt 100% 5 5.3KB/s 00:00 sftp> ls db db/junk.txt
Happily there is not a lot to mention here, it does exactly what it says on the tin. The documentation walks through setting up an example database with the following:
BRAVO# sqlite3 /var/data/db/fruits.db sqlite> create table fruits(name, color); sqlite> insert into fruits values ('appple', 'red'), ('banana', 'yellow');
One thing worth mentioning, because I didn't find it in the
documentation, the configuration file defaults
/etc/litestream.yml but can be overridden via the
LITESTREAM_CONFIG. While I have created a
similar directory structure on both hosts
/var/data/db) this is purely to make my own life
easier and not required by the software.
BRAVO# cat litestream.yml dbs: - path: /var/data/db/fruits.db replicas: - url: sftp://replicator@alpha/db key-path: /etc/replicator/replicator.key
With that configured it is then possible to invoke:
BRAVO# litestream replicate
And the write-ahead log for SQLite is shipped over to my SFTP server. There is nothing else to do. Further writes sync automatically and everything proceeds as normal.
Recover from total database loss
Of course it is important to verify the replication can recreate the database so I did the simplest thing and wiped all traces of the database from the "source" server.
BRAVO# rm -rf /var/data/db/*
Restoring the database looks like this:
BRAVO# litestream restore fruits.db
The write-ahead logs are streamed back from the SFTP server and the database is reconstituted. No fuss, no problems.
Of course the
litestream process should run in the
background and either start on-boot or chained to a related
service. This sounds like a pretty
service so I won't get into it here to belabor a contrived
example. While the SFTP connection is encrypted with public key
cryptography (same as ssh) it would be nice to isolate things even
further, like say, on a virtual private network. Finally I think I
should like to try one of
single-process applications I keep hearing about; I guess I need
to brush up on some Go and work up a fun little project idea.
I am impressed. Dead simple to configure both the SFTP server and litestream. I would like to see how litestream works in the long term although I have some reservations about utilizing it very much. Not that I foresee issues but rather I foresee a lack of issues. I have run multiple SQLite databases on a single host for years without incident so I don't know that I'll have much cause to restore databases. Certainly starting to feel like I am exercising my system administration skills and I am pleased to find some nice, simple tools to get a modern lightweight system configured.