Files
quicproquo/docs/operations/backup-restore.md
Christian Nennemann 91c5495ab7 docs: add operational runbook, Grafana dashboard, and production docker-compose
Add comprehensive operational documentation:
- docs/operations/backup-restore.md: SQLCipher, file backend, blob backup/restore
- docs/operations/key-rotation.md: auth token, TLS, federation, DB key, OPAQUE rotation
- docs/operations/incident-response.md: playbook for common incidents
- docs/operations/scaling-guide.md: resource sizing, scaling triggers, capacity planning
- docs/operations/monitoring.md: Prometheus metrics, alert rules, log monitoring
- docs/operations/dashboards/qpq-overview.json: Grafana dashboard template
- docs/operations/prometheus.yml + alerts: Prometheus scrape and alert config
- docs/operations/grafana-provisioning/: auto-provisioning for datasources and dashboards
- docker-compose.prod.yml: production stack (server + Prometheus + Grafana)
- .env.example: documented environment variable template
2026-03-04 20:30:57 +01:00

5.5 KiB

Backup and Restore Procedures

This document covers backup and restore for all quicproquo server data stores.

Data Inventory

Data Location Backend Contains
SQLCipher DB QPQ_DB_PATH (default data/qpq.db) store_backend=sql Users, key packages, delivery queues, sessions, KT log, OPAQUE setup, blobs metadata, moderation
File store QPQ_DATA_DIR (default data/) store_backend=file Bincode-serialized key packages, delivery queues, server state
Blob storage QPQ_DATA_DIR/blobs/ Filesystem Uploaded file transfer blobs
TLS certificates QPQ_TLS_CERT, QPQ_TLS_KEY DER files Server identity
OPAQUE ServerSetup Inside DB or file store Persisted OPAQUE credential state (critical for auth)
Server signing key Inside DB or file store Persisted Ed25519 key for delivery proofs
KT Merkle log Inside DB or file store Persisted Key transparency audit log

SQLCipher Backup

Hot Backup (Online)

SQLCipher supports the .backup command while the server is running (WAL mode allows concurrent readers).

# 1. Open the encrypted database with the same key
sqlite3 data/qpq.db

# 2. At the sqlite3 prompt, set the encryption key
PRAGMA key = 'your-db-key-here';

# 3. Perform an online backup
.backup /backups/qpq-$(date +%Y%m%d-%H%M%S).db

.quit

Scripted Hot Backup

#!/bin/bash
set -euo pipefail

BACKUP_DIR="/backups/qpq"
DB_PATH="${QPQ_DB_PATH:-data/qpq.db}"
DB_KEY="${QPQ_DB_KEY}"
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
BACKUP_FILE="${BACKUP_DIR}/qpq-${TIMESTAMP}.db"

mkdir -p "$BACKUP_DIR"

sqlite3 "$DB_PATH" <<EOF
PRAGMA key = '${DB_KEY}';
.backup ${BACKUP_FILE}
EOF

# Verify the backup is readable
sqlite3 "$BACKUP_FILE" "PRAGMA key = '${DB_KEY}'; PRAGMA integrity_check;" \
  | grep -q "ok" && echo "Backup verified: $BACKUP_FILE" \
  || { echo "ERROR: backup verification failed"; exit 1; }

# Retain last 7 daily backups
find "$BACKUP_DIR" -name 'qpq-*.db' -mtime +7 -delete

Cold Backup (Offline)

# 1. Stop the server
systemctl stop qpq-server   # or docker compose stop server

# 2. Copy the database file
cp data/qpq.db /backups/qpq-$(date +%Y%m%d).db

# 3. Copy the WAL and SHM files if they exist
cp data/qpq.db-wal /backups/ 2>/dev/null || true
cp data/qpq.db-shm /backups/ 2>/dev/null || true

# 4. Restart the server
systemctl start qpq-server

File Backend Backup

When using store_backend=file, data is stored as bincode files under QPQ_DATA_DIR.

# Full directory backup
tar czf /backups/qpq-data-$(date +%Y%m%d-%H%M%S).tar.gz \
  -C "$(dirname "${QPQ_DATA_DIR:-data}")" \
  "$(basename "${QPQ_DATA_DIR:-data}")"

Blob Storage Backup

Blobs are stored in QPQ_DATA_DIR/blobs/. These are immutable once written.

# Incremental rsync (blobs are write-once, ideal for rsync)
rsync -av --progress data/blobs/ /backups/blobs/

TLS Certificate Backup

# Back up TLS certificates (store separately from DB backups)
cp data/server-cert.der /backups/tls/server-cert.der
cp data/server-key.der /backups/tls/server-key.der

# Federation certs (if federation is enabled)
cp data/federation-cert.der /backups/tls/federation-cert.der 2>/dev/null || true
cp data/federation-key.der /backups/tls/federation-key.der 2>/dev/null || true
cp data/federation-ca.der /backups/tls/federation-ca.der 2>/dev/null || true

Restore Procedures

Restore SQLCipher Database

# 1. Stop the server
systemctl stop qpq-server

# 2. Move the current (corrupt/lost) database aside
mv data/qpq.db data/qpq.db.broken 2>/dev/null || true
rm -f data/qpq.db-wal data/qpq.db-shm

# 3. Copy the backup in place
cp /backups/qpq-20260304.db data/qpq.db

# 4. Verify integrity
sqlite3 data/qpq.db "PRAGMA key = '${QPQ_DB_KEY}'; PRAGMA integrity_check;"

# 5. Start the server (migrations will apply automatically if needed)
systemctl start qpq-server

Restore File Backend

# 1. Stop the server
systemctl stop qpq-server

# 2. Replace the data directory
mv data data.broken 2>/dev/null || true
tar xzf /backups/qpq-data-20260304.tar.gz -C .

# 3. Restore TLS certs if not included in the data backup
cp /backups/tls/server-cert.der data/server-cert.der
cp /backups/tls/server-key.der data/server-key.der

# 4. Start the server
systemctl start qpq-server

Restore Blobs Only

rsync -av /backups/blobs/ data/blobs/

Backup Schedule Recommendations

Frequency What Method
Every 6 hours SQLCipher database Hot backup script via cron
Daily File backend / full data dir tar + offsite copy
Continuous Blobs rsync (incremental)
On change TLS certificates Manual + secret manager

Cron Example

# SQLCipher hot backup every 6 hours
0 */6 * * * /opt/qpq/scripts/backup-db.sh >> /var/log/qpq-backup.log 2>&1

# Full data directory daily at 02:00
0 2 * * * tar czf /backups/qpq-data-$(date +\%Y\%m\%d).tar.gz -C /var/lib quicproquo

# Blob sync every hour
0 * * * * rsync -a /var/lib/quicproquo/blobs/ /backups/blobs/

# Prune backups older than 30 days
0 3 * * 0 find /backups -name 'qpq-*' -mtime +30 -delete

Verification

Always verify backups after creation:

# SQLCipher integrity check
sqlite3 /backups/qpq-latest.db \
  "PRAGMA key = '${QPQ_DB_KEY}'; PRAGMA integrity_check; SELECT count(*) FROM users;"

# File backend: check the archive is valid
tar tzf /backups/qpq-data-latest.tar.gz > /dev/null

# TLS cert: check it parses and is not expired
openssl x509 -inform DER -in /backups/tls/server-cert.der -noout -dates