chore: rename quicproquo → quicprochat in docs, Docker, CI, and packaging

Rename all project references from quicproquo/qpq to quicprochat/qpc
across documentation, Docker configuration, CI workflows, packaging
scripts, operational configs, and build tooling.

- Docker: crate paths, binary names, user/group, data dirs, env vars
- CI: workflow crate references, binary names, artifact names
- Docs: all markdown files under docs/, SDK READMEs, book.toml
- Packaging: OpenWrt Makefile, init script, UCI config (file renames)
- Scripts: justfile, dev-shell, screenshot, cross-compile, ai_team
- Operations: Prometheus config, alert rules, Grafana dashboard
- Config: .env.example (QPQ_* → QPC_*), CODEOWNERS paths
- Top-level: README, CONTRIBUTING, ROADMAP, CLAUDE.md
This commit is contained in:
2026-03-07 18:46:43 +01:00
parent a710037dde
commit 2e081ead8e
179 changed files with 1645 additions and 1645 deletions

View File

@@ -1,15 +1,15 @@
# Backup and Restore Procedures
This document covers backup and restore for all quicproquo server data stores.
This document covers backup and restore for all quicprochat server data stores.
## Data Inventory
| Data | Location | Backend | Contains |
|------|----------|---------|----------|
| SQLCipher DB | `QPQ_DB_PATH` (default `data/qpq.db`) | `store_backend=sql` | Users, key packages, delivery queues, sessions, KT log, OPAQUE setup, blobs metadata, moderation |
| File store | `QPQ_DATA_DIR` (default `data/`) | `store_backend=file` | Bincode-serialized key packages, delivery queues, server state |
| Blob storage | `QPQ_DATA_DIR/blobs/` | Filesystem | Uploaded file transfer blobs |
| TLS certificates | `QPQ_TLS_CERT`, `QPQ_TLS_KEY` | DER files | Server identity |
| SQLCipher DB | `QPC_DB_PATH` (default `data/qpc.db`) | `store_backend=sql` | Users, key packages, delivery queues, sessions, KT log, OPAQUE setup, blobs metadata, moderation |
| File store | `QPC_DATA_DIR` (default `data/`) | `store_backend=file` | Bincode-serialized key packages, delivery queues, server state |
| Blob storage | `QPC_DATA_DIR/blobs/` | Filesystem | Uploaded file transfer blobs |
| TLS certificates | `QPC_TLS_CERT`, `QPC_TLS_KEY` | DER files | Server identity |
| OPAQUE ServerSetup | Inside DB or file store | Persisted | OPAQUE credential state (critical for auth) |
| Server signing key | Inside DB or file store | Persisted | Ed25519 key for delivery proofs |
| KT Merkle log | Inside DB or file store | Persisted | Key transparency audit log |
@@ -22,13 +22,13 @@ SQLCipher supports the `.backup` command while the server is running (WAL mode a
```bash
# 1. Open the encrypted database with the same key
sqlite3 data/qpq.db
sqlite3 data/qpc.db
# 2. At the sqlite3 prompt, set the encryption key
PRAGMA key = 'your-db-key-here';
# 3. Perform an online backup
.backup /backups/qpq-$(date +%Y%m%d-%H%M%S).db
.backup /backups/qpc-$(date +%Y%m%d-%H%M%S).db
.quit
```
@@ -39,11 +39,11 @@ PRAGMA key = 'your-db-key-here';
#!/bin/bash
set -euo pipefail
BACKUP_DIR="/backups/qpq"
DB_PATH="${QPQ_DB_PATH:-data/qpq.db}"
DB_KEY="${QPQ_DB_KEY}"
BACKUP_DIR="/backups/qpc"
DB_PATH="${QPC_DB_PATH:-data/qpc.db}"
DB_KEY="${QPC_DB_KEY}"
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
BACKUP_FILE="${BACKUP_DIR}/qpq-${TIMESTAMP}.db"
BACKUP_FILE="${BACKUP_DIR}/qpc-${TIMESTAMP}.db"
mkdir -p "$BACKUP_DIR"
@@ -58,40 +58,40 @@ sqlite3 "$BACKUP_FILE" "PRAGMA key = '${DB_KEY}'; PRAGMA integrity_check;" \
|| { echo "ERROR: backup verification failed"; exit 1; }
# Retain last 7 daily backups
find "$BACKUP_DIR" -name 'qpq-*.db' -mtime +7 -delete
find "$BACKUP_DIR" -name 'qpc-*.db' -mtime +7 -delete
```
### Cold Backup (Offline)
```bash
# 1. Stop the server
systemctl stop qpq-server # or docker compose stop server
systemctl stop qpc-server # or docker compose stop server
# 2. Copy the database file
cp data/qpq.db /backups/qpq-$(date +%Y%m%d).db
cp data/qpc.db /backups/qpc-$(date +%Y%m%d).db
# 3. Copy the WAL and SHM files if they exist
cp data/qpq.db-wal /backups/ 2>/dev/null || true
cp data/qpq.db-shm /backups/ 2>/dev/null || true
cp data/qpc.db-wal /backups/ 2>/dev/null || true
cp data/qpc.db-shm /backups/ 2>/dev/null || true
# 4. Restart the server
systemctl start qpq-server
systemctl start qpc-server
```
## File Backend Backup
When using `store_backend=file`, data is stored as bincode files under `QPQ_DATA_DIR`.
When using `store_backend=file`, data is stored as bincode files under `QPC_DATA_DIR`.
```bash
# Full directory backup
tar czf /backups/qpq-data-$(date +%Y%m%d-%H%M%S).tar.gz \
-C "$(dirname "${QPQ_DATA_DIR:-data}")" \
"$(basename "${QPQ_DATA_DIR:-data}")"
tar czf /backups/qpc-data-$(date +%Y%m%d-%H%M%S).tar.gz \
-C "$(dirname "${QPC_DATA_DIR:-data}")" \
"$(basename "${QPC_DATA_DIR:-data}")"
```
## Blob Storage Backup
Blobs are stored in `QPQ_DATA_DIR/blobs/`. These are immutable once written.
Blobs are stored in `QPC_DATA_DIR/blobs/`. These are immutable once written.
```bash
# Incremental rsync (blobs are write-once, ideal for rsync)
@@ -117,38 +117,38 @@ cp data/federation-ca.der /backups/tls/federation-ca.der 2>/dev/null || true
```bash
# 1. Stop the server
systemctl stop qpq-server
systemctl stop qpc-server
# 2. Move the current (corrupt/lost) database aside
mv data/qpq.db data/qpq.db.broken 2>/dev/null || true
rm -f data/qpq.db-wal data/qpq.db-shm
mv data/qpc.db data/qpc.db.broken 2>/dev/null || true
rm -f data/qpc.db-wal data/qpc.db-shm
# 3. Copy the backup in place
cp /backups/qpq-20260304.db data/qpq.db
cp /backups/qpc-20260304.db data/qpc.db
# 4. Verify integrity
sqlite3 data/qpq.db "PRAGMA key = '${QPQ_DB_KEY}'; PRAGMA integrity_check;"
sqlite3 data/qpc.db "PRAGMA key = '${QPC_DB_KEY}'; PRAGMA integrity_check;"
# 5. Start the server (migrations will apply automatically if needed)
systemctl start qpq-server
systemctl start qpc-server
```
### Restore File Backend
```bash
# 1. Stop the server
systemctl stop qpq-server
systemctl stop qpc-server
# 2. Replace the data directory
mv data data.broken 2>/dev/null || true
tar xzf /backups/qpq-data-20260304.tar.gz -C .
tar xzf /backups/qpc-data-20260304.tar.gz -C .
# 3. Restore TLS certs if not included in the data backup
cp /backups/tls/server-cert.der data/server-cert.der
cp /backups/tls/server-key.der data/server-key.der
# 4. Start the server
systemctl start qpq-server
systemctl start qpc-server
```
### Restore Blobs Only
@@ -170,16 +170,16 @@ rsync -av /backups/blobs/ data/blobs/
```cron
# SQLCipher hot backup every 6 hours
0 */6 * * * /opt/qpq/scripts/backup-db.sh >> /var/log/qpq-backup.log 2>&1
0 */6 * * * /opt/qpc/scripts/backup-db.sh >> /var/log/qpc-backup.log 2>&1
# Full data directory daily at 02:00
0 2 * * * tar czf /backups/qpq-data-$(date +\%Y\%m\%d).tar.gz -C /var/lib quicproquo
0 2 * * * tar czf /backups/qpc-data-$(date +\%Y\%m\%d).tar.gz -C /var/lib quicprochat
# Blob sync every hour
0 * * * * rsync -a /var/lib/quicproquo/blobs/ /backups/blobs/
0 * * * * rsync -a /var/lib/quicprochat/blobs/ /backups/blobs/
# Prune backups older than 30 days
0 3 * * 0 find /backups -name 'qpq-*' -mtime +30 -delete
0 3 * * 0 find /backups -name 'qpc-*' -mtime +30 -delete
```
## Verification
@@ -188,11 +188,11 @@ Always verify backups after creation:
```bash
# SQLCipher integrity check
sqlite3 /backups/qpq-latest.db \
"PRAGMA key = '${QPQ_DB_KEY}'; PRAGMA integrity_check; SELECT count(*) FROM users;"
sqlite3 /backups/qpc-latest.db \
"PRAGMA key = '${QPC_DB_KEY}'; PRAGMA integrity_check; SELECT count(*) FROM users;"
# File backend: check the archive is valid
tar tzf /backups/qpq-data-latest.tar.gz > /dev/null
tar tzf /backups/qpc-data-latest.tar.gz > /dev/null
# TLS cert: check it parses and is not expired
openssl x509 -inform DER -in /backups/tls/server-cert.der -noout -dates

View File

@@ -42,10 +42,10 @@
}
],
"id": null,
"uid": "qpq-overview",
"title": "quicproquo Server Overview",
"description": "Operational dashboard for quicproquo server instances",
"tags": ["quicproquo", "qpq"],
"uid": "qpc-overview",
"title": "quicprochat Server Overview",
"description": "Operational dashboard for quicprochat server instances",
"tags": ["quicprochat", "qpc"],
"timezone": "browser",
"editable": true,
"graphTooltip": 1,
@@ -61,7 +61,7 @@
"gridPos": { "h": 4, "w": 4, "x": 0, "y": 0 },
"targets": [
{
"expr": "up{job=\"qpq-server\"}",
"expr": "up{job=\"qpc-server\"}",
"legendFormat": "{{instance}}"
}
],

View File

@@ -3,7 +3,7 @@ apiVersion: 1
providers:
- name: 'default'
orgId: 1
folder: 'quicproquo'
folder: 'quicprochat'
type: file
disableDeletion: false
editable: true

View File

@@ -1,6 +1,6 @@
# Incident Response Playbook
This document provides procedures for responding to common operational incidents in a quicproquo deployment.
This document provides procedures for responding to common operational incidents in a quicprochat deployment.
## Severity Levels
@@ -20,7 +20,7 @@ This document provides procedures for responding to common operational incidents
```bash
# Check server logs
journalctl -u qpq-server --since "10 min ago" --no-pager
journalctl -u qpc-server --since "10 min ago" --no-pager
# Docker
docker compose logs --tail=50 server
@@ -39,17 +39,17 @@ ls -la data/server-cert.der data/server-key.der
**Missing auth token (production mode)**
```bash
# Production requires QPQ_AUTH_TOKEN >= 16 chars, not "devtoken"
echo $QPQ_AUTH_TOKEN | wc -c
# Production requires QPC_AUTH_TOKEN >= 16 chars, not "devtoken"
echo $QPC_AUTH_TOKEN | wc -c
```
**Database locked or corrupt**
```bash
# Check if another process holds the database
fuser data/qpq.db
fuser data/qpc.db
# Verify database integrity
sqlite3 data/qpq.db "PRAGMA key='${QPQ_DB_KEY}'; PRAGMA integrity_check;"
sqlite3 data/qpc.db "PRAGMA key='${QPC_DB_KEY}'; PRAGMA integrity_check;"
```
**Port already in use**
@@ -69,12 +69,12 @@ ss -tlnp | grep 7000
```bash
# 1. Check if the process is running
systemctl status qpq-server
systemctl status qpc-server
# or: docker compose ps
# 2. Check resource usage
top -bn1 | grep qpq
df -h /var/lib/quicproquo
top -bn1 | grep qpc
df -h /var/lib/quicprochat
free -h
# 3. Check QUIC port is reachable
@@ -90,10 +90,10 @@ journalctl -k | grep -i oom
```bash
# Restart the service
systemctl restart qpq-server
systemctl restart qpc-server
# If OOM: increase memory limit
systemctl edit qpq-server --force
systemctl edit qpc-server --force
# MemoryMax=2G
# If disk full: see "Storage Full" incident below
@@ -110,15 +110,15 @@ systemctl edit qpq-server --force
```bash
# Check disk usage
df -h /var/lib/quicproquo
du -sh /var/lib/quicproquo/*
df -h /var/lib/quicprochat
du -sh /var/lib/quicprochat/*
# Check largest files
du -a /var/lib/quicproquo | sort -rn | head -20
du -a /var/lib/quicprochat | sort -rn | head -20
# Check blob storage specifically
du -sh /var/lib/quicproquo/blobs/
find /var/lib/quicproquo/blobs/ -type f | wc -l
du -sh /var/lib/quicprochat/blobs/
find /var/lib/quicprochat/blobs/ -type f | wc -l
```
### Recovery
@@ -128,8 +128,8 @@ find /var/lib/quicproquo/blobs/ -type f | wc -l
# but if it's behind, you can trigger manual cleanup)
# For SQL backend: delete expired delivery messages
sqlite3 data/qpq.db <<'EOF'
PRAGMA key = '${QPQ_DB_KEY}';
sqlite3 data/qpc.db <<'EOF'
PRAGMA key = '${QPC_DB_KEY}';
DELETE FROM delivery_queue WHERE expires_at IS NOT NULL AND expires_at < unixepoch();
VACUUM;
EOF
@@ -142,10 +142,10 @@ EOF
# Then: resize2fs /dev/xvdf
# 4. Move to a larger disk
systemctl stop qpq-server
rsync -av /var/lib/quicproquo/ /mnt/new-volume/quicproquo/
# Update QPQ_DATA_DIR and QPQ_DB_PATH to point to the new location
systemctl start qpq-server
systemctl stop qpc-server
rsync -av /var/lib/quicprochat/ /mnt/new-volume/quicprochat/
# Update QPC_DATA_DIR and QPC_DB_PATH to point to the new location
systemctl start qpc-server
```
### Prevention
@@ -188,10 +188,10 @@ iptables -A INPUT -s <attacker-ip> -j DROP
# (Cloudflare Spectrum, AWS Shield, etc.)
# 4. If the server is overwhelmed, restart to clear state
systemctl restart qpq-server
systemctl restart qpc-server
# 5. Enable log redaction to reduce I/O pressure during attacks
# Set QPQ_REDACT_LOGS=true
# Set QPC_REDACT_LOGS=true
```
## Incident: Key Compromise
@@ -210,7 +210,7 @@ NEW_TOKEN=$(openssl rand -base64 32)
# 3. Notify all legitimate clients of the new token
# 4. Review logs for unauthorized access
journalctl -u qpq-server | grep "auth_login_success" | tail -100
journalctl -u qpc-server | grep "auth_login_success" | tail -100
```
### TLS Private Key Compromised
@@ -225,7 +225,7 @@ journalctl -u qpq-server | grep "auth_login_success" | tail -100
# (procedure depends on your CA)
# 3. Restart the server with the new certificate
systemctl restart qpq-server
systemctl restart qpc-server
# 4. If clients pin certificates, notify them of the change
```
@@ -236,7 +236,7 @@ systemctl restart qpq-server
```bash
# 1. Stop the server
systemctl stop qpq-server
systemctl stop qpc-server
# 2. Rekey the database immediately
# See: key-rotation.md "Database Encryption Key Rotation"
@@ -278,8 +278,8 @@ top -bn1 | head -20
iostat -x 1 3
# 2. Check database performance
sqlite3 data/qpq.db <<'EOF'
PRAGMA key = '${QPQ_DB_KEY}';
sqlite3 data/qpc.db <<'EOF'
PRAGMA key = '${QPC_DB_KEY}';
PRAGMA integrity_check;
PRAGMA wal_checkpoint(PASSIVE);
-- Check table sizes
@@ -296,10 +296,10 @@ curl -s http://localhost:9090/metrics | grep delivery_queue_depth
```bash
# 1. Checkpoint the WAL (reduces WAL file size)
sqlite3 data/qpq.db "PRAGMA key='${QPQ_DB_KEY}'; PRAGMA wal_checkpoint(TRUNCATE);"
sqlite3 data/qpc.db "PRAGMA key='${QPC_DB_KEY}'; PRAGMA wal_checkpoint(TRUNCATE);"
# 2. VACUUM to reclaim space and defragment
sqlite3 data/qpq.db "PRAGMA key='${QPQ_DB_KEY}'; VACUUM;"
sqlite3 data/qpc.db "PRAGMA key='${QPC_DB_KEY}'; VACUUM;"
# 3. If the queue is huge, check for clients not fetching
# (delivery_queue rows accumulate when clients are offline)
@@ -323,7 +323,7 @@ openssl x509 -inform DER -in data/server-cert.der -noout -enddate
# See: key-rotation.md "TLS Certificate Rotation"
# 3. Verify the new certificate is loaded
journalctl -u qpq-server --since "1 min ago" | grep -i cert
journalctl -u qpc-server --since "1 min ago" | grep -i cert
```
## Post-Incident Checklist

View File

@@ -1,10 +1,10 @@
# Key Rotation Procedures
This document provides step-by-step procedures for rotating all cryptographic material in a quicproquo deployment.
This document provides step-by-step procedures for rotating all cryptographic material in a quicprochat deployment.
## Auth Token Rotation
The auth token (`QPQ_AUTH_TOKEN`) is used for bearer-token authentication (auth version 1). OPAQUE-authenticated sessions are not affected by token rotation.
The auth token (`QPC_AUTH_TOKEN`) is used for bearer-token authentication (auth version 1). OPAQUE-authenticated sessions are not affected by token rotation.
### Procedure
@@ -15,22 +15,22 @@ echo "New token: $NEW_TOKEN"
# 2. Update the config file or environment
# Option A: TOML config file
sed -i "s/^auth_token = .*/auth_token = \"$NEW_TOKEN\"/" qpq-server.toml
sed -i "s/^auth_token = .*/auth_token = \"$NEW_TOKEN\"/" qpc-server.toml
# Option B: Environment variable (systemd)
systemctl edit qpq-server --force
# Add: Environment=QPQ_AUTH_TOKEN=<new-token>
systemctl edit qpc-server --force
# Add: Environment=QPC_AUTH_TOKEN=<new-token>
# Option C: Docker Compose
# Update QPQ_AUTH_TOKEN in docker-compose.prod.yml or .env file
# Update QPC_AUTH_TOKEN in docker-compose.prod.yml or .env file
# 3. Restart the server
systemctl restart qpq-server
systemctl restart qpc-server
# or: docker compose restart server
# 4. Update all clients with the new token
# Clients using OPAQUE auth are unaffected.
# Clients using bearer-token auth must update their QPQ_ACCESS_TOKEN.
# Clients using bearer-token auth must update their QPC_ACCESS_TOKEN.
```
### Impact
@@ -49,7 +49,7 @@ The server uses DER-encoded X.509 certificates for QUIC TLS 1.3. The server vali
# 1. Obtain a new certificate (example with Let's Encrypt / certbot)
certbot certonly --standalone -d chat.example.com
# 2. Convert PEM to DER format (qpq-server expects DER)
# 2. Convert PEM to DER format (qpc-server expects DER)
openssl x509 -in /etc/letsencrypt/live/chat.example.com/fullchain.pem \
-outform DER -out /tmp/server-cert.der
@@ -71,10 +71,10 @@ cp /tmp/server-key.der data/server-key.der
openssl x509 -inform DER -in data/server-cert.der -noout -text | head -20
# 7. Restart the server (QUIC requires restart for new TLS config)
systemctl restart qpq-server
systemctl restart qpc-server
# 8. Verify the server started with the new certificate
journalctl -u qpq-server --since "1 min ago" | grep -i tls
journalctl -u qpc-server --since "1 min ago" | grep -i tls
```
### Self-Signed Certificate (Development)
@@ -83,7 +83,7 @@ In non-production mode, the server auto-generates a self-signed certificate if n
```bash
rm data/server-cert.der data/server-key.der
systemctl restart qpq-server
systemctl restart qpc-server
# Server will generate a new self-signed cert for localhost/127.0.0.1/::1
```
@@ -91,26 +91,26 @@ systemctl restart qpq-server
```bash
#!/bin/bash
# /opt/qpq/scripts/renew-cert.sh
# /opt/qpc/scripts/renew-cert.sh
set -euo pipefail
DOMAIN="chat.example.com"
CERT_DIR="/etc/letsencrypt/live/$DOMAIN"
QPQ_DATA="/var/lib/quicproquo"
QPC_DATA="/var/lib/quicprochat"
certbot renew --quiet
openssl x509 -in "$CERT_DIR/fullchain.pem" -outform DER -out "$QPQ_DATA/server-cert.der"
openssl pkey -in "$CERT_DIR/privkey.pem" -outform DER -out "$QPQ_DATA/server-key.der"
chmod 600 "$QPQ_DATA/server-key.der"
chown qpq:qpq "$QPQ_DATA/server-cert.der" "$QPQ_DATA/server-key.der"
openssl x509 -in "$CERT_DIR/fullchain.pem" -outform DER -out "$QPC_DATA/server-cert.der"
openssl pkey -in "$CERT_DIR/privkey.pem" -outform DER -out "$QPC_DATA/server-key.der"
chmod 600 "$QPC_DATA/server-key.der"
chown qpc:qpc "$QPC_DATA/server-cert.der" "$QPC_DATA/server-key.der"
systemctl restart qpq-server
systemctl restart qpc-server
```
```cron
# Run cert renewal check twice daily
0 3,15 * * * /opt/qpq/scripts/renew-cert.sh >> /var/log/qpq-cert-renew.log 2>&1
0 3,15 * * * /opt/qpc/scripts/renew-cert.sh >> /var/log/qpc-cert-renew.log 2>&1
```
## Federation Certificate Rotation
@@ -134,43 +134,43 @@ openssl pkey -in /tmp/federation-key.pem -outform DER -out data/federation-key.d
chmod 600 data/federation-key.der
# 3. Restart the server
systemctl restart qpq-server
systemctl restart qpc-server
# 4. Coordinate with federation peers: they must trust the same CA
```
## Database Encryption Key Rotation
The SQLCipher database key (`QPQ_DB_KEY`) encrypts all data at rest.
The SQLCipher database key (`QPC_DB_KEY`) encrypts all data at rest.
### Procedure (SQLCipher PRAGMA rekey)
```bash
# 1. Stop the server
systemctl stop qpq-server
systemctl stop qpc-server
# 2. Back up the database
cp data/qpq.db /backups/qpq-pre-rekey-$(date +%Y%m%d).db
cp data/qpc.db /backups/qpc-pre-rekey-$(date +%Y%m%d).db
# 3. Rekey the database
sqlite3 data/qpq.db <<EOF
sqlite3 data/qpc.db <<EOF
PRAGMA key = 'old-encryption-key';
PRAGMA rekey = 'new-encryption-key';
EOF
# 4. Verify the database opens with the new key
sqlite3 data/qpq.db "PRAGMA key = 'new-encryption-key'; PRAGMA integrity_check;"
sqlite3 data/qpc.db "PRAGMA key = 'new-encryption-key'; PRAGMA integrity_check;"
# 5. Update the environment/config with the new key
# Option A: systemd
systemctl edit qpq-server --force
# Environment=QPQ_DB_KEY=new-encryption-key
systemctl edit qpc-server --force
# Environment=QPC_DB_KEY=new-encryption-key
# Option B: Docker Compose .env
echo "QPQ_DB_KEY=new-encryption-key" >> .env
echo "QPC_DB_KEY=new-encryption-key" >> .env
# 6. Start the server
systemctl start qpq-server
systemctl start qpc-server
```
### Full Re-encryption (Alternative)
@@ -179,18 +179,18 @@ If `PRAGMA rekey` is unavailable or you want a fresh database file:
```bash
# 1. Stop the server and back up
systemctl stop qpq-server
cp data/qpq.db /backups/qpq-pre-rekey.db
systemctl stop qpc-server
cp data/qpc.db /backups/qpc-pre-rekey.db
# 2. Export with old key, import with new key
sqlite3 data/qpq.db "PRAGMA key='old-key'; .dump" | \
sqlite3 data/qpq-new.db "PRAGMA key='new-key'; .read /dev/stdin"
sqlite3 data/qpc.db "PRAGMA key='old-key'; .dump" | \
sqlite3 data/qpc-new.db "PRAGMA key='new-key'; .read /dev/stdin"
# 3. Replace the database
mv data/qpq-new.db data/qpq.db
mv data/qpc-new.db data/qpc.db
# 4. Update config and restart
systemctl start qpq-server
systemctl start qpc-server
```
## OPAQUE ServerSetup Rotation
@@ -201,20 +201,20 @@ The OPAQUE ServerSetup is generated once and persisted. Rotating it invalidates
```bash
# 1. Stop the server
systemctl stop qpq-server
systemctl stop qpc-server
# 2. Back up the database
cp data/qpq.db /backups/qpq-pre-opaque-rotate.db
cp data/qpc.db /backups/qpc-pre-opaque-rotate.db
# 3. Delete the persisted OPAQUE setup
# For SQL backend:
sqlite3 data/qpq.db "PRAGMA key='${QPQ_DB_KEY}'; DELETE FROM server_state WHERE key = 'opaque_setup';"
sqlite3 data/qpc.db "PRAGMA key='${QPC_DB_KEY}'; DELETE FROM server_state WHERE key = 'opaque_setup';"
# For file backend:
rm data/opaque_setup.bin 2>/dev/null || true
# 4. Start the server (it will generate a new OPAQUE ServerSetup)
systemctl start qpq-server
systemctl start qpc-server
# 5. All users must re-register (existing OPAQUE credentials are invalid)
```
@@ -225,17 +225,17 @@ The Ed25519 signing key is used for delivery proofs. Rotating it means old deliv
```bash
# 1. Stop the server
systemctl stop qpq-server
systemctl stop qpc-server
# 2. Back up
cp data/qpq.db /backups/qpq-pre-sigkey-rotate.db
cp data/qpc.db /backups/qpc-pre-sigkey-rotate.db
# 3. Delete the persisted signing key seed
# For SQL backend:
sqlite3 data/qpq.db "PRAGMA key='${QPQ_DB_KEY}'; DELETE FROM server_state WHERE key = 'signing_key_seed';"
sqlite3 data/qpc.db "PRAGMA key='${QPC_DB_KEY}'; DELETE FROM server_state WHERE key = 'signing_key_seed';"
# 4. Start the server (generates a new Ed25519 signing key)
systemctl start qpq-server
systemctl start qpc-server
```
## Rotation Schedule

View File

@@ -1,6 +1,6 @@
# Monitoring Guide
This document covers metrics collection, alerting, and dashboards for quicproquo.
This document covers metrics collection, alerting, and dashboards for quicprochat.
## Enabling Metrics
@@ -8,10 +8,10 @@ The server exports Prometheus metrics via HTTP when configured:
```bash
# Environment variables
QPQ_METRICS_LISTEN=0.0.0.0:9090
QPQ_METRICS_ENABLED=true
QPC_METRICS_LISTEN=0.0.0.0:9090
QPC_METRICS_ENABLED=true
# Or in qpq-server.toml
# Or in qpc-server.toml
metrics_listen = "0.0.0.0:9090"
metrics_enabled = true
```
@@ -48,9 +48,9 @@ global:
evaluation_interval: 15s
scrape_configs:
- job_name: 'qpq-server'
- job_name: 'qpc-server'
static_targets:
- targets: ['qpq-server:9090']
- targets: ['qpc-server:9090']
scrape_interval: 10s
```
@@ -59,17 +59,17 @@ scrape_configs:
```yaml
# prometheus-alerts.yml
groups:
- name: qpq-server
- name: qpc-server
rules:
# Server down
- alert: QpqServerDown
expr: up{job="qpq-server"} == 0
expr: up{job="qpc-server"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "qpq-server is down"
description: "Prometheus cannot scrape qpq-server metrics for > 1 minute."
summary: "qpc-server is down"
description: "Prometheus cannot scrape qpc-server metrics for > 1 minute."
# High auth failure rate (potential brute force)
- alert: QpqHighAuthFailureRate
@@ -136,7 +136,7 @@ groups:
## Key Dashboard Panels
See `dashboards/qpq-overview.json` for the full Grafana dashboard. Key panels:
See `dashboards/qpc-overview.json` for the full Grafana dashboard. Key panels:
### Message Throughput
- **Enqueue rate**: `rate(enqueue_total[5m])`
@@ -160,10 +160,10 @@ See `dashboards/qpq-overview.json` for the full Grafana dashboard. Key panels:
## Grafana Dashboard
Import the dashboard from `dashboards/qpq-overview.json`:
Import the dashboard from `dashboards/qpc-overview.json`:
1. Open Grafana -> Dashboards -> Import
2. Upload `docs/operations/dashboards/qpq-overview.json`
2. Upload `docs/operations/dashboards/qpc-overview.json`
3. Select your Prometheus data source
4. Save
@@ -176,7 +176,7 @@ The server uses `tracing` with `RUST_LOG` environment variable:
RUST_LOG=info
# Debug specific modules
RUST_LOG=info,quicproquo_server::node_service=debug
RUST_LOG=info,quicprochat_server::node_service=debug
# Verbose debugging
RUST_LOG=debug
@@ -189,7 +189,7 @@ RUST_LOG=debug
| `"TLS certificate expires within 30 days"` | Cert expiring soon | Rotate certificate |
| `"TLS certificate is self-signed"` | Self-signed cert in use | Replace with CA-signed cert in production |
| `"connection rate limit exceeded"` | IP being rate limited | Check for DDoS |
| `"running without QPQ_AUTH_TOKEN"` | Insecure mode | Must not appear in production |
| `"running without QPC_AUTH_TOKEN"` | Insecure mode | Must not appear in production |
| `"db_key is empty; SQL store will be plaintext"` | Unencrypted DB | Must not appear in production |
| `"shutdown signal received"` | Graceful shutdown started | Expected during deploys |
| `"generated and persisted new OPAQUE ServerSetup"` | Fresh OPAQUE setup | Expected on first start only |
@@ -200,13 +200,13 @@ For production, pipe logs to a log aggregator:
```bash
# Systemd -> journald -> Loki/Elasticsearch
journalctl -u qpq-server -f --output=json | \
journalctl -u qpc-server -f --output=json | \
promtail --stdin --client.url=http://loki:3100/loki/api/v1/push
# Docker -> Loki driver
docker run --log-driver=loki \
--log-opt loki-url="http://loki:3100/loki/api/v1/push" \
qpq-server
qpc-server
```
## Health Checking
@@ -221,5 +221,5 @@ ss -ulnp | grep 7000
curl -sf http://localhost:9090/metrics > /dev/null
# Full client connection test
qpq-client --server 127.0.0.1:7000 --auth-token "$TOKEN" --ping
qpc-client --server 127.0.0.1:7000 --auth-token "$TOKEN" --ping
```

View File

@@ -1,16 +1,16 @@
groups:
- name: qpq-server
- name: qpc-server
rules:
- alert: QpqServerDown
expr: up{job="qpq-server"} == 0
- alert: QpcServerDown
expr: up{job="qpc-server"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "qpq-server is down"
description: "Prometheus cannot scrape qpq-server metrics for > 1 minute."
summary: "qpc-server is down"
description: "Prometheus cannot scrape qpc-server metrics for > 1 minute."
- alert: QpqHighAuthFailureRate
- alert: QpcHighAuthFailureRate
expr: rate(auth_login_failure_total[5m]) > 10
for: 2m
labels:
@@ -19,7 +19,7 @@ groups:
summary: "High authentication failure rate"
description: "{{ $value | printf \"%.1f\" }} auth failures/sec over 5 minutes."
- alert: QpqRateLimitActive
- alert: QpcRateLimitActive
expr: rate(rate_limit_hit_total[5m]) > 5
for: 5m
labels:
@@ -27,7 +27,7 @@ groups:
annotations:
summary: "Rate limiting is actively rejecting requests"
- alert: QpqDeliveryQueueHigh
- alert: QpcDeliveryQueueHigh
expr: delivery_queue_depth > 10000
for: 10m
labels:
@@ -35,7 +35,7 @@ groups:
annotations:
summary: "Delivery queue depth is high ({{ $value }})"
- alert: QpqDeliveryQueueCritical
- alert: QpcDeliveryQueueCritical
expr: delivery_queue_depth > 100000
for: 5m
labels:
@@ -43,7 +43,7 @@ groups:
annotations:
summary: "Delivery queue depth is critical ({{ $value }})"
- alert: QpqLowAuthSuccessRatio
- alert: QpcLowAuthSuccessRatio
expr: >
rate(auth_login_success_total[5m])
/ (rate(auth_login_success_total[5m]) + rate(auth_login_failure_total[5m]))

View File

@@ -6,7 +6,7 @@ rule_files:
- "alerts.yml"
scrape_configs:
- job_name: 'qpq-server'
- job_name: 'qpc-server'
static_configs:
- targets: ['server:9090']
scrape_interval: 10s

View File

@@ -1,10 +1,10 @@
# Scaling Guide
This document covers resource sizing, scaling triggers, and capacity planning for quicproquo deployments.
This document covers resource sizing, scaling triggers, and capacity planning for quicprochat deployments.
## Architecture Overview
quicproquo runs as a single-process server handling QUIC connections. Key resource consumers:
quicprochat runs as a single-process server handling QUIC connections. Key resource consumers:
- **CPU**: TLS 1.3 handshakes (QUIC), OPAQUE PAKE authentication, message routing
- **Memory**: In-memory session state (DashMap), QUIC connection state, delivery waiters, rate limit entries
@@ -70,7 +70,7 @@ The server is async (Tokio) and benefits from multiple cores. QUIC TLS handshake
```bash
# Check current CPU usage
top -bn1 -p $(pgrep qpq-server)
top -bn1 -p $(pgrep qpc-server)
# For Docker: increase CPU limits
# docker-compose.prod.yml:
@@ -107,22 +107,22 @@ iostat -x 1 5
# Move to NVMe if on spinning disk
# Increase WAL autocheckpoint threshold for burst writes
sqlite3 data/qpq.db "PRAGMA key='${QPQ_DB_KEY}'; PRAGMA wal_autocheckpoint=2000;"
sqlite3 data/qpc.db "PRAGMA key='${QPC_DB_KEY}'; PRAGMA wal_autocheckpoint=2000;"
```
## Horizontal Scaling
quicproquo does not yet have built-in multi-node clustering. For horizontal scaling, use these patterns:
quicprochat does not yet have built-in multi-node clustering. For horizontal scaling, use these patterns:
### Load Balancer (UDP/QUIC)
Place a UDP load balancer in front of multiple qpq-server instances. Each instance runs independently with its own database.
Place a UDP load balancer in front of multiple qpc-server instances. Each instance runs independently with its own database.
```
+-----------+
clients ------> | L4 LB | ----> qpq-server-1 (db-1)
| (UDP/QUIC)| ----> qpq-server-2 (db-2)
+-----------+ qpq-server-3 (db-3)
clients ------> | L4 LB | ----> qpc-server-1 (db-1)
| (UDP/QUIC)| ----> qpc-server-2 (db-2)
+-----------+ qpc-server-3 (db-3)
```
**Requirements:**
@@ -134,7 +134,7 @@ Place a UDP load balancer in front of multiple qpq-server instances. Each instan
Enable federation to relay messages between nodes:
```toml
# qpq-server.toml on node-1
# qpc-server.toml on node-1
[federation]
enabled = true
domain = "node1.chat.example.com"
@@ -153,9 +153,9 @@ address = "10.0.1.2:7001"
For true horizontal scaling, migrate from SQLCipher to a shared PostgreSQL instance. This is not yet implemented but is the planned approach for multi-node deployments.
```
qpq-server-1 --\
qpq-server-2 ---+--> PostgreSQL (shared)
qpq-server-3 --/
qpc-server-1 --\
qpc-server-2 ---+--> PostgreSQL (shared)
qpc-server-3 --/
```
## Connection Tuning
@@ -174,7 +174,7 @@ For high connection counts, consider:
- Increasing UDP buffer sizes:
```bash
# /etc/sysctl.d/99-qpq.conf
# /etc/sysctl.d/99-qpc.conf
net.core.rmem_max = 26214400
net.core.wmem_max = 26214400
net.core.rmem_default = 1048576
@@ -182,7 +182,7 @@ net.core.wmem_default = 1048576
```
```bash
sysctl -p /etc/sysctl.d/99-qpq.conf
sysctl -p /etc/sysctl.d/99-qpc.conf
```
## Docker Resource Limits
@@ -211,11 +211,11 @@ Use the included test infrastructure to benchmark:
```bash
# Build the test client
cargo build --release --bin qpq-client
cargo build --release --bin qpc-client
# Run concurrent connection test (example)
for i in $(seq 1 100); do
qpq-client --server 127.0.0.1:7000 --auth-token "$QPQ_AUTH_TOKEN" &
qpc-client --server 127.0.0.1:7000 --auth-token "$QPC_AUTH_TOKEN" &
done
wait