feat: add post-quantum hybrid KEM + SQLCipher persistence

Feature 1 — Post-Quantum Hybrid KEM (X25519 + ML-KEM-768):
- Create hybrid_kem.rs with keygen, encrypt, decrypt + 11 unit tests
- Wire format: version(1) | x25519_eph_pk(32) | mlkem_ct(1088) | nonce(12) | ct
- Add uploadHybridKey/fetchHybridKey RPCs to node.capnp schema
- Server: hybrid key storage in FileBackedStore + RPC handlers
- Client: hybrid keypair in StoredState, auto-wrap/unwrap in send/recv/invite/join
- demo-group runs full hybrid PQ envelope round-trip

Feature 2 — SQLCipher Persistence:
- Extract Store trait from FileBackedStore API
- Create SqlStore (rusqlite + bundled-sqlcipher) with encrypted-at-rest SQLite
- Schema: key_packages, deliveries, hybrid_keys tables with indexes
- Server CLI: --store-backend=sql, --db-path, --db-key flags
- 5 unit tests for SqlStore (FIFO, round-trip, upsert, channel isolation)

Also includes: client lib.rs refactor, auth config, TOML config file support,
mdBook documentation, and various cleanups by user.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-22 08:07:48 +01:00
parent d1ddef4cea
commit f334ed3d43
81 changed files with 14502 additions and 2289 deletions

View File

@@ -0,0 +1,390 @@
# Storage Backend
quicnprotochat uses two storage backends: `FileBackedStore` on the server side
for KeyPackages and delivery queues, and `DiskKeyStore` on the client side for
MLS cryptographic key material. Both follow the same pattern: in-memory data
structures backed by optional file persistence, with full serialization on every
write.
**Sources:**
- `crates/quicnprotochat-server/src/storage.rs` (FileBackedStore)
- `crates/quicnprotochat-core/src/keystore.rs` (DiskKeyStore, StoreCrypto)
---
## FileBackedStore (Server-Side)
`FileBackedStore` provides persistent storage for the server's three data
domains: KeyPackages, delivery queues, and hybrid public keys.
### Structure
```rust
pub struct FileBackedStore {
kp_path: PathBuf, // keypackages.bin
ds_path: PathBuf, // deliveries.bin
hk_path: PathBuf, // hybridkeys.bin
key_packages: Mutex<HashMap<Vec<u8>, VecDeque<Vec<u8>>>>, // identity -> KP queue
deliveries: Mutex<HashMap<ChannelKey, VecDeque<Vec<u8>>>>, // (channel, recipient) -> msg queue
hybrid_keys: Mutex<HashMap<Vec<u8>, Vec<u8>>>, // identity -> hybrid PK
}
```
Each domain has its own `Mutex`-protected in-memory map and its own disk file.
The `Mutex` (not `RwLock`) is used because every read-path operation that
modifies state (e.g., `pop_front` in `fetch_key_package`) requires exclusive
access.
### Initialization
```rust
FileBackedStore::open(dir: impl AsRef<Path>) -> Result<Self, StorageError>
```
1. Creates the directory if it does not exist.
2. Loads each map from its respective file, or initializes an empty map if the
file is missing.
3. Returns the initialized store.
File paths:
- `{dir}/keypackages.bin` -- KeyPackage queues
- `{dir}/deliveries.bin` -- Delivery queues
- `{dir}/hybridkeys.bin` -- Hybrid public keys
The default data directory is `data/`, configurable via `--data-dir` /
`QUICNPROTOCHAT_DATA_DIR`.
### Flush-on-Every-Write
Every mutation serializes the entire in-memory map to disk:
```text
upload_key_package(identity_key, package)
|
+-- lock key_packages Mutex
|
+-- map.entry(identity_key).or_default().push_back(package)
|
+-- flush_kp_map(path, &map)
| +-- QueueMapV1 { map: map.clone() }
| +-- bincode::serialize(&payload)
| +-- fs::write(path, bytes)
|
+-- unlock Mutex
```
This approach is deliberately simple and correct:
- **Crash safety:** Every successful RPC response guarantees the data has been
written to the filesystem.
- **No partial writes:** The entire map is serialized atomically (though not to
a temp file with rename -- this is an MVP trade-off).
- **Performance:** Not suitable for production scale. Every write serializes and
writes the full map, which is O(n) in the total number of stored entries.
**Production improvement path:** Replace with a proper database (SQLite, sled,
or similar) for incremental writes, WAL-based crash safety, and concurrent
access without full serialization.
### KeyPackage Operations
| Method | Behavior |
|--------|----------|
| `upload_key_package(identity_key, package)` | Push to back of VecDeque; flush |
| `fetch_key_package(identity_key)` | Pop from front (FIFO, single-use); flush |
The KeyPackage map uses the `QueueMapV1` serialization wrapper:
```rust
#[derive(Serialize, Deserialize, Default)]
struct QueueMapV1 {
map: HashMap<Vec<u8>, VecDeque<Vec<u8>>>,
}
```
### Delivery Queue Operations
| Method | Behavior |
|--------|----------|
| `enqueue(recipient_key, channel_id, payload)` | Construct ChannelKey; push to back; flush |
| `fetch(recipient_key, channel_id)` | Construct ChannelKey; drain entire VecDeque; flush |
The delivery map uses `QueueMapV2` with the compound `ChannelKey`:
```rust
#[derive(Serialize, Deserialize, Clone, Eq, PartialEq, Debug)]
pub struct ChannelKey {
pub channel_id: Vec<u8>,
pub recipient_key: Vec<u8>,
}
#[derive(Serialize, Deserialize, Default)]
struct QueueMapV2 {
map: HashMap<ChannelKey, VecDeque<Vec<u8>>>,
}
```
See [Delivery Service Internals](delivery-service.md) for the full queue model
and channel-aware routing semantics.
### V1/V2 Delivery Map Migration
The delivery map format evolved from V1 (keyed by recipient key only) to V2
(keyed by `ChannelKey` with channel ID + recipient key). The load function
handles both formats transparently:
```rust
fn load_delivery_map(path: &Path) -> Result<HashMap<ChannelKey, VecDeque<Vec<u8>>>> {
let bytes = fs::read(path)?;
// Try V2 format first (channel-aware).
if let Ok(map) = bincode::deserialize::<QueueMapV2>(&bytes) {
return Ok(map.map);
}
// Fallback to legacy V1 format: migrate by setting channel_id = empty.
let legacy: QueueMapV1 = bincode::deserialize(&bytes)?;
let mut upgraded = HashMap::new();
for (recipient_key, queue) in legacy.map.into_iter() {
upgraded.insert(
ChannelKey { channel_id: Vec::new(), recipient_key },
queue,
);
}
Ok(upgraded)
}
```
Migration strategy:
1. Attempt to deserialize as V2 (`QueueMapV2`). If successful, use as-is.
2. If V2 fails, deserialize as V1 (`QueueMapV1`). Migrate each entry by
wrapping the recipient key in a `ChannelKey` with an empty `channel_id`.
3. The next flush will write V2 format, completing the migration.
This in-place migration is transparent to clients. Legacy messages (pre-channel
routing) appear under the empty channel ID and can still be fetched by clients
that pass an empty `channelId`.
### Hybrid Key Operations
| Method | Behavior |
|--------|----------|
| `upload_hybrid_key(identity_key, hybrid_pk)` | Insert (overwrite); flush |
| `fetch_hybrid_key(identity_key)` | Read-only lookup; no flush needed |
The hybrid key map is a flat `HashMap<Vec<u8>, Vec<u8>>` serialized directly
with bincode. Unlike KeyPackages, hybrid keys are not single-use -- they persist
until overwritten.
### Error Type
```rust
#[derive(thiserror::Error, Debug)]
pub enum StorageError {
#[error("io error: {0}")]
Io(String),
#[error("serialization error")]
Serde,
}
```
I/O errors (disk full, permission denied) and serialization errors (corrupt
file) are the two failure modes. The server converts `StorageError` to
`capnp::Error` via the `storage_err` helper for RPC responses.
---
## DiskKeyStore (Client-Side)
`DiskKeyStore` is the client-side key store that implements the openmls
`OpenMlsKeyStore` trait. It holds MLS cryptographic key material -- most
importantly, the HPKE init private keys created during KeyPackage generation.
### Structure
```rust
pub struct DiskKeyStore {
path: Option<PathBuf>, // None = ephemeral (in-memory only)
values: RwLock<HashMap<Vec<u8>, Vec<u8>>>, // key reference -> serialized MLS entity
}
```
The `RwLock` (not `Mutex`) allows concurrent reads. Write operations (store,
delete) take an exclusive lock and flush to disk.
### Modes
| Mode | Constructor | Persistence |
|------|-------------|-------------|
| Ephemeral | `DiskKeyStore::ephemeral()` | None. Data exists only in memory. Lost on process exit. |
| Persistent | `DiskKeyStore::persistent(path)` | Yes. Every write flushes the full map to disk. Survives process restarts. |
**Ephemeral mode** is used for tests and the `register` / `demo-group` CLI
commands where session resumption is not needed.
**Persistent mode** is used for production clients (`register-state`, `invite`,
`join`, `send`, `recv` commands). The key store file path is derived from the
state file path by changing the extension to `.ks`:
```rust
fn keystore_path(state_path: &Path) -> PathBuf {
let mut path = state_path.to_path_buf();
path.set_extension("ks");
path
}
```
So `quicnprotochat-state.bin` produces a key store at `quicnprotochat-state.ks`.
### Persistence Format
The key store is serialized as a bincode-encoded `HashMap<Vec<u8>, Vec<u8>>`.
Individual values are serialized using `serde_json` (as required by openmls's
`MlsEntity` trait bound):
```rust
fn store<V: MlsEntity>(&self, k: &[u8], v: &V) -> Result<(), Self::Error> {
let value = serde_json::to_vec(v)?; // MlsEntity -> JSON bytes
let mut values = self.values.write().unwrap();
values.insert(k.to_vec(), value);
drop(values); // release lock before I/O
self.flush() // bincode serialize full map to disk
}
```
The two-layer serialization (JSON for values, bincode for the map) is a
consequence of openmls requiring `serde_json`-compatible serialization for MLS
entities, while the outer map uses bincode for compactness.
### OpenMlsKeyStore Implementation
| Trait Method | DiskKeyStore Behavior |
|--------------|-----------------------|
| `store(k, v)` | JSON-serialize value, insert into HashMap, flush to disk |
| `read(k)` | Look up key, JSON-deserialize value, return `Option<V>` |
| `delete(k)` | Remove from HashMap, flush to disk |
The `read` method does not flush because it does not modify the map. A failed
deserialization (corrupt value) returns `None` rather than an error, which
matches the openmls `OpenMlsKeyStore` trait signature.
### Flush Behavior
```rust
fn flush(&self) -> Result<(), DiskKeyStoreError> {
let Some(path) = &self.path else {
return Ok(()); // ephemeral: no-op
};
let values = self.values.read().unwrap();
let bytes = bincode::serialize(&*values)?;
fs::create_dir_all(path.parent())?; // ensure parent dir exists
fs::write(path, bytes)?;
Ok(())
}
```
Like `FileBackedStore`, the flush serializes the entire map on every write.
For client-side usage, the map is typically small (a handful of HPKE keys), so
this is not a performance concern.
### Error Type
```rust
#[derive(thiserror::Error, Debug, PartialEq, Eq)]
pub enum DiskKeyStoreError {
#[error("serialization error")]
Serialization,
#[error("io error: {0}")]
Io(String),
}
```
---
## StoreCrypto
`StoreCrypto` is a composite type that bundles a `DiskKeyStore` with the
`RustCrypto` provider from `openmls_rust_crypto`. It implements the openmls
`OpenMlsCryptoProvider` trait, which is the single entry point that openmls
uses for all cryptographic operations:
```rust
pub struct StoreCrypto {
crypto: RustCrypto, // AES-GCM, SHA-256, X25519, Ed25519, etc.
key_store: DiskKeyStore, // HPKE init keys, MLS epoch secrets, etc.
}
impl OpenMlsCryptoProvider for StoreCrypto {
type CryptoProvider = RustCrypto;
type RandProvider = RustCrypto;
type KeyStoreProvider = DiskKeyStore;
fn crypto() -> &RustCrypto { &self.crypto }
fn rand() -> &RustCrypto { &self.crypto }
fn key_store() -> &DiskKeyStore { &self.key_store }
}
```
`StoreCrypto` is the `backend` field of [`GroupMember`](group-member-lifecycle.md).
It is passed to every openmls operation -- `KeyPackage::builder().build()`,
`MlsGroup::new_with_group_id()`, `MlsGroup::new_from_welcome()`,
`create_message()`, `process_message()`, etc.
The critical property is that the **same `StoreCrypto` instance** (and therefore
the same `DiskKeyStore`) must be used from `generate_key_package()` through
`join_group()`, because the HPKE init private key is stored in and read from
this key store.
---
## Storage Architecture Summary
```text
Server Client
====== ======
FileBackedStore DiskKeyStore
+-- key_packages (Mutex<HashMap>) +-- values (RwLock<HashMap>)
| Persisted: keypackages.bin | Persisted: {state}.ks
| Format: bincode(QueueMapV1) | Format: bincode(HashMap)
| | Values: serde_json(MlsEntity)
+-- deliveries (Mutex<HashMap>) |
| Persisted: deliveries.bin +-- Wrapped by StoreCrypto
| Format: bincode(QueueMapV2) | implements OpenMlsCryptoProvider
| Migration: V1 -> V2 on load |
| +-- Used by GroupMember.backend
+-- hybrid_keys (Mutex<HashMap>)
Persisted: hybridkeys.bin
Format: bincode(HashMap)
```
### Shared Design Patterns
Both backends share these characteristics:
1. **Full-map serialization.** Every write serializes the entire map to disk.
Simple, correct, but O(n) per write.
2. **Bincode format.** The outer map is always bincode-serialized. Compact and
fast, but not human-readable and not forward-compatible without wrapper
structs.
3. **No WAL / journaling.** A crash during `fs::write` could leave a corrupt
file. For the MVP, this is acceptable -- the data can be regenerated (clients
re-upload KeyPackages; delivery messages are ephemeral).
4. **No compaction.** Empty queues are not removed from the map. Over time, the
serialized size can grow with stale entries. A production implementation
should periodically compact empty entries.
5. **Directory creation.** Both backends call `fs::create_dir_all` before
writing, ensuring parent directories exist.
---
## Related Pages
- [GroupMember Lifecycle](group-member-lifecycle.md) -- how `StoreCrypto` and `DiskKeyStore` are used during MLS operations
- [KeyPackage Exchange Flow](keypackage-exchange.md) -- upload and fetch through `FileBackedStore`
- [Delivery Service Internals](delivery-service.md) -- delivery queue operations
- [Authentication Service Internals](authentication-service.md) -- KeyPackage and hybrid key storage
- [Key Lifecycle and Zeroization](../cryptography/key-lifecycle.md) -- how HPKE keys are created and destroyed