feat: add post-quantum hybrid KEM + SQLCipher persistence

Feature 1 — Post-Quantum Hybrid KEM (X25519 + ML-KEM-768):
- Create hybrid_kem.rs with keygen, encrypt, decrypt + 11 unit tests
- Wire format: version(1) | x25519_eph_pk(32) | mlkem_ct(1088) | nonce(12) | ct
- Add uploadHybridKey/fetchHybridKey RPCs to node.capnp schema
- Server: hybrid key storage in FileBackedStore + RPC handlers
- Client: hybrid keypair in StoredState, auto-wrap/unwrap in send/recv/invite/join
- demo-group runs full hybrid PQ envelope round-trip

Feature 2 — SQLCipher Persistence:
- Extract Store trait from FileBackedStore API
- Create SqlStore (rusqlite + bundled-sqlcipher) with encrypted-at-rest SQLite
- Schema: key_packages, deliveries, hybrid_keys tables with indexes
- Server CLI: --store-backend=sql, --db-path, --db-key flags
- 5 unit tests for SqlStore (FIFO, round-trip, upsert, channel isolation)

Also includes: client lib.rs refactor, auth config, TOML config file support,
mdBook documentation, and various cleanups by user.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-22 08:07:48 +01:00
parent d1ddef4cea
commit f334ed3d43
81 changed files with 14502 additions and 2289 deletions

View File

@@ -0,0 +1,279 @@
# Authentication Service Internals
The Authentication Service (AS) stores and distributes single-use MLS
KeyPackages. It is one of the two logical services exposed through the unified
`NodeService` RPC interface. The AS also stores hybrid (X25519 + ML-KEM-768)
public keys for post-quantum envelope encryption.
This page covers the server-side implementation of KeyPackage storage, the
`Auth` struct validation logic, and the hybrid key endpoints.
**Sources:**
- `crates/quicnprotochat-server/src/main.rs` (RPC handlers, auth validation)
- `crates/quicnprotochat-server/src/storage.rs` (FileBackedStore)
- `schemas/node.capnp` (wire schema)
---
## KeyPackage Storage
### Data Model
KeyPackages are stored in a `FileBackedStore` using a `Mutex`-protected
`HashMap`:
```text
key_packages: Mutex<HashMap<Vec<u8>, VecDeque<Vec<u8>>>>
^ ^
| |
identity_key FIFO queue of
(32-byte Ed25519 TLS-encoded
public key) KeyPackage bytes
```
Each identity can have multiple KeyPackages queued. This is essential because
KeyPackages are single-use (per RFC 9420): once fetched by a peer, they are
permanently removed. Clients should upload several KeyPackages to handle
concurrent group invitations.
The map is persisted to `data/keypackages.bin` using bincode serialization,
wrapped in the `QueueMapV1` struct. See [Storage Backend](storage-backend.md)
for persistence details.
### uploadKeyPackage
```capnp
uploadKeyPackage @0 (identityKey :Data, package :Data, auth :Auth)
-> (fingerprint :Data);
```
**Handler logic:**
1. **Parse parameters.** Extract `identityKey`, `package`, and `auth`.
2. **Validate auth.** Call `validate_auth()` (see [Auth Validation](#auth-validation)
below).
3. **Validate inputs:**
| Check | Constraint | Error Message |
|-------|------------|---------------|
| Identity key length | Exactly 32 bytes | `"identityKey must be exactly 32 bytes, got {n}"` |
| Package non-empty | `package.len() > 0` | `"package must not be empty"` |
| Package size cap | `package.len() <= 1,048,576` | `"package exceeds max size (1048576 bytes)"` |
4. **Compute fingerprint.** `SHA-256(package_bytes)` produces a 32-byte digest.
5. **Store.** `FileBackedStore::upload_key_package(identity_key, package)` pushes
the package to the back of the identity's `VecDeque` and flushes to disk.
6. **Return fingerprint.** The SHA-256 hash is set in the response.
The fingerprint allows the uploading client to verify that the server stored the
exact bytes it sent. See [KeyPackage Exchange Flow](keypackage-exchange.md) for
the client-side verification logic.
### fetchKeyPackage
```capnp
fetchKeyPackage @1 (identityKey :Data, auth :Auth) -> (package :Data);
```
**Handler logic:**
1. **Parse and validate** `identityKey` (32 bytes) and `auth`.
2. **Pop from queue.** `FileBackedStore::fetch_key_package(identity_key)` calls
`VecDeque::pop_front()` on the identity's queue, removing and returning the
oldest KeyPackage. The updated map is flushed to disk.
3. **Return.** If a KeyPackage was available, set it in the response. If the
queue was empty (or the identity has no entry), return empty `Data`.
**Single-use semantics:** The `pop_front()` operation ensures each KeyPackage is
returned exactly once. This is critical for MLS security -- reusing a KeyPackage
would allow conflicting group states. The removal is atomic with respect to the
`Mutex` lock, so concurrent fetch requests will not receive the same package.
**Empty response handling:** The client checks `package.is_empty()` to
distinguish between "no packages available" and "package fetched." An empty
response is not an error -- it means the target identity has exhausted their
KeyPackage supply and needs to upload more.
---
## Auth Validation
All `NodeService` RPC methods accept an `Auth` struct:
```capnp
struct Auth {
version @0 :UInt16; # 0 = legacy/none, 1 = token-based
accessToken @1 :Data; # opaque bearer token
deviceId @2 :Data; # optional UUID for auditing
}
```
The server validates this struct through the `validate_auth` function:
```text
validate_auth(cfg, auth)
|
+-- version == 0?
| +-- cfg.allow_legacy_v0 == true? -> OK
| +-- cfg.allow_legacy_v0 == false? -> ERROR "auth version 0 disabled"
|
+-- version == 1?
| +-- accessToken empty? -> ERROR "requires non-empty accessToken"
| +-- cfg.required_token is Some?
| | +-- token matches? -> OK
| | +-- token mismatch? -> ERROR "invalid accessToken"
| +-- cfg.required_token is None? -> OK (any non-empty token accepted)
|
+-- version >= 2? -> ERROR "unsupported auth version"
```
### AuthConfig
The server's auth behavior is controlled by `AuthConfig`:
```rust
struct AuthConfig {
required_token: Option<Vec<u8>>, // None = accept any token
allow_legacy_v0: bool, // true = accept version 0 (no auth)
}
```
Configured via CLI flags / environment variables:
| Flag / Env Var | Default | Purpose |
|-----------------------------------|---------|---------|
| `--auth-token` / `QUICNPROTOCHAT_AUTH_TOKEN` | None | Required bearer token. If unset, any non-empty token is accepted for version 1. |
| `--allow-auth-v0` / `QUICNPROTOCHAT_ALLOW_AUTH_V0` | `true` | Whether to accept `auth.version=0` (legacy, unauthenticated) requests. |
### Version Semantics
| Version | Meaning | Token Required? |
|---------|---------|-----------------|
| 0 | Legacy / unauthenticated | No. Token is ignored. Server must have `allow_legacy_v0 = true`. |
| 1 | Token-based authentication | Yes. Must be non-empty. Must match `required_token` if configured. |
| 2+ | Reserved for future use | Rejected. |
### Current Limitations
The current auth implementation is intentionally minimal:
- **No identity binding.** The access token is not tied to a specific Ed25519
identity. Any valid token can upload or fetch KeyPackages for any identity.
- **No rate limiting.** There is no per-identity or per-IP rate limiting.
- **No token rotation.** Tokens are static strings configured at server startup.
- **No device management.** The `deviceId` field is accepted but not used for
authorization decisions.
The [Auth, Devices, and Tokens](../roadmap/authz-plan.md) roadmap item
addresses these gaps with a proper token issuance and validation system.
---
## Hybrid Key Endpoints
The AS also stores hybrid (X25519 + ML-KEM-768) public keys for post-quantum
envelope encryption. Unlike KeyPackages, hybrid keys are **not single-use** --
they are stored persistently and can be fetched multiple times.
### uploadHybridKey
```capnp
uploadHybridKey @6 (identityKey :Data, hybridPublicKey :Data) -> ();
```
**Handler logic:**
1. Validate `identityKey` (32 bytes) and `hybridPublicKey` (non-empty).
2. `FileBackedStore::upload_hybrid_key(identity_key, hybrid_pk)` stores the key,
overwriting any previous value for this identity.
3. Flushes to `data/hybridkeys.bin`.
The storage model is simpler than KeyPackages: a flat
`HashMap<Vec<u8>, Vec<u8>>` (identity key to hybrid public key bytes). There is
no queue -- each identity has at most one hybrid public key.
### fetchHybridKey
```capnp
fetchHybridKey @7 (identityKey :Data) -> (hybridPublicKey :Data);
```
**Handler logic:**
1. Validate `identityKey` (32 bytes).
2. Look up the hybrid public key in the store. Unlike `fetchKeyPackage`, this
does **not** remove the key -- it can be fetched repeatedly.
3. Return the key bytes, or empty `Data` if none is stored.
See [Hybrid KEM](../protocol-layers/hybrid-kem.md) for how the client uses
these keys to wrap MLS payloads in post-quantum envelopes.
---
## NodeServiceImpl Structure
The server-side implementation struct:
```rust
struct NodeServiceImpl {
store: Arc<FileBackedStore>, // shared across connections
waiters: Arc<DashMap<Vec<u8>, Arc<Notify>>>, // long-poll notification
auth_cfg: Arc<AuthConfig>, // auth policy
}
```
All connections share the same `store` and `waiters` via `Arc`. The
`DashMap<Vec<u8>, Arc<Notify>>` is keyed by recipient key and provides the
push-notification mechanism for `fetchWait`. See
[Delivery Service Internals](delivery-service.md) for the long-polling
implementation.
---
## Connection Model
```text
QUIC endpoint (port 7000)
+-- TLS 1.3 handshake (self-signed cert by default)
+-- Accept bidirectional stream
+-- capnp-rpc VatNetwork (Side::Server)
+-- NodeServiceImpl { store, waiters, auth_cfg }
```
Each QUIC connection opens one bidirectional stream for Cap'n Proto RPC. The
`capnp-rpc` crate uses `Rc<RefCell<>>` internally, making it `!Send`. All RPC
tasks run on a `tokio::task::LocalSet` to satisfy this constraint.
The server generates a self-signed TLS certificate on first start if no
certificate files exist. Certificate and key paths are configurable via
`--tls-cert` and `--tls-key`.
---
## Health Endpoint
```capnp
health @5 () -> (status :Text);
```
A simple readiness probe. Returns `"ok"` unconditionally. No auth validation is
performed. Useful for infrastructure health checks and measuring QUIC round-trip
time.
---
## Related Pages
- [KeyPackage Exchange Flow](keypackage-exchange.md) -- end-to-end upload and fetch flow including client-side logic
- [Delivery Service Internals](delivery-service.md) -- the DS half of NodeService
- [Storage Backend](storage-backend.md) -- FileBackedStore persistence model
- [GroupMember Lifecycle](group-member-lifecycle.md) -- how KeyPackages are generated and consumed
- [Auth, Devices, and Tokens](../roadmap/authz-plan.md) -- planned auth improvements
- [NodeService Schema](../wire-format/node-service-schema.md) -- Cap'n Proto schema reference
- [Hybrid KEM](../protocol-layers/hybrid-kem.md) -- post-quantum envelope encryption

View File

@@ -0,0 +1,337 @@
# Delivery Service Internals
The Delivery Service (DS) is a store-and-forward relay for opaque MLS payloads.
It never inspects, decrypts, or validates MLS ciphertext -- it routes solely by
recipient identity key and channel identifier. The DS exposes three operations
through the `NodeService` RPC interface: `enqueue`, `fetch`, and `fetchWait`.
**Sources:**
- `crates/quicnprotochat-server/src/main.rs` (RPC handlers)
- `crates/quicnprotochat-server/src/storage.rs` (queue storage)
- `schemas/node.capnp` (wire schema)
---
## Architecture
```text
NodeService (port 7000)
=======================
enqueue(recipientKey, payload, channelId)
|
v
+---------------------------------------------------------+
| FileBackedStore |
| |
| deliveries: Mutex<HashMap<ChannelKey, VecDeque<Vec<u8>>>>|
| ^ ^ |
| | | |
| ChannelKey { FIFO queue of |
| channel_id, opaque payload |
| recipient_key bytes |
| } |
| |
| Persisted to: data/deliveries.bin (bincode, V2 format) |
+---------------------------------------------------------+
|
v
notify_waiters() --> DashMap<Vec<u8>, Arc<Notify>>
^
|
keyed by recipient_key
wakes blocked fetchWait calls
```
The DS is intentionally MLS-unaware. This design decision is documented in
[ADR-004: MLS-Unaware Delivery Service](../design-rationale/adr-004-mls-unaware-ds.md).
From the server's perspective, every payload is an opaque blob -- it could be
a Welcome, a Commit, an application message, or a hybrid-encrypted envelope.
---
## Queue Model
### ChannelKey
Delivery queues are indexed by a compound key:
```rust
#[derive(Serialize, Deserialize, Clone, Eq, PartialEq, Debug)]
pub struct ChannelKey {
pub channel_id: Vec<u8>,
pub recipient_key: Vec<u8>,
}
```
| Field | Size | Purpose |
|-----------------|-------------|---------|
| `channel_id` | Variable (typically 16 bytes UUID or empty) | Isolates messages by conversation. Empty for legacy/default channel. |
| `recipient_key` | 32 bytes | Ed25519 public key of the intended recipient. |
The `ChannelKey` implements `Hash` manually, hashing `channel_id` followed by
`recipient_key`.
**Channel-aware routing** ensures that messages for different conversations do
not interfere with each other. A client fetching from channel A will not see
messages enqueued for channel B, even if both target the same recipient. For
legacy clients (or single-channel usage), `channel_id` is left empty.
### Queue Structure
Each `ChannelKey` maps to a `VecDeque<Vec<u8>>`:
```text
ChannelKey("chan-1", "alice-pk") -> [msg_1, msg_2, msg_3]
ChannelKey("chan-1", "bob-pk") -> [msg_4]
ChannelKey("chan-2", "alice-pk") -> [msg_5, msg_6]
ChannelKey("", "alice-pk") -> [msg_7] (legacy/default channel)
```
Messages within a queue are ordered FIFO (first-in, first-out). This preserves
MLS epoch ordering, which is critical: a recipient must process a Welcome before
application messages, and Commits in the order they were produced.
---
## RPC Operations
### enqueue
Appends a payload to the recipient's queue and wakes any blocked long-poll
waiters.
```capnp
enqueue @2 (recipientKey :Data, payload :Data, channelId :Data,
version :UInt16, auth :Auth) -> ();
```
**Handler logic:**
1. **Parse parameters.** Extract `recipientKey`, `payload`, `channelId`,
`version`, and `auth` from the Cap'n Proto request.
2. **Validate auth.** Call `validate_auth()` to check the `Auth` struct. See
[Authentication Service Internals](authentication-service.md) for auth
validation details.
3. **Validate inputs:**
| Field | Constraint | Error on Violation |
|----------------|-------------------------|--------------------|
| `recipientKey` | Exactly 32 bytes | `"recipientKey must be exactly 32 bytes, got {n}"` |
| `payload` | Non-empty | `"payload must not be empty"` |
| `payload` | At most 5 MB | `"payload exceeds max size (5242880 bytes)"` |
| `version` | 0 (legacy) or 1 (current) | `"unsupported wire version {v} (expected 0 or 1)"` |
4. **Store.** Call `FileBackedStore::enqueue(recipient_key, channel_id, payload)`,
which constructs a `ChannelKey` from the channel ID and recipient key, then
pushes the payload to the back of the corresponding `VecDeque`. The entire
delivery map is flushed to disk.
5. **Notify waiters.** Look up or create a `tokio::sync::Notify` for the
recipient key in `DashMap<Vec<u8>, Arc<Notify>>` and call
`notify_waiters()`. This wakes all `fetchWait` calls currently blocked on
this recipient.
### fetch
Atomically drains the entire queue for a recipient+channel and returns all
payloads.
```capnp
fetch @3 (recipientKey :Data, channelId :Data, version :UInt16, auth :Auth)
-> (payloads :List(Data));
```
**Handler logic:**
1. Parse and validate `recipientKey` (32 bytes), `version` (0 or 1), and
`auth`.
2. Call `FileBackedStore::fetch(recipient_key, channel_id)`, which:
- Constructs a `ChannelKey`.
- Calls `VecDeque::drain(..)` on the matching queue, collecting all messages.
- Flushes the updated (now empty) map to disk.
- Returns the drained messages as `Vec<Vec<u8>>`.
3. Build a `List(Data)` response with all the payload bytes.
**Important:** The drain is atomic with respect to the `Mutex` lock. No
interleaving with concurrent `enqueue` calls is possible. The returned list
preserves FIFO order.
### fetchWait (Long-Polling)
Combines `fetch` with a blocking wait. If the queue is empty, the server waits
for up to `timeoutMs` milliseconds for a new message to arrive.
```capnp
fetchWait @4 (recipientKey :Data, channelId :Data, version :UInt16,
timeoutMs :UInt64, auth :Auth) -> (payloads :List(Data));
```
**Handler logic:**
```text
1. validate inputs (same as fetch)
2. messages = store.fetch(recipient_key, channel_id)
3. if messages.is_empty() AND timeout_ms > 0:
a. waiter = waiters.entry(recipient_key).or_insert(Arc::new(Notify::new()))
b. tokio::time::timeout(Duration::from_millis(timeout_ms), waiter.notified()).await
c. messages = store.fetch(recipient_key, channel_id) // re-fetch after wake
4. return messages
```
The implementation uses `Promise::from_future(async move { ... })` because the
`tokio::time::timeout` call is async. This is the only DS handler that produces
an async `Promise`.
**Timeout behavior:**
- If `timeout_ms == 0`, `fetchWait` behaves identically to `fetch` (immediate
return).
- If a message arrives before the timeout, `notify_waiters()` from `enqueue`
wakes the `Notify`, and the handler re-fetches immediately.
- If the timeout expires without a message, the handler re-fetches (which will
return empty) and returns an empty list.
**Waiter model:** The `DashMap<Vec<u8>, Arc<Notify>>` is keyed by recipient key
(not by `ChannelKey`). This means a notification for any channel targeting the
same recipient will wake all blocked `fetchWait` calls for that recipient. This
is a deliberate simplification -- the re-fetch after waking will only return
messages from the requested channel, so cross-channel wake-ups result in a
no-op re-fetch rather than incorrect behavior.
---
## Version Validation
The `version` field in `enqueue`, `fetch`, and `fetchWait` enables future
schema evolution:
| Version | Meaning |
|---------|---------|
| 0 | Legacy (pre-versioning). `channelId` is treated as empty. |
| 1 | Current wire format. `channelId` is a meaningful field. |
| 2+ | Rejected with `"unsupported wire version"`. |
Both 0 and 1 are accepted on the server side. The constant
`CURRENT_WIRE_VERSION = 1` is used in validation:
```rust
if version != 0 && version != CURRENT_WIRE_VERSION {
return Promise::err(/* unsupported version */);
}
```
The client library always sends `version: 1` for new operations.
---
## Notification System
The waiter map provides a lightweight push-notification mechanism:
```text
enqueue() fetchWait()
| |
v v
store.enqueue(key, ch, payload) messages = store.fetch(key, ch)
| |
v | (if empty)
waiter = waiters.entry(key) v
.or_insert(Notify::new()) waiter = waiters.entry(key)
| .or_insert(Notify::new())
v |
waiter.notify_waiters() v
| timeout(duration, waiter.notified())
| |
+------- wakes ----------------------->+
|
v
messages = store.fetch(key, ch)
|
v
return messages
```
`tokio::sync::Notify` is a broadcast notification primitive. `notify_waiters()`
wakes all tasks currently awaiting `.notified()`. If no tasks are waiting, the
notification is lost (there is no stored permit in the `notify_waiters()` path).
This is acceptable because `fetchWait` always performs a fetch before blocking,
so messages that arrive before the wait begins are returned immediately.
---
## Data Flow Example: Two-Party Message Exchange
```text
Alice Server DS Bob
| | |
| encrypt("hello bob") | |
| -> ct_bytes | |
| | |
| enqueue(bob_pk, ct_bytes) | |
| -------------------------> | |
| | queue[("", bob_pk)] += ct |
| | notify_waiters(bob_pk) |
| | |
| | <--- fetchWait(bob_pk, 30s) |
| | (was blocked, now woken)|
| | |
| | drain queue[("", bob_pk)] |
| | ---- [ct_bytes] -----------> |
| | |
| | decrypt(ct_bytes) |
| | -> "hello bob" |
```
---
## Server Constants
| Constant | Value | Purpose |
|-------------------------|-----------|---------|
| `MAX_PAYLOAD_BYTES` | 5,242,880 (5 MB) | Maximum size of a single enqueued payload |
| `MAX_KEYPACKAGE_BYTES` | 1,048,576 (1 MB) | Maximum size of a KeyPackage (AS) |
| `CURRENT_WIRE_VERSION` | 1 | Current schema version; rejects > 1 |
---
## Persistence
Delivery queues are persisted to `data/deliveries.bin` using bincode
serialization. The V2 format uses `ChannelKey` as the map key:
```rust
#[derive(Serialize, Deserialize, Default)]
struct QueueMapV2 {
map: HashMap<ChannelKey, VecDeque<Vec<u8>>>,
}
```
On load, the server attempts V2 deserialization first. If that fails, it falls
back to V1 format (keyed by `Vec<u8>` recipient key only) and migrates in
memory by assigning empty `channel_id` to each entry:
```rust
for (recipient_key, queue) in legacy.map.into_iter() {
upgraded.insert(
ChannelKey { channel_id: Vec::new(), recipient_key },
queue,
);
}
```
See [Storage Backend](storage-backend.md) for the full persistence model.
---
## Related Pages
- [Authentication Service Internals](authentication-service.md) -- KeyPackage storage and retrieval
- [GroupMember Lifecycle](group-member-lifecycle.md) -- how `send_message()` and `receive_message()` produce and consume the payloads
- [Storage Backend](storage-backend.md) -- `FileBackedStore` persistence and migration
- [NodeService Schema](../wire-format/node-service-schema.md) -- Cap'n Proto schema reference
- [ADR-004: MLS-Unaware Delivery Service](../design-rationale/adr-004-mls-unaware-ds.md) -- design rationale
- [End-to-End Data Flow](../architecture/data-flow.md) -- sequence diagrams for registration, group creation, and messaging

View File

@@ -0,0 +1,316 @@
# GroupMember Lifecycle
The `GroupMember` struct in `quicnprotochat-core` is the core MLS state machine
that manages a single client's membership in an MLS group. It wraps an openmls
`MlsGroup`, a persistent crypto backend, and the long-term Ed25519 identity
keypair. Every MLS operation -- key package generation, group creation, member
addition, joining, sending, and receiving -- flows through this struct.
**Source:** `crates/quicnprotochat-core/src/group.rs`
---
## Struct Fields
```rust
pub struct GroupMember {
backend: StoreCrypto, // persistent crypto backend (key store + RustCrypto)
identity: Arc<IdentityKeypair>, // long-term Ed25519 signing keypair
group: Option<MlsGroup>, // active MLS group (None before create/join)
config: MlsGroupConfig, // shared group configuration
}
```
| Field | Type | Purpose |
|------------|-------------------------|---------|
| `backend` | `StoreCrypto` | Implements `OpenMlsCryptoProvider`. Couples a `RustCrypto` engine with a `DiskKeyStore` that holds HPKE init private keys. The backend is **persistent** -- the same instance must be used from `generate_key_package()` through `join_group()`. See [Storage Backend](storage-backend.md) for details on `DiskKeyStore`. |
| `identity` | `Arc<IdentityKeypair>` | The client's long-term Ed25519 keypair. Used as the MLS `Signer` for all group operations (signing Commits, KeyPackages, credentials). Also used to build the MLS `BasicCredential`. See [Ed25519 Identity Keys](../cryptography/identity-keys.md). |
| `group` | `Option<MlsGroup>` | `None` until the client creates or joins a group. Once set, all message operations (`send_message`, `receive_message`) operate on this group. |
| `config` | `MlsGroupConfig` | Shared configuration for all groups created by this member. Built once in the constructor. |
### MlsGroupConfig
The configuration is constructed as:
```rust
MlsGroupConfig::builder()
.use_ratchet_tree_extension(true)
.build()
```
Setting `use_ratchet_tree_extension = true` embeds the ratchet tree inside
Welcome messages (in the `GroupInfo` extension). This means `new_from_welcome`
can be called with `ratchet_tree = None` -- openmls extracts the tree from the
Welcome itself. This simplifies the protocol by eliminating the need for a
separate ratchet tree distribution mechanism.
---
## State Transition Diagram
```text
GroupMember::new(identity) -----> [No Group]
| group = None
|
+-- generate_key_package() --> [Has KeyPackage, waiting for Welcome]
| Returns TLS-encoded HPKE init key stored in backend
| KeyPackage bytes
|
+-- create_group(group_id) --> [Group Creator, epoch 0]
| group = Some(MlsGroup) Sole member of the group
| |
| +-- add_member(kp_bytes) --> [epoch N+1]
| Returns (commit_bytes, welcome_bytes)
| Pending commit merged locally
| Creator ready to encrypt immediately
|
+-- join_group(welcome_bytes) --> [Group Member, epoch N]
group = Some(MlsGroup) Joined via Welcome
|
+-- send_message(plaintext) --> encrypted PrivateMessage bytes
|
+-- receive_message(bytes) --> Some(plaintext) [ApplicationMessage]
| None [Commit or Proposal]
```
### Transitions in Detail
1. **`new(identity)`** -- Creates a `GroupMember` with an ephemeral
`DiskKeyStore` and no active group. The `StoreCrypto` backend is initialized
fresh. An alternative constructor, `new_with_state`, accepts a pre-existing
`DiskKeyStore` and optional serialized `MlsGroup` for session resumption.
2. **`generate_key_package()`** -- Generates a fresh single-use MLS KeyPackage.
The HPKE init private key is stored in `self.backend`'s key store. Returns
TLS-encoded KeyPackage bytes suitable for upload to the
[Authentication Service](authentication-service.md).
3. **`create_group(group_id)`** -- Creates a new MLS group where the caller
becomes the sole member at epoch 0. The `group_id` can be any non-empty byte
string (SHA-256 of a human-readable name is recommended).
4. **`add_member(key_package_bytes)`** -- Adds a peer using their TLS-encoded
KeyPackage. Produces a Commit and a Welcome. The Commit is merged locally
(advancing the epoch), so the creator is immediately ready to encrypt. The
caller is responsible for distributing the Welcome to the new member via the
[Delivery Service](delivery-service.md).
5. **`join_group(welcome_bytes)`** -- Joins an existing group from a TLS-encoded
Welcome message. The caller must have previously called
`generate_key_package()` on **this same instance** so the HPKE init private
key is available in the backend.
6. **`send_message(plaintext)`** -- Encrypts plaintext as an MLS Application
message (PrivateMessage variant). Returns TLS-encoded bytes for delivery.
7. **`receive_message(bytes)`** -- Processes an incoming MLS message. Returns
`Some(plaintext)` for application messages, `None` for Commits (which advance
the group epoch) and Proposals (which are stored for a future Commit).
---
## Critical Invariant: Backend Identity
The same `GroupMember` instance must be used from `generate_key_package()`
through `join_group()`. This is the most important invariant in the system.
**Why:** When `generate_key_package()` runs, openmls creates an HPKE key pair
and stores the private key in the `StoreCrypto` backend's in-memory key store
(the `DiskKeyStore`). When `join_group()` later processes the Welcome, openmls
calls `new_from_welcome`, which reads the HPKE init private key from the key
store to decrypt the Welcome's encrypted group secrets. If a different backend
instance is used, the private key will not be found, and `new_from_welcome` will
fail with a key-not-found error.
```text
generate_key_package() join_group(welcome)
| |
v v
KeyPackage::builder().build() MlsGroup::new_from_welcome()
| |
v v
backend.key_store().store( backend.key_store().read(
init_key_ref, hpke_private_key) init_key_ref) -> hpke_private_key
| |
+----------- MUST BE SAME BACKEND ------+
```
For persistent clients, the `DiskKeyStore::persistent(path)` constructor is used
so that the HPKE init keys survive process restarts. The client state file
stores the path alongside the identity seed and serialized group, and
`new_with_state` reconstructs the `GroupMember` with the persisted key store.
---
## Credential Construction
The `make_credential_with_key` helper builds the MLS `CredentialWithKey` used
for KeyPackage generation and group creation:
```rust
fn make_credential_with_key(&self) -> Result<CredentialWithKey, CoreError> {
let credential = Credential::new(
self.identity.public_key_bytes().to_vec(),
CredentialType::Basic,
)?;
Ok(CredentialWithKey {
credential,
signature_key: self.identity.public_key_bytes().to_vec().into(),
})
}
```
Key points:
- **Credential type:** `CredentialType::Basic` -- the simplest MLS credential
form, containing only the raw public key bytes.
- **Credential identity:** The raw 32-byte Ed25519 public key. This is what
peers use to identify the member within the group.
- **Signature key:** The same Ed25519 public key bytes, wrapped in the openmls
`SignaturePublicKey` type.
- **Signer:** The `IdentityKeypair` struct implements the openmls `Signer`
trait directly, so it can be passed to `KeyPackage::builder().build()` and
`MlsGroup::new_with_group_id()` without the external
`openmls_basic_credential` crate.
---
## MLS Ciphersuite
All operations use a single ciphersuite:
```text
MLS_128_DHKEMX25519_AES128GCM_SHA256_Ed25519
```
This provides:
| Component | Algorithm | Security Level |
|---------------|--------------------|----------------|
| HPKE KEM | DHKEM(X25519) | 128-bit classical |
| AEAD | AES-128-GCM | 128-bit |
| KDF / Hash | SHA-256 | 128-bit collision resistance |
| Signature | Ed25519 | 128-bit classical |
See [Cryptography Overview](../cryptography/overview.md) for the full algorithm
inventory across all protocol layers.
---
## KeyPackage Deserialization (openmls 0.5)
openmls 0.5 separates serializable and deserializable types. `KeyPackage`
derives `TlsSerialize` but not `TlsDeserialize`. To deserialize an incoming
KeyPackage:
```rust
let key_package: KeyPackage =
KeyPackageIn::tls_deserialize(&mut bytes.as_ref())?
.validate(backend.crypto(), ProtocolVersion::Mls10)?;
```
The `KeyPackageIn` type derives `TlsDeserialize` and provides `validate()`,
which verifies the KeyPackage's signature and returns a trusted `KeyPackage`.
Similarly, `MlsMessageIn` is used to deserialize incoming MLS messages, and its
`extract()` method returns the inner message body (`MlsMessageInBody`). The
`into_welcome()` and `into_protocol_message()` methods that existed in earlier
openmls versions are feature-gated in 0.5; `extract()` with pattern matching is
the public API:
```rust
let msg_in = MlsMessageIn::tls_deserialize(&mut bytes.as_ref())?;
match msg_in.extract() {
MlsMessageInBody::Welcome(w) => { /* join_group path */ }
MlsMessageInBody::PrivateMessage(m) => ProtocolMessage::PrivateMessage(m),
MlsMessageInBody::PublicMessage(m) => ProtocolMessage::PublicMessage(m),
_ => { /* error: unexpected message type */ }
}
```
---
## Message Processing
`receive_message` handles four variants of `ProcessedMessageContent`:
| Variant | Action | Return Value |
|----------------------------------|--------------------------------------------|--------------|
| `ApplicationMessage` | Extract plaintext bytes | `Some(plaintext)` |
| `StagedCommitMessage` | `merge_staged_commit()` -- epoch advances | `None` |
| `ProposalMessage` | `store_pending_proposal()` -- cached | `None` |
| `ExternalJoinProposalMessage` | `store_pending_proposal()` -- cached | `None` |
For Commit messages, `merge_staged_commit` advances the group's epoch and
updates the ratchet tree. Proposals are stored for inclusion in a future Commit;
this allows the group to accumulate multiple proposals before committing them as
a batch.
---
## Error Handling
All `GroupMember` methods return `Result<_, CoreError>`. The MLS-specific error
variant is:
```rust
#[error("MLS error: {0}")]
Mls(String)
```
The inner string is the debug representation of the openmls error. This is a
deliberate design choice: openmls error types are complex enums with many
variants, and wrapping the debug output provides sufficient diagnostic
information without coupling `CoreError` to openmls's internal error hierarchy.
Common error scenarios:
| Operation | Failure Mode |
|------------------------|-------------------------------------------------|
| `generate_key_package` | Backend RNG failure (extremely unlikely) |
| `create_group` | Group already exists in state |
| `add_member` | Malformed KeyPackage, no active group |
| `join_group` | Welcome does not match any stored init key |
| `send_message` | No active group |
| `receive_message` | Malformed message, decryption failure, wrong epoch |
---
## Accessors
| Method | Returns | Purpose |
|-------------------|---------------------------------------|---------|
| `group_id()` | `Option<Vec<u8>>` | MLS group ID bytes, or `None` if no group is active |
| `identity()` | `&IdentityKeypair` | Reference to the long-term Ed25519 keypair |
| `identity_seed()` | `[u8; 32]` | Private seed bytes for state persistence |
| `backend()` | `&StoreCrypto` | Reference to the crypto provider |
| `group_ref()` | `Option<&MlsGroup>` | Reference to the MLS group for serialization |
---
## Unit Tests
The `two_party_mls_round_trip` test exercises the complete lifecycle:
1. Alice and Bob each create a `GroupMember` with fresh identities.
2. Bob generates a KeyPackage (stored in his backend).
3. Alice creates a group and adds Bob using his KeyPackage.
4. Bob joins via the Welcome message.
5. Alice sends "hello bob" -- Bob decrypts and verifies.
6. Bob sends "hello alice" -- Alice decrypts and verifies.
This test runs entirely in-memory (no server) and validates that the HPKE init
key invariant is maintained when the same `GroupMember` instance is used
throughout.
---
## Related Pages
- [KeyPackage Exchange Flow](keypackage-exchange.md) -- upload and fetch of KeyPackages via the server
- [Delivery Service Internals](delivery-service.md) -- how Commits and Welcomes are relayed
- [Authentication Service Internals](authentication-service.md) -- server-side KeyPackage storage
- [Storage Backend](storage-backend.md) -- `DiskKeyStore` and `FileBackedStore` persistence
- [Cryptography Overview](../cryptography/overview.md) -- algorithm inventory
- [Ed25519 Identity Keys](../cryptography/identity-keys.md) -- the `IdentityKeypair` struct

View File

@@ -0,0 +1,326 @@
# KeyPackage Exchange Flow
MLS KeyPackages are single-use tokens that enable a group creator to add a new
member. The KeyPackage contains the member's HPKE init public key, their MLS
credential (Ed25519 public key), and a signature proving ownership. The
quicnprotochat Authentication Service (AS) provides a simple upload/fetch
interface for distributing KeyPackages between clients.
This page describes the end-to-end flow: from client-side generation through
server-side storage to peer-side retrieval and consumption.
**Sources:**
- `crates/quicnprotochat-core/src/group.rs` (client-side generation)
- `crates/quicnprotochat-server/src/main.rs` (server-side handlers)
- `crates/quicnprotochat-server/src/storage.rs` (server-side persistence)
- `crates/quicnprotochat-client/src/lib.rs` (client-side RPC calls)
- `schemas/node.capnp` (wire schema)
---
## Upload Flow
The upload flow moves a freshly generated KeyPackage from a client to the
server, where it is stored for later retrieval by a peer.
```text
Client Server (AS)
| |
| 1. GroupMember::generate_key_package() |
| -> TLS-encoded KeyPackage bytes |
| -> HPKE init key stored in backend |
| |
| 2. uploadKeyPackage RPC |
| identityKey = Ed25519 pub key (32 B) |
| package = TLS-encoded bytes |
| auth = Auth struct |
| ----------------------------------------> |
| | 3. Validate inputs:
| | - identityKey == 32 bytes
| | - package non-empty
| | - package < 1 MB
| | - auth version valid
| |
| | 4. Compute SHA-256(package)
| |
| | 5. Store: push_back to
| | DashMap<Vec<u8>, VecDeque<Vec<u8>>>
| | keyed by identity_key
| |
| 6. Response: fingerprint (SHA-256 hash) |
| <---------------------------------------- |
| |
| 7. Verify: local SHA-256 == server SHA-256|
| |
```
### Step-by-Step
1. **Client generates KeyPackage.** The client calls
`GroupMember::generate_key_package()`, which internally:
- Builds an MLS `CredentialWithKey` from the Ed25519 public key
(`CredentialType::Basic`).
- Calls `KeyPackage::builder().build()` with the ciphersuite
`MLS_128_DHKEMX25519_AES128GCM_SHA256_Ed25519`, the `StoreCrypto` backend,
and the `IdentityKeypair` as the signer.
- openmls generates an ephemeral HPKE key pair (X25519) and stores the
private key in the backend's `DiskKeyStore`.
- Returns the TLS-serialized KeyPackage bytes.
See [GroupMember Lifecycle](group-member-lifecycle.md) for the critical
invariant about backend identity.
2. **Client sends `uploadKeyPackage` RPC.** The request includes:
- `identityKey`: The raw 32-byte Ed25519 public key.
- `package`: The TLS-encoded KeyPackage bytes.
- `auth`: An [Auth struct](../wire-format/auth-schema.md) with version and
optional access token.
3. **Server validates inputs.** The server checks:
- `identityKey` is exactly 32 bytes (Ed25519 public key size).
- `package` is non-empty.
- `package` does not exceed `MAX_KEYPACKAGE_BYTES` (1 MB).
- The `Auth` struct version is acceptable (0 for legacy, 1 for token-based).
4. **Server computes fingerprint.** `SHA-256(package_bytes)` produces a 32-byte
digest used as a tamper-detection fingerprint.
5. **Server stores the KeyPackage.** The package bytes are pushed to the back of
a `VecDeque<Vec<u8>>` keyed by the identity key in the server's
`FileBackedStore`. This allows multiple KeyPackages per identity (clients
should upload several to handle concurrent invitations). The store flushes to
disk after every mutation.
6. **Server returns the fingerprint.** The SHA-256 digest is sent back in the
response's `fingerprint` field.
7. **Client verifies the fingerprint.** The client computes its own
`SHA-256(package_bytes)` and compares it to the server-returned value. A
mismatch indicates tampering (the server or a MITM modified the package in
transit) and the client aborts with a `fingerprint mismatch` error.
---
## Fetch Flow
The fetch flow allows a peer to retrieve a stored KeyPackage for a target
identity, consuming it in the process (single-use per RFC 9420).
```text
Peer Server (AS)
| |
| 1. fetchKeyPackage RPC |
| identityKey = target's Ed25519 pub key |
| auth = Auth struct |
| ----------------------------------------> |
| | 2. Validate inputs:
| | - identityKey == 32 bytes
| | - auth version valid
| |
| | 3. Pop front of VecDeque
| | (FIFO, single-use)
| | Flush updated map to disk
| |
| 4. Response: package bytes (or empty) |
| <---------------------------------------- |
| |
| 5. If non-empty: |
| KeyPackageIn::tls_deserialize() |
| .validate(crypto, MLS10) |
| -> trusted KeyPackage for add_member() |
| |
```
### Step-by-Step
1. **Peer sends `fetchKeyPackage` RPC.** The request includes the target's
Ed25519 public key (32 bytes) and an Auth context.
2. **Server validates inputs.** Same identity key length check as upload (32
bytes).
3. **Server pops from the front of the queue.** `VecDeque::pop_front()` returns
the oldest uploaded KeyPackage. This enforces FIFO ordering and **single-use
semantics**: once fetched, the KeyPackage is permanently removed from the
server. This is a hard requirement of the MLS specification -- reusing a
KeyPackage would allow an attacker to create conflicting group states.
The store is flushed to disk after the pop, ensuring the removal survives
server restarts.
4. **Server returns the package bytes.** If the queue was empty (no KeyPackages
available), the response contains an empty `Data` field. The client checks
for emptiness to distinguish "no packages available" from "package fetched."
5. **Peer deserializes and validates.** The peer uses `KeyPackageIn::tls_deserialize()`
followed by `.validate(crypto, ProtocolVersion::Mls10)` to verify the
KeyPackage signature. The validated `KeyPackage` can then be passed to
`GroupMember::add_member()`.
---
## Fingerprint Verification
The fingerprint mechanism provides a simple tamper-detection check:
```text
Client Server Client
SHA-256(pkg) ---------> store pkg -----------> SHA-256(pkg)
| SHA-256(pkg) --------> |
| | |
+---- compare: local_fp == server_fp --------+
```
**What it detects:**
- A malicious server replacing the package bytes.
- A network-layer MITM modifying the package in transit (though QUIC/TLS
already prevents this).
**What it does NOT detect:**
- A malicious server that simply returns the correct fingerprint for a package
it has replaced (since the server computes the hash itself). True
KeyPackage authenticity requires verifying the Ed25519 signature inside
the KeyPackage, which openmls does during `validate()`.
The fingerprint is best understood as a transport-level integrity check, not a
cryptographic proof of authenticity. The real authenticity guarantee comes from
the MLS KeyPackage signature verified on the receiving side.
---
## Storage Model
On the server, KeyPackages are stored in a `FileBackedStore`:
```text
FileBackedStore
+-- key_packages: Mutex<HashMap<Vec<u8>, VecDeque<Vec<u8>>>>
| ^ ^
| | |
| identity_key queue of TLS-encoded
| (32 bytes) KeyPackage bytes
|
+-- Persisted to: data/keypackages.bin (bincode serialized)
```
Each identity key maps to a FIFO queue of KeyPackage bytes. A client should
upload multiple KeyPackages so that peers can concurrently fetch them without
contention. If the queue is exhausted, fetches return empty until the client
uploads more.
The storage format uses the `QueueMapV1` wrapper for bincode serialization:
```rust
#[derive(Serialize, Deserialize, Default)]
struct QueueMapV1 {
map: HashMap<Vec<u8>, VecDeque<Vec<u8>>>,
}
```
See [Storage Backend](storage-backend.md) for details on persistence,
flush-on-write semantics, and the V1/V2 delivery map migration.
---
## Input Validation Summary
| Field | Constraint | Error on Violation |
|----------------|---------------------------|--------------------|
| `identityKey` | Exactly 32 bytes | `"identityKey must be exactly 32 bytes, got {n}"` |
| `package` | Non-empty | `"package must not be empty"` |
| `package` | At most 1 MB (1,048,576) | `"package exceeds max size (1048576 bytes)"` |
| `auth.version` | 0 (legacy) or 1 (current) | `"unsupported auth version {v}"` |
| `auth.token` | Non-empty when version=1 | `"auth.version=1 requires non-empty accessToken"` |
---
## Wire Schema
From `schemas/node.capnp`:
```capnp
uploadKeyPackage @0 (identityKey :Data, package :Data, auth :Auth)
-> (fingerprint :Data);
fetchKeyPackage @1 (identityKey :Data, auth :Auth)
-> (package :Data);
```
The `Auth` struct is shared across all RPC methods:
```capnp
struct Auth {
version @0 :UInt16; # 0 = legacy/none, 1 = token-based
accessToken @1 :Data; # opaque bearer token
deviceId @2 :Data; # optional UUID for auditing
}
```
See [NodeService Schema](../wire-format/node-service-schema.md) for the
complete schema reference.
---
## Client-Side Usage
The CLI exposes two commands for KeyPackage exchange:
### `register` / `register-state`
Generates a fresh KeyPackage and uploads it. `register` uses an ephemeral
identity; `register-state` loads from (or initializes) a persistent state file.
```bash
# Ephemeral registration (for testing)
quicnprotochat register --server 127.0.0.1:7000
# Persistent registration (production)
quicnprotochat register-state --state alice.bin --server 127.0.0.1:7000
```
Output:
```
identity_key : 7a3f... (64 hex chars, 32 bytes)
fingerprint : 9e1c... (SHA-256 of KeyPackage)
KeyPackage uploaded successfully.
```
### `fetch-key`
Fetches a peer's KeyPackage by their hex-encoded Ed25519 public key:
```bash
quicnprotochat fetch-key --server 127.0.0.1:7000 7a3f...
```
---
## Security Considerations
1. **Single-use enforcement.** The server's `pop_front()` semantics ensure each
KeyPackage is consumed exactly once, satisfying RFC 9420's requirement.
However, a malicious server could duplicate KeyPackages before deletion. True
single-use is enforced at the MLS protocol level: duplicate KeyPackage usage
would be detected when processing the Welcome (mismatched group state).
2. **No authentication on fetch.** Currently, anyone can fetch any identity's
KeyPackage. This is intentional for the MVP but means an attacker could
exhaust a victim's KeyPackage supply. The
[Auth, Devices, and Tokens](../roadmap/authz-plan.md) plan addresses this
with token-based access control.
3. **HPKE init key lifetime.** The HPKE init private key lives in the
`DiskKeyStore` from generation until the Welcome is processed. For persistent
clients using `DiskKeyStore::persistent()`, this key survives process
restarts. For ephemeral clients, the key exists only in memory and is lost if
the process exits before `join_group()` is called.
---
## Related Pages
- [GroupMember Lifecycle](group-member-lifecycle.md) -- the MLS state machine that generates and consumes KeyPackages
- [Authentication Service Internals](authentication-service.md) -- server-side KeyPackage handling
- [Delivery Service Internals](delivery-service.md) -- how the Welcome message is relayed after `add_member()`
- [Storage Backend](storage-backend.md) -- `FileBackedStore` persistence model
- [NodeService Schema](../wire-format/node-service-schema.md) -- Cap'n Proto schema reference

View File

@@ -0,0 +1,390 @@
# Storage Backend
quicnprotochat uses two storage backends: `FileBackedStore` on the server side
for KeyPackages and delivery queues, and `DiskKeyStore` on the client side for
MLS cryptographic key material. Both follow the same pattern: in-memory data
structures backed by optional file persistence, with full serialization on every
write.
**Sources:**
- `crates/quicnprotochat-server/src/storage.rs` (FileBackedStore)
- `crates/quicnprotochat-core/src/keystore.rs` (DiskKeyStore, StoreCrypto)
---
## FileBackedStore (Server-Side)
`FileBackedStore` provides persistent storage for the server's three data
domains: KeyPackages, delivery queues, and hybrid public keys.
### Structure
```rust
pub struct FileBackedStore {
kp_path: PathBuf, // keypackages.bin
ds_path: PathBuf, // deliveries.bin
hk_path: PathBuf, // hybridkeys.bin
key_packages: Mutex<HashMap<Vec<u8>, VecDeque<Vec<u8>>>>, // identity -> KP queue
deliveries: Mutex<HashMap<ChannelKey, VecDeque<Vec<u8>>>>, // (channel, recipient) -> msg queue
hybrid_keys: Mutex<HashMap<Vec<u8>, Vec<u8>>>, // identity -> hybrid PK
}
```
Each domain has its own `Mutex`-protected in-memory map and its own disk file.
The `Mutex` (not `RwLock`) is used because every read-path operation that
modifies state (e.g., `pop_front` in `fetch_key_package`) requires exclusive
access.
### Initialization
```rust
FileBackedStore::open(dir: impl AsRef<Path>) -> Result<Self, StorageError>
```
1. Creates the directory if it does not exist.
2. Loads each map from its respective file, or initializes an empty map if the
file is missing.
3. Returns the initialized store.
File paths:
- `{dir}/keypackages.bin` -- KeyPackage queues
- `{dir}/deliveries.bin` -- Delivery queues
- `{dir}/hybridkeys.bin` -- Hybrid public keys
The default data directory is `data/`, configurable via `--data-dir` /
`QUICNPROTOCHAT_DATA_DIR`.
### Flush-on-Every-Write
Every mutation serializes the entire in-memory map to disk:
```text
upload_key_package(identity_key, package)
|
+-- lock key_packages Mutex
|
+-- map.entry(identity_key).or_default().push_back(package)
|
+-- flush_kp_map(path, &map)
| +-- QueueMapV1 { map: map.clone() }
| +-- bincode::serialize(&payload)
| +-- fs::write(path, bytes)
|
+-- unlock Mutex
```
This approach is deliberately simple and correct:
- **Crash safety:** Every successful RPC response guarantees the data has been
written to the filesystem.
- **No partial writes:** The entire map is serialized atomically (though not to
a temp file with rename -- this is an MVP trade-off).
- **Performance:** Not suitable for production scale. Every write serializes and
writes the full map, which is O(n) in the total number of stored entries.
**Production improvement path:** Replace with a proper database (SQLite, sled,
or similar) for incremental writes, WAL-based crash safety, and concurrent
access without full serialization.
### KeyPackage Operations
| Method | Behavior |
|--------|----------|
| `upload_key_package(identity_key, package)` | Push to back of VecDeque; flush |
| `fetch_key_package(identity_key)` | Pop from front (FIFO, single-use); flush |
The KeyPackage map uses the `QueueMapV1` serialization wrapper:
```rust
#[derive(Serialize, Deserialize, Default)]
struct QueueMapV1 {
map: HashMap<Vec<u8>, VecDeque<Vec<u8>>>,
}
```
### Delivery Queue Operations
| Method | Behavior |
|--------|----------|
| `enqueue(recipient_key, channel_id, payload)` | Construct ChannelKey; push to back; flush |
| `fetch(recipient_key, channel_id)` | Construct ChannelKey; drain entire VecDeque; flush |
The delivery map uses `QueueMapV2` with the compound `ChannelKey`:
```rust
#[derive(Serialize, Deserialize, Clone, Eq, PartialEq, Debug)]
pub struct ChannelKey {
pub channel_id: Vec<u8>,
pub recipient_key: Vec<u8>,
}
#[derive(Serialize, Deserialize, Default)]
struct QueueMapV2 {
map: HashMap<ChannelKey, VecDeque<Vec<u8>>>,
}
```
See [Delivery Service Internals](delivery-service.md) for the full queue model
and channel-aware routing semantics.
### V1/V2 Delivery Map Migration
The delivery map format evolved from V1 (keyed by recipient key only) to V2
(keyed by `ChannelKey` with channel ID + recipient key). The load function
handles both formats transparently:
```rust
fn load_delivery_map(path: &Path) -> Result<HashMap<ChannelKey, VecDeque<Vec<u8>>>> {
let bytes = fs::read(path)?;
// Try V2 format first (channel-aware).
if let Ok(map) = bincode::deserialize::<QueueMapV2>(&bytes) {
return Ok(map.map);
}
// Fallback to legacy V1 format: migrate by setting channel_id = empty.
let legacy: QueueMapV1 = bincode::deserialize(&bytes)?;
let mut upgraded = HashMap::new();
for (recipient_key, queue) in legacy.map.into_iter() {
upgraded.insert(
ChannelKey { channel_id: Vec::new(), recipient_key },
queue,
);
}
Ok(upgraded)
}
```
Migration strategy:
1. Attempt to deserialize as V2 (`QueueMapV2`). If successful, use as-is.
2. If V2 fails, deserialize as V1 (`QueueMapV1`). Migrate each entry by
wrapping the recipient key in a `ChannelKey` with an empty `channel_id`.
3. The next flush will write V2 format, completing the migration.
This in-place migration is transparent to clients. Legacy messages (pre-channel
routing) appear under the empty channel ID and can still be fetched by clients
that pass an empty `channelId`.
### Hybrid Key Operations
| Method | Behavior |
|--------|----------|
| `upload_hybrid_key(identity_key, hybrid_pk)` | Insert (overwrite); flush |
| `fetch_hybrid_key(identity_key)` | Read-only lookup; no flush needed |
The hybrid key map is a flat `HashMap<Vec<u8>, Vec<u8>>` serialized directly
with bincode. Unlike KeyPackages, hybrid keys are not single-use -- they persist
until overwritten.
### Error Type
```rust
#[derive(thiserror::Error, Debug)]
pub enum StorageError {
#[error("io error: {0}")]
Io(String),
#[error("serialization error")]
Serde,
}
```
I/O errors (disk full, permission denied) and serialization errors (corrupt
file) are the two failure modes. The server converts `StorageError` to
`capnp::Error` via the `storage_err` helper for RPC responses.
---
## DiskKeyStore (Client-Side)
`DiskKeyStore` is the client-side key store that implements the openmls
`OpenMlsKeyStore` trait. It holds MLS cryptographic key material -- most
importantly, the HPKE init private keys created during KeyPackage generation.
### Structure
```rust
pub struct DiskKeyStore {
path: Option<PathBuf>, // None = ephemeral (in-memory only)
values: RwLock<HashMap<Vec<u8>, Vec<u8>>>, // key reference -> serialized MLS entity
}
```
The `RwLock` (not `Mutex`) allows concurrent reads. Write operations (store,
delete) take an exclusive lock and flush to disk.
### Modes
| Mode | Constructor | Persistence |
|------|-------------|-------------|
| Ephemeral | `DiskKeyStore::ephemeral()` | None. Data exists only in memory. Lost on process exit. |
| Persistent | `DiskKeyStore::persistent(path)` | Yes. Every write flushes the full map to disk. Survives process restarts. |
**Ephemeral mode** is used for tests and the `register` / `demo-group` CLI
commands where session resumption is not needed.
**Persistent mode** is used for production clients (`register-state`, `invite`,
`join`, `send`, `recv` commands). The key store file path is derived from the
state file path by changing the extension to `.ks`:
```rust
fn keystore_path(state_path: &Path) -> PathBuf {
let mut path = state_path.to_path_buf();
path.set_extension("ks");
path
}
```
So `quicnprotochat-state.bin` produces a key store at `quicnprotochat-state.ks`.
### Persistence Format
The key store is serialized as a bincode-encoded `HashMap<Vec<u8>, Vec<u8>>`.
Individual values are serialized using `serde_json` (as required by openmls's
`MlsEntity` trait bound):
```rust
fn store<V: MlsEntity>(&self, k: &[u8], v: &V) -> Result<(), Self::Error> {
let value = serde_json::to_vec(v)?; // MlsEntity -> JSON bytes
let mut values = self.values.write().unwrap();
values.insert(k.to_vec(), value);
drop(values); // release lock before I/O
self.flush() // bincode serialize full map to disk
}
```
The two-layer serialization (JSON for values, bincode for the map) is a
consequence of openmls requiring `serde_json`-compatible serialization for MLS
entities, while the outer map uses bincode for compactness.
### OpenMlsKeyStore Implementation
| Trait Method | DiskKeyStore Behavior |
|--------------|-----------------------|
| `store(k, v)` | JSON-serialize value, insert into HashMap, flush to disk |
| `read(k)` | Look up key, JSON-deserialize value, return `Option<V>` |
| `delete(k)` | Remove from HashMap, flush to disk |
The `read` method does not flush because it does not modify the map. A failed
deserialization (corrupt value) returns `None` rather than an error, which
matches the openmls `OpenMlsKeyStore` trait signature.
### Flush Behavior
```rust
fn flush(&self) -> Result<(), DiskKeyStoreError> {
let Some(path) = &self.path else {
return Ok(()); // ephemeral: no-op
};
let values = self.values.read().unwrap();
let bytes = bincode::serialize(&*values)?;
fs::create_dir_all(path.parent())?; // ensure parent dir exists
fs::write(path, bytes)?;
Ok(())
}
```
Like `FileBackedStore`, the flush serializes the entire map on every write.
For client-side usage, the map is typically small (a handful of HPKE keys), so
this is not a performance concern.
### Error Type
```rust
#[derive(thiserror::Error, Debug, PartialEq, Eq)]
pub enum DiskKeyStoreError {
#[error("serialization error")]
Serialization,
#[error("io error: {0}")]
Io(String),
}
```
---
## StoreCrypto
`StoreCrypto` is a composite type that bundles a `DiskKeyStore` with the
`RustCrypto` provider from `openmls_rust_crypto`. It implements the openmls
`OpenMlsCryptoProvider` trait, which is the single entry point that openmls
uses for all cryptographic operations:
```rust
pub struct StoreCrypto {
crypto: RustCrypto, // AES-GCM, SHA-256, X25519, Ed25519, etc.
key_store: DiskKeyStore, // HPKE init keys, MLS epoch secrets, etc.
}
impl OpenMlsCryptoProvider for StoreCrypto {
type CryptoProvider = RustCrypto;
type RandProvider = RustCrypto;
type KeyStoreProvider = DiskKeyStore;
fn crypto() -> &RustCrypto { &self.crypto }
fn rand() -> &RustCrypto { &self.crypto }
fn key_store() -> &DiskKeyStore { &self.key_store }
}
```
`StoreCrypto` is the `backend` field of [`GroupMember`](group-member-lifecycle.md).
It is passed to every openmls operation -- `KeyPackage::builder().build()`,
`MlsGroup::new_with_group_id()`, `MlsGroup::new_from_welcome()`,
`create_message()`, `process_message()`, etc.
The critical property is that the **same `StoreCrypto` instance** (and therefore
the same `DiskKeyStore`) must be used from `generate_key_package()` through
`join_group()`, because the HPKE init private key is stored in and read from
this key store.
---
## Storage Architecture Summary
```text
Server Client
====== ======
FileBackedStore DiskKeyStore
+-- key_packages (Mutex<HashMap>) +-- values (RwLock<HashMap>)
| Persisted: keypackages.bin | Persisted: {state}.ks
| Format: bincode(QueueMapV1) | Format: bincode(HashMap)
| | Values: serde_json(MlsEntity)
+-- deliveries (Mutex<HashMap>) |
| Persisted: deliveries.bin +-- Wrapped by StoreCrypto
| Format: bincode(QueueMapV2) | implements OpenMlsCryptoProvider
| Migration: V1 -> V2 on load |
| +-- Used by GroupMember.backend
+-- hybrid_keys (Mutex<HashMap>)
Persisted: hybridkeys.bin
Format: bincode(HashMap)
```
### Shared Design Patterns
Both backends share these characteristics:
1. **Full-map serialization.** Every write serializes the entire map to disk.
Simple, correct, but O(n) per write.
2. **Bincode format.** The outer map is always bincode-serialized. Compact and
fast, but not human-readable and not forward-compatible without wrapper
structs.
3. **No WAL / journaling.** A crash during `fs::write` could leave a corrupt
file. For the MVP, this is acceptable -- the data can be regenerated (clients
re-upload KeyPackages; delivery messages are ephemeral).
4. **No compaction.** Empty queues are not removed from the map. Over time, the
serialized size can grow with stale entries. A production implementation
should periodically compact empty entries.
5. **Directory creation.** Both backends call `fs::create_dir_all` before
writing, ensuring parent directories exist.
---
## Related Pages
- [GroupMember Lifecycle](group-member-lifecycle.md) -- how `StoreCrypto` and `DiskKeyStore` are used during MLS operations
- [KeyPackage Exchange Flow](keypackage-exchange.md) -- upload and fetch through `FileBackedStore`
- [Delivery Service Internals](delivery-service.md) -- delivery queue operations
- [Authentication Service Internals](authentication-service.md) -- KeyPackage and hybrid key storage
- [Key Lifecycle and Zeroization](../cryptography/key-lifecycle.md) -- how HPKE keys are created and destroyed