feat: add post-quantum hybrid KEM + SQLCipher persistence
Feature 1 — Post-Quantum Hybrid KEM (X25519 + ML-KEM-768): - Create hybrid_kem.rs with keygen, encrypt, decrypt + 11 unit tests - Wire format: version(1) | x25519_eph_pk(32) | mlkem_ct(1088) | nonce(12) | ct - Add uploadHybridKey/fetchHybridKey RPCs to node.capnp schema - Server: hybrid key storage in FileBackedStore + RPC handlers - Client: hybrid keypair in StoredState, auto-wrap/unwrap in send/recv/invite/join - demo-group runs full hybrid PQ envelope round-trip Feature 2 — SQLCipher Persistence: - Extract Store trait from FileBackedStore API - Create SqlStore (rusqlite + bundled-sqlcipher) with encrypted-at-rest SQLite - Schema: key_packages, deliveries, hybrid_keys tables with indexes - Server CLI: --store-backend=sql, --db-path, --db-key flags - 5 unit tests for SqlStore (FIFO, round-trip, upsert, channel isolation) Also includes: client lib.rs refactor, auth config, TOML config file support, mdBook documentation, and various cleanups by user. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
124
docs/src/design-rationale/adr-004-mls-unaware-ds.md
Normal file
124
docs/src/design-rationale/adr-004-mls-unaware-ds.md
Normal file
@@ -0,0 +1,124 @@
|
||||
# ADR-004: MLS-Unaware Delivery Service
|
||||
|
||||
**Status:** Accepted
|
||||
|
||||
---
|
||||
|
||||
## Context
|
||||
|
||||
The Delivery Service (DS) is the server-side component that stores and forwards messages between clients. A fundamental design question is: **should the DS understand MLS messages?**
|
||||
|
||||
An MLS-aware DS could inspect message types and perform optimizations:
|
||||
|
||||
- **Fan-out:** When a client sends a Commit or Application message intended for all group members, an MLS-aware DS could parse the group membership and deliver the message to all members automatically, instead of requiring the client to enqueue separately for each recipient.
|
||||
- **Membership validation:** An MLS-aware DS could verify that a sender is actually a member of the group before accepting a message, preventing spam from non-members.
|
||||
- **Epoch filtering:** An MLS-aware DS could reject messages from stale epochs, reducing the processing burden on recipients.
|
||||
- **Tree optimization:** An MLS-aware DS could cache the ratchet tree and assist with tree synchronization.
|
||||
|
||||
However, an MLS-aware DS would also:
|
||||
|
||||
- Have access to MLS message metadata (group IDs, epoch numbers, sender positions in the tree).
|
||||
- Require an MLS library dependency on the server.
|
||||
- Be more complex to implement, test, and audit.
|
||||
- Potentially violate the MLS architecture's trust model.
|
||||
|
||||
### What RFC 9420 says
|
||||
|
||||
RFC 9420 Section 4 defines the DS as a component that:
|
||||
|
||||
> "is responsible for ordering handshake messages and delivering them to each client."
|
||||
|
||||
Critically, the RFC specifies that the DS **does not have access to group keys** and treats message content as opaque. The DS's role is limited to:
|
||||
|
||||
1. Ordering: ensuring that handshake messages (Commits) are applied in a consistent order across all group members.
|
||||
2. Delivery: routing messages to the correct recipients.
|
||||
3. Optional: enforcing access control (e.g., only group members can send to the group).
|
||||
|
||||
The RFC explicitly envisions that the DS operates on opaque blobs, not on decrypted MLS content.
|
||||
|
||||
---
|
||||
|
||||
## Decision
|
||||
|
||||
The quicnprotochat Delivery Service is **MLS-unaware**. It routes opaque byte strings by `(recipientKey, channelId)` without parsing, inspecting, or validating any MLS content.
|
||||
|
||||
### What the DS sees
|
||||
|
||||
```text
|
||||
DS perspective:
|
||||
enqueue(recipientKey=0x1234..., payload=<opaque bytes>, channelId=<uuid>, version=1)
|
||||
fetch(recipientKey=0x1234..., channelId=<uuid>, version=1) -> [<opaque bytes>, ...]
|
||||
|
||||
DS does NOT see:
|
||||
- Whether the payload is a Welcome, Commit, or Application message
|
||||
- The MLS group ID or epoch number
|
||||
- The sender's position in the ratchet tree
|
||||
- Any plaintext content
|
||||
```
|
||||
|
||||
### Routing responsibility
|
||||
|
||||
Because the DS does not parse MLS messages, the **client** is responsible for routing:
|
||||
|
||||
| MLS Operation | Client's Routing Responsibility |
|
||||
|---|---|
|
||||
| `add_members()` | Enqueue the Welcome message to the new member's `recipientKey`. Enqueue the Commit to each existing member's `recipientKey`. |
|
||||
| `remove_members()` | Enqueue the Commit to each remaining member's `recipientKey`. |
|
||||
| `create_message()` | Enqueue the Application message to each group member's `recipientKey`. |
|
||||
| `self_update()` | Enqueue the Commit to each other member's `recipientKey`. |
|
||||
|
||||
This means that sending a message to a group of n members requires n-1 enqueue calls (one per recipient, excluding the sender). The client must maintain its own copy of the group membership list.
|
||||
|
||||
---
|
||||
|
||||
## Consequences
|
||||
|
||||
### Benefits
|
||||
|
||||
- **Correct MLS architecture.** The DS does not hold group keys or inspect group state, which is the architecture recommended by RFC 9420 Section 4. A compromised DS learns nothing about message content or group structure beyond the routing metadata (recipient keys and channel IDs).
|
||||
|
||||
- **Audit-friendly.** The DS's audit log is a simple append-only sequence of `(timestamp, recipientKey, channelId, payload_hash)` entries. There is no complex state machine to audit. The server's behavior is trivially verifiable: it accepts blobs and returns them in FIFO order.
|
||||
|
||||
- **No MLS dependency on the server.** The server does not depend on `openmls` or any MLS library. This reduces the server's attack surface, compile time, and binary size. It also means the server is completely decoupled from MLS version upgrades.
|
||||
|
||||
- **Simplicity.** The DS is a hash map of FIFO queues. The entire implementation fits in a few hundred lines of Rust. There are no edge cases around epoch transitions, tree synchronization, or membership conflicts.
|
||||
|
||||
- **Protocol agnosticism.** The DS can carry any payload, not just MLS messages. Future protocol extensions (e.g., signaling for voice/video, file transfer metadata) can reuse the same delivery infrastructure without modification.
|
||||
|
||||
### Costs and trade-offs
|
||||
|
||||
- **No server-side fan-out.** The client must enqueue separately for each recipient. For a group of n members, this means n-1 enqueue calls per message, compared to 1 call if the DS could fan out. This increases client bandwidth usage by a factor of approximately n for the routing metadata (though the payload is the same in each call).
|
||||
|
||||
- **No server-side membership validation.** The DS cannot verify that a sender is a member of the group. A malicious client could enqueue messages to any recipient key, potentially causing the recipient to process (and reject) invalid MLS messages. This is mitigated by MLS's own authentication: invalid messages are rejected during MLS processing.
|
||||
|
||||
- **No server-side ordering guarantees.** RFC 9420 envisions the DS providing a consistent ordering of handshake messages. The current DS provides FIFO ordering per `(recipientKey, channelId)` queue, but it does not provide global ordering across all group members. In practice, MLS handles out-of-order delivery gracefully (Commits include the epoch number, and clients can buffer messages for future epochs).
|
||||
|
||||
- **Client complexity.** The client must maintain the group membership list and perform per-recipient routing. This is additional state that the client must manage correctly. An incorrect membership list results in some members not receiving messages.
|
||||
|
||||
### Residual risks
|
||||
|
||||
- **Metadata exposure.** While the DS does not see message content, it does see routing metadata: which recipient keys receive messages, when, and on which channels. This metadata can reveal communication patterns. Mitigation: use channel IDs that are not correlated with real-world identifiers, and consider padding to hide message sizes.
|
||||
|
||||
- **Denial of service.** Because the DS does not validate senders, a malicious client could flood a recipient's queue with garbage payloads. Mitigation: rate limiting (planned for a future milestone) and the `Auth` struct for sender identification.
|
||||
|
||||
---
|
||||
|
||||
## Code references
|
||||
|
||||
| File | Relevance |
|
||||
|---|---|
|
||||
| `schemas/delivery.capnp` | DeliveryService RPC interface (opaque `Data` payloads) |
|
||||
| `schemas/node.capnp` | NodeService: `enqueue`, `fetch`, `fetchWait` methods |
|
||||
| `crates/quicnprotochat-server/src/storage.rs` | Server-side queue storage (DashMap-based FIFO queues) |
|
||||
| `crates/quicnprotochat-server/src/main.rs` | NodeService RPC handler implementation |
|
||||
|
||||
---
|
||||
|
||||
## Further reading
|
||||
|
||||
- [Design Decisions Overview](overview.md) -- index of all ADRs
|
||||
- [Delivery Schema](../wire-format/delivery-schema.md) -- the DS RPC interface definition
|
||||
- [NodeService Schema](../wire-format/node-service-schema.md) -- the unified interface that includes DS methods
|
||||
- [ADR-005: Single-Use KeyPackages](adr-005-single-use-keypackages.md) -- related AS design decision
|
||||
- [Architecture Overview](../architecture/overview.md) -- system-level view showing DS in context
|
||||
- [Why This Design, Not Signal/Matrix/...](why-not-signal.md) -- broader protocol comparison
|
||||
Reference in New Issue
Block a user