feat: add post-quantum hybrid KEM + SQLCipher persistence

Feature 1 — Post-Quantum Hybrid KEM (X25519 + ML-KEM-768):
- Create hybrid_kem.rs with keygen, encrypt, decrypt + 11 unit tests
- Wire format: version(1) | x25519_eph_pk(32) | mlkem_ct(1088) | nonce(12) | ct
- Add uploadHybridKey/fetchHybridKey RPCs to node.capnp schema
- Server: hybrid key storage in FileBackedStore + RPC handlers
- Client: hybrid keypair in StoredState, auto-wrap/unwrap in send/recv/invite/join
- demo-group runs full hybrid PQ envelope round-trip

Feature 2 — SQLCipher Persistence:
- Extract Store trait from FileBackedStore API
- Create SqlStore (rusqlite + bundled-sqlcipher) with encrypted-at-rest SQLite
- Schema: key_packages, deliveries, hybrid_keys tables with indexes
- Server CLI: --store-backend=sql, --db-path, --db-key flags
- 5 unit tests for SqlStore (FIFO, round-trip, upsert, channel isolation)

Also includes: client lib.rs refactor, auth config, TOML config file support,
mdBook documentation, and various cleanups by user.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-22 08:07:48 +01:00
parent d1ddef4cea
commit f334ed3d43
81 changed files with 14502 additions and 2289 deletions

View File

@@ -0,0 +1,278 @@
# Cap'n Proto Serialisation and RPC
quicnprotochat uses [Cap'n Proto](https://capnproto.org/) for both message serialisation and remote procedure calls. The serialisation layer encodes structured messages (Envelopes, Auth tokens, delivery payloads) into a compact binary format. The RPC layer provides the client-server interface for the Authentication Service, Delivery Service, and health checks -- all exposed through a single `NodeService` interface.
This page covers why Cap'n Proto was chosen, how schemas are compiled, the owned `ParsedEnvelope` type, serialisation helpers, and ALPN integration with QUIC.
## Why Cap'n Proto
Several serialisation formats were considered. The table below summarises the trade-offs:
| Format | Zero-copy reads | Schema enforcement | Built-in RPC | Canonical bytes for signing |
|---|---|---|---|---|
| **Cap'n Proto** | Yes | Yes (`.capnp` schemas) | Yes (`capnp-rpc`) | Yes (canonical serialisation mode) |
| Protocol Buffers | No (requires deserialisation) | Yes (`.proto` schemas) | Yes (`tonic`/gRPC) | No (non-deterministic field ordering) |
| MessagePack | No | No (untyped) | No | No |
| FlatBuffers | Yes | Yes (`.fbs` schemas) | No built-in RPC | Partial |
Cap'n Proto was selected for the following reasons:
1. **Zero-copy reads**: Cap'n Proto messages can be read directly from the wire buffer without deserialisation. The `Reader` type is a thin pointer into the original bytes. This eliminates allocation and copying on the hot path (message routing in the Delivery Service).
2. **Schema-enforced types**: All messages are defined in `.capnp` schema files. The compiler (`capnpc`) generates type-safe Rust code that prevents mismatched field types at compile time. This is especially valuable for a security-sensitive protocol where a type confusion bug could be exploitable.
3. **Canonical serialisation**: Cap'n Proto can produce deterministic byte representations of messages. This is critical for MLS, where Commits and KeyPackages must be signed -- the signature must cover exactly the same bytes that the verifier will see.
4. **Built-in async RPC**: The `capnp-rpc` crate provides a capability-based RPC system with promise pipelining. quicnprotochat uses it for the `NodeService` interface (KeyPackage upload/fetch, message enqueue/fetch, health checks, hybrid key operations). This avoids the need to hand-roll a request/response protocol.
5. **Compact wire format**: Cap'n Proto's wire format is more compact than JSON or XML and comparable to Protocol Buffers, with the advantage of no decode step.
## Schema compilation flow
Cap'n Proto schemas live in the workspace-root `schemas/` directory:
```text
schemas/
envelope.capnp -- Top-level wire message (MsgType enum + payload)
auth.capnp -- AuthenticationService RPC interface (legacy, pre-M3)
delivery.capnp -- DeliveryService RPC interface (legacy, pre-M3)
node.capnp -- Unified NodeService RPC interface (M3+)
```
### build.rs
The `quicnprotochat-proto` crate compiles these schemas at build time via `build.rs`:
```rust
capnpc::CompilerCommand::new()
.src_prefix(&schemas_dir)
.file(schemas_dir.join("envelope.capnp"))
.file(schemas_dir.join("auth.capnp"))
.file(schemas_dir.join("delivery.capnp"))
.file(schemas_dir.join("node.capnp"))
.run()
.expect("Cap'n Proto schema compilation failed.");
```
Key details:
- **`src_prefix`**: Set to `schemas/` so that inter-schema imports resolve correctly.
- **Output location**: Generated Rust source is written to `$OUT_DIR` (Cargo's build directory). The filenames follow the convention `{schema_name}_capnp.rs`.
- **Rerun triggers**: `cargo:rerun-if-changed` directives ensure the build script re-runs whenever any `.capnp` file changes.
- **Prerequisite**: The `capnp` CLI binary must be installed on the build machine (`apt-get install capnproto` or `brew install capnp`).
### Generated module inclusion
The generated code is spliced into the `quicnprotochat-proto` crate via `include!` macros:
```rust
pub mod envelope_capnp {
include!(concat!(env!("OUT_DIR"), "/envelope_capnp.rs"));
}
pub mod auth_capnp {
include!(concat!(env!("OUT_DIR"), "/auth_capnp.rs"));
}
pub mod delivery_capnp {
include!(concat!(env!("OUT_DIR"), "/delivery_capnp.rs"));
}
pub mod node_capnp {
include!(concat!(env!("OUT_DIR"), "/node_capnp.rs"));
}
```
Consumers import types from these modules. For example, `node_capnp::node_service::Server` is the trait that the server implements.
## The Envelope schema
The `Envelope` is the top-level wire message for all quicnprotochat traffic. Every frame exchanged between peers (whether over Noise or QUIC) is serialised as an Envelope:
```capnp
struct Envelope {
msgType @0 :MsgType;
groupId @1 :Data; # 32-byte SHA-256 digest of group name
senderId @2 :Data; # 32-byte SHA-256 digest of Ed25519 pubkey
payload @3 :Data; # Opaque payload (MLS blob or control data)
timestampMs @4 :UInt64; # Unix epoch milliseconds
enum MsgType {
ping @0;
pong @1;
keyPackageUpload @2;
keyPackageFetch @3;
keyPackageResponse @4;
mlsWelcome @5;
mlsCommit @6;
mlsApplication @7;
error @8;
}
}
```
The Delivery Service routes by `(groupId, msgType)` without inspecting `payload`. This design keeps the DS MLS-unaware -- see [ADR-004: MLS-Unaware Delivery Service](../design-rationale/adr-004-mls-unaware-ds.md).
## The `ParsedEnvelope` owned type
Cap'n Proto readers (`envelope_capnp::envelope::Reader`) borrow from the original byte buffer and cannot be sent across async task boundaries (`!Send`). This is a fundamental limitation of zero-copy reads.
To bridge this gap, `quicnprotochat-proto` defines `ParsedEnvelope`:
```rust
pub struct ParsedEnvelope {
pub msg_type: MsgType,
pub group_id: Vec<u8>,
pub sender_id: Vec<u8>,
pub payload: Vec<u8>,
pub timestamp_ms: u64,
}
```
`ParsedEnvelope` eagerly copies all byte fields out of the Cap'n Proto reader, making the type `Send + 'static`. This allows it to cross Tokio task boundaries, be stored in queues, and be passed through channels.
The trade-off is clear: `ParsedEnvelope` allocates and copies, defeating the zero-copy benefit. This is acceptable because:
1. The copying happens once per message at the protocol boundary.
2. Application-layer code (MLS encryption/decryption, routing) needs owned data anyway.
3. The performance-critical path (Delivery Service routing) works with opaque `Vec<u8>` payloads, not parsed Cap'n Proto readers.
### Invariants
- `group_id` and `sender_id` are either empty (for control messages like Ping/Pong) or exactly 32 bytes (SHA-256 digest).
- `payload` is empty for Ping and Pong; non-empty for all MLS variants.
## Serialisation helpers
Two functions handle the conversion between `ParsedEnvelope` and wire bytes:
### `build_envelope`
```rust
pub fn build_envelope(env: &ParsedEnvelope) -> Result<Vec<u8>, capnp::Error>
```
Serialises a `ParsedEnvelope` to unpacked Cap'n Proto wire bytes. The output includes the Cap'n Proto segment table header followed by the message data. These bytes are suitable as the body of a length-prefixed frame (the `LengthPrefixedCodec` in `quicnprotochat-core` prepends the 4-byte length) or as a payload within a QUIC stream.
Internally, it builds a `capnp::message::Builder`, populates an `Envelope` root, and serialises via `capnp::serialize::write_message`.
### `parse_envelope`
```rust
pub fn parse_envelope(bytes: &[u8]) -> Result<ParsedEnvelope, capnp::Error>
```
Deserialises unpacked Cap'n Proto wire bytes into a `ParsedEnvelope`. All data is copied out of the reader before returning, so the input slice is not retained.
It returns `capnp::Error` if:
- The bytes are not valid Cap'n Proto wire format.
- The `msgType` discriminant is not present in the current schema (forward-compatibility guard).
### Low-level helpers
Two additional functions provide raw byte-to-message conversions:
```rust
pub fn to_bytes<A: Allocator>(msg: &Builder<A>) -> Result<Vec<u8>, capnp::Error>
pub fn from_bytes(bytes: &[u8]) -> Result<Reader<OwnedSegments>, capnp::Error>
```
`from_bytes` uses `ReaderOptions::new()` with default limits:
- **Traversal limit**: 64 MiB (8 * 1024 * 1024 words)
- **Nesting limit**: 512 levels
These defaults are reasonable for trusted data. For untrusted data from the network, callers should consider tightening `traversal_limit_in_words` to prevent denial-of-service via deeply nested or excessively large messages. The server enforces its own size limits: 5 MB per payload (`MAX_PAYLOAD_BYTES`) and 1 MB per KeyPackage (`MAX_KEYPACKAGE_BYTES`).
## The NodeService RPC interface
The M3 unified RPC interface is defined in `schemas/node.capnp`:
```capnp
interface NodeService {
uploadKeyPackage @0 (identityKey :Data, package :Data, auth :Auth)
-> (fingerprint :Data);
fetchKeyPackage @1 (identityKey :Data, auth :Auth) -> (package :Data);
enqueue @2 (recipientKey :Data, payload :Data,
channelId :Data, version :UInt16, auth :Auth) -> ();
fetch @3 (recipientKey :Data, channelId :Data,
version :UInt16, auth :Auth) -> (payloads :List(Data));
fetchWait @4 (recipientKey :Data, channelId :Data,
version :UInt16, timeoutMs :UInt64, auth :Auth)
-> (payloads :List(Data));
health @5 () -> (status :Text);
uploadHybridKey @6 (identityKey :Data, hybridPublicKey :Data) -> ();
fetchHybridKey @7 (identityKey :Data) -> (hybridPublicKey :Data);
}
```
This combines Authentication Service operations (`uploadKeyPackage`, `fetchKeyPackage`), Delivery Service operations (`enqueue`, `fetch`, `fetchWait`), health monitoring (`health`), and hybrid key management (`uploadHybridKey`, `fetchHybridKey`) into a single RPC interface.
### Auth context
Every mutating RPC method accepts an `Auth` struct:
```capnp
struct Auth {
version @0 :UInt16; # 0 = legacy/none, 1 = token-based auth
accessToken @1 :Data; # opaque bearer token
deviceId @2 :Data; # optional UUID bytes for auditing
}
```
The server validates the `version` field and rejects unknown versions. Token validation is planned for a future milestone. See [Auth, Devices, and Tokens](../roadmap/authz-plan.md).
## ALPN integration
Cap'n Proto RPC rides directly on the QUIC bidirectional stream. The ALPN (Application-Layer Protocol Negotiation) extension in the TLS handshake identifies the protocol:
```rust
tls.alpn_protocols = vec![b"capnp".to_vec()];
```
Both client and server set the ALPN to `b"capnp"`. If the client and server disagree on the ALPN, the TLS handshake fails before any application data is exchanged.
On the QUIC path, the flow is:
```text
Client Server
| |
|── QUIC handshake (TLS 1.3) ────►| ALPN: "capnp"
| |
|── open_bi() ───────────────────►| Bidirectional QUIC stream
| |
|◄─────── capnp-rpc messages ────►| VatNetwork reads/writes on the stream
```
The `tokio-util` compat layer converts Quinn stream types into `futures::AsyncRead + AsyncWrite`, which `capnp-rpc`'s `VatNetwork` expects. See [QUIC + TLS 1.3](quic-tls.md) for the full connection setup.
On the legacy Noise path, the `into_capnp_io()` bridge serves the same purpose -- converting a Noise-encrypted TCP connection into a byte stream for `VatNetwork`. See [Noise\_XX Handshake](noise-xx.md) for details.
## Comparison with alternatives
### vs Protocol Buffers + gRPC
Protocol Buffers require a full deserialisation step to access any field. Cap'n Proto avoids this with zero-copy readers. gRPC requires HTTP/2 framing, which adds overhead on top of QUIC. Cap'n Proto RPC is leaner and maps naturally to a single QUIC stream.
### vs MessagePack
MessagePack is untyped -- there is no schema file, and type errors are caught at runtime. This is unacceptable for a security protocol where a misinterpreted field could be exploitable. MessagePack also has no RPC framework, requiring a hand-rolled request/response protocol.
### vs FlatBuffers
FlatBuffers supports zero-copy reads (like Cap'n Proto) but lacks a built-in RPC framework. The ecosystem and tooling are also less mature for Rust.
## Design constraints of `quicnprotochat-proto`
The `quicnprotochat-proto` crate enforces three design constraints:
1. **No crypto**: Key material never enters this crate. All encryption and signing happens in `quicnprotochat-core`.
2. **No I/O**: Callers own the transport. This crate only converts between bytes and types.
3. **No async**: Pure synchronous data-layer code. Async is the caller's responsibility.
These constraints keep the serialisation layer thin and auditable.
## Further reading
- [Envelope Schema](../wire-format/envelope-schema.md) -- Detailed field-by-field breakdown of the Envelope wire format.
- [NodeService Schema](../wire-format/node-service-schema.md) -- Full RPC interface documentation.
- [Auth Schema](../wire-format/auth-schema.md) -- Auth token structure and versioning.
- [MLS (RFC 9420)](mls.md) -- How MLS messages are carried as opaque payloads inside Cap'n Proto Envelopes.
- [ADR-002: Cap'n Proto over MessagePack](../design-rationale/adr-002-capnproto.md) -- Design rationale for choosing Cap'n Proto.
- [ADR-003: RPC Inside the Noise Tunnel](../design-rationale/adr-003-rpc-inside-noise.md) -- Why RPC runs inside the encrypted transport.