docs: rewrite mdBook documentation for v2 architecture

Update 25+ files and add 6 new pages to reflect the v2 migration from
Cap'n Proto to Protobuf framing over QUIC. Integrates SDK and Operations
docs into the mdBook, restructures SUMMARY.md, and rewrites the wire
format, architecture, and protocol sections with accurate v2 content.
This commit is contained in:
2026-03-04 22:02:31 +01:00
parent f7a7f672b4
commit d073f614b3
31 changed files with 4423 additions and 2379 deletions

View File

@@ -1,264 +1,299 @@
# Cap'n Proto Serialisation and RPC
# Protobuf Framing
quicproquo uses [Cap'n Proto](https://capnproto.org/) for both message serialisation and remote procedure calls. The serialisation layer encodes structured messages (Envelopes, Auth tokens, delivery payloads) into a compact binary format. The RPC layer provides the client-server interface for the Authentication Service, Delivery Service, and health checks -- all exposed through a single `NodeService` interface.
quicproquo v2 uses a custom binary framing protocol layered over QUIC bidirectional streams. Message payloads are serialised with Protocol Buffers (Protobuf) via the `prost` crate. The framing layer (implemented in `quicproquo-rpc`) adds a compact fixed-size header that carries the method ID, request correlation ID, and payload length -- enabling zero-copy dispatch without a separate length-delimited codec.
This page covers why Cap'n Proto was chosen, how schemas are compiled, the owned `ParsedEnvelope` type, serialisation helpers, and ALPN integration with QUIC.
This page covers the three frame types, the method ID dispatch table, status codes, push event delivery, and the Protobuf schema organisation.
## Why Cap'n Proto
---
Several serialisation formats were considered. The table below summarises the trade-offs:
## Frame Types
| Format | Zero-copy reads | Schema enforcement | Built-in RPC | Canonical bytes for signing |
|---|---|---|---|---|
| **Cap'n Proto** | Yes | Yes (`.capnp` schemas) | Yes (`capnp-rpc`) | Yes (canonical serialisation mode) |
| Protocol Buffers | No (requires deserialisation) | Yes (`.proto` schemas) | Yes (`tonic`/gRPC) | No (non-deterministic field ordering) |
| MessagePack | No | No (untyped) | No | No |
| FlatBuffers | Yes | Yes (`.fbs` schemas) | No built-in RPC | Partial |
There are three frame types in the v2 protocol. All multi-byte integers are **big-endian** (network byte order).
Cap'n Proto was selected for the following reasons:
### Request Frame (client -> server)
1. **Zero-copy reads**: Cap'n Proto messages can be read directly from the wire buffer without deserialisation. The `Reader` type is a thin pointer into the original bytes. This eliminates allocation and copying on the hot path (message routing in the Delivery Service).
Sent on a QUIC bidirectional stream (one stream per RPC call):
2. **Schema-enforced types**: All messages are defined in `.capnp` schema files. The compiler (`capnpc`) generates type-safe Rust code that prevents mismatched field types at compile time. This is especially valuable for a security-sensitive protocol where a type confusion bug could be exploitable.
3. **Canonical serialisation**: Cap'n Proto can produce deterministic byte representations of messages. This is critical for MLS, where Commits and KeyPackages must be signed -- the signature must cover exactly the same bytes that the verifier will see.
4. **Built-in async RPC**: The `capnp-rpc` crate provides a capability-based RPC system with promise pipelining. quicproquo uses it for the `NodeService` interface (KeyPackage upload/fetch, message enqueue/fetch, health checks, hybrid key operations). This avoids the need to hand-roll a request/response protocol.
5. **Compact wire format**: Cap'n Proto's wire format is more compact than JSON or XML and comparable to Protocol Buffers, with the advantage of no decode step.
## Schema compilation flow
Cap'n Proto schemas live in the workspace-root `schemas/` directory:
```text
schemas/
envelope.capnp -- Top-level wire message (MsgType enum + payload)
auth.capnp -- AuthenticationService RPC interface (legacy, pre-M3)
delivery.capnp -- DeliveryService RPC interface (legacy, pre-M3)
node.capnp -- Unified NodeService RPC interface (M3+)
```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| method_id (u16 BE) | request_id (u32 BE) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| request_id (cont.) | payload_len (u32 BE) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| payload_len (cont.) | protobuf payload ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```
### build.rs
| Field | Type | Bytes | Description |
|---|---|---|---|
| `method_id` | `u16` | 0-1 | RPC method identifier (see method IDs table) |
| `request_id` | `u32` | 2-5 | Client-generated correlation ID; echoed back in the response |
| `payload_len` | `u32` | 6-9 | Length of the Protobuf payload in bytes |
| payload | bytes | 10+ | Protobuf-encoded request message |
The `quicproquo-proto` crate compiles these schemas at build time via `build.rs`:
Header size: **10 bytes**. Maximum payload: **4 MiB**.
### Response Frame (server -> client)
Sent on the same QUIC bidirectional stream as the request:
```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| status (u8) | request_id (u32 BE) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| request_id (cont.) | payload_len (u32 BE) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| payload_len (cont.) | protobuf payload ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```
| Field | Type | Bytes | Description |
|---|---|---|---|
| `status` | `u8` | 0 | Status code (see status codes table) |
| `request_id` | `u32` | 1-4 | Echoes the `request_id` from the request frame |
| `payload_len` | `u32` | 5-8 | Length of the Protobuf payload in bytes |
| payload | bytes | 9+ | Protobuf-encoded response message (may be empty on error) |
Header size: **9 bytes**.
### Push Frame (server -> client, uni-stream)
Sent by the server on QUIC uni-directional streams for real-time event delivery. No request ID -- push frames are not correlated to any client request.
```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| event_type (u16 BE) | payload_len (u32 BE) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| payload_len (cont.) | protobuf payload ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```
| Field | Type | Bytes | Description |
|---|---|---|---|
| `event_type` | `u16` | 0-1 | Push event type (see push event types table) |
| `payload_len` | `u32` | 2-5 | Length of the Protobuf payload in bytes |
| payload | bytes | 6+ | Protobuf-encoded push event message |
Header size: **6 bytes**.
---
## Status Codes
The `status` byte in a Response frame carries one of the following values:
| Value | `RpcStatus` variant | Meaning |
|-------|---------------------|---------|
| 0 | `Ok` | Success. Response payload contains the result. |
| 1 | `BadRequest` | Malformed request, missing required field, or failed validation. |
| 2 | `Unauthorized` | Missing or invalid session token. |
| 3 | `Forbidden` | Valid token but insufficient permissions for this operation. |
| 4 | `NotFound` | Requested resource does not exist (e.g., KeyPackage not found). |
| 5 | `RateLimited` | Request rate limit exceeded. Client should back off before retrying. |
| 8 | `DeadlineExceeded` | Server could not complete the request within the configured deadline. |
| 9 | `Unavailable` | Server temporarily unable to serve the request (e.g., storage unavailable). |
| 10 | `Internal` | Unexpected server-side error. |
| 11 | `UnknownMethod` | The `method_id` in the request is not registered. |
---
## Method IDs
All 44 RPC method IDs are defined in `crates/quicproquo-proto/src/lib.rs` in the `method_ids` module. The numeric ranges group related methods by service category.
### Auth (100-103)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 100 | `OpaqueRegisterStart` | `OpaqueRegisterStartRequest` | `OpaqueRegisterStartResponse` |
| 101 | `OpaqueRegisterFinish` | `OpaqueRegisterFinishRequest` | `OpaqueRegisterFinishResponse` |
| 102 | `OpaqueLoginStart` | `OpaqueLoginStartRequest` | `OpaqueLoginStartResponse` |
| 103 | `OpaqueLoginFinish` | `OpaqueLoginFinishRequest` | `OpaqueLoginFinishResponse` |
### Delivery (200-205)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 200 | `Enqueue` | `EnqueueRequest` | `EnqueueResponse` |
| 201 | `Fetch` | `FetchRequest` | `FetchResponse` |
| 202 | `FetchWait` | `FetchWaitRequest` | `FetchWaitResponse` |
| 203 | `Peek` | `PeekRequest` | `PeekResponse` |
| 204 | `Ack` | `AckRequest` | `AckResponse` |
| 205 | `BatchEnqueue` | `BatchEnqueueRequest` | `BatchEnqueueResponse` |
### Keys (300-304)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 300 | `UploadKeyPackage` | `UploadKeyPackageRequest` | `UploadKeyPackageResponse` |
| 301 | `FetchKeyPackage` | `FetchKeyPackageRequest` | `FetchKeyPackageResponse` |
| 302 | `UploadHybridKey` | `UploadHybridKeyRequest` | `UploadHybridKeyResponse` |
| 303 | `FetchHybridKey` | `FetchHybridKeyRequest` | `FetchHybridKeyResponse` |
| 304 | `FetchHybridKeys` | `FetchHybridKeysRequest` | `FetchHybridKeysResponse` |
### Channel (400)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 400 | `CreateChannel` | `CreateChannelRequest` | `CreateChannelResponse` |
### Group Management (410-413)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 410 | `RemoveMember` | `RemoveMemberRequest` | `RemoveMemberResponse` |
| 411 | `UpdateGroupMetadata` | `UpdateGroupMetadataRequest` | `UpdateGroupMetadataResponse` |
| 412 | `ListGroupMembers` | `ListGroupMembersRequest` | `ListGroupMembersResponse` |
| 413 | `RotateKeys` | `RotateKeysRequest` | `RotateKeysResponse` |
### Moderation (420-424)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 420 | `ReportMessage` | `ReportMessageRequest` | `ReportMessageResponse` |
| 421 | `BanUser` | `BanUserRequest` | `BanUserResponse` |
| 422 | `UnbanUser` | `UnbanUserRequest` | `UnbanUserResponse` |
| 423 | `ListReports` | `ListReportsRequest` | `ListReportsResponse` |
| 424 | `ListBanned` | `ListBannedRequest` | `ListBannedResponse` |
### User / Identity (500-501)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 500 | `ResolveUser` | `ResolveUserRequest` | `ResolveUserResponse` |
| 501 | `ResolveIdentity` | `ResolveIdentityRequest` | `ResolveIdentityResponse` |
### Key Transparency (510-520)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 510 | `RevokeKey` | `RevokeKeyRequest` | `RevokeKeyResponse` |
| 511 | `CheckRevocation` | `CheckRevocationRequest` | `CheckRevocationResponse` |
| 520 | `AuditKeyTransparency` | `AuditKeyTransparencyRequest` | `AuditKeyTransparencyResponse` |
### Blob Storage (600-601)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 600 | `UploadBlob` | `UploadBlobRequest` | `UploadBlobResponse` |
| 601 | `DownloadBlob` | `DownloadBlobRequest` | `DownloadBlobResponse` |
### Device Management (700-710)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 700 | `RegisterDevice` | `RegisterDeviceRequest` | `RegisterDeviceResponse` |
| 701 | `ListDevices` | `ListDevicesRequest` | `ListDevicesResponse` |
| 702 | `RevokeDevice` | `RevokeDeviceRequest` | `RevokeDeviceResponse` |
| 710 | `RegisterPushToken` | `RegisterPushTokenRequest` | `RegisterPushTokenResponse` |
### Recovery (750-752)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 750 | `StoreRecoveryBundle` | `StoreRecoveryBundleRequest` | `StoreRecoveryBundleResponse` |
| 751 | `FetchRecoveryBundle` | `FetchRecoveryBundleRequest` | `FetchRecoveryBundleResponse` |
| 752 | `DeleteRecoveryBundle` | `DeleteRecoveryBundleRequest` | `DeleteRecoveryBundleResponse` |
### P2P and Health (800-802)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 800 | `PublishEndpoint` | `PublishEndpointRequest` | `PublishEndpointResponse` |
| 801 | `ResolveEndpoint` | `ResolveEndpointRequest` | `ResolveEndpointResponse` |
| 802 | `Health` | `HealthRequest` | `HealthResponse` |
### Federation (900-905)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 900 | `RelayEnqueue` | `RelayEnqueueRequest` | `RelayEnqueueResponse` |
| 901 | `RelayBatchEnqueue` | `RelayBatchEnqueueRequest` | `RelayBatchEnqueueResponse` |
| 902 | `ProxyFetchKeyPackage` | `ProxyFetchKeyPackageRequest` | `ProxyFetchKeyPackageResponse` |
| 903 | `ProxyFetchHybridKey` | `ProxyFetchHybridKeyRequest` | `ProxyFetchHybridKeyResponse` |
| 904 | `ProxyResolveUser` | `ProxyResolveUserRequest` | `ProxyResolveUserResponse` |
| 905 | `FederationHealth` | `FederationHealthRequest` | `FederationHealthResponse` |
### Account (950)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 950 | `DeleteAccount` | `DeleteAccountRequest` | `DeleteAccountResponse` |
---
## Push Event Types
Server-to-client push events are delivered on QUIC uni-streams using the Push frame format. Event types are defined alongside method IDs in `quicproquo-proto/src/lib.rs`:
| Value | Event | Description |
|-------|-------|-------------|
| 1000 | `PushNewMessage` | A new message has been enqueued for the client. |
| 1001 | `PushTyping` | A group member has started or stopped typing. |
| 1002 | `PushPresence` | A contact's presence status has changed (online/offline). |
| 1003 | `PushMembership` | Group membership changed (member added or removed). |
Push events avoid the need for the client to long-poll `FetchWait` (202) for real-time delivery. The client can listen on a background task for incoming uni-streams and process push events independently of pending RPC calls.
---
## Stream Model
Each RPC call uses a **dedicated QUIC bidirectional stream**:
1. Client opens a new bidirectional stream (`connection.open_bi()`).
2. Client encodes the request into a `RequestFrame` and writes it to the send half.
3. Client closes the send half (marks end-of-write).
4. Server reads the complete `RequestFrame` from the receive half.
5. Server processes the request and writes a `ResponseFrame` to its send half.
6. Server closes the send half.
7. Client reads the complete `ResponseFrame`.
This allows many concurrent RPCs on a single QUIC connection without head-of-line blocking.
---
## Protobuf Schema Organisation
All message types are defined in `proto/qpq/v1/`:
| File | Contents |
|---|---|
| `auth.proto` | OPAQUE registration and login message types |
| `common.proto` | Auth context, account deletion, shared types |
| `delivery.proto` | Enqueue, Fetch, Peek, Ack, BatchEnqueue |
| `keys.proto` | MLS key packages, hybrid keys |
| `channel.proto` | Channel creation |
| `group.proto` | Group management (remove member, metadata, rotate keys) |
| `moderation.proto` | Report, ban, unban, list |
| `user.proto` | User and identity resolution |
| `kt.proto` | Key transparency (revoke, check, audit) |
| `blob.proto` | Binary object storage |
| `device.proto` | Multi-device management, push tokens |
| `recovery.proto` | Account recovery bundles |
| `p2p.proto` | P2P endpoints, health |
| `federation.proto` | Cross-server relay |
All `.proto` files use `package qpq.v1;` and are compiled to Rust at build time using `prost-build` via the `quicproquo-proto` crate's `build.rs`. The `protobuf-src` crate vendors `protoc`, so no system-wide `protoc` installation is required.
Generated Rust types are accessed via:
```rust
capnpc::CompilerCommand::new()
.src_prefix(&schemas_dir)
.file(schemas_dir.join("envelope.capnp"))
.file(schemas_dir.join("auth.capnp"))
.file(schemas_dir.join("delivery.capnp"))
.file(schemas_dir.join("node.capnp"))
.run()
.expect("Cap'n Proto schema compilation failed.");
use quicproquo_proto::qpq::v1::{EnqueueRequest, FetchResponse, /* ... */};
use quicproquo_proto::method_ids::{ENQUEUE, FETCH, /* ... */};
```
Key details:
---
- **`src_prefix`**: Set to `schemas/` so that inter-schema imports resolve correctly.
- **Output location**: Generated Rust source is written to `$OUT_DIR` (Cargo's build directory). The filenames follow the convention `{schema_name}_capnp.rs`.
- **Rerun triggers**: `cargo:rerun-if-changed` directives ensure the build script re-runs whenever any `.capnp` file changes.
- **Prerequisite**: The `capnp` CLI binary must be installed on the build machine (`apt-get install capnproto` or `brew install capnp`).
## Design Constraints of `quicproquo-proto`
### Generated module inclusion
The generated code is spliced into the `quicproquo-proto` crate via `include!` macros:
```rust
pub mod envelope_capnp {
include!(concat!(env!("OUT_DIR"), "/envelope_capnp.rs"));
}
pub mod auth_capnp {
include!(concat!(env!("OUT_DIR"), "/auth_capnp.rs"));
}
pub mod delivery_capnp {
include!(concat!(env!("OUT_DIR"), "/delivery_capnp.rs"));
}
pub mod node_capnp {
include!(concat!(env!("OUT_DIR"), "/node_capnp.rs"));
}
```
Consumers import types from these modules. For example, `node_capnp::node_service::Server` is the trait that the server implements.
## The Envelope schema
The `Envelope` is the top-level wire message for all quicproquo traffic. Every frame exchanged between peers is serialised as an Envelope:
```capnp
struct Envelope {
msgType @0 :MsgType;
groupId @1 :Data; # 32-byte SHA-256 digest of group name
senderId @2 :Data; # 32-byte SHA-256 digest of Ed25519 pubkey
payload @3 :Data; # Opaque payload (MLS blob or control data)
timestampMs @4 :UInt64; # Unix epoch milliseconds
enum MsgType {
ping @0;
pong @1;
keyPackageUpload @2;
keyPackageFetch @3;
keyPackageResponse @4;
mlsWelcome @5;
mlsCommit @6;
mlsApplication @7;
error @8;
}
}
```
The Delivery Service routes by `(groupId, msgType)` without inspecting `payload`. This design keeps the DS MLS-unaware -- see [ADR-004: MLS-Unaware Delivery Service](../design-rationale/adr-004-mls-unaware-ds.md).
## The `ParsedEnvelope` owned type
Cap'n Proto readers (`envelope_capnp::envelope::Reader`) borrow from the original byte buffer and cannot be sent across async task boundaries (`!Send`). This is a fundamental limitation of zero-copy reads.
To bridge this gap, `quicproquo-proto` defines `ParsedEnvelope`:
```rust
pub struct ParsedEnvelope {
pub msg_type: MsgType,
pub group_id: Vec<u8>,
pub sender_id: Vec<u8>,
pub payload: Vec<u8>,
pub timestamp_ms: u64,
}
```
`ParsedEnvelope` eagerly copies all byte fields out of the Cap'n Proto reader, making the type `Send + 'static`. This allows it to cross Tokio task boundaries, be stored in queues, and be passed through channels.
The trade-off is clear: `ParsedEnvelope` allocates and copies, defeating the zero-copy benefit. This is acceptable because:
1. The copying happens once per message at the protocol boundary.
2. Application-layer code (MLS encryption/decryption, routing) needs owned data anyway.
3. The performance-critical path (Delivery Service routing) works with opaque `Vec<u8>` payloads, not parsed Cap'n Proto readers.
### Invariants
- `group_id` and `sender_id` are either empty (for control messages like Ping/Pong) or exactly 32 bytes (SHA-256 digest).
- `payload` is empty for Ping and Pong; non-empty for all MLS variants.
## Serialisation helpers
Two functions handle the conversion between `ParsedEnvelope` and wire bytes:
### `build_envelope`
```rust
pub fn build_envelope(env: &ParsedEnvelope) -> Result<Vec<u8>, capnp::Error>
```
Serialises a `ParsedEnvelope` to unpacked Cap'n Proto wire bytes. The output includes the Cap'n Proto segment table header followed by the message data. These bytes are suitable as a payload within a QUIC stream.
Internally, it builds a `capnp::message::Builder`, populates an `Envelope` root, and serialises via `capnp::serialize::write_message`.
### `parse_envelope`
```rust
pub fn parse_envelope(bytes: &[u8]) -> Result<ParsedEnvelope, capnp::Error>
```
Deserialises unpacked Cap'n Proto wire bytes into a `ParsedEnvelope`. All data is copied out of the reader before returning, so the input slice is not retained.
It returns `capnp::Error` if:
- The bytes are not valid Cap'n Proto wire format.
- The `msgType` discriminant is not present in the current schema (forward-compatibility guard).
### Low-level helpers
Two additional functions provide raw byte-to-message conversions:
```rust
pub fn to_bytes<A: Allocator>(msg: &Builder<A>) -> Result<Vec<u8>, capnp::Error>
pub fn from_bytes(bytes: &[u8]) -> Result<Reader<OwnedSegments>, capnp::Error>
```
`from_bytes` uses `ReaderOptions::new()` with default limits:
- **Traversal limit**: 32 MiB (4 * 1024 * 1024 words)
- **Nesting limit**: 512 levels
The traversal limit bounds DoS from deeply nested or excessively large Cap'n Proto messages. The server also enforces size limits: 5 MB per payload (`MAX_PAYLOAD_BYTES`) and 1 MB per KeyPackage (`MAX_KEYPACKAGE_BYTES`).
## The NodeService RPC interface
The M3 unified RPC interface is defined in `schemas/node.capnp`:
```capnp
interface NodeService {
uploadKeyPackage @0 (identityKey :Data, package :Data, auth :Auth)
-> (fingerprint :Data);
fetchKeyPackage @1 (identityKey :Data, auth :Auth) -> (package :Data);
enqueue @2 (recipientKey :Data, payload :Data,
channelId :Data, version :UInt16, auth :Auth) -> ();
fetch @3 (recipientKey :Data, channelId :Data,
version :UInt16, auth :Auth) -> (payloads :List(Data));
fetchWait @4 (recipientKey :Data, channelId :Data,
version :UInt16, timeoutMs :UInt64, auth :Auth)
-> (payloads :List(Data));
health @5 () -> (status :Text);
uploadHybridKey @6 (identityKey :Data, hybridPublicKey :Data) -> ();
fetchHybridKey @7 (identityKey :Data) -> (hybridPublicKey :Data);
}
```
This combines Authentication Service operations (`uploadKeyPackage`, `fetchKeyPackage`), Delivery Service operations (`enqueue`, `fetch`, `fetchWait`), health monitoring (`health`), and hybrid key management (`uploadHybridKey`, `fetchHybridKey`) into a single RPC interface.
### Auth context
Every mutating RPC method accepts an `Auth` struct:
```capnp
struct Auth {
version @0 :UInt16; # 0 = legacy/none, 1 = token-based auth
accessToken @1 :Data; # opaque bearer token
deviceId @2 :Data; # optional UUID bytes for auditing
}
```
The server validates the `version` field and rejects unknown versions. Token validation is planned for a future milestone. See [Auth, Devices, and Tokens](../roadmap/authz-plan.md).
## ALPN integration
Cap'n Proto RPC rides directly on the QUIC bidirectional stream. The ALPN (Application-Layer Protocol Negotiation) extension in the TLS handshake identifies the protocol:
```rust
tls.alpn_protocols = vec![b"capnp".to_vec()];
```
Both client and server set the ALPN to `b"capnp"`. If the client and server disagree on the ALPN, the TLS handshake fails before any application data is exchanged.
On the QUIC path, the flow is:
```text
Client Server
| |
|── QUIC handshake (TLS 1.3) ────►| ALPN: "capnp"
| |
|── open_bi() ───────────────────►| Bidirectional QUIC stream
| |
|◄─────── capnp-rpc messages ────►| VatNetwork reads/writes on the stream
```
The `tokio-util` compat layer converts Quinn stream types into `futures::AsyncRead + AsyncWrite`, which `capnp-rpc`'s `VatNetwork` expects. See [QUIC + TLS 1.3](quic-tls.md) for the full connection setup.
## Comparison with alternatives
### vs Protocol Buffers + gRPC
Protocol Buffers require a full deserialisation step to access any field. Cap'n Proto avoids this with zero-copy readers. gRPC requires HTTP/2 framing, which adds overhead on top of QUIC. Cap'n Proto RPC is leaner and maps naturally to a single QUIC stream.
### vs MessagePack
MessagePack is untyped -- there is no schema file, and type errors are caught at runtime. This is unacceptable for a security protocol where a misinterpreted field could be exploitable. MessagePack also has no RPC framework, requiring a hand-rolled request/response protocol.
### vs FlatBuffers
FlatBuffers supports zero-copy reads (like Cap'n Proto) but lacks a built-in RPC framework. The ecosystem and tooling are also less mature for Rust.
## Design constraints of `quicproquo-proto`
The `quicproquo-proto` crate enforces three design constraints:
The `quicproquo-proto` crate enforces three constraints:
1. **No crypto**: Key material never enters this crate. All encryption and signing happens in `quicproquo-core`.
2. **No I/O**: Callers own the transport. This crate only converts between bytes and types.
@@ -266,10 +301,12 @@ The `quicproquo-proto` crate enforces three design constraints:
These constraints keep the serialisation layer thin and auditable.
---
## Further reading
- [Envelope Schema](../wire-format/envelope-schema.md) -- Detailed field-by-field breakdown of the Envelope wire format.
- [NodeService Schema](../wire-format/node-service-schema.md) -- Full RPC interface documentation.
- [Auth Schema](../wire-format/auth-schema.md) -- Auth token structure and versioning.
- [MLS (RFC 9420)](mls.md) -- How MLS messages are carried as opaque payloads inside Cap'n Proto Envelopes.
- [ADR-002: Cap'n Proto over MessagePack](../design-rationale/adr-002-capnproto.md) -- Design rationale for choosing Cap'n Proto.
- [QUIC + TLS 1.3](quic-tls.md) -- The transport layer that carries these frames.
- [Service Architecture](../architecture/service-architecture.md) -- How the server dispatches method IDs to handlers.
- [Wire Format Reference](../wire-format/overview.md) -- Full Protobuf schema documentation.
- [MLS (RFC 9420)](mls.md) -- How MLS messages are carried as opaque payloads inside Protobuf delivery messages.
- [ADR-007](../design-rationale/adr-007-protobuf-migration.md) -- Design rationale for the v1 Cap'n Proto to v2 Protobuf migration.

View File

@@ -9,7 +9,7 @@ This page provides a high-level comparison and a suggested reading order. The de
| Layer | Standard / Spec | Crate(s) | Security Properties |
|---|---|---|---|
| **QUIC + TLS 1.3** | RFC 9000, RFC 9001 | `quinn 0.11`, `rustls 0.23` | Transport confidentiality, server authentication, 0-RTT resumption |
| **Cap'n Proto** | [capnproto.org specification](https://capnproto.org/encoding.html) | `capnp 0.19`, `capnp-rpc 0.19` | Zero-copy deserialisation, schema-enforced types, canonical serialisation for signing, async RPC |
| **Protobuf framing** | Custom binary header + [Protocol Buffers](https://protobuf.dev/) | `quicproquo-rpc`, `prost 0.13` | Typed length-prefixed frames, method dispatch, push events, status codes |
| **MLS** | [RFC 9420](https://www.rfc-editor.org/rfc/rfc9420.html) | `openmls 0.5` | Group key agreement, forward secrecy, post-compromise security (PCS) |
| **Hybrid KEM** | [draft-ietf-tls-hybrid-design](https://datatracker.ietf.org/doc/draft-ietf-tls-hybrid-design/) | `ml-kem 0.2`, `x25519-dalek 2` | Post-quantum resistance via ML-KEM-768 combined with X25519 |
@@ -27,7 +27,8 @@ Application plaintext
|
v
+-----------+
| Cap'n Proto| Schema-typed serialisation into Envelope frames
| Protobuf | Typed serialisation into Protobuf messages
| framing | + binary header [method_id/event_type][req_id][len]
+-----------+
|
v
@@ -39,19 +40,19 @@ Application plaintext
Network
```
The Hybrid KEM layer operates orthogonally: it wraps MLS payloads in an outer post-quantum encryption envelope before they enter the transport layer. It is implemented and tested but not yet integrated into the MLS ciphersuite (planned for the M5 milestone).
The Hybrid KEM layer operates orthogonally: it wraps MLS payloads in an outer post-quantum encryption envelope before they enter the transport layer. It is implemented and tested but not yet integrated into the MLS ciphersuite (planned for a future milestone).
## Suggested reading order
The pages in this section are ordered to build understanding incrementally:
1. **[QUIC + TLS 1.3](quic-tls.md)** -- Start here. This is the transport layer that every client-server connection uses. Understanding QUIC stream multiplexing and the TLS 1.3 handshake is prerequisite to understanding how Cap'n Proto RPC rides on top.
1. **[QUIC + TLS 1.3](quic-tls.md)** -- Start here. This is the transport layer that every client-server connection uses. Understanding QUIC stream multiplexing and the TLS 1.3 handshake is prerequisite to understanding how the Protobuf framing protocol rides on top.
2. **[MLS (RFC 9420)](mls.md)** -- The core cryptographic innovation. MLS provides the group key agreement that makes quicproquo an E2E encrypted group messenger rather than just a transport-encrypted relay. This is the longest and most detailed page.
3. **[Cap'n Proto Serialisation and RPC](capn-proto.md)** -- The serialisation and RPC layer that bridges MLS application data with the transport. Understanding the Envelope schema, the ParsedEnvelope owned type, and the NodeService RPC interface is essential for reading the server and client source code.
3. **[Protobuf Framing](capn-proto.md)** -- The framing and RPC layer that bridges MLS application data with the transport. Understanding the three frame types (Request, Response, Push), the method ID dispatch table, and status codes is essential for reading the server and client source code.
4. **[Hybrid KEM: X25519 + ML-KEM-768](hybrid-kem.md)** -- The post-quantum encryption layer. Read this last because it builds on concepts from all other layers: key encapsulation (from MLS), wire format conventions (from Cap'n Proto), and AEAD encryption.
4. **[Hybrid KEM: X25519 + ML-KEM-768](hybrid-kem.md)** -- The post-quantum encryption layer. Read this last because it builds on concepts from all other layers: key encapsulation (from MLS), wire format conventions (from Protobuf framing), and AEAD encryption.
## Cross-cutting concerns
@@ -59,9 +60,9 @@ Several topics span multiple layers and have their own dedicated pages elsewhere
- **Forward secrecy**: Provided by MLS epoch ratcheting. See [Forward Secrecy](../cryptography/forward-secrecy.md).
- **Post-compromise security**: Provided by MLS Update proposals. See [Post-Compromise Security](../cryptography/post-compromise-security.md).
- **Post-quantum readiness**: Currently provided by the standalone Hybrid KEM module; integration into MLS is planned for M5. See [Post-Quantum Readiness](../cryptography/post-quantum-readiness.md).
- **Post-quantum readiness**: Currently provided by the standalone Hybrid KEM module; integration into MLS is planned. See [Post-Quantum Readiness](../cryptography/post-quantum-readiness.md).
- **Key lifecycle and zeroization**: Private key material is zeroized after use across all layers. See [Key Lifecycle and Zeroization](../cryptography/key-lifecycle.md).
- **Wire format details**: The Cap'n Proto schema definitions are documented in the [Wire Format Reference](../wire-format/overview.md) section.
- **Wire format details**: The Protobuf schema definitions are documented in the [Wire Format Reference](../wire-format/overview.md) section.
- **Design rationale**: The ADR pages explain *why* each layer was chosen. See [Design Decisions Overview](../design-rationale/overview.md).
## Crate mapping
@@ -70,8 +71,9 @@ Each protocol layer maps to one or more workspace crates:
| Layer | Primary Crate | Source File(s) |
|---|---|---|
| QUIC + TLS 1.3 | `quicproquo-server`, `quicproquo-client` | `main.rs` (server and client entry points) |
| Cap'n Proto | `quicproquo-proto` | `src/lib.rs`, `build.rs`, `schemas/*.capnp` |
| QUIC + TLS 1.3 | `quicproquo-server`, `quicproquo-client` | Server and client entry points |
| Protobuf framing | `quicproquo-rpc` | `src/framing.rs`, `src/server.rs`, `src/client.rs` |
| Protobuf types + method IDs | `quicproquo-proto` | `src/lib.rs` (method_ids), `proto/qpq/v1/*.proto` |
| MLS | `quicproquo-core` | `src/group.rs`, `src/keystore.rs` |
| Hybrid KEM | `quicproquo-core` | `src/hybrid_kem.rs` |

View File

@@ -10,26 +10,27 @@ QUIC provides several advantages over traditional TCP-based transports:
- **0-RTT resumption**: Returning clients can send data in the first flight, reducing connection setup latency.
- **Integrated encryption**: TLS 1.3 is integral to the QUIC handshake; no extra round-trips for transport security.
- **NAT traversal**: UDP-based; connection migration survives NAT rebinding.
- **Ecosystem support**: `capnp-rpc` can use QUIC bidirectional streams directly via the `tokio-util` compat layer.
- **Per-call concurrency**: The v2 RPC framework opens one bidirectional stream per RPC call. Multiple calls run concurrently without blocking each other.
- **Push streams**: Server-to-client push events use QUIC uni-directional streams, avoiding any request-response overhead.
## Crate integration
quicproquo uses the following crates for QUIC and TLS:
- **`quinn 0.11`** -- The async QUIC implementation for Tokio. Provides `Endpoint`, `Connection`, and bidirectional stream types.
- **`quinn 0.11`** -- The async QUIC implementation for Tokio. Provides `Endpoint`, `Connection`, and bidirectional/uni-directional stream types.
- **`quinn-proto 0.11`** -- The protocol-level types, including `QuicServerConfig` and `QuicClientConfig` wrappers that bridge `rustls` into `quinn`.
- **`rustls 0.23`** -- The TLS implementation. quicproquo uses it in strict TLS 1.3 mode with no fallback to TLS 1.2.
- **`rcgen 0.13`** -- Self-signed certificate generation for development and testing.
### Server configuration
The server builds its QUIC endpoint configuration in `build_server_config()` (in `quicproquo-server/src/main.rs`):
The server builds its QUIC endpoint configuration with:
```rust
let mut tls = rustls::ServerConfig::builder_with_protocol_versions(&[&TLS13])
.with_no_client_auth()
.with_single_cert(cert_chain, key)?;
tls.alpn_protocols = vec![b"capnp".to_vec()];
tls.alpn_protocols = vec![b"qpq".to_vec()];
let crypto = QuicServerConfig::try_from(tls)?;
Ok(ServerConfig::with_crypto(Arc::new(crypto)))
@@ -39,9 +40,9 @@ Key points:
1. **TLS 1.3 strict mode**: `builder_with_protocol_versions(&[&TLS13])` ensures no TLS 1.2 fallback. This is a hard requirement: TLS 1.2 lacks the 0-RTT and full forward secrecy guarantees that quicproquo relies on.
2. **No client certificate authentication**: `with_no_client_auth()` means the server does not verify client certificates at the TLS layer. Client authentication is handled at the application layer via Ed25519 identity keys and MLS credentials. This is a deliberate design choice -- MLS provides stronger authentication properties than TLS client certificates.
2. **No client certificate authentication**: `with_no_client_auth()` means the server does not verify client certificates at the TLS layer. Client authentication is handled at the application layer via OPAQUE password authentication and Ed25519 identity keys. This is a deliberate design choice -- OPAQUE provides stronger authentication properties than TLS client certificates without requiring PKI infrastructure.
3. **ALPN negotiation**: The Application-Layer Protocol Negotiation extension is set to `b"capnp"`, advertising that this endpoint speaks Cap'n Proto RPC. Both client and server must agree on this protocol identifier or the TLS handshake fails.
3. **ALPN negotiation**: The Application-Layer Protocol Negotiation extension is set to `b"qpq"`, advertising that this endpoint speaks the quicproquo v2 Protobuf framing protocol. Both client and server must agree on this protocol identifier or the TLS handshake fails.
4. **`QuicServerConfig` bridge**: The `quinn-proto` crate provides `QuicServerConfig::try_from(tls)` to adapt the `rustls::ServerConfig` for use with QUIC. This handles the QUIC-specific TLS parameters (transport parameters, QUIC header protection keys) automatically.
@@ -53,10 +54,10 @@ The client performs the mirror operation. It loads the server's DER-encoded cert
let mut roots = rustls::RootCertStore::empty();
roots.add(CertificateDer::from(cert_bytes))?;
let tls = rustls::ClientConfig::builder_with_protocol_versions(&[&TLS13])
let mut tls = rustls::ClientConfig::builder_with_protocol_versions(&[&TLS13])
.with_root_certificates(roots)
.with_no_client_auth();
tls.alpn_protocols = vec![b"capnp".to_vec()];
tls.alpn_protocols = vec![b"qpq".to_vec()];
let crypto = QuicClientConfig::try_from(tls)?;
```
@@ -65,20 +66,26 @@ The client trusts exactly one certificate: the server's self-signed cert loaded
### Per-connection handling
Each accepted QUIC connection spawns a handler task:
The v2 server accepts connections and handles streams concurrently:
```rust
let (send, recv) = connection.accept_bi().await?;
let (reader, writer) = (recv.compat(), send.compat_write());
// Accept a QUIC connection
let connection = endpoint.accept().await?;
let network = twoparty::VatNetwork::new(reader, writer, Side::Server, Default::default());
let service: node_service::Client = capnp_rpc::new_client(NodeServiceImpl { store, waiters });
RpcSystem::new(Box::new(network), Some(service.client)).await?;
// For each incoming bidirectional stream (one per RPC call):
let (send, recv) = connection.accept_bi().await?;
// Read RequestFrame, dispatch, write ResponseFrame
tokio::spawn(handle_rpc(send, recv, server_state));
// For server-initiated push events:
let send = connection.open_uni().await?;
// Write PushFrame
tokio::spawn(send_push(send, event));
```
The `tokio-util` compat layer (`compat()` and `compat_write()`) converts Quinn's `RecvStream` and `SendStream` into types that implement `futures::AsyncRead` and `futures::AsyncWrite`, which `capnp-rpc`'s `VatNetwork` requires. The entire Cap'n Proto RPC system then runs over this single QUIC bidirectional stream.
Because `capnp-rpc` uses `Rc<RefCell<>>` internally (making it `!Send`), all RPC tasks run on a `tokio::task::LocalSet`. The server spawns each connection handler via `tokio::task::spawn_local`.
Unlike the v1 Cap'n Proto RPC (which required `tokio::task::LocalSet` due to
`!Send` internals), the v2 framework uses `Arc`-based shared state and
`tokio::spawn` for full multi-threaded concurrency.
## Certificate trust model
@@ -126,9 +133,9 @@ The QUIC + TLS 1.3 layer provides:
### What TLS does *not* provide
- **Client authentication**: Handled by MLS identity credentials at the application layer. See [MLS (RFC 9420)](mls.md).
- **End-to-end encryption**: TLS terminates at the server. The server can read the Cap'n Proto RPC framing and message routing metadata. Payload confidentiality is provided by MLS. See [MLS (RFC 9420)](mls.md).
- **Post-quantum resistance**: TLS 1.3 key exchange uses classical ECDHE. Post-quantum protection of application data is provided by the [Hybrid KEM](hybrid-kem.md) layer (M5 milestone).
- **Client authentication**: Handled by OPAQUE password authentication (methods 100-103) and Ed25519 identity keys at the application layer. See [Service Architecture](../architecture/service-architecture.md).
- **End-to-end encryption**: TLS terminates at the server. The server can read the Protobuf framing and message routing metadata. Payload confidentiality is provided by MLS. See [MLS (RFC 9420)](mls.md).
- **Post-quantum resistance**: TLS 1.3 key exchange uses classical ECDHE. Post-quantum protection of application data is provided by the [Hybrid KEM](hybrid-kem.md) layer.
## Configuration reference
@@ -136,7 +143,7 @@ The QUIC + TLS 1.3 layer provides:
| Environment Variable | CLI Flag | Default | Description |
|---|---|---|---|
| `QPQ_LISTEN` | `--listen` | `0.0.0.0:7000` | QUIC listen address |
| `QPQ_LISTEN` | `--listen` | `0.0.0.0:5001` | QUIC listen address |
| `QPQ_TLS_CERT` | `--tls-cert` | `data/server-cert.der` | TLS certificate path |
| `QPQ_TLS_KEY` | `--tls-key` | `data/server-key.der` | TLS private key path |
| `QPQ_DATA_DIR` | `--data-dir` | `data` | Persistent storage directory |
@@ -147,9 +154,9 @@ The QUIC + TLS 1.3 layer provides:
|---|---|---|---|
| `QPQ_CA_CERT` | `--ca-cert` | `data/server-cert.der` | Server certificate to trust |
| `QPQ_SERVER_NAME` | `--server-name` | `localhost` | Expected TLS server name (must match certificate SAN) |
| `QPQ_SERVER` | `--server` | `127.0.0.1:7000` | Server address (per-subcommand) |
| `QPQ_SERVER` | `--server` | `127.0.0.1:5001` | Server address (per-subcommand) |
## Further reading
- [Cap'n Proto Serialisation and RPC](capn-proto.md) -- The RPC layer that runs on top of QUIC streams.
- [Service Architecture](../architecture/service-architecture.md) -- How the server's `NodeServiceImpl` binds to the QUIC endpoint.
- [Protobuf Framing](capn-proto.md) -- The RPC framing layer that runs on top of QUIC streams.
- [Service Architecture](../architecture/service-architecture.md) -- How the server binds to the QUIC endpoint and dispatches 44 RPC methods.