docs: rewrite mdBook documentation for v2 architecture

Update 25+ files and add 6 new pages to reflect the v2 migration from
Cap'n Proto to Protobuf framing over QUIC. Integrates SDK and Operations
docs into the mdBook, restructures SUMMARY.md, and rewrites the wire
format, architecture, and protocol sections with accurate v2 content.
This commit is contained in:
2026-03-04 22:02:31 +01:00
parent f7a7f672b4
commit d073f614b3
31 changed files with 4423 additions and 2379 deletions

View File

@@ -23,12 +23,6 @@
- [TLS in quicproquo](getting-started/tls.md)
- [Certificate Lifecycle and CA-Signed TLS](getting-started/certificate-lifecycle.md)
- [Docker Deployment](getting-started/docker.md)
- [Go SDK](getting-started/go-sdk.md)
- [TypeScript SDK and Browser Demo](getting-started/typescript-sdk.md)
- [C FFI Bindings](getting-started/ffi.md)
- [WASM Integration](getting-started/wasm.md)
- [Bot SDK](getting-started/bot-sdk.md)
- [Code Generators (qpq-gen)](getting-started/generators.md)
- [Mesh Networking](getting-started/mesh-networking.md)
- [Demo Walkthrough: Alice and Bob](getting-started/demo-walkthrough.md)
@@ -48,12 +42,24 @@
- [Protocol Layers Overview](protocol-layers/overview.md)
- [QUIC + TLS 1.3](protocol-layers/quic-tls.md)
- [Cap'n Proto Serialisation and RPC](protocol-layers/capn-proto.md)
- [Protobuf Framing](protocol-layers/capn-proto.md)
- [MLS (RFC 9420)](protocol-layers/mls.md)
- [Hybrid KEM: X25519 + ML-KEM-768](protocol-layers/hybrid-kem.md)
---
# Client SDKs
- [SDK Overview](sdk/index.md)
- [Wire Format Reference](sdk/wire-format.md)
- [Rust SDK](sdk/rust.md)
- [Go SDK](getting-started/go-sdk.md)
- [TypeScript SDK and Browser Demo](getting-started/typescript-sdk.md)
- [C FFI Bindings](getting-started/ffi.md)
- [WASM Integration](getting-started/wasm.md)
---
# Cryptographic Properties
- [Cryptography Overview](cryptography/overview.md)
@@ -97,15 +103,23 @@
---
# Roadmap and Research
# Roadmap
- [Milestone Tracker](roadmap/milestones.md)
- [Phase 2 + M4M6 Roadmap](roadmap/phase2-and-m4-m6.md)
- [Phase 2 + M4-M6 Roadmap](roadmap/phase2-and-m4-m6.md)
- [Production Readiness WBS](roadmap/production-readiness.md)
- [Auth, Devices, and Tokens](roadmap/authz-plan.md)
- [1:1 Channel Design](roadmap/dm-channels.md)
- [Future Research Directions](roadmap/future-research.md)
- [Full Roadmap (Phases 18)](../../ROADMAP.md)
- [Full Roadmap (Phases 1-8)](../../ROADMAP.md)
---
# Operations
- [Monitoring](operations/monitoring.md)
- [Backup and Restore](operations/backup-restore.md)
- [Scaling Guide](operations/scaling-guide.md)
---

View File

@@ -12,18 +12,23 @@ AES-128-GCM (in the MLS ciphersuite). See [Cryptography Overview](../cryptograph
**ALPN** -- Application-Layer Protocol Negotiation. A TLS extension that allows
the client and server to agree on an application protocol during the TLS
handshake. quicproquo uses the ALPN token `b"capnp"` to identify Cap'n Proto
RPC connections. See [QUIC + TLS 1.3](../protocol-layers/quic-tls.md).
handshake. quicproquo v2 uses the ALPN token `"qpq"` (replacing the legacy
`"capnp"` token used in v1). See [Wire Format Overview](../wire-format/overview.md).
**AS** -- Authentication Service. The server component that stores and
distributes single-use MLS KeyPackages. Clients upload KeyPackages after identity
generation; peers fetch them to add new members to a group.
See [Architecture Overview](../architecture/overview.md).
**Argon2id** -- A memory-hard password hashing and key derivation function
(winner of the Password Hashing Competition, 2015). quicproquo uses Argon2id
to derive the SQLCipher encryption key from the server's passphrase, and
optionally for client-side key derivation. See [Storage Backend](../internals/storage-backend.md).
**AS** -- Authentication Service. The server component that handles OPAQUE
registration and login, stores single-use MLS KeyPackages, and manages hybrid
post-quantum public keys. See [Authentication Service Internals](../internals/authentication-service.md).
**Cap'n Proto** -- A zero-copy serialisation format with a built-in RPC system.
quicproquo uses Cap'n Proto for all wire messages and service RPCs. Schemas
live in `schemas/*.capnp` and are compiled to Rust at build time.
See [Cap'n Proto Serialisation and RPC](../protocol-layers/capn-proto.md).
Used in quicproquo v1 for all wire messages and service RPCs. Schemas lived in
`schemas/*.capnp`. In v2, Cap'n Proto is replaced by Protobuf (prost) for RPC
messages, though the legacy Cap'n Proto types remain in `quicproquo-proto` for
backward compatibility. See the v1 archive in `crates/quicproquo-proto/`.
**Commit** -- An MLS message type that advances the group to a new epoch. When a
member sends a Commit (e.g., after adding or removing a member), all group
@@ -42,7 +47,7 @@ self-signed TLS certificate generated by quicproquo is DER-encoded.
**DS** -- Delivery Service. The server component that provides store-and-forward
relay for opaque MLS payloads. The DS never inspects ciphertext -- it routes
solely by recipient public key and optional channel ID.
solely by recipient public key, channel ID, and device ID.
See [Architecture Overview](../architecture/overview.md).
**Ed25519** -- Edwards-curve Digital Signature Algorithm on Curve25519. Used for
@@ -80,8 +85,8 @@ is consumed on fetch. See
**ML-KEM-768** -- Module-Lattice-based Key Encapsulation Mechanism, security
level 3 (NIST FIPS 203). A post-quantum KEM based on the hardness of the
module learning-with-errors (MLWE) problem. quicproquo plans to use ML-KEM-768
in a hybrid construction with X25519 at milestone M7.
module learning-with-errors (MLWE) problem. quicproquo uses ML-KEM-768 in a
hybrid construction with X25519 for post-quantum sealed envelope encryption.
See [Post-Quantum Readiness](../cryptography/post-quantum-readiness.md).
**MLS** -- Messaging Layer Security. A protocol for group key agreement defined
@@ -89,6 +94,13 @@ in RFC 9420. MLS provides forward secrecy and post-compromise security for
groups of any size through an efficient tree-based key schedule.
See [MLS (RFC 9420)](../protocol-layers/mls.md).
**OPAQUE** -- Asymmetric Password-Authenticated Key Exchange (RFC 9497). A
password authentication protocol in which the server never learns the user's
password, not even during registration. The server stores an OPAQUE registration
record derived from the password. quicproquo uses OPAQUE for all user
authentication (replacing static token auth in v1).
See [Authentication Service Internals](../internals/authentication-service.md).
**PCS** -- Post-Compromise Security. The property that a protocol recovers
security after a member's state is compromised. In MLS, once a compromised
member sends an Update or Commit, subsequent epochs are secure again (assuming
@@ -101,6 +113,16 @@ record was requested. Explored as a future enhancement for metadata-hiding
KeyPackage and message fetch.
See [Future Research](../roadmap/future-research.md).
**prost** -- A Rust Protobuf code generation and runtime library. Used in
quicproquo v2 to generate Rust types from `proto/qpq/v1/*.proto` files at
build time. The generated types live in `crates/quicproquo-proto/`.
See [Rust Crate Documentation](references.md).
**Protobuf** -- Protocol Buffers. A language-neutral, binary serialisation format
from Google. quicproquo v2 uses Protobuf for all RPC message payloads, encoded
using the `prost` crate. Proto definitions live in `proto/qpq/v1/`.
See [Wire Format Overview](../wire-format/overview.md).
**QUIC** -- A UDP-based, multiplexed, encrypted transport protocol defined in
RFC 9000. QUIC integrates TLS 1.3 for authentication and confidentiality and
provides 0-RTT connection establishment, stream multiplexing, and built-in
@@ -112,6 +134,12 @@ group key derivation. Each leaf corresponds to a group member; internal nodes
hold derived key material. Updates propagate along the path from a leaf to the
root, giving O(log N) cost for key updates in a group of N members.
**SQLCipher** -- An open-source extension to SQLite that provides transparent,
page-level AES-256 encryption of the database file. quicproquo uses SQLCipher
as the primary server-side storage backend via the `rusqlite` crate with the
`sqlcipher` feature. The encryption key is derived from a server passphrase
using Argon2id. See [Storage Backend](../internals/storage-backend.md).
**TLS 1.3** -- Transport Layer Security version 1.3, defined in RFC 8446. The
standard for authenticated, encrypted transport. quicproquo uses TLS 1.3
exclusively (via `rustls` with `TLS13` cipher suites only) as part of the QUIC

View File

@@ -11,14 +11,14 @@ category.
| Reference | Description |
|-----------|-------------|
| [RFC 9420 -- The Messaging Layer Security (MLS) Protocol](https://datatracker.ietf.org/doc/rfc9420/) | The group key agreement protocol used by quicproquo. Defines KeyPackages, Welcome messages, Commits, the ratchet tree, epoch advancement, and the security properties (forward secrecy, post-compromise security). See [MLS (RFC 9420)](../protocol-layers/mls.md). |
| [RFC 9497 -- The OPAQUE Asymmetric PAKE Protocol](https://datatracker.ietf.org/doc/rfc9497/) | Asymmetric password-authenticated key exchange. quicproquo uses OPAQUE for all user registration and login. The server never learns the user's password. See [Authentication Service Internals](../internals/authentication-service.md). |
| [RFC 9000 -- QUIC: A UDP-Based Multiplexed and Secure Transport](https://datatracker.ietf.org/doc/rfc9000/) | The transport protocol underlying quicproquo's primary connection layer. Provides multiplexed streams, 0-RTT connection establishment, and built-in congestion control. See [QUIC + TLS 1.3](../protocol-layers/quic-tls.md). |
| [RFC 9001 -- Using TLS to Secure QUIC](https://datatracker.ietf.org/doc/rfc9001/) | Defines how TLS 1.3 is integrated into QUIC for authentication and key exchange. quicproquo uses this via the `quinn` + `rustls` stack. |
| [RFC 8446 -- The Transport Layer Security (TLS) Protocol Version 1.3](https://datatracker.ietf.org/doc/rfc8446/) | The TLS version used exclusively by quicproquo (no TLS 1.2 fallback). Provides the handshake, key schedule, and record layer for QUIC transport security. |
| [RFC 9180 -- Hybrid Public Key Encryption (HPKE)](https://datatracker.ietf.org/doc/rfc9180/) | The public-key encryption scheme used internally by MLS for encrypting to KeyPackage init keys. quicproquo's MLS ciphersuite uses DHKEM(X25519, HKDF-SHA256) with AES-128-GCM. |
| [NIST FIPS 203 -- Module-Lattice-Based Key-Encapsulation Mechanism Standard (ML-KEM)](https://csrc.nist.gov/pubs/fips/203/final) | The post-quantum KEM standard. quicproquo plans to use ML-KEM-768 in a hybrid construction with X25519 at milestone M7. See [Post-Quantum Readiness](../cryptography/post-quantum-readiness.md). |
| [Cap'n Proto specification](https://capnproto.org/) | The zero-copy serialisation format and RPC system used for all quicproquo wire messages and service interfaces. See [Cap'n Proto Serialisation and RPC](../protocol-layers/capn-proto.md). |
| [NIST FIPS 203 -- Module-Lattice-Based Key-Encapsulation Mechanism Standard (ML-KEM)](https://csrc.nist.gov/pubs/fips/203/final) | The post-quantum KEM standard. quicproquo uses ML-KEM-768 in a hybrid construction with X25519 for post-quantum sealed envelope encryption. See [Post-Quantum Readiness](../cryptography/post-quantum-readiness.md). |
| [Protocol Buffers Language Guide (proto3)](https://protobuf.dev/programming-guides/proto3/) | The binary serialisation format used for all v2 RPC message payloads. quicproquo proto definitions live in `proto/qpq/v1/`. See [Wire Format Overview](../wire-format/overview.md). |
| [draft-ietf-tls-hybrid-design -- Hybrid Key Exchange in TLS 1.3](https://datatracker.ietf.org/doc/draft-ietf-tls-hybrid-design/) | The combiner approach used by quicproquo's hybrid KEM construction (X25519 shared secret concatenated with ML-KEM-768 shared secret, fed through HKDF). See [Hybrid KEM](../protocol-layers/hybrid-kem.md). |
| [RFC 9497 -- OPAQUE](https://datatracker.ietf.org/doc/rfc9497/) | Asymmetric password-authenticated key exchange. Considered for future authentication (see [Future Research](../roadmap/future-research.md)). |
---
@@ -28,18 +28,22 @@ category.
|-------|---------|----------------------|
| `openmls` | [docs.rs/openmls](https://docs.rs/openmls/) | MLS protocol implementation: group creation, member addition, Welcome processing, application message encryption/decryption. See [MLS (RFC 9420)](../protocol-layers/mls.md). |
| `openmls_rust_crypto` | [docs.rs/openmls_rust_crypto](https://docs.rs/openmls_rust_crypto/) | Pure-Rust cryptographic backend for openmls. Provides the `OpenMlsRustCrypto` provider used by `GroupMember`. |
| `prost` | [docs.rs/prost](https://docs.rs/prost/) | Protobuf runtime for Rust. Used to encode/decode all v2 RPC messages. Generated types are in `crates/quicproquo-proto/`. |
| `prost-build` | [docs.rs/prost-build](https://docs.rs/prost-build/) | Build-time Protobuf code generator invoked from `crates/quicproquo-proto/build.rs`. Reads `.proto` files and emits Rust structs. |
| `protobuf-src` | [docs.rs/protobuf-src](https://docs.rs/protobuf-src/) | Vendors the `protoc` compiler as a build dependency. No system-installed protoc required. |
| `quinn` | [docs.rs/quinn](https://docs.rs/quinn/) | QUIC transport implementation. Provides the `Endpoint`, `Connection`, and stream types for client and server. See [QUIC + TLS 1.3](../protocol-layers/quic-tls.md). |
| `rustls` | [docs.rs/rustls](https://docs.rs/rustls/) | TLS 1.3 implementation used by `quinn`. Configured with `TLS13` cipher suites only and custom certificate verification. |
| `capnp` | [docs.rs/capnp](https://docs.rs/capnp/) | Cap'n Proto serialisation library. Used for building and reading all wire messages. |
| `capnp-rpc` | [docs.rs/capnp-rpc](https://docs.rs/capnp-rpc/) | Cap'n Proto RPC framework. Provides the async RPC system for `NodeService`. Runs inside the QUIC encrypted channel. |
| `capnpc` | [docs.rs/capnpc](https://docs.rs/capnpc/) | Cap'n Proto compiler invoked at build time (`build.rs`) to generate Rust types from `.capnp` schemas. |
| `ml-kem` | [docs.rs/ml-kem](https://docs.rs/ml-kem/) | ML-KEM (NIST FIPS 203) implementation. Vendored in the workspace for the planned hybrid post-quantum KEM (M7). |
| `opaque-ke` | [docs.rs/opaque-ke](https://docs.rs/opaque-ke/) | OPAQUE asymmetric PAKE implementation (RFC 9497). Used for server-side registration record generation and client-side credential derivation. |
| `rusqlite` | [docs.rs/rusqlite](https://docs.rs/rusqlite/) | SQLite bindings for Rust. Used with the `sqlcipher` feature for the SQLCipher-encrypted database backend. See [Storage Backend](../internals/storage-backend.md). |
| `argon2` | [docs.rs/argon2](https://docs.rs/argon2/) | Argon2id key derivation. Used to derive the SQLCipher encryption key from the server passphrase. |
| `ml-kem` | [docs.rs/ml-kem](https://docs.rs/ml-kem/) | ML-KEM (NIST FIPS 203) implementation. Used in the hybrid X25519 + ML-KEM-768 KEM for post-quantum envelope encryption. |
| `ed25519-dalek` | [docs.rs/ed25519-dalek](https://docs.rs/ed25519-dalek/) | Ed25519 signing and verification. Used for MLS identity credentials (`BasicCredential`). See [Ed25519 Identity Keys](../cryptography/identity-keys.md). |
| `x25519-dalek` | [docs.rs/x25519-dalek](https://docs.rs/x25519-dalek/) | X25519 Diffie-Hellman key exchange. Used in hybrid KEM (X25519 + ML-KEM-768) and as the classical component of DHKEM in MLS HPKE. See [Hybrid KEM](../protocol-layers/hybrid-kem.md). |
| `zeroize` | [docs.rs/zeroize](https://docs.rs/zeroize/) | Secure memory zeroisation. All private key types implement `Zeroize + ZeroizeOnDrop`. See [Key Lifecycle and Zeroization](../cryptography/key-lifecycle.md). |
| `bytes` | [docs.rs/bytes](https://docs.rs/bytes/) | Zero-copy byte buffer abstraction. Used in the RPC framing layer (`quicproquo-rpc`) for efficient frame encoding/decoding without copying. |
| `tokio` | [docs.rs/tokio](https://docs.rs/tokio/) | Async runtime. All server and client I/O runs on Tokio. |
| `tower` | [docs.rs/tower](https://docs.rs/tower/) | Service abstraction and middleware framework. Used in `quicproquo-rpc` for RPC middleware (auth, rate limiting, tracing). |
| `clap` | [docs.rs/clap](https://docs.rs/clap/) | CLI argument parser for the client binary. |
| `dashmap` | [docs.rs/dashmap](https://docs.rs/dashmap/) | Concurrent hash map. Used for the in-memory AS key store and DS delivery queues (to be replaced by SQLite at M6). |
| `tracing` | [docs.rs/tracing](https://docs.rs/tracing/) | Structured logging framework. Used throughout the server for request logging and diagnostics. |
| `thiserror` | [docs.rs/thiserror](https://docs.rs/thiserror/) | Derive macro for typed error enums in library crates. |
| `anyhow` | [docs.rs/anyhow](https://docs.rs/anyhow/) | Flexible error handling for application crates (server, client). |
@@ -89,6 +93,18 @@ The predecessor to ML-KEM (NIST FIPS 203). CRYSTALS-Kyber was selected by NIST
and standardised as ML-KEM. quicproquo uses the `ml-kem` crate which
implements the final FIPS 203 standard.
### OPAQUE
**"The OPAQUE Asymmetric PAKE Protocol"**
Stanislaw Jarecki, Hugo Krawczyk, and Jiayu Xu.
*EUROCRYPT 2018.*
The original academic paper introducing OPAQUE. Standardised as RFC 9497.
Relevant background for understanding the security guarantees of quicproquo's
authentication system: the server stores a verifier (not the password), and the
protocol is resistant to pre-computation attacks even if the server's verifier
database is stolen.
### Metadata Resistance
**"Sealed Sender"**

View File

@@ -1,9 +1,9 @@
# Crate Responsibilities
The quicproquo workspace contains six crates. The main four (proto, core,
The quicproquo workspace contains nine crates. The core four (proto, core,
server, client) follow strict layering rules; each owns one concern and depends
only on the crates below it. The workspace also includes **quicproquo-gui**
(Tauri desktop app) and **quicproquo-p2p** (P2P endpoint resolution). This
only on the crates below it. The workspace also includes dedicated crates for
the RPC framework, client SDK, key transparency, plugin API, and P2P. This
page documents what each crate provides, what it explicitly avoids, and how the
crates relate to one another.
@@ -12,33 +12,47 @@ crates relate to one another.
## Dependency Flow Diagram
```text
┌──────────────────────────┐
quicproquo-client
(CLI, QUIC client, │
│ GroupMember orchestr.) │
└─────────┬───────┬────────┘
│ │
┌───────┘ └────────┐
┌────────────────────────┐ ┌────────────────────────┐
│ quicproquo-core quicproquo-server │
(crypto, MLS, │ (QUIC listener,
hybrid KEM) │ NodeService RPC, │
│ │ storage) │
└──────────┬─────────────┘ └─────────┬──────────────┘
│ │
│ ┌───────────────────┘
▼ ▼
┌────────────────────────┐
quicproquo-proto │
(Cap'n Proto schemas, │
│ codegen, helpers) │
└────────────────────────┘
+-------------------+ +-------------------+
| quicproquo-client | | quicproquo-sdk |
| (CLI/TUI binary) | | (QpqClient, store)|
+--------+----------+ +--------+----------+
| |
+----------+ +-----------+
| |
v v
+-----------+----------+
| quicproquo-rpc |
| (framing, server, |
| client, middleware) |
+--------+-------------+
|
+---------------+---------------+
| |
v v
+------------------------+ +-----------------------------+
| quicproquo-core | | quicproquo-server |
| (crypto, MLS, | | (RPC server + domain |
| hybrid KEM) | | services) |
+----------+-------------+ +-------------+---------------+
| |
| +-------------------+ |
+------>| quicproquo-proto |<--+
| (capnp legacy + |
| prost v2 types) |
+-------------------+
(separate, no shared deps)
+-------------------+ +-------------------+ +-------------------+
| quicproquo-kt | | quicproquo-p2p | | quicproquo- |
| (key transparency)| | (iroh P2P) | | plugin-api |
+-------------------+ +-------------------+ | (#![no_std] C-ABI)|
+-------------------+
```
**Arrows point from dependant to dependency.** The proto crate sits at the base
of the dependency graph. The core crate depends on proto for envelope
serialisation. The server and client crates both depend on core and proto.
of the dependency graph. The core crate depends on proto for legacy envelope
serialisation. The rpc crate provides the framing and dispatch layer used by
both the sdk and server.
---
@@ -54,48 +68,61 @@ dependency.
| `identity` | `IdentityKeypair` | Ed25519 signing keypair for MLS credentials. Seed stored as `Zeroizing<[u8; 32]>`. Implements `openmls_traits::Signer`. |
| `group` | `GroupMember` | MLS group state machine wrapping `openmls::MlsGroup`. Lifecycle: `new` -> `generate_key_package` -> `create_group` / `join_group` -> `send_message` / `receive_message`. |
| `keypackage` | `generate_key_package` | Standalone KeyPackage generation (returns TLS-encoded bytes + SHA-256 fingerprint). |
| `keystore` | `DiskKeyStore`, `StoreCrypto` | `OpenMlsKeyStore` implementation backed by an in-memory `HashMap` with optional bincode flush to disk. `StoreCrypto` couples `RustCrypto` + `DiskKeyStore` into an `OpenMlsCryptoProvider`. |
| `keystore` | `DiskKeyStore`, `StoreCrypto` | `OpenMlsKeyStore` implementation backed by an in-memory `HashMap` with bincode flush to disk. `StoreCrypto` couples `RustCrypto` + `DiskKeyStore` into an `OpenMlsCryptoProvider`. |
| `hybrid_kem` | `HybridKeypair`, `HybridPublicKey`, `hybrid_encrypt`, `hybrid_decrypt` | X25519 + ML-KEM-768 hybrid KEM. HKDF-SHA256 key derivation, ChaCha20-Poly1305 AEAD. Versioned envelope wire format. |
| `error` | `CoreError`, `MAX_PLAINTEXT_LEN` | Unified error types. `CoreError` covers Cap'n Proto, MLS, and hybrid KEM failures. |
| `error` | `CoreError`, `MAX_PLAINTEXT_LEN` | Unified error types covering MLS and hybrid KEM failures. |
### What this crate does NOT do
- No network I/O.
- No QUIC or TLS -- that is the server and client crates' concern.
- No async runtime setup (it uses Tokio types internally but does not spawn or
manage a runtime).
- No async runtime setup.
- No CLI parsing.
### Key dependencies
`ed25519-dalek`, `openmls`, `openmls_rust_crypto`,
`openmls_traits`, `tls_codec`, `ml-kem`, `x25519-dalek`, `chacha20poly1305`,
`hkdf`, `sha2`, `zeroize`, `capnp`, `quicproquo-proto`, `tokio`,
`serde`, `bincode`, `serde_json`, `thiserror`.
`hkdf`, `sha2`, `zeroize`, `quicproquo-proto`, `serde`, `bincode`, `thiserror`.
---
## quicproquo-proto
**Role:** Cap'n Proto schema definitions, compile-time code generation, and
pure-synchronous serialisation helpers. This crate is the single source of truth
for the wire format.
**Role:** Protocol type definitions for both v1 (legacy Cap'n Proto) and v2
(Protobuf/prost). This crate is the single source of truth for wire types and
method ID constants.
### Contents
| Item | Description |
|---------------------------|-------------|
| `schemas/envelope.capnp` | `Envelope` struct and `MsgType` enum -- top-level wire message. |
| `schemas/auth.capnp` | `AuthenticationService` interface -- `uploadKeyPackage`, `fetchKeyPackage`. |
| `schemas/delivery.capnp` | `DeliveryService` interface -- `enqueue`, `fetch`. |
| `schemas/node.capnp` | `NodeService` interface (unified AS+DS) -- all RPC methods plus `Auth` struct. |
| `build.rs` | Invokes `capnpc` to generate Rust types from the four `.capnp` files. |
| `lib.rs` | `pub mod envelope_capnp`, `auth_capnp`, `delivery_capnp`, `node_capnp` -- re-exports generated modules. |
| `MsgType` | Re-exported enum from `envelope_capnp::envelope::MsgType`. |
| `ParsedEnvelope` | Owned, `Send + 'static` representation of a decoded `Envelope`. All byte fields are eagerly copied out of the Cap'n Proto reader. |
| `build_envelope` | Serialise a `ParsedEnvelope` to unpacked Cap'n Proto wire bytes. |
| `parse_envelope` | Deserialise wire bytes into a `ParsedEnvelope`. |
| `to_bytes` / `from_bytes` | Low-level Cap'n Proto message <-> byte conversions. |
| Item | Description |
|-------------------------------|-------------|
| `schemas/*.capnp` | Legacy Cap'n Proto schemas (auth, delivery, node, federation). |
| `proto/qpq/v1/*.proto` | 14 Protobuf files defining all v2 message types. |
| `build.rs` | Invokes `capnpc` for legacy types and `prost-build` for v2 types. |
| `pub mod qpq::v1` | All Protobuf-generated types, included via `prost` `include!`. |
| `pub mod method_ids` | All 44 RPC method ID constants (u16) plus 4 push event type constants. |
| `auth_capnp`, `node_capnp`... | Re-exported legacy Cap'n Proto generated modules. |
### method_ids ranges
| Range | Category |
|---|---|
| 100-103 | Auth (OPAQUE register/login) |
| 200-205 | Delivery (enqueue, fetch, ack) |
| 300-304 | Keys (key packages, hybrid keys) |
| 400 | Channel creation |
| 410-413 | Group management |
| 420-424 | Moderation |
| 500-501 | User / identity resolution |
| 510-520 | Key transparency |
| 600-601 | Blob storage |
| 700-710 | Device management + push tokens |
| 750-752 | Recovery bundles |
| 800-802 | P2P endpoints + health |
| 900-905 | Federation relay |
| 950 | Account deletion |
| 1000-1003 | Push event types (server-to-client) |
### What this crate does NOT do
@@ -106,155 +133,178 @@ for the wire format.
### Key dependencies
`capnp` (runtime), `capnpc` (build-time only).
`capnp` (runtime), `capnpc` (build-time), `prost`, `prost-build` (build-time),
`bytes`.
---
## quicproquo-rpc
**Role:** v2 RPC framework. Implements the custom binary framing protocol,
server-side dispatch, client-side request/response handling, and Tower
middleware (rate limiting, timeouts, authentication).
### Components
| Component | Description |
|------------------|-------------|
| `framing` | `RequestFrame`, `ResponseFrame`, `PushFrame` encode/decode with big-endian headers. Max payload: 4 MiB. |
| `server` | `RpcServer` accepts QUIC connections, reads request frames, dispatches to registered handlers, writes response frames. |
| `client` | `RpcClient` opens per-RPC QUIC streams, writes request frames, reads response frames. |
| `middleware` | Tower `Service` wrappers: rate limiter, deadline/timeout, auth token injection. |
| `error` | `RpcError`, `RpcStatus` enum (Ok=0, BadRequest=1, Unauthorized=2, ... UnknownMethod=11). |
### Frame format (implemented here)
```
Request: [method_id: u16 BE][request_id: u32 BE][payload_len: u32 BE][protobuf]
Response: [status: u8][request_id: u32 BE][payload_len: u32 BE][protobuf]
Push: [event_type: u16 BE][payload_len: u32 BE][protobuf]
```
### What this crate does NOT do
- No domain logic -- handlers are registered by the server crate.
- No crypto operations.
### Key dependencies
`quinn`, `rustls`, `tokio`, `bytes`, `tower`, `prost`, `quicproquo-proto`,
`tracing`, `thiserror`.
---
## quicproquo-sdk
**Role:** High-level client SDK. `QpqClient` wraps the RPC client with
typed methods, an async event broadcast channel, and a `ConversationStore`
for local conversation state.
### Components
| Component | Description |
|---------------------|-------------|
| `QpqClient` | Authenticated client: `register`, `login`, `send_message`, `fetch_messages`, etc. |
| Event channel | `tokio::sync::broadcast` channel delivering `ClientEvent` variants (NewMessage, Typing, Presence, Membership). |
| `ConversationStore` | SQLCipher-backed local store for message history and group state. |
### What this crate does NOT do
- No raw frame handling -- delegates to `quicproquo-rpc`.
- No MLS group state management -- delegates to `quicproquo-core`.
### Key dependencies
`quicproquo-rpc`, `quicproquo-core`, `quicproquo-proto`, `tokio`, `rusqlite`,
`prost`, `tracing`, `thiserror`, `anyhow`.
---
## quicproquo-server
**Role:** Network-facing server binary. Accepts QUIC + TLS 1.3 connections,
dispatches Cap'n Proto RPC calls to `NodeServiceImpl`, and persists state to
disk via `FileBackedStore`.
dispatches 44 Protobuf RPC methods through registered handlers in `domain/`,
and persists state to SQLCipher.
### Components
| Component | Description |
|----------------------|-------------|
| `NodeServiceImpl` | Implements `node_service::Server` (Cap'n Proto generated trait). Handles all eight RPC methods: `uploadKeyPackage`, `fetchKeyPackage`, `enqueue`, `fetch`, `fetchWait`, `health`, `uploadHybridKey`, `fetchHybridKey`. |
| `FileBackedStore` | Mutex-guarded `HashMap`s for KeyPackages (keyed by Ed25519 public key), delivery queues (keyed by `ChannelKey = (channelId, recipientKey)`), and hybrid public keys. Each mutation flushes the full map to a bincode file on disk. |
| `DashMap` waiters | `DashMap<Vec<u8>, Arc<Notify>>` -- per-recipient `tokio::sync::Notify` instances for `fetchWait` long-polling. `enqueue` calls `notify_waiters()` after appending. |
| TLS config | Self-signed certificate auto-generated on first run (`rcgen`). TLS 1.3 only, ALPN `capnp`. |
| CLI (`clap`) | `--listen` (default `0.0.0.0:7000`), `--data-dir`, `--tls-cert`, `--tls-key`. |
| `v2_handlers/` | One handler module per method category (auth, delivery, keys, channel, group, user, kt, blob, device, p2p, federation, moderation, recovery, account). |
| `domain/` | Protocol-agnostic domain types and service logic (e.g., `AuthService`, `DeliveryService`, `KeyService`). |
| `ServerState` | Shared state: SQLCipher connection pool, DashMap waiters, OPAQUE server state. |
| TLS config | Self-signed certificate auto-generated on first run (`rcgen`). TLS 1.3 only, ALPN `qpq`. |
| CLI (`clap`) | `--listen` (default `0.0.0.0:5001`), `--data-dir`, `--tls-cert`, `--tls-key`. |
### Connection lifecycle
```text
QUIC accept
└─ TLS 1.3 handshake (self-signed cert, ALPN "capnp")
└─ accept_bi() -> bidirectional QUIC stream
└─ tokio_util::compat adapters (AsyncRead/AsyncWrite)
└─ capnp-rpc twoparty::VatNetwork (Side::Server)
└─ RpcSystem drives NodeServiceImpl
```
QUIC accept (ALPN: "qpq")
+- TLS 1.3 handshake (self-signed cert)
+- Per-stream: read RequestFrame -> dispatch to handler -> write ResponseFrame
+- Uni-stream (server -> client): write PushFrame for events
```
Because `capnp-rpc` uses `Rc<RefCell<>>` internally and is therefore `!Send`,
the entire RPC stack runs on a `tokio::task::LocalSet`. Each incoming connection
is handled by `spawn_local`.
Each RPC call gets its own QUIC bidirectional stream; handlers run concurrently
via `tokio::spawn`.
### What this crate does NOT do
- No direct crypto operations (it delegates to `quicproquo-core` types
for fingerprinting and storage only).
- No MLS processing -- all payloads are opaque byte strings.
- No direct crypto beyond OPAQUE server-side operations.
- No MLS processing -- all MLS payloads are opaque byte strings.
### Key dependencies
`quicproquo-core`, `quicproquo-proto`, `quinn`, `quinn-proto`,
`rustls`, `rcgen`, `capnp`, `capnp-rpc`, `tokio`, `tokio-util`, `dashmap`,
`sha2`, `clap`, `tracing`, `anyhow`, `thiserror`, `bincode`, `serde`.
`quicproquo-core`, `quicproquo-proto`, `quicproquo-rpc`, `quinn`, `rustls`,
`rcgen`, `tokio`, `dashmap`, `rusqlite`, `prost`, `clap`, `tracing`, `anyhow`,
`thiserror`.
---
## quicproquo-client
**Role:** CLI client binary. Connects to the server over QUIC + TLS 1.3,
orchestrates MLS group operations via `GroupMember`, and persists identity and
group state to disk.
### Components
| Component | Description |
|-------------------------|-------------|
| `connect_node` | Establishes a QUIC/TLS connection, opens a bidirectional stream, and bootstraps a `capnp-rpc` `RpcSystem` to obtain a `node_service::Client`. |
| CLI subcommands (`clap`)| `ping`, `register`, `fetch-key`, `demo-group`, `register-state`, `create-group`, `invite`, `join`, `send`, `recv`. |
| `GroupMember` usage | The client creates a `GroupMember` (from `quicproquo-core`), calls `generate_key_package` / `create_group` / `add_member` / `join_group` / `send_message` / `receive_message`. |
| State persistence | `StoredState` holds `identity_seed` (32 bytes) and optional serialised `MlsGroup`. A companion `.ks` file stores the `DiskKeyStore` with HPKE init private keys. |
| Auth context | `ClientAuth` bundles an optional bearer token and device ID. Passed to every RPC via the `Auth` struct in `node.capnp`. |
### CLI subcommand summary
| Subcommand | What it does |
|-------------------|--------------|
| `ping` | Call `health()` and print RTT. |
| `register` | Generate a fresh identity + KeyPackage, upload to AS, print identity key. |
| `register-state` | Same as `register` but uses/creates persistent state file. |
| `fetch-key` | Fetch a peer's KeyPackage by hex identity key. |
| `create-group` | Create a new MLS group and save state. |
| `invite` | Fetch peer's KeyPackage, add to group, enqueue Welcome via DS. |
| `join` | Fetch Welcome from DS, join the MLS group. |
| `send` | Encrypt a message with MLS, enqueue via DS. |
| `recv` | Fetch pending payloads from DS, decrypt with MLS. Supports `--stream` for continuous long-polling. |
| `demo-group` | End-to-end Alice+Bob round-trip (ephemeral identities). |
**Role:** CLI/TUI client binary. Connects to the server, orchestrates MLS
group operations via `GroupMember`, and persists identity and group state.
### What this crate does NOT do
- No server-side logic.
- No direct crypto beyond calling `GroupMember` and verifying SHA-256
fingerprints.
- No raw frame parsing -- delegates to `quicproquo-sdk` / `quicproquo-rpc`.
### Key dependencies
`quicproquo-core`, `quicproquo-proto`, `quinn`, `quinn-proto`,
`rustls`, `capnp`, `capnp-rpc`, `tokio`, `tokio-util`, `clap`, `sha2`,
`serde`, `bincode`, `anyhow`, `thiserror`, `tracing`.
`quicproquo-sdk`, `quicproquo-core`, `quicproquo-proto`, `tokio`, `clap`,
`rustyline`, `tracing`, `anyhow`.
---
## quicproquo-bot
## quicproquo-kt
**Role:** High-level SDK for building automated agents (bots) on the
quicproquo network. Wraps the client library into a simple polling-based API.
**Role:** Key transparency. Implements an append-only transparency log for
Ed25519 public keys with revocation checking and audit support.
### Components
| Component | Description |
|------------------|-------------|
| `BotConfig` | Builder-pattern configuration: server address, credentials, TLS, state file path. |
| `Bot` | Connected bot instance. Methods: `connect()`, `send_dm()`, `receive()`, `receive_raw()`, `resolve_user()`. |
| `Message` | Received message struct with `sender`, `text`, and `seq` fields. |
| `run_pipe_mode` | JSON-lines stdin/stdout interface for shell integration (`send`, `recv`, `resolve` actions). |
### Architecture
Each `send_dm` and `receive` call opens a fresh QUIC connection (stateless
reconnect pattern). The bot wraps the client's `cmd_send` and
`receive_pending_plaintexts` functions, handling MLS group state internally.
### What this crate does NOT do
- No server-side logic.
- No raw MLS operations — delegates to `quicproquo-client` high-level functions.
- No persistent QUIC connections — each operation reconnects.
### Key dependencies
`quicproquo-core`, `quicproquo-client`, `tokio`, `anyhow`, `tracing`,
`serde`, `serde_json`, `hex`.
Methods exposed: `RevokeKey` (510), `CheckRevocation` (511),
`AuditKeyTransparency` (520).
---
## Other workspace crates
## quicproquo-plugin-api
| Crate | Role |
|-------------------------|------|
| **quicproquo-gui** | Tauri 2 desktop application; provides a GUI on top of the client/core stack. |
| **quicproquo-p2p** | P2P endpoint publish/resolve; used by the server and clients for direct peer discovery. |
**Role:** `#![no_std]` C-ABI plugin interface. Defines a stable ABI for
dynamically loaded plugins with 6 hook points (on_message_send,
on_message_receive, on_group_join, on_group_leave, on_connect, on_disconnect).
These crates are optional for building and running the server and CLI client.
This crate has no workspace dependencies. It is intentionally `no_std` to
allow plugins compiled for embedded or WASM targets.
---
## quicproquo-p2p
**Role:** P2P endpoint publish and resolve via iroh. Used by the server and
clients for direct peer discovery when the `mesh` feature is enabled on
`quicproquo-client`.
Methods exposed: `PublishEndpoint` (800), `ResolveEndpoint` (801).
This crate is compiled but kept out of the default dependency graph for most
build targets due to iroh's large dependency footprint (~90 extra deps).
---
## Layering Rules
1. **proto** depends on nothing in-workspace. It is pure data definition.
2. **core** depends on **proto** (for `ParsedEnvelope` and envelope helpers).
It does not depend on server or client.
3. **server** depends on **core** and **proto**. It does not depend on client.
4. **client** depends on **core** and **proto**. It does not depend on server.
5. **server** and **client** never depend on each other. They communicate
exclusively via the Cap'n Proto RPC wire protocol.
6. **quicproquo-gui** and **quicproquo-p2p** are optional; they depend
on client/core/proto as needed and do not change the core layering.
2. **core** depends on **proto** (for legacy envelope helpers).
It does not depend on server, rpc, or sdk.
3. **rpc** depends on **proto**. It does not depend on core, server, or client.
4. **sdk** depends on **rpc** and **core**. It does not depend on server.
5. **server** depends on **core**, **proto**, and **rpc**. It does not depend on client or sdk.
6. **client** depends on **sdk**, **core**, and **proto**. It does not depend on server.
7. **server** and **client** never depend on each other. They communicate
exclusively via the v2 Protobuf framing protocol over QUIC.
8. **kt**, **plugin-api**, and **p2p** are optional; they do not change the
core layering.
This layering ensures that:
@@ -268,7 +318,7 @@ This layering ensures that:
## Further Reading
- [Architecture Overview](overview.md) -- high-level system diagram
- [Service Architecture](service-architecture.md) -- NodeService RPC details
- [Wire Format Overview](../wire-format/overview.md) -- Cap'n Proto schema reference
- [Service Architecture](service-architecture.md) -- 44 RPC method details
- [Wire Format Reference](../wire-format/overview.md) -- Protobuf schema reference
- [GroupMember Lifecycle](../internals/group-member-lifecycle.md) -- MLS state machine details
- [Storage Backend](../internals/storage-backend.md) -- FileBackedStore internals
- [Storage Backend](../internals/storage-backend.md) -- SQLCipher storage internals

View File

@@ -6,75 +6,86 @@ with an ASCII sequence diagram showing control-plane (AS) and data-plane (DS)
traffic.
Throughout these flows the server is **MLS-unaware** -- it stores and forwards
opaque byte blobs without parsing their MLS content.
opaque byte blobs without parsing their MLS content. All RPC calls use the v2
Protobuf framing protocol over QUIC (ALPN: `qpq`, port 5001).
---
## 1. Registration Flow
Before a client can join any MLS group, it must generate an Ed25519 identity
keypair and upload at least one KeyPackage to the Authentication Service. Peers
fetch these KeyPackages to add the client to groups.
Before a client can join any MLS group, it must authenticate with OPAQUE,
generate an Ed25519 identity keypair, and upload at least one KeyPackage to
the Authentication Service.
### Sequence Diagram
```text
Client (Alice) NodeService (AS)
────────────── ────────────────
1. Generate Ed25519 identity keypair
(IdentityKeypair::generate)
│ │
│ 2. Generate MLS KeyPackage
(GroupMember::generate_key_package) │
│ - Creates HPKE init keypair │
│ - Embeds Ed25519 pk in credential
- Signs leaf node with Ed25519 sk │
- TLS-encodes the KeyPackage
│ │
3. QUIC connect + TLS 1.3 handshake │
│ ────────────────────────────────────────>│
│ │
│ 4. uploadKeyPackage(identityKey, pkg)
│ ────────────────────────────────────────>│
│ 5. Validate:
│ │ - identityKey == 32 bytes
│ │ - package non-empty, <= 1 MB
│ │ - auth version allowed
│ 6. Compute SHA-256(package)
│ 7. Append to per-identity queue:
│ │ keyPackages[identityKey].push(pkg)
│ │ 8. Flush keypackages.bin to disk
fingerprint (SHA-256)
│ <────────────────────────────────────────│
│ │
9. Compare local fingerprint with │
│ server-returned fingerprint │
│ (tamper detection)
Client (Alice) Server (port 5001)
-------------- ------------------
| |
| 1. OpaqueRegisterStart (100) |
| username, registration_request |
| ---------------------------------------->|
| |
| registration_response |
| <----------------------------------------|
| |
| 2. OpaqueRegisterFinish (101) |
| username, upload, identity_key |
| ---------------------------------------->|
| | 3. Store OPAQUE record +
| success | identity key mapping
| <----------------------------------------|
| |
| 4. Generate MLS KeyPackage |
| (GroupMember::generate_key_package) |
| - Creates HPKE init keypair |
| - Embeds Ed25519 pk in credential |
| - Signs leaf node with Ed25519 sk |
| - TLS-encodes the KeyPackage |
| |
| 5. OpaqueLoginStart (102) |
| username, login_request |
| ---------------------------------------->|
| login_response |
| <----------------------------------------|
| |
| 6. OpaqueLoginFinish (103) |
| username, finalization, identity_key |
| ---------------------------------------->|
| session_token |
| <----------------------------------------|
| |
| 7. UploadKeyPackage (300) |
| identity_key, package, session_token |
| ---------------------------------------->|
| | 8. Validate + store
| fingerprint (SHA-256) | in KeyPackage queue
| <----------------------------------------|
| |
| 9. Compare local fingerprint with |
| server-returned fingerprint |
| (tamper detection) |
| |
```
### Key Points
- **KeyPackages are single-use** (RFC 9420 requirement). Each `fetchKeyPackage`
- **KeyPackages are single-use** (RFC 9420 requirement). Each `FetchKeyPackage`
call atomically removes and returns one package. The client should upload
multiple KeyPackages if it expects to be added to several groups.
- The `identityKey` used as the AS index is the **raw 32-byte Ed25519 public
- The `identity_key` used as the AS index is the **raw 32-byte Ed25519 public
key**, not a fingerprint or hash. Peers must know Alice's public key out-of-
band (QR code, directory, etc.) to fetch her KeyPackage.
band (QR code, directory lookup via `ResolveUser`, etc.) to fetch her KeyPackage.
- The HPKE init private key generated during `generate_key_package` is stored
in the client's `DiskKeyStore`. The **same `GroupMember` instance** (or a
restored instance with the same key store) must later call `join_group` to
decrypt the Welcome message.
- The optional hybrid public key (`uploadHybridKey`) can also be uploaded
during registration for post-quantum envelope encryption.
- The optional hybrid public key (`UploadHybridKey`, method 302) can also be
uploaded during registration for post-quantum envelope encryption.
---
@@ -87,64 +98,66 @@ Bob via the DS.
### Sequence Diagram
```text
Alice NodeService (AS+DS) Bob
───── ────────────────── ───
1. create_group("my-group")
(local MLS operation --
Alice is sole member,
epoch 0)
2. fetchKeyPackage(bob_pk)
│ ───────────────────────────────>│
│ │ 3. Pop bob's KeyPackage
from queue (atomic)
bob_kp bytes │
│ <───────────────────────────────│ │
│ │
4. add_member(bob_kp)
│ Local MLS operations:
a. Deserialise & validate
Bob's KeyPackage │
b. Produce Commit message │
(adds Bob to ratchet │
tree, advances epoch) │
c. Produce Welcome message
(encrypted to Bob's
HPKE init key, contains │
group secrets + tree) │
d. merge_pending_commit()
(Alice advances to
epoch 1 locally)
5. enqueue(bob_pk, welcome)
│ ───────────────────────────────>│
│ 6. Append welcome to
│ │ deliveries[(ch, bob_pk)]
7. Notify bob_pk waiters
8. Bob connects and fetches │
│ <─────────────────────────────│
fetch(bob_pk)
│ │ │
9. Drain bob's queue
(returns [welcome])
[welcome_bytes]
│ ─────────────────────────────>│
│ │ 10. join_group(welcome)
│ - Decrypt Welcome with
HPKE init private key
- Extract ratchet tree
from GroupInfo ext
- Initialise MlsGroup
at epoch 1
Bob is now a group member
Alice Server (AS+DS, port 5001) Bob
----- ------------------------- ---
| | |
| 1. create_group("my-group") | |
| (local MLS operation -- | |
| Alice is sole member, | |
| epoch 0) | |
| | |
| 2. FetchKeyPackage (301) | |
| bob_identity_key | |
| --------------------------------> |
| | 3. Pop bob's KeyPackage |
| | from queue (atomic) |
| bob_kp bytes | |
| <-------------------------------- |
| | |
| 4. add_member(bob_kp) | |
| Local MLS operations: | |
| a. Deserialise & validate | |
| Bob's KeyPackage | |
| b. Produce Commit message | |
| (adds Bob to ratchet | |
| tree, advances epoch) | |
| c. Produce Welcome message | |
| (encrypted to Bob's | |
| HPKE init key, contains | |
| group secrets + tree) | |
| d. merge_pending_commit() | |
| (Alice advances to | |
| epoch 1 locally) | |
| | |
| 5. Enqueue (200) | |
| recipient=bob_pk, payload=welcome |
| --------------------------------> |
| | 6. Append welcome to |
| | deliveries[bob_pk] |
| | |
| | 7. Notify bob_pk waiters |
| | (FetchWait wakes up) |
| | |
| | 8. Bob connects and polls |
| | <------------------------------
| | FetchWait (202) |
| | |
| | 9. Drain bob's queue |
| | (returns [welcome]) |
| | |
| | [welcome_bytes] |
| | ------------------------------>
| | |
| | | 10. join_group(welcome)
| | | - Decrypt Welcome with
| | | HPKE init private key
| | | - Extract ratchet tree
| | | from GroupInfo ext
| | | - Initialise MlsGroup
| | | at epoch 1
| | |
| | | Bob is now a group member
| | |
```
### Key Points
@@ -162,7 +175,7 @@ Bob via the DS.
tree in the Welcome's `GroupInfo` extension. This means Bob does not need a
separate tree fetch -- `new_from_welcome` extracts it automatically.
- The DS routes solely by `recipientKey` (Bob's Ed25519 public key). It does
- The DS routes solely by `recipient_key` (Bob's Ed25519 public key). It does
not parse the Welcome, the Commit, or any MLS structure.
---
@@ -175,63 +188,66 @@ messages through the DS.
### Sequence Diagram
```text
Alice NodeService (DS) Bob
───── ────────────────── ───
─── Alice sends a message to Bob ───
1. send_message("hello bob")
MLS create_message():
- Derive message key from
epoch secret + gen counter
- Encrypt plaintext with
AES-128-GCM
- Produce MlsMessageOut
(PrivateMessage variant)
- TLS-encode to bytes
2. enqueue(bob_pk, ciphertext) │
│ ───────────────────────────────>│
│ │ 3. Store in bob's queue
4. Notify bob_pk waiters
│ (time passes) │
5. Bob polls for messages
│ <─────────────────────────────│
fetchWait(bob_pk, 30000)
│ │ │
6. Drain bob's queue
[ciphertext]
│ ─────────────────────────────>│
│ │ 7. receive_message(ct)
│ MLS process_message():
- Identify sender from
PrivateMessage header
- Derive decryption key
from epoch secret
- Decrypt AES-128-GCM
- Return plaintext:
"hello bob"
│ ─── Bob replies to Alice ─── │
│ │ 8. send_message("hello alice")
│ (same MLS encrypt flow)
9. enqueue(alice_pk, ct) │
│ <─────────────────────────────│
│ 10. Store + notify
│ 11. fetch(alice_pk) │ │
│ ───────────────────────────────>│ │
[ciphertext] │
│ <───────────────────────────────│
│ │
│ 12. receive_message(ct)
│ -> "hello alice" │
Alice Server (DS, port 5001) Bob
----- ---------------------- ---
| | |
| -- Alice sends a message to Bob -- |
| | |
| 1. send_message("hello bob") | |
| MLS create_message(): | |
| - Derive message key from | |
| epoch secret + gen counter| |
| - Encrypt plaintext with | |
| AES-128-GCM | |
| - Produce MlsMessageOut | |
| (PrivateMessage variant) | |
| - TLS-encode to bytes | |
| | |
| 2. Enqueue (200) | |
| recipient=bob_pk, payload | |
| --------------------------------> |
| | 3. Store in bob's queue |
| | 4. Notify bob_pk waiters |
| | (or push PushNewMessage) |
| | |
| | (time passes) |
| | |
| | 5. Bob polls for messages |
| | <------------------------------
| | FetchWait (202) |
| | |
| | 6. Drain bob's queue |
| | [ciphertext] |
| | ------------------------------>
| | |
| | | 7. receive_message(ct)
| | | MLS process_message():
| | | - Identify sender from
| | | PrivateMessage header
| | | - Derive decryption key
| | | from epoch secret
| | | - Decrypt AES-128-GCM
| | | - Return plaintext:
| | | "hello bob"
| | |
| -- Bob replies to Alice -- |
| | |
| | | 8. send_message("hello alice")
| | | (same MLS encrypt flow)
| | |
| | 9. Enqueue (200) |
| | recipient=alice_pk |
| | <------------------------------
| | 10. Store + notify |
| | |
| 11. Fetch (201) | |
| --------------------------------> |
| [ciphertext] | |
| <-------------------------------- |
| | |
| 12. receive_message(ct) | |
| -> "hello alice" | |
| | |
```
### Key Points
@@ -243,44 +259,48 @@ messages through the DS.
- **The DS is a dumb relay**: it does not decrypt, inspect, or reorder
messages. It stores opaque byte blobs in a FIFO queue keyed by recipient.
- **Long-polling** via `fetchWait` avoids the need for persistent connections
or WebSocket-style push. The client specifies a timeout in milliseconds; the
server blocks up to that duration using `tokio::sync::Notify`. The `recv
--stream` CLI flag loops `fetchWait` indefinitely for continuous message
reception.
- **Long-polling** via `FetchWait` (202) avoids the need for persistent
connections or WebSocket-style push. The client specifies a timeout in
milliseconds; the server blocks up to that duration using
`tokio::sync::Notify`. Push events (method 1000 `PushNewMessage`) deliver
real-time notifications on a separate QUIC uni-stream.
- **Channel-aware routing** is supported: the `channelId` field in `enqueue`
and `fetch` allows scoping queues by channel (e.g., a 16-byte UUID for
1:1 conversations). When `channelId` is empty, messages go to the default
(legacy) queue.
- **Channel-aware routing** is supported: the `channel_id` field in `Enqueue`
and `Fetch` allows scoping queues by channel (e.g., a UUID for a 1:1
conversation or group). When `channel_id` is empty, messages go to the
default queue.
---
## Control-Plane vs. Data-Plane Summary
```text
┌─────────────────────────────────────────────────────────────────────┐
Control Plane (AS)
uploadKeyPackage ────> Store KeyPackage for identity
fetchKeyPackage <──── Pop and return one KeyPackage
uploadHybridKey ────> Store hybrid PQ public key
fetchHybridKey <──── Return hybrid PQ public key
Traffic: Infrequent. Once per group join (upload before, │
fetch during group add). │
└─────────────────────────────────────────────────────────────────────┘
+---------------------------------------------------------------------+
| Control Plane (AS) |
| |
| UploadKeyPackage (300) ----> Store KeyPackage for identity |
| FetchKeyPackage (301) <---- Pop and return one KeyPackage |
| UploadHybridKey (302) ----> Store hybrid PQ public key |
| FetchHybridKey (303) <---- Return hybrid PQ public key |
| FetchHybridKeys (304) <---- Return hybrid keys for N identities|
| |
| Traffic: Infrequent. Once per group join (upload before, |
| fetch during group add). |
+---------------------------------------------------------------------+
┌─────────────────────────────────────────────────────────────────────┐
Data Plane (DS)
enqueue ────> Append payload to recipient queue
fetch <──── Drain and return all queued payloads
fetchWait <──── Long-poll drain with timeout
Traffic: High-frequency. Every MLS message (Welcome, Commit, │
Application) flows through the DS. │
└─────────────────────────────────────────────────────────────────────┘
+---------------------------------------------------------------------+
| Data Plane (DS) |
| |
| Enqueue (200) ----> Append payload to recipient queue |
| Fetch (201) <---- Drain and return all queued payloads|
| FetchWait (202) <---- Long-poll drain with timeout |
| Peek (203) <---- Inspect without removing |
| Ack (204) ----> Acknowledge and remove by seq num |
| BatchEnqueue (205) ----> Enqueue multiple payloads at once |
| |
| Traffic: High-frequency. Every MLS message (Welcome, Commit, |
| Application) flows through the DS. |
+---------------------------------------------------------------------+
```
The separation means the AS can be rate-limited or placed behind stricter
@@ -294,47 +314,49 @@ The following diagram summarises the client-side state machine across all three
flows:
```text
┌──────────────┐
No State
└──────┬───────┘
+--------------+
| No State |
+------+-------+
|
OPAQUE register + login
|
v
+--------------+
| Authenticated | session_token obtained
| | No identity yet
+------+--------+
|
IdentityKeypair::generate()
┌──────────────┐
│ Identity │ Ed25519 keypair exists
Generated No KeyPackage, no group
└──────┬───────┘
generate_key_package() + uploadKeyPackage()
┌──────────────┐
│ Registered │ KeyPackage on AS
│ │ HPKE init key in DiskKeyStore
└──────┬───────┘
┌──────────────┴──────────────┐
│ │
+ UploadKeyPackage (300)
|
v
+--------------+
| Registered | KeyPackage on AS
| | HPKE init key in DiskKeyStore
+------+-------+
|
+--------------+--------------+
| |
create_group() join_group(welcome)
┌─────────────┐ ┌──────────────┐
Group Owner Group Member
(epoch 0) (epoch N)
└──────┬──────┘ └──────┬───────┘
add_member()
┌──────────────────────────────────────────┐
Active Group Member
send_message() -> enqueue via DS
receive_message() <- fetch from DS │
Epoch advances on each Commit │
└──────────────────────────────────────────┘
| |
v v
+-------------+ +--------------+
| Group Owner | | Group Member |
| (epoch 0) | | (epoch N) |
+------+------+ +------+-------+
| |
add_member() |
| |
v v
+------------------------------------------+
| Active Group Member |
| |
| send_message() -> Enqueue (200) |
| receive_message() <- Fetch/FetchWait |
| or PushNewMessage |
| |
| Epoch advances on each Commit |
+------------------------------------------+
```
---
@@ -342,7 +364,7 @@ flows:
## Further Reading
- [Architecture Overview](overview.md) -- system diagram and two-service model
- [Service Architecture](service-architecture.md) -- RPC method details and long-polling internals
- [Service Architecture](service-architecture.md) -- RPC method details and push events
- [GroupMember Lifecycle](../internals/group-member-lifecycle.md) -- detailed MLS state machine
- [KeyPackage Exchange Flow](../internals/keypackage-exchange.md) -- single-use semantics and AS internals
- [MLS (RFC 9420)](../protocol-layers/mls.md) -- key schedule, ratchet tree, and ciphersuite details

View File

@@ -8,21 +8,20 @@ system, the dual-key cryptographic model, and how the pieces fit together.
## Two-Service Model
The server exposes two logical services through a single **NodeService** RPC
interface, bound to **port 7000** over QUIC + TLS 1.3:
The server exposes two logical services through a unified RPC endpoint
bound to **port 5001** over QUIC + TLS 1.3:
| Logical Service | Responsibility |
|--------------------------|-----------------------------------------------------------------|
| **Authentication Service (AS)** | Stores and distributes single-use MLS KeyPackages. Clients upload KeyPackages after identity generation; peers fetch them to add new members to a group. |
| **Authentication Service (AS)** | Stores and distributes single-use MLS KeyPackages. Clients upload KeyPackages after identity generation; peers fetch them to add new members to a group. Also manages hybrid PQ public keys and identity resolution. |
| **Delivery Service (DS)** | Store-and-forward relay for opaque payloads. The DS never inspects MLS ciphertext -- it routes solely by recipient Ed25519 public key (and optional channel ID). |
Combining both services into a single endpoint simplifies deployment and
reduces round-trips. The schema is defined in
[`schemas/node.capnp`](../wire-format/node-service-schema.md) as a unified
`NodeService` interface.
Both services are accessed through a single QUIC connection using the v2
Protobuf framing protocol. Each RPC call gets a dedicated QUIC bidirectional
stream to prevent head-of-line blocking.
See [Service Architecture](service-architecture.md) for per-method details,
connection lifecycle, and the long-polling `fetchWait` mechanism.
connection lifecycle, and push event delivery.
---
@@ -33,17 +32,17 @@ as its long-term identity:
```text
quicproquo Key Model
┌──────────────────────────────────────────────────┐
Ed25519 signing keypair (MLS identity)
│ ──────────────────────────────────────
- Generated once per user/device
- Embedded in MLS BasicCredential
- Signs KeyPackages, Commits, and group ops
- Raw 32-byte public key is the AS index
- Managed by IdentityKeypair, zeroize-on-drop
└──────────────────────────────────────────────────┘
+--------------------------------------------------+
| |
| Ed25519 signing keypair (MLS identity) |
| ------------------------------------------ |
| - Generated once per user/device |
| - Embedded in MLS BasicCredential |
| - Signs KeyPackages, Commits, and group ops |
| - Raw 32-byte public key is the AS index |
| - Managed by IdentityKeypair, zeroize-on-drop |
| |
+--------------------------------------------------+
```
| Property | Ed25519 (MLS) |
@@ -52,7 +51,7 @@ as its long-term identity:
| Purpose | Identity binding, signing, MLS credentials |
| Crate | `ed25519-dalek` |
| Zeroize on drop | Yes (`Zeroizing<[u8; 32]>`) |
| PQ protection | MLS key schedule uses DHKEM(X25519); hybrid PQ KEM available at envelope level |
| PQ protection | MLS key schedule uses DHKEM(X25519); hybrid X25519+ML-KEM-768 KEM available at envelope level |
For details on the cryptographic properties, see
[Ed25519 Identity Keys](../cryptography/identity-keys.md).
@@ -62,37 +61,41 @@ For details on the cryptographic properties, see
## System Diagram
```text
┌─────────────────┐ ┌─────────────────┐
Alice Client Bob Client
IdentityKeypair IdentityKeypair
(Ed25519) (Ed25519)
GroupMember │ GroupMember
(MLS state) │ (MLS state) │
└────────┬─────────┘ └────────┬─────────┘
QUIC + TLS 1.3 (quinn/rustls)
┌────────────────────────────────────────────────────────────────────────────┐
NodeService (port 7000)
┌──────────────────────────┐ ┌───────────────────────────────────┐
Authentication Service Delivery Service
uploadKeyPackage() │ │ enqueue(recipientKey, payload) │
│ fetchKeyPackage() fetch(recipientKey)
│ uploadHybridKey() │ fetchWait(recipientKey, timeout) │
│ fetchHybridKey() │ │ │ │
│ Queues: DashMap + FileBackedStore│ │
│ Store: DashMap + │ │
│ │ FileBackedStore
└──────────────────────────┘ └───────────────────────────────────┘ │
│ health()
└────────────────────────────────────────────────────────────────────────────┘
+-----------------+ +-----------------+
| Alice Client | | Bob Client |
| | | |
| IdentityKeypair | | IdentityKeypair |
| (Ed25519) | | (Ed25519) |
| | | |
| QpqClient | | QpqClient |
| (SDK) | | (SDK) |
+--------+---------+ +--------+---------+
| |
| QUIC + TLS 1.3 (quinn/rustls) ALPN: "qpq" |
| |
v v
+------------------------------------------------------------------------+
| quicproquo-server (port 5001) |
| |
| +---------------------------+ +--------------------------------+ |
| | Authentication Service | | Delivery Service | |
| | | | | |
| | OpaqueRegisterStart(100) | | Enqueue(200) | |
| | OpaqueRegisterFinish(101)| | Fetch(201) | |
| | OpaqueLoginStart(102) | | FetchWait(202) | |
| | OpaqueLoginFinish(103) | | Peek(203) | |
| | | | Ack(204) | |
| | UploadKeyPackage(300) | | BatchEnqueue(205) | |
| | FetchKeyPackage(301) | | | |
| | UploadHybridKey(302) | | Store: SQLCipher | |
| | FetchHybridKey(303) | | DashMap waiters | |
| | FetchHybridKeys(304) | +--------------------------------+ |
| +---------------------------+ |
| |
| + 34 more methods: Keys, Channel, Group, User, KT, Blob, Device, |
| P2P, Federation, Moderation, Recovery, Account (see method_ids) |
| |
+------------------------------------------------------------------------+
```
**Key observations:**
@@ -103,7 +106,11 @@ For details on the cryptographic properties, see
2. KeyPackages are single-use (RFC 9420 requirement). The AS atomically removes
a KeyPackage on fetch to enforce this invariant.
3. QUIC + TLS 1.3 is the sole transport layer.
3. QUIC + TLS 1.3 is the sole transport layer. The ALPN identifier is `qpq`.
4. Push events (new messages, typing, presence, membership changes) are
delivered server-to-client on QUIC uni-streams using a separate push frame
format.
---
@@ -114,14 +121,14 @@ The system stacks three protocol layers:
1. **Transport** -- QUIC + TLS 1.3. Provides confidentiality, integrity, and
server authentication. See [Protocol Stack](protocol-stack.md).
2. **Framing / RPC** -- Cap'n Proto serialisation and RPC. Provides zero-copy
typed messages, schema versioning, and async method dispatch.
See [Cap'n Proto Serialisation and RPC](../protocol-layers/capn-proto.md).
2. **Framing / RPC** -- Custom binary header + Protobuf serialisation. Each
request frame is `[method_id: u16][request_id: u32][payload_len: u32][protobuf]`.
Responses are `[status: u8][request_id: u32][payload_len: u32][protobuf]`.
See [Protobuf Framing](../protocol-layers/capn-proto.md).
3. **End-to-End Encryption** -- MLS (RFC 9420). Provides group key agreement,
forward secrecy, and post-compromise security. The server never holds group
keys.
See [MLS (RFC 9420)](../protocol-layers/mls.md).
keys. See [MLS (RFC 9420)](../protocol-layers/mls.md).
An optional fourth layer -- the **hybrid KEM envelope** (X25519 + ML-KEM-768)
-- wraps MLS payloads for post-quantum confidentiality at the per-message level.
@@ -131,14 +138,19 @@ See [Hybrid KEM](../protocol-layers/hybrid-kem.md).
## Crate Map
The implementation is split across four workspace crates:
The implementation is split across nine workspace crates:
| Crate | Role |
|----------------------------|-------------------------------------------------------------------|
| `quicproquo-core` | Crypto primitives, MLS state machine, hybrid KEM |
| `quicproquo-proto` | Cap'n Proto schemas, codegen, and serialisation helpers |
| `quicproquo-server` | QUIC listener, NodeService RPC, storage |
| `quicproquo-client` | QUIC client, CLI subcommands, state persistence |
| Crate | Role |
|------------------------------|-------------------------------------------------------------------|
| `quicproquo-core` | Crypto primitives, MLS state machine, hybrid KEM |
| `quicproquo-proto` | Cap'n Proto legacy types + Protobuf (prost) v2 generated types |
| `quicproquo-kt` | Key transparency (append-only log, revocation) |
| `quicproquo-plugin-api` | `#![no_std]` C-ABI plugin interface |
| `quicproquo-rpc` | QUIC RPC framework: framing, server dispatch, client, middleware |
| `quicproquo-sdk` | Client SDK: `QpqClient`, event broadcast, `ConversationStore` |
| `quicproquo-server` | RPC server + domain services |
| `quicproquo-client` | CLI/TUI client binary |
| `quicproquo-p2p` | iroh P2P endpoint publish/resolve (feature-flagged) |
See [Crate Responsibilities](crate-responsibilities.md) for a full breakdown
and dependency diagram.
@@ -148,7 +160,7 @@ and dependency diagram.
## Further Reading
- [Protocol Stack](protocol-stack.md) -- layered protocol stack description
- [Service Architecture](service-architecture.md) -- NodeService RPC methods, connection lifecycle, long-polling
- [Service Architecture](service-architecture.md) -- 44 RPC methods, connection lifecycle, push events
- [End-to-End Data Flow](data-flow.md) -- registration, group creation, and message exchange sequence diagrams
- [Wire Format Overview](../wire-format/overview.md) -- Cap'n Proto schema reference
- [Wire Format Reference](../wire-format/overview.md) -- Protobuf schema reference and method ID table
- [Cryptography Overview](../cryptography/overview.md) -- detailed cryptographic properties and threat model

View File

@@ -10,16 +10,17 @@ comparison table.
## Transport: QUIC + TLS 1.3
The transport layer is QUIC over UDP with TLS 1.3 negotiated by `quinn` and
`rustls`. Cap'n Proto RPC rides on a bidirectional QUIC stream.
`rustls`. The v2 Protobuf framing protocol rides on individual QUIC streams,
one per RPC call.
```text
┌─────────────────────────────────────────────┐
Application / MLS ciphertext <- group key ratchet (RFC 9420)
├─────────────────────────────────────────────┤
Cap'n Proto RPC │ <- typed, schema-versioned framing
├─────────────────────────────────────────────┤
QUIC + TLS 1.3 (quinn / rustls) <- mutual auth + transport secrecy
└─────────────────────────────────────────────┘
+---------------------------------------------+
| Application / MLS ciphertext | <- group key ratchet (RFC 9420)
+---------------------------------------------+
| Protobuf framing (custom binary header) | <- typed, length-prefixed framing
+---------------------------------------------+
| QUIC + TLS 1.3 (quinn / rustls) | <- mutual auth + transport secrecy
+---------------------------------------------+
```
### What each layer provides
@@ -31,18 +32,20 @@ The transport layer is QUIC over UDP with TLS 1.3 negotiated by `quinn` and
- TLS 1.3 provides perfect forward secrecy per connection via ephemeral ECDHE.
- The server presents a self-signed certificate by default; the client pins
the server certificate via `--ca-cert`.
- ALPN protocol identifier: `capnp`.
- ALPN protocol identifier: `qpq`.
- Multiplexed streams over a single UDP socket -- one bidirectional stream
per RPC session.
per RPC call, preventing head-of-line blocking.
- Uni-directional streams for server-to-client push events.
**Cap'n Proto RPC** (`capnp`, `capnp-rpc`)
**Protobuf framing** (`quicproquo-rpc`, `quicproquo-proto`)
- Zero-copy, schema-versioned serialisation.
- Asynchronous RPC with promise pipelining (multiple in-flight calls).
- The `NodeService` interface (defined in `schemas/node.capnp`) multiplexes
Authentication and Delivery operations on a single connection.
- The two-party VatNetwork runs over `tokio::io::compat` adapters wrapping
QUIC send/recv streams.
- Three frame types: Request, Response, Push.
- Fixed-length binary headers carry method/status codes, request correlation
IDs, and payload length; zero-copy from header to `bytes::Bytes`.
- 44 RPC method IDs across 14 service categories.
- 4 push event types (NewMessage, Typing, Presence, Membership).
- All multi-byte integers in big-endian (network byte order).
- Maximum payload size: 4 MiB per frame.
**MLS (RFC 9420)** (`openmls`, `openmls_rust_crypto`)
@@ -63,7 +66,7 @@ The transport layer is QUIC over UDP with TLS 1.3 negotiated by `quinn` and
| Layer | Provides | Crate(s) |
|-------------|------------------------------------------------------------------|-----------------------------------------|
| **Transport: QUIC + TLS 1.3** | Confidentiality, server authentication, forward secrecy, multiplexed streams, congestion control | `quinn`, `rustls` |
| **Framing: Cap'n Proto** | Zero-copy typed serialisation, schema versioning, async RPC with promise pipelining | `capnp`, `capnp-rpc` |
| **Framing: Protobuf** | Typed serialisation, length-prefixed framing, method dispatch, push events | `quicproquo-rpc`, `prost` |
| **Encryption: MLS** | Group key agreement, forward secrecy, post-compromise security, identity binding | `openmls`, `openmls_rust_crypto` |
| **Encryption: Hybrid KEM** (optional) | Post-quantum confidentiality for individual payloads (X25519 + ML-KEM-768) | `ml-kem`, `x25519-dalek`, `chacha20poly1305`, `hkdf` |
@@ -75,35 +78,42 @@ A plaintext message traverses the stack as follows:
```text
Sender Recipient
────── ─────────
------ ---------
plaintext bytes
|
v
MLS create_message()
── encrypts with group AEAD key (AES-128-GCM) ──
| -- encrypts with group AEAD key (AES-128-GCM) --
v
TLS-encoded MlsMessageOut (opaque ciphertext blob)
Cap'n Proto: enqueue(recipientKey, payload)
│ ── serialised into NodeService RPC call ──
QUIC stream (TLS 1.3 encrypted)
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ network ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
Server: NodeService.enqueue() stores payload in FIFO queue
Cap'n Proto: fetch() / fetchWait() returns payload
|
v
Protobuf encode: EnqueueRequest { recipient_key, payload, ... }
|
v
RequestFrame: [200: u16][req_id: u32][len: u32][protobuf bytes]
|
v
QUIC bidirectional stream (TLS 1.3 encrypted)
|
v
.............. network ..............
|
v
Server: handler reads RequestFrame, stores payload in queue
|
v
ResponseFrame: [0: u8 (Ok)][req_id: u32][len: u32][EnqueueResponse bytes]
(or PushFrame on uni-stream when push event fires)
|
v
Client: Fetch(201) or receives PushFrame (event_type=1000)
|
v
MLS process_message()
── decrypts with group AEAD key ──
| -- decrypts with group AEAD key --
v
plaintext bytes
```
@@ -116,6 +126,6 @@ The server **never** holds the MLS group key. It sees only the encrypted
- [Architecture Overview](overview.md) -- high-level system diagram and identity key model
- [QUIC + TLS 1.3](../protocol-layers/quic-tls.md) -- QUIC configuration, ALPN, and certificate handling
- [Cap'n Proto Serialisation and RPC](../protocol-layers/capn-proto.md) -- schema design and VatNetwork wiring
- [Protobuf Framing](../protocol-layers/capn-proto.md) -- frame format, method IDs, status codes
- [MLS (RFC 9420)](../protocol-layers/mls.md) -- ciphersuite selection, key schedule, and ratchet tree
- [Hybrid KEM: X25519 + ML-KEM-768](../protocol-layers/hybrid-kem.md) -- post-quantum envelope encryption

View File

@@ -1,61 +1,233 @@
# Service Architecture
The quicproquo server exposes a single **NodeService** RPC endpoint that
combines Authentication and Delivery operations. This page documents the RPC
interface, per-connection lifecycle, storage model, long-polling mechanism, and
authentication context.
The quicproquo server exposes 44 RPC methods through a single QUIC + TLS 1.3
endpoint on **port 5001**. Methods are dispatched by numeric method ID using
the v2 Protobuf framing protocol. This page documents the method reference,
connection lifecycle, storage model, and authentication flow.
---
## NodeService Endpoint
## RPC Endpoint
A single QUIC + TLS 1.3 listener on **port 7000** serves all operations.
The schema is defined in `schemas/node.capnp` and documented in
[NodeService Schema](../wire-format/node-service-schema.md).
A single QUIC + TLS 1.3 listener on **port 5001** serves all operations.
The ALPN identifier is `qpq`. Each RPC call uses a dedicated QUIC
bidirectional stream; calls are concurrent and do not block each other.
```text
NodeService (port 7000)
├── Authentication methods
│ ├── uploadKeyPackage(identityKey, package, auth) -> fingerprint
├── fetchKeyPackage(identityKey, auth) -> package
├── uploadHybridKey(identityKey, hybridPublicKey) -> ()
└── fetchHybridKey(identityKey) -> hybridPublicKey
├── Delivery methods
│ ├── enqueue(recipientKey, payload, channelId, version, auth) -> ()
├── fetch(recipientKey, channelId, version, auth) -> payloads
└── fetchWait(recipientKey, channelId, version, timeoutMs, auth) -> payloads
└── Operational
└── health() -> status
quicproquo-server (port 5001, ALPN: "qpq")
|
+-- Auth (100-103)
| +-- 100: OpaqueRegisterStart
| +-- 101: OpaqueRegisterFinish
| +-- 102: OpaqueLoginStart
| +-- 103: OpaqueLoginFinish
|
+-- Delivery (200-205)
| +-- 200: Enqueue
| +-- 201: Fetch
| +-- 202: FetchWait
| +-- 203: Peek
| +-- 204: Ack
| +-- 205: BatchEnqueue
|
+-- Keys (300-304)
| +-- 300: UploadKeyPackage
| +-- 301: FetchKeyPackage
| +-- 302: UploadHybridKey
| +-- 303: FetchHybridKey
| +-- 304: FetchHybridKeys
|
+-- Channel (400)
| +-- 400: CreateChannel
|
+-- Group Management (410-413)
| +-- 410: RemoveMember
| +-- 411: UpdateGroupMetadata
| +-- 412: ListGroupMembers
| +-- 413: RotateKeys
|
+-- Moderation (420-424)
| +-- 420: ReportMessage
| +-- 421: BanUser
| +-- 422: UnbanUser
| +-- 423: ListReports
| +-- 424: ListBanned
|
+-- User (500-501)
| +-- 500: ResolveUser
| +-- 501: ResolveIdentity
|
+-- Key Transparency (510-520)
| +-- 510: RevokeKey
| +-- 511: CheckRevocation
| +-- 520: AuditKeyTransparency
|
+-- Blob (600-601)
| +-- 600: UploadBlob
| +-- 601: DownloadBlob
|
+-- Device (700-710)
| +-- 700: RegisterDevice
| +-- 701: ListDevices
| +-- 702: RevokeDevice
| +-- 710: RegisterPushToken
|
+-- Recovery (750-752)
| +-- 750: StoreRecoveryBundle
| +-- 751: FetchRecoveryBundle
| +-- 752: DeleteRecoveryBundle
|
+-- P2P (800-802)
| +-- 800: PublishEndpoint
| +-- 801: ResolveEndpoint
| +-- 802: Health
|
+-- Federation (900-905)
| +-- 900: RelayEnqueue
| +-- 901: RelayBatchEnqueue
| +-- 902: ProxyFetchKeyPackage
| +-- 903: ProxyFetchHybridKey
| +-- 904: ProxyResolveUser
| +-- 905: FederationHealth
|
+-- Account (950)
+-- 950: DeleteAccount
Push event types (server -> client, uni-stream):
1000: PushNewMessage
1001: PushTyping
1002: PushPresence
1003: PushMembership
```
---
## RPC Method Reference
### Authentication Service Methods
### Auth (100-103)
| Method | Params | Returns | Semantics |
|----------------------|-------------------------------------|------------------|-----------|
| `uploadKeyPackage` | `identityKey` (32 B Ed25519 pk), `package` (TLS-encoded KeyPackage), `auth` | `fingerprint` (SHA-256 of package) | Appends the KeyPackage to a per-identity FIFO queue. The fingerprint lets the client detect server-side tampering. Max package size: 1 MB. |
| `fetchKeyPackage` | `identityKey` (32 B), `auth` | `package` (or empty `Data`) | Atomically pops and returns the oldest KeyPackage for the identity. Returns empty bytes if none are stored. Single-use semantics per RFC 9420. |
| `uploadHybridKey` | `identityKey` (32 B), `hybridPublicKey` (X25519 pk + ML-KEM-768 ek) | `()` | Stores (or replaces) the hybrid PQ public key for envelope-level post-quantum encryption. |
| `fetchHybridKey` | `identityKey` (32 B) | `hybridPublicKey` (or empty `Data`) | Returns the stored hybrid public key for a peer, or empty if none. |
OPAQUE password authentication (asymmetric PAKE). The password is never sent
to the server. Method IDs 100-103 implement the 4-step OPAQUE handshake.
### Delivery Service Methods
| ID | Method | Description |
|-----|-------------------------|-------------|
| 100 | `OpaqueRegisterStart` | Client initiates registration with `username` and OPAQUE `registration_request` blob. Server returns `registration_response`. |
| 101 | `OpaqueRegisterFinish` | Client completes registration with `username`, OPAQUE `upload` blob, and Ed25519 `identity_key`. Server stores the OPAQUE record. |
| 102 | `OpaqueLoginStart` | Client initiates login with `username` and OPAQUE `login_request` blob. Server returns `login_response`. |
| 103 | `OpaqueLoginFinish` | Client completes login with `username`, OPAQUE `finalization` blob, and `identity_key`. Server returns a `session_token`. |
| Method | Params | Returns | Semantics |
|--------------|------------------------------------------------------------------------|----------------------|-----------|
| `enqueue` | `recipientKey` (32 B), `payload` (opaque), `channelId`, `version`, `auth` | `()` | Appends `payload` to the recipient's FIFO queue. Max payload: 5 MB. Wakes any `fetchWait` waiter for this recipient. Supported versions: 0 (legacy), 1 (current). |
| `fetch` | `recipientKey` (32 B), `channelId`, `version`, `auth` | `payloads: List(Data)` | Atomically drains and returns the full queue in FIFO order. Returns empty list if nothing is pending. |
| `fetchWait` | `recipientKey` (32 B), `channelId`, `version`, `timeoutMs`, `auth` | `payloads: List(Data)` | Same as `fetch`, but if the queue is empty and `timeoutMs > 0`, blocks up to `timeoutMs` milliseconds waiting for a `Notify` signal from `enqueue`. Returns whatever is in the queue when the wait completes or times out. |
The `session_token` is an opaque bearer token used for subsequent authenticated
RPCs. It is passed in the Protobuf request body (not as a frame-level header).
### Operational Methods
### Delivery (200-205)
| Method | Params | Returns | Semantics |
|----------|--------|-----------------|-----------|
| `health` | none | `status: Text` | Returns `"ok"`. Used for liveness/readiness probes. |
Store-and-forward relay. The server never inspects MLS ciphertext -- it routes
opaque byte blobs by recipient key.
| ID | Method | Description |
|-----|----------------|-------------|
| 200 | `Enqueue` | Append an opaque payload to the recipient's FIFO queue. Wakes `FetchWait` waiters. |
| 201 | `Fetch` | Drain and return all queued payloads in FIFO order. |
| 202 | `FetchWait` | Same as `Fetch`, but long-polls if the queue is empty (up to `timeout_ms`). |
| 203 | `Peek` | Return queued payloads without removing them. |
| 204 | `Ack` | Acknowledge and remove specific payloads by sequence number. |
| 205 | `BatchEnqueue` | Enqueue multiple payloads in a single RPC call. |
### Keys (300-304)
MLS KeyPackage distribution and hybrid PQ public key management.
| ID | Method | Description |
|-----|--------------------|-------------|
| 300 | `UploadKeyPackage` | Append a TLS-encoded MLS KeyPackage to the identity's queue. Single-use: each fetch atomically removes one. |
| 301 | `FetchKeyPackage` | Atomically pop and return the oldest KeyPackage for an identity. Returns empty if none. |
| 302 | `UploadHybridKey` | Store (or replace) the X25519+ML-KEM-768 hybrid public key for an identity. |
| 303 | `FetchHybridKey` | Return the stored hybrid public key for a single identity. |
| 304 | `FetchHybridKeys` | Return hybrid public keys for multiple identities in one call. |
### Group Management (400, 410-413)
| ID | Method | Description |
|-----|-----------------------|-------------|
| 400 | `CreateChannel` | Register a new channel (group) on the server. |
| 410 | `RemoveMember` | Remove a member from a group (server-side record). |
| 411 | `UpdateGroupMetadata` | Update group name, description, or settings. |
| 412 | `ListGroupMembers` | List all members of a group. |
| 413 | `RotateKeys` | Trigger a server-assisted key rotation event. |
### User / Identity (500-501)
| ID | Method | Description |
|-----|-------------------|-------------|
| 500 | `ResolveUser` | Resolve a username to an Ed25519 public key. |
| 501 | `ResolveIdentity` | Resolve an identity key to user profile information. |
### Key Transparency (510-520)
| ID | Method | Description |
|-----|--------------------------|-------------|
| 510 | `RevokeKey` | Append a key revocation record to the transparency log. |
| 511 | `CheckRevocation` | Check whether a given key has been revoked. |
| 520 | `AuditKeyTransparency` | Fetch a transparency log audit proof for a key. |
### Blob Storage (600-601)
| ID | Method | Description |
|-----|----------------|-------------|
| 600 | `UploadBlob` | Store a binary blob (file attachment, avatar, etc.). Returns a content-addressed blob ID. |
| 601 | `DownloadBlob` | Retrieve a blob by ID. |
### Device Management (700-710)
| ID | Method | Description |
|-----|-----------------------|-------------|
| 700 | `RegisterDevice` | Register a new device for a user account. |
| 701 | `ListDevices` | List all registered devices for the authenticated user. |
| 702 | `RevokeDevice` | Revoke a device, invalidating its session. |
| 710 | `RegisterPushToken` | Register a push notification token (APNs / FCM) for a device. |
### Recovery (750-752)
| ID | Method | Description |
|-----|-------------------------|-------------|
| 750 | `StoreRecoveryBundle` | Encrypt and store an account recovery bundle server-side. |
| 751 | `FetchRecoveryBundle` | Retrieve the recovery bundle (requires OPAQUE re-authentication). |
| 752 | `DeleteRecoveryBundle` | Delete the stored recovery bundle. |
### P2P and Health (800-802)
| ID | Method | Description |
|-----|--------------------|-------------|
| 800 | `PublishEndpoint` | Publish a direct P2P endpoint (iroh node address). |
| 801 | `ResolveEndpoint` | Resolve a peer's P2P endpoint by identity key. |
| 802 | `Health` | Liveness/readiness probe. Returns server uptime and status. |
### Federation (900-905)
| ID | Method | Description |
|-----|-------------------------|-------------|
| 900 | `RelayEnqueue` | Relay a single message to a user on another server. |
| 901 | `RelayBatchEnqueue` | Relay multiple messages in one request. |
| 902 | `ProxyFetchKeyPackage` | Fetch a KeyPackage from a remote server on behalf of a local client. |
| 903 | `ProxyFetchHybridKey` | Fetch a hybrid public key from a remote server. |
| 904 | `ProxyResolveUser` | Resolve a username on a remote server. |
| 905 | `FederationHealth` | Check health of the federation link to another server. |
### Moderation (420-424)
| ID | Method | Description |
|-----|----------------|-------------|
| 420 | `ReportMessage` | Submit a content moderation report. |
| 421 | `BanUser` | Ban a user from a channel or server-wide. |
| 422 | `UnbanUser` | Lift a ban. |
| 423 | `ListReports` | List pending moderation reports (admin only). |
| 424 | `ListBanned` | List banned users (admin only). |
### Account (950)
| ID | Method | Description |
|-----|-----------------|-------------|
| 950 | `DeleteAccount` | Permanently delete the authenticated account and all associated data. |
---
@@ -64,196 +236,127 @@ NodeService (port 7000)
Each incoming QUIC connection follows this sequence:
```text
┌──────────────────────────────────────────────────────────────────────┐
│ Client Server │
│ │
│ 1. UDP packet -> │
│ QUIC INITIAL
│ │
│ 2. <- QUIC HANDSHAKE │
TLS 1.3 ServerHello + │
│ Certificate (self-signed) │
│ ALPN: "capnp" │
│ │
│ 3. Client verifies server │
│ cert against pinned CA │
│ cert (--ca-cert flag) │
│ │
4. QUIC connection established │
│ │
5. Client opens bidirectional ──────────> Server accepts bi stream │
QUIC stream (open_bi) (accept_bi) │
│ │
│ 6. tokio_util::compat adapters wrap the send/recv halves │
into AsyncRead + AsyncWrite │
│ │
7. capnp-rpc twoparty::VatNetwork │
│ Client Side::Client Server Side::Server │
│ │
8. RpcSystem::new() starts │
promise-pipelined RPC loop │
│ │
│ 9. Client bootstraps │
│ node_service::Client NodeServiceImpl created │
│ (shares Arc<FileBackedStore>, │
│ Arc<DashMap<..., Notify>>) │
│ │
│ 10. RPC calls flow over the bidirectional stream │
│ until either side closes the connection. │
└──────────────────────────────────────────────────────────────────────┘
Client Server
------ ------
1. UDP QUIC INITIAL ->
2. <- QUIC HANDSHAKE
TLS 1.3 ServerHello +
Certificate (self-signed)
ALPN: "qpq"
3. Client verifies server cert against
pinned CA cert (--ca-cert flag)
4. QUIC connection established
5. Per RPC call:
Client opens bidirectional stream
Client writes RequestFrame:
[method_id: u16][req_id: u32][len: u32][protobuf]
Client marks end-of-write
6. Server reads RequestFrame
Server dispatches to handler by method_id
Handler processes, writes ResponseFrame:
[status: u8][req_id: u32][len: u32][protobuf]
7. For push events (server -> client):
Server opens uni-stream
Server writes PushFrame:
[event_type: u16][len: u32][protobuf]
8. Multiple RPCs run concurrently
(each on its own stream)
```
### LocalSet requirement
### Concurrency model
`capnp-rpc` uses `Rc<RefCell<>>` internally, making it `!Send`. Therefore:
- The server runs the entire accept loop inside a `tokio::task::LocalSet`.
- Each connection handler is `spawn_local`, ensuring all RPC futures stay on a
single thread.
- The client wraps each subcommand invocation in its own `LocalSet::run_until`.
This is a fundamental constraint of the Cap'n Proto RPC runtime in Rust.
Attempts to spawn RPC futures on the multi-threaded Tokio executor will fail
with a compile error.
Unlike the v1 Cap'n Proto RPC (which was `!Send` due to `Rc<RefCell<>>`
internals and required `LocalSet`), the v2 RPC framework uses `Arc`-based
shared state and spawns each handler with `tokio::spawn`. The server can
handle many concurrent requests per connection without a `LocalSet`.
---
## Storage Model
## Status Codes
`NodeServiceImpl` holds two pieces of shared state:
Response frames carry a `status: u8` field:
### FileBackedStore
| Value | Status | Meaning |
|-------|------------------|---------|
| 0 | `Ok` | Success |
| 1 | `BadRequest` | Malformed request or missing required field |
| 2 | `Unauthorized` | Missing or invalid session token |
| 3 | `Forbidden` | Valid token but insufficient permissions |
| 4 | `NotFound` | Requested resource does not exist |
| 5 | `RateLimited` | Request rate limit exceeded; retry after backoff |
| 8 | `DeadlineExceeded` | Request timed out on the server |
| 9 | `Unavailable` | Server temporarily unable to serve the request |
| 10 | `Internal` | Unexpected server error |
| 11 | `UnknownMethod` | The requested method_id is not registered |
---
## Authentication Flow
OPAQUE (RFC-compliant asymmetric PAKE) prevents the password from reaching
the server in any form:
```text
FileBackedStore
├── key_packages: Mutex<HashMap<Vec<u8>, VecDeque<Vec<u8>>>>
Key: Ed25519 public key (32 bytes)
│ Value: FIFO queue of TLS-encoded KeyPackage blobs
File: data/keypackages.bin (bincode)
├── deliveries: Mutex<HashMap<ChannelKey, VecDeque<Vec<u8>>>>
│ ChannelKey: { channel_id: Vec<u8>, recipient_key: Vec<u8> }
Value: FIFO queue of opaque payload blobs
File: data/deliveries.bin (bincode, v2 format)
└── hybrid_keys: Mutex<HashMap<Vec<u8>, Vec<u8>>>
Key: Ed25519 public key (32 bytes)
Value: serialised HybridPublicKey blob
File: data/hybridkeys.bin (bincode)
Client Server
| |
| OpaqueRegisterStart(100): |
| username, registration_request |
| --------------------------------->|
| |
| registration_response |
| <---------------------------------|
| |
| OpaqueRegisterFinish(101): |
| username, upload, identity_key |
| --------------------------------->|
| |
| success |
| <---------------------------------|
| |
| OpaqueLoginStart(102): |
| username, login_request |
| --------------------------------->|
| |
| login_response |
| <---------------------------------|
| |
| OpaqueLoginFinish(103): |
| username, finalization, |
| identity_key |
| --------------------------------->|
| |
| session_token |
| <---------------------------------|
```
Every mutation (upload, fetch, enqueue) acquires the relevant `Mutex`, modifies
the in-memory `HashMap`, and then flushes the entire map to disk as a bincode
blob. This is intentionally simple for MVP-scale workloads. A production
deployment would replace this with an embedded database or external store.
The delivery map supports a **v1 -> v2 upgrade path**: if `deliveries.bin`
contains the legacy `QueueMapV1` format (keyed by `recipientKey` only), the
store transparently upgrades entries by wrapping them in `ChannelKey` with an
empty `channel_id`.
### DashMap Waiters
```text
Arc<DashMap<Vec<u8>, Arc<Notify>>>
Key: recipient Ed25519 public key (32 bytes)
Value: tokio::sync::Notify instance
```
The waiters map is orthogonal to `FileBackedStore`. It lives entirely in
memory and serves the `fetchWait` long-polling mechanism:
1. `enqueue` calls `waiter(&recipient_key).notify_waiters()` after storing the
payload.
2. `fetchWait` first tries a regular `fetch`. If the queue is empty and
`timeoutMs > 0`:
- Look up or insert a `Notify` for the recipient.
- `tokio::time::timeout(Duration::from_millis(timeoutMs), notify.notified())`
- When notified (or on timeout), perform a second `fetch` and return
whatever is available.
This design avoids busy-polling while keeping the implementation lock-free
(DashMap uses sharded RwLocks internally).
---
## Auth Struct
Every RPC method that modifies or reads user-specific state accepts an `Auth`
parameter:
```capnp
struct Auth {
version @0 :UInt16; # 0 = legacy/none, 1 = token-based auth
accessToken @1 :Data; # opaque bearer token
deviceId @2 :Data; # optional UUID for auditing/rate limiting
}
```
### Version semantics
| Version | Meaning |
|---------|------------------------------------------------------------|
| 0 | Legacy / no authentication. The server accepts the request without checking credentials. Suitable for development and testing. |
| 1 | Token-based authentication. The `accessToken` field should contain an opaque bearer token issued at login. The server validates the token against a token store (not yet implemented -- see [Auth, Devices, and Tokens](../roadmap/authz-plan.md)). |
The server validates the `version` field on every request via `validate_auth()`.
Requests with unsupported versions are rejected with a Cap'n Proto error.
### Client-side usage
The client CLI accepts `--access-token` and `--device-id` flags (or the
corresponding environment variables). These are bundled into a `ClientAuth`
struct and injected into every outgoing RPC call via the `set_auth()` helper.
Currently, the client sends `version = 0` with empty token and device ID by
default. When the token-based auth flow is implemented, the client will populate
these fields.
---
## Validation and Limits
The server enforces the following constraints on every RPC call:
| Constraint | Value | Error on violation |
|-----------------------------|--------------------|--------------------|
| `identityKey` / `recipientKey` length | Exactly 32 bytes | Cap'n Proto error: "must be exactly 32 bytes" |
| KeyPackage size | <= 1 MB | Cap'n Proto error: "package exceeds max size" |
| Payload size | <= 5 MB | Cap'n Proto error: "payload exceeds max size" |
| Wire version | 0 or 1 | Cap'n Proto error: "unsupported wire version" |
| Auth version | 0 or 1 | Cap'n Proto error: "unsupported auth version" |
| KeyPackage non-empty | `package.len() > 0`| Cap'n Proto error: "package must not be empty" |
| Payload non-empty | `payload.len() > 0`| Cap'n Proto error: "payload must not be empty" |
The `session_token` is then passed in subsequent Protobuf requests. The server
validates it on every authenticated method call.
---
## Configuration
The server binary is configured via CLI flags or environment variables:
| Flag | Env var | Default | Description |
|----------------|----------------------------|------------------------|-------------|
| `--listen` | `QPQ_LISTEN` | `0.0.0.0:5001` | QUIC listen address (host:port). |
| `--data-dir` | `QPQ_DATA_DIR` | `data` | Directory for persisted state. |
| `--tls-cert` | `QPQ_TLS_CERT` | `data/server-cert.der` | Path to TLS certificate (DER). Auto-generated if missing. |
| `--tls-key` | `QPQ_TLS_KEY` | `data/server-key.der` | Path to TLS private key (DER). Auto-generated if missing. |
| Flag | Env var | Default | Description |
|----------------|----------------------------|----------------------|-------------|
| `--listen` | `QPQ_LISTEN` | `0.0.0.0:7000` | QUIC listen address (host:port). |
| `--data-dir` | `QPQ_DATA_DIR` | `data` | Directory for persisted KeyPackages, delivery queues, and hybrid keys. |
| `--tls-cert` | `QPQ_TLS_CERT` | `data/server-cert.der` | Path to TLS certificate (DER). Auto-generated if missing. |
| `--tls-key` | `QPQ_TLS_KEY` | `data/server-key.der` | Path to TLS private key (DER). Auto-generated if missing. |
If the TLS certificate or key files do not exist at startup, the server
auto-generates a self-signed certificate for `localhost`, `127.0.0.1`, and
`::1` using `rcgen`.
Logging level is controlled by the `RUST_LOG` environment variable (default:
`info`).
Logging level is controlled by the `RUST_LOG` environment variable (default: `info`).
---
## Further Reading
- [Architecture Overview](overview.md) -- two-service model and dual-key overview
- [NodeService Schema](../wire-format/node-service-schema.md) -- full Cap'n Proto schema
- [End-to-End Data Flow](data-flow.md) -- sequence diagrams showing registration, group creation, and messaging
- [Delivery Service Internals](../internals/delivery-service.md) -- queue routing and channel-aware delivery
- [Authentication Service Internals](../internals/authentication-service.md) -- KeyPackage lifecycle
- [Storage Backend](../internals/storage-backend.md) -- FileBackedStore details and upgrade path
- [Auth, Devices, and Tokens](../roadmap/authz-plan.md) -- planned token-based authentication
- [Architecture Overview](overview.md) -- two-service model and system diagram
- [End-to-End Data Flow](data-flow.md) -- sequence diagrams for registration, group creation, and messaging
- [Protobuf Framing](../protocol-layers/capn-proto.md) -- frame format details and method ID constants
- [Wire Format Reference](../wire-format/overview.md) -- full Protobuf schema documentation

View File

@@ -182,29 +182,22 @@ could provide an additional detection mechanism.
### No Client Authentication on the Delivery Service
The Delivery Service does not currently authenticate clients. Anyone who knows
a recipient's Ed25519 public key can enqueue messages for that recipient. This
enables spam and potential denial-of-service by flooding a recipient's queue.
The Delivery Service requires a valid OPAQUE session token for all DS
operations. The session token is bound to the client's identity key, and the
server rejects enqueue and fetch operations that lack a valid token.
**Impact:** Queue flooding, spam delivery. MLS provides its own authentication
(the recipient will reject messages not signed by a group member), so forged
content will not be accepted, but the recipient must still download and attempt
to process the spam.
**Status:** Mitigated. Token-based authentication is enforced via the OPAQUE
login flow (methods 100-103). Unauthenticated enqueue attempts are rejected.
**Mitigation path:** The AUTHZ\_PLAN introduces token-based authentication,
binding identityKey to accounts and requiring valid access tokens for all
DS operations.
### Rate Limiting
### No Rate Limiting
The server enforces a sliding window rate limit on all RPC methods. Requests
exceeding the configured threshold per IP or per account are rejected with a
rate-limit error response.
The server does not currently enforce per-client or per-IP rate limits. A
malicious client could flood the server with requests, consuming resources and
degrading service for other users.
**Impact:** Denial of service.
**Mitigation path:** The AUTHZ\_PLAN specifies per-IP and per-account/device
rate limits (e.g., 50 requests/second, 5 MB payload cap).
**Status:** Mitigated. Rate limiting is active (sliding window, configurable
threshold, default 50 requests/second per IP). The `rate_limit_hit_total`
Prometheus metric tracks rejections. See [Monitoring](../operations/monitoring.md).
### BasicCredential Only
@@ -234,18 +227,30 @@ hybrid KEM is active for MLS).
**Mitigation path:** Adopt post-quantum TLS (ML-KEM in TLS 1.3 handshake) when
`rustls` supports it.
## Future Mitigations
## Implemented Mitigations
### Sealed Sender
**Goal:** Hide the sender's identity from the server.
**Status:** Implemented. The `--sealed-sender` flag encrypts the sender's
identity inside the MLS ciphertext. When enabled, the server routes by recipient
queue index only and cannot determine who sent the message. This reduces server
metadata visibility from "who sent to whom" to "someone sent to this recipient."
**Approach:** Encrypt the sender's identity inside the MLS ciphertext. The
server cannot determine who sent a message -- it only knows the recipient
(delivery queue index). Signal implements a version of this as "Sealed Sender."
### OPAQUE Authentication
**Benefit:** Reduces the server's metadata visibility from "who sent to whom"
to "someone sent to this recipient."
**Status:** Implemented. The OPAQUE protocol (RFC 9497) is the only supported
login mechanism. The server stores OPAQUE registration records; it never receives
or stores the client's password. Session tokens issued on login are required for
all authenticated RPCs.
### Username Enumeration Protection
**Status:** Implemented. All auth responses (including failures) are subject to
a 5ms timing floor, preventing timing-based username enumeration.
---
## Future Mitigations
### Private Information Retrieval (PIR)
@@ -272,17 +277,6 @@ verify that their public key has not been replaced by an attacker.
**Benefit:** Detects attacks where the server (or an attacker who compromised
the server) substitutes a victim's public key with the attacker's key.
### OPAQUE Authentication
**Goal:** Zero-knowledge password authentication.
**Approach:** Use the OPAQUE protocol (RFC 9497) for client-server
authentication. OPAQUE allows the client to prove knowledge of a password
without revealing it to the server, even during registration.
**Benefit:** The server never learns the client's password, preventing
credential theft in a server compromise.
### Tor/I2P Integration
**Goal:** Hide client IP addresses from the server and network adversaries.
@@ -315,11 +309,12 @@ communication patterns from traffic analysis.
|--------|-------------------|-----|-------------|
| Passive eavesdropper | TLS 1.3 + MLS (2 layers) | Traffic analysis | Padding, Tor |
| Active MITM | TLS 1.3 (QUIC) | Self-signed certs | Cert pinning, CA |
| Compromised server | MLS E2E encryption | Metadata visible | Sealed Sender, PIR |
| Compromised server | MLS E2E encryption + Sealed Sender | Metadata partially visible | PIR |
| Compromised client | FS + PCS | Current epoch exposed | Periodic Updates |
| Spam/flooding | None | No auth on DS | AUTHZ\_PLAN |
| Spam/flooding | Rate limiting + OPAQUE session tokens | -- | -- |
| Username enumeration | 5ms timing floor on all auth responses | -- | -- |
| Key substitution | None | BasicCredential only | Key Transparency |
| Quantum adversary (content) | Hybrid KEM (M5+) | Pre-M5 messages | Deploy hybrid ASAP |
| Quantum adversary (content) | Hybrid KEM (X25519 + ML-KEM-768) | Pre-v2 messages | -- |
| Quantum adversary (transport) | None | Classical TLS (ECDHE) | PQ TLS |
## Related Pages

View File

@@ -10,9 +10,10 @@ These decisions are not immutable. Each ADR has a status field and can be supers
| ADR | Title | Status | One-line summary |
|---|---|---|---|
| [ADR-002](adr-002-capnproto.md) | Cap'n Proto over MessagePack | Accepted | Zero-copy, schema-enforced serialisation with built-in async RPC replaces hand-rolled MessagePack dispatch. |
| [ADR-002](adr-002-capnproto.md) | Cap'n Proto over MessagePack (v1) | Superseded | Zero-copy, schema-enforced serialisation with built-in async RPC replaced hand-rolled MessagePack dispatch. Superseded by ADR-007. |
| [ADR-004](adr-004-mls-unaware-ds.md) | MLS-Unaware Delivery Service | Accepted | The DS routes opaque blobs by recipient key; it never inspects MLS content. |
| [ADR-005](adr-005-single-use-keypackages.md) | Single-Use KeyPackages | Accepted | The AS atomically removes a KeyPackage on fetch to preserve MLS forward secrecy. |
| [ADR-007](adr-007-protobuf-migration.md) | v1 Cap'n Proto to v2 Protobuf Migration | Accepted | Replace Cap'n Proto RPC with custom Protobuf framing over QUIC streams for better ecosystem support, 44-method surface, and multi-threaded dispatch. |
---
@@ -26,7 +27,7 @@ For a broader comparison of quicproquo's design against alternative messaging pr
Each ADR page follows this structure:
1. **Status** -- One of: Proposed, Accepted, Deprecated, Superseded. All current ADRs are Accepted.
1. **Status** -- One of: Proposed, Accepted, Deprecated, Superseded. All current ADRs are Accepted unless noted.
2. **Context** -- The problem or force that motivated the decision. What constraints existed? What alternatives were considered?
3. **Decision** -- The specific choice that was made. What was selected and what was rejected?
4. **Consequences** -- The trade-offs that result from the decision. What are the benefits? What are the costs? What residual risks remain?
@@ -34,13 +35,72 @@ Each ADR page follows this structure:
---
## Cross-cutting themes
## ADR-007: v1 Cap'n Proto to v2 Protobuf Migration
Several themes recur across multiple ADRs:
**Status**: Accepted
**Context**
quicproquo v1 used Cap'n Proto for both serialisation and RPC dispatch via
`capnp-rpc`. This worked well for the initial 8-method `NodeService` interface
but had several limitations as the protocol expanded:
- **`!Send` constraint**: `capnp-rpc` uses `Rc<RefCell<>>` internally, requiring
all RPC futures to run on a `tokio::task::LocalSet`. This prevented multi-threaded
dispatch and added complexity to every connection handler.
- **Schema growth friction**: Cap'n Proto's capability-based RPC model does not
map cleanly to large flat method tables. Adding the 36 new methods (keys,
blob, device, federation, moderation, recovery, etc.) would have required
significant schema refactoring.
- **ALPN collision**: The `b"capnp"` ALPN identifier is not registered and could
conflict with other Cap'n Proto deployments. A project-specific ALPN is cleaner.
- **Tooling**: `capnpc` requires a system-wide binary installation or a vendored
copy. `prost-build` with `protobuf-src` self-vendors `protoc`, eliminating the
build-time dependency.
**Decision**
Replace `capnp-rpc` with a custom binary framing layer (`quicproquo-rpc`) and
Protocol Buffers (`prost`) for payload serialisation:
- Three frame types: Request (10-byte header), Response (9-byte header), Push
(6-byte header), all carrying Protobuf-encoded payloads.
- Method IDs are numeric `u16` constants dispatched via a handler registry.
One QUIC bidirectional stream per RPC call; push events on QUIC uni-streams.
- ALPN changed from `b"capnp"` to `b"qpq"`. Default port changed from 7000 to 5001.
- Cap'n Proto legacy types are retained in `quicproquo-proto` for v1 compatibility
but are no longer used for RPC dispatch.
**Consequences**
Benefits:
- Full `tokio::spawn` concurrency (no `LocalSet` required).
- 44-method RPC surface with clean numeric namespace and room to grow.
- Self-contained build (no system `protoc` dependency).
- Lighter middleware integration via Tower `Service` traits.
- Push event delivery without polling.
Costs:
- Lost Cap'n Proto zero-copy reads (Protobuf requires deserialisation). Acceptable
because the hot path in the Delivery Service works with opaque `bytes::Bytes`
without deserialisation.
- Lost promise pipelining from `capnp-rpc`. Not required for the current RPC
surface; can be re-added with a future streaming RPC design.
- v1 clients are no longer wire-compatible with v2 servers.
**Code references**
- Frame format: `crates/quicproquo-rpc/src/framing.rs`
- Method IDs: `crates/quicproquo-proto/src/lib.rs` (`method_ids` module)
- Proto schemas: `proto/qpq/v1/*.proto`
---
## Cross-cutting themes
### Layered security
The core principle is that **no single layer is trusted alone**. QUIC/TLS transport encryption protects metadata and provides authentication; MLS provides end-to-end content encryption with forward secrecy and post-compromise security.
The core principle is that **no single layer is trusted alone**. QUIC/TLS transport encryption protects metadata and provides server authentication; MLS provides end-to-end content encryption with forward secrecy and post-compromise security; OPAQUE ensures the server never learns the user's password.
### Server minimalism
@@ -48,13 +108,16 @@ ADR-004 and ADR-005 reflect a design philosophy where the server does as little
### Schema-first design
ADR-002 establishes Cap'n Proto as the single source of truth for the wire format. Every message and RPC call is defined in `.capnp` schema files, which are checked into the repository and used for code generation. This eliminates the class of bugs that arises from hand-rolled serialisation and ensures that the wire format is documented, versioned, and evolvable.
The v2 protocol defines all messages and method IDs in checked-in source files
(`proto/qpq/v1/*.proto` and `crates/quicproquo-proto/src/lib.rs`). Every wire
type is documented, versioned, and evolvable through the standard Protobuf
schema evolution rules (adding optional fields, reserving removed field numbers).
---
## Further reading
- [Why This Design, Not Signal/Matrix/...](why-not-signal.md) -- comparative analysis against alternative protocols
- [Wire Format Overview](../wire-format/overview.md) -- the serialisation pipeline that implements these decisions
- [Wire Format Reference](../wire-format/overview.md) -- the serialisation pipeline that implements these decisions
- [Architecture Overview](../architecture/overview.md) -- system-level view
- [Protocol Layers Overview](../protocol-layers/overview.md) -- how the protocol layers stack

View File

@@ -1,279 +1,202 @@
# Authentication Service Internals
The Authentication Service (AS) stores and distributes single-use MLS
KeyPackages. It is one of the two logical services exposed through the unified
`NodeService` RPC interface. The AS also stores hybrid (X25519 + ML-KEM-768)
public keys for post-quantum envelope encryption.
The Authentication Service handles user registration and login via the OPAQUE asymmetric password-authenticated key exchange (PAKE) protocol. It also manages MLS KeyPackages, hybrid post-quantum keys, and session token issuance.
This page covers the server-side implementation of KeyPackage storage, the
`Auth` struct validation logic, and the hybrid key endpoints.
This page covers the server-side OPAQUE flow, session token lifecycle, KeyPackage storage, and hybrid key endpoints.
**Sources:**
- `crates/quicproquo-server/src/main.rs` (RPC handlers, auth validation)
- `crates/quicproquo-server/src/storage.rs` (FileBackedStore)
- `schemas/node.capnp` (wire schema)
- `crates/quicproquo-server/src/domain/` (OPAQUE handlers, session management)
- `crates/quicproquo-server/src/sql_store.rs` (SqlStore persistence)
- `proto/qpq/v1/auth.proto` (wire schema)
---
## OPAQUE Protocol
quicproquo uses the OPAQUE asymmetric PAKE (RFC 9497) for user authentication. The password never leaves the client and is never known to the server. The server stores an OPAQUE registration record derived from the password, but this record cannot be used to recover the password even if the server is fully compromised.
### Registration (IDs 100-101)
Registration takes two round trips.
```text
Client Server
| |
| [1] OpaqueRegisterStartRequest |
| username: "alice" |
| request: <OPAQUE RegistrationReq> |
| ---------------------------------------->|
| |
| [2] OpaqueRegisterStartResponse |
| response: <OPAQUE RegistrationResp> |
| <----------------------------------------|
| |
| [3] OpaqueRegisterFinishRequest |
| username: "alice" |
| upload: <OPAQUE RegistrationUpload> |
| identity_key: <Ed25519 pubkey> |
| ---------------------------------------->|
| |
| [4] OpaqueRegisterFinishResponse |
| success: true |
| <----------------------------------------|
```
**Step [1]:** The client generates a `RegistrationRequest` blob using the `opaque-ke` crate. This contains a masked version of the password; the server cannot extract the raw password.
**Step [2]:** The server generates a `RegistrationResponse` using its OPAQUE server keypair and the client's request. The server does not yet persist anything.
**Step [3]:** The client completes the OPAQUE registration and sends a `RegistrationUpload` blob. This blob contains the password-derived key material (specifically the client's OPAQUE export key envelope and public key). The client also sends its Ed25519 identity public key.
**Step [4]:** The server stores the `RegistrationUpload` blob as the user's OPAQUE record, indexed by `username`. The Ed25519 identity key is stored alongside the record. Registration fails with `success: false` if the username is already taken.
### Login (IDs 102-103)
Login also takes two round trips and produces a session token.
```text
Client Server
| |
| [1] OpaqueLoginStartRequest |
| username: "alice" |
| request: <OPAQUE CredentialReq> |
| ---------------------------------------->|
| |
| [2] OpaqueLoginStartResponse |
| response: <OPAQUE CredentialResp> |
| <----------------------------------------|
| |
| [3] OpaqueLoginFinishRequest |
| username: "alice" |
| finalization: <OPAQUE Finalization> |
| identity_key: <Ed25519 pubkey> |
| ---------------------------------------->|
| |
| [4] OpaqueLoginFinishResponse |
| session_token: <32 bytes> |
| <----------------------------------------|
```
**Step [1]:** The client generates a `CredentialRequest` using the `opaque-ke` crate.
**Step [2]:** The server looks up the user's OPAQUE record by `username` and generates a `CredentialResponse`. If the username is unknown, the server generates a fake response using a blinded dummy record to prevent username enumeration.
**Step [3]:** The client verifies the server's `CredentialResponse` against the stored password, derives the shared export key, and sends a `CredentialFinalization` blob that proves knowledge of the password. The client also sends its Ed25519 identity key.
**Step [4]:** The server verifies the `CredentialFinalization`. If verification succeeds and the identity key matches the registered key, the server generates a `session_token` (32 random bytes), stores it in the session table, and returns it to the client. If verification fails, the server returns an error status with an empty `session_token`.
### Session token lifecycle
The `session_token` is a 32-byte random bearer credential issued at login. It is:
- Stored in the SQLCipher `sessions` table (see [Storage Backend](storage-backend.md)).
- Included by the client in subsequent QUIC connections for authentication.
- Validated by the server on connection establishment; the server rejects connections with unknown or expired tokens.
- Invalidated on `DeleteAccount` or explicit logout.
The `Auth` message in `common.proto` carries the token for federation contexts:
```protobuf
message Auth {
bytes access_token = 1;
bytes device_id = 2;
}
```
---
## KeyPackage Storage
### Data Model
MLS KeyPackages are single-use by RFC 9420 requirement. The server stores a FIFO queue of KeyPackages per identity key.
KeyPackages are stored in a `FileBackedStore` using a `Mutex`-protected
`HashMap`:
### Data model
```text
key_packages: Mutex<HashMap<Vec<u8>, VecDeque<Vec<u8>>>>
^ ^
| |
identity_key FIFO queue of
(32-byte Ed25519 TLS-encoded
public key) KeyPackage bytes
identity_key (32-byte Ed25519 pubkey)
-> VecDeque<KeyPackage bytes>
```
Each identity can have multiple KeyPackages queued. This is essential because
KeyPackages are single-use (per RFC 9420): once fetched by a peer, they are
permanently removed. Clients should upload several KeyPackages to handle
concurrent group invitations.
Each identity can have multiple KeyPackages queued. Clients should upload several packages after registration so that concurrent group invitations can each consume one without exhausting the supply.
The map is persisted to `data/keypackages.bin` using bincode serialization,
wrapped in the `QueueMapV1` struct. See [Storage Backend](storage-backend.md)
for persistence details.
### uploadKeyPackage
```capnp
uploadKeyPackage @0 (identityKey :Data, package :Data, auth :Auth)
-> (fingerprint :Data);
```
### UploadKeyPackage (ID 300)
**Handler logic:**
1. **Parse parameters.** Extract `identityKey`, `package`, and `auth`.
1. Validate `identity_key` (exactly 32 bytes) and `package` (non-empty, <= 1 MiB).
2. Compute `SHA-256(package)` as the fingerprint.
3. Push the package to the back of the identity's queue in the SQL store.
4. Return the fingerprint.
2. **Validate auth.** Call `validate_auth()` (see [Auth Validation](#auth-validation)
below).
The fingerprint allows the uploading client to detect server-side tampering. A peer that fetches a KeyPackage can compare its SHA-256 hash against the fingerprint communicated out-of-band.
3. **Validate inputs:**
| Check | Constraint | Error Message |
|-------|------------|---------------|
| Identity key length | Exactly 32 bytes | `"identityKey must be exactly 32 bytes, got {n}"` |
| Package non-empty | `package.len() > 0` | `"package must not be empty"` |
| Package size cap | `package.len() <= 1,048,576` | `"package exceeds max size (1048576 bytes)"` |
4. **Compute fingerprint.** `SHA-256(package_bytes)` produces a 32-byte digest.
5. **Store.** `FileBackedStore::upload_key_package(identity_key, package)` pushes
the package to the back of the identity's `VecDeque` and flushes to disk.
6. **Return fingerprint.** The SHA-256 hash is set in the response.
The fingerprint allows the uploading client to verify that the server stored the
exact bytes it sent. See [KeyPackage Exchange Flow](keypackage-exchange.md) for
the client-side verification logic.
### fetchKeyPackage
```capnp
fetchKeyPackage @1 (identityKey :Data, auth :Auth) -> (package :Data);
```
### FetchKeyPackage (ID 301)
**Handler logic:**
1. **Parse and validate** `identityKey` (32 bytes) and `auth`.
1. Validate `identity_key` (exactly 32 bytes).
2. Pop from the front of the identity's queue (atomic operation).
3. Return the package bytes, or empty bytes if the queue is empty.
2. **Pop from queue.** `FileBackedStore::fetch_key_package(identity_key)` calls
`VecDeque::pop_front()` on the identity's queue, removing and returning the
oldest KeyPackage. The updated map is flushed to disk.
3. **Return.** If a KeyPackage was available, set it in the response. If the
queue was empty (or the identity has no entry), return empty `Data`.
**Single-use semantics:** The `pop_front()` operation ensures each KeyPackage is
returned exactly once. This is critical for MLS security -- reusing a KeyPackage
would allow conflicting group states. The removal is atomic with respect to the
`Mutex` lock, so concurrent fetch requests will not receive the same package.
**Empty response handling:** The client checks `package.is_empty()` to
distinguish between "no packages available" and "package fetched." An empty
response is not an error -- it means the target identity has exhausted their
KeyPackage supply and needs to upload more.
---
## Auth Validation
All `NodeService` RPC methods accept an `Auth` struct:
```capnp
struct Auth {
version @0 :UInt16; # 0 = legacy/none, 1 = token-based
accessToken @1 :Data; # opaque bearer token
deviceId @2 :Data; # optional UUID for auditing
}
```
The server validates this struct through the `validate_auth` function:
```text
validate_auth(cfg, auth)
|
+-- version == 0?
| +-- cfg.allow_legacy_v0 == true? -> OK
| +-- cfg.allow_legacy_v0 == false? -> ERROR "auth version 0 disabled"
|
+-- version == 1?
| +-- accessToken empty? -> ERROR "requires non-empty accessToken"
| +-- cfg.required_token is Some?
| | +-- token matches? -> OK
| | +-- token mismatch? -> ERROR "invalid accessToken"
| +-- cfg.required_token is None? -> OK (any non-empty token accepted)
|
+-- version >= 2? -> ERROR "unsupported auth version"
```
### AuthConfig
The server's auth behavior is controlled by `AuthConfig`:
```rust
struct AuthConfig {
required_token: Option<Vec<u8>>, // None = accept any token
allow_legacy_v0: bool, // true = accept version 0 (no auth)
}
```
Configured via CLI flags / environment variables:
| Flag / Env Var | Default | Purpose |
|-----------------------------------|---------|---------|
| `--auth-token` / `QPQ_AUTH_TOKEN` | None | Required bearer token. If unset, any non-empty token is accepted for version 1. |
| `--allow-auth-v0` / `QPQ_ALLOW_AUTH_V0` | `true` | Whether to accept `auth.version=0` (legacy, unauthenticated) requests. |
### Version Semantics
| Version | Meaning | Token Required? |
|---------|---------|-----------------|
| 0 | Legacy / unauthenticated | No. Token is ignored. Server must have `allow_legacy_v0 = true`. |
| 1 | Token-based authentication | Yes. Must be non-empty. Must match `required_token` if configured. |
| 2+ | Reserved for future use | Rejected. |
### Current Limitations
The current auth implementation is intentionally minimal:
- **No identity binding.** The access token is not tied to a specific Ed25519
identity. Any valid token can upload or fetch KeyPackages for any identity.
- **No rate limiting.** There is no per-identity or per-IP rate limiting.
- **No token rotation.** Tokens are static strings configured at server startup.
- **No device management.** The `deviceId` field is accepted but not used for
authorization decisions.
The [Auth, Devices, and Tokens](../roadmap/authz-plan.md) roadmap item
addresses these gaps with a proper token issuance and validation system.
The pop is atomic with respect to the store lock, so concurrent fetch requests will not receive the same package. An empty response is not an error -- it means the target has exhausted its KeyPackage supply.
---
## Hybrid Key Endpoints
The AS also stores hybrid (X25519 + ML-KEM-768) public keys for post-quantum
envelope encryption. Unlike KeyPackages, hybrid keys are **not single-use** --
they are stored persistently and can be fetched multiple times.
Hybrid (X25519 + ML-KEM-768) public keys are used for post-quantum sealed envelope encryption. Unlike KeyPackages, hybrid keys are not single-use. Each identity stores exactly one hybrid key; uploading a new key overwrites the previous one.
### uploadHybridKey
```capnp
uploadHybridKey @6 (identityKey :Data, hybridPublicKey :Data) -> ();
```
### UploadHybridKey (ID 302)
**Handler logic:**
1. Validate `identityKey` (32 bytes) and `hybridPublicKey` (non-empty).
2. `FileBackedStore::upload_hybrid_key(identity_key, hybrid_pk)` stores the key,
overwriting any previous value for this identity.
3. Flushes to `data/hybridkeys.bin`.
1. Validate `identity_key` (32 bytes) and `hybrid_public_key` (non-empty).
2. Store the hybrid key, overwriting any previous value for this identity.
3. Return empty response.
The storage model is simpler than KeyPackages: a flat
`HashMap<Vec<u8>, Vec<u8>>` (identity key to hybrid public key bytes). There is
no queue -- each identity has at most one hybrid public key.
### FetchHybridKey (ID 303)
### fetchHybridKey
Non-destructive lookup. Returns the stored hybrid public key, or empty bytes if none is stored. The key persists across fetches.
```capnp
fetchHybridKey @7 (identityKey :Data) -> (hybridPublicKey :Data);
```
### FetchHybridKeys (ID 304)
**Handler logic:**
1. Validate `identityKey` (32 bytes).
2. Look up the hybrid public key in the store. Unlike `fetchKeyPackage`, this
does **not** remove the key -- it can be fetched repeatedly.
3. Return the key bytes, or empty `Data` if none is stored.
See [Hybrid KEM](../protocol-layers/hybrid-kem.md) for how the client uses
these keys to wrap MLS payloads in post-quantum envelopes.
Batch variant. Returns one key per input identity key in the same order. Missing keys are returned as empty bytes at the corresponding index position.
---
## NodeServiceImpl Structure
## Key Transparency Integration
The server-side implementation struct:
The key transparency log (a Merkle append-only log) records key revocations and allows clients to audit the integrity of the key directory.
### RevokeKey (ID 510)
Appends a revocation entry to the KT Merkle log. Returns the leaf index of the revocation entry. Reasons: `"compromised"`, `"superseded"`, `"user_revoked"`.
### CheckRevocation (ID 511)
Returns the revocation status of an identity key: whether revoked, the reason, and the timestamp in milliseconds.
### AuditKeyTransparency (ID 520)
Returns a range of entries from the append-only log for client-side Merkle verification. Clients can verify the returned `root` hash against the Merkle tree built from the entries.
---
## Server implementation structure
```rust
struct NodeServiceImpl {
store: Arc<FileBackedStore>, // shared across connections
waiters: Arc<DashMap<Vec<u8>, Arc<Notify>>>, // long-poll notification
auth_cfg: Arc<AuthConfig>, // auth policy
// Domain handler (quicproquo-server/src/domain/)
struct AuthHandler {
store: Arc<SqlStore>, // SQLCipher persistence
opaque_server: OpaqueServer, // opaque-ke server state
}
```
All connections share the same `store` and `waiters` via `Arc`. The
`DashMap<Vec<u8>, Arc<Notify>>` is keyed by recipient key and provides the
push-notification mechanism for `fetchWait`. See
[Delivery Service Internals](delivery-service.md) for the long-polling
implementation.
All connections share the same `SqlStore` via `Arc`. The OPAQUE server state contains the server's long-term OPAQUE keypair, which is generated on first start and persisted to the database.
---
## Connection Model
## Related pages
```text
QUIC endpoint (port 7000)
+-- TLS 1.3 handshake (self-signed cert by default)
+-- Accept bidirectional stream
+-- capnp-rpc VatNetwork (Side::Server)
+-- NodeServiceImpl { store, waiters, auth_cfg }
```
Each QUIC connection opens one bidirectional stream for Cap'n Proto RPC. The
`capnp-rpc` crate uses `Rc<RefCell<>>` internally, making it `!Send`. All RPC
tasks run on a `tokio::task::LocalSet` to satisfy this constraint.
The server generates a self-signed TLS certificate on first start if no
certificate files exist. Certificate and key paths are configurable via
`--tls-cert` and `--tls-key`.
---
## Health Endpoint
```capnp
health @5 () -> (status :Text);
```
A simple readiness probe. Returns `"ok"` unconditionally. No auth validation is
performed. Useful for infrastructure health checks and measuring QUIC round-trip
time.
---
## Related Pages
- [KeyPackage Exchange Flow](keypackage-exchange.md) -- end-to-end upload and fetch flow including client-side logic
- [Delivery Service Internals](delivery-service.md) -- the DS half of NodeService
- [Storage Backend](storage-backend.md) -- FileBackedStore persistence model
- [GroupMember Lifecycle](group-member-lifecycle.md) -- how KeyPackages are generated and consumed
- [Auth, Devices, and Tokens](../roadmap/authz-plan.md) -- planned auth improvements
- [NodeService Schema](../wire-format/node-service-schema.md) -- Cap'n Proto schema reference
- [Hybrid KEM](../protocol-layers/hybrid-kem.md) -- post-quantum envelope encryption
- [Storage Backend](storage-backend.md) -- SqlStore and FileBackedStore persistence
- [Auth Schema](../wire-format/auth-schema.md) -- Protobuf wire definitions
- [Method ID Reference](../wire-format/envelope-schema.md) -- all 44 method IDs

View File

@@ -1,21 +1,152 @@
# Storage Backend
quicproquo uses two storage backends: `FileBackedStore` on the server side
for KeyPackages and delivery queues, and `DiskKeyStore` on the client side for
MLS cryptographic key material. Both follow the same pattern: in-memory data
structures backed by optional file persistence, with full serialization on every
write.
quicproquo uses two storage backends: `SqlStore` on the server side (SQLCipher-encrypted SQLite with Argon2id key derivation) and `DiskKeyStore` on the client side (bincode-serialised file for MLS cryptographic key material).
**Sources:**
- `crates/quicproquo-server/src/storage.rs` (FileBackedStore)
- `crates/quicproquo-server/src/sql_store.rs` (SqlStore)
- `crates/quicproquo-server/src/storage.rs` (Store trait, FileBackedStore legacy)
- `crates/quicproquo-core/src/keystore.rs` (DiskKeyStore, StoreCrypto)
---
## FileBackedStore (Server-Side)
## SqlStore (Server-Side)
`FileBackedStore` provides persistent storage for the server's three data
domains: KeyPackages, delivery queues, and hybrid public keys.
`SqlStore` is the primary server-side storage backend. It wraps SQLCipher (SQLite with AES-256 encryption) via the `rusqlite` crate and provides a connection pool for concurrent access.
### Encryption
The database file is encrypted with SQLCipher using a key derived from a server-supplied passphrase. The key is passed as the SQLCipher `PRAGMA key` on connection open. Key derivation uses Argon2id: the server generates a random salt on first start and derives the 32-byte SQLCipher key material from the passphrase using Argon2id with server-configured parameters.
The database file is opaque without the key; an attacker with filesystem access cannot read any stored data without also compromising the server's key material.
### Connection pool
```rust
pub struct SqlStore {
pool: Vec<Mutex<Connection>>, // default pool_size = 4
}
```
`SqlStore` maintains a fixed pool of SQLCipher connections (default: 4). Each request acquires a connection via `try_lock()` on each pool slot (non-blocking fast path), falling back to blocking on the first connection if all are busy. WAL journal mode allows concurrent readers; writers are serialised by SQLite's locking protocol.
PRAGMA settings applied to every connection:
| PRAGMA | Value | Effect |
|--------|-------|--------|
| `journal_mode` | `WAL` | Write-ahead logging for concurrent reads |
| `synchronous` | `NORMAL` | fsync on WAL checkpoints only (performance vs. durability trade-off) |
| `foreign_keys` | `ON` | Enforce referential integrity |
### Schema and migrations
The schema version is tracked via `PRAGMA user_version`. On first open, `SqlStore` applies all pending migrations in order. Migrations are embedded as SQL strings at compile time.
Current schema version: **13**
| Migration | Version | Content |
|-----------|---------|---------|
| `001_initial.sql` | 1 | Users, key_packages, deliveries, hybrid_keys tables |
| `002_add_seq.sql` | 3 | Delivery sequence numbers |
| `003_channels.sql` | 4 | Channel-aware delivery queues |
| `004_federation.sql` | 5 | Federation peer table |
| `005_signing_key.sql` | 6 | Server signing key storage |
| `006_kt_log.sql` | 7 | Key transparency Merkle log |
| `007_add_expiry.sql` | 8 | TTL/expiry columns on deliveries |
| `008_devices.sql` | 9 | Device registration table |
| `009_sessions.sql` | 10 | Session token table |
| `010_blobs.sql` | 11 | Blob storage table |
| `011_recovery_bundles.sql` | 12 | Recovery bundle table |
| `012_moderation.sql` | 13 | Reports and bans tables |
If the database's `user_version` is greater than `SCHEMA_VERSION`, the server refuses to open it (downgrade protection).
### Store trait
`SqlStore` implements the `Store` trait defined in `storage.rs`:
```rust
pub trait Store: Send + Sync {
fn upload_key_package(&self, identity_key: &[u8], package: Vec<u8>) -> Result<(), StorageError>;
fn fetch_key_package(&self, identity_key: &[u8]) -> Result<Option<Vec<u8>>, StorageError>;
fn upload_hybrid_key(&self, identity_key: &[u8], hybrid_pk: Vec<u8>) -> Result<(), StorageError>;
fn fetch_hybrid_key(&self, identity_key: &[u8]) -> Result<Option<Vec<u8>>, StorageError>;
fn enqueue(&self, recipient_key: &[u8], channel_id: &[u8], payload: Vec<u8>, ...) -> Result<u64, StorageError>;
fn fetch(&self, recipient_key: &[u8], channel_id: &[u8], limit: u32, ...) -> Result<Vec<(u64, Vec<u8>)>, StorageError>;
fn ack(&self, recipient_key: &[u8], channel_id: &[u8], seq_up_to: u64, ...) -> Result<(), StorageError>;
fn store_session(&self, record: SessionRecord) -> Result<(), StorageError>;
fn fetch_session(&self, token: &[u8]) -> Result<Option<SessionRecord>, StorageError>;
// ... and more
}
```
### Key package storage
Key packages are stored in the `key_packages` table:
```sql
CREATE TABLE key_packages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
identity_key BLOB NOT NULL,
package_data BLOB NOT NULL,
created_at INTEGER NOT NULL DEFAULT (unixepoch())
);
```
`upload_key_package` inserts a row. `fetch_key_package` selects and deletes the oldest row for the given identity key in a single transaction (atomic FIFO pop). This guarantees MLS's single-use requirement.
### Delivery queue storage
Delivery messages are stored in the `deliveries` table with per-message sequence numbers:
```sql
CREATE TABLE deliveries (
seq INTEGER PRIMARY KEY AUTOINCREMENT,
recipient BLOB NOT NULL,
channel_id BLOB NOT NULL DEFAULT '',
device_id BLOB NOT NULL DEFAULT '',
payload BLOB NOT NULL,
expires_at INTEGER, -- NULL = no expiry
message_id BLOB -- idempotency key
);
```
`enqueue` inserts a row and returns the `seq`. `fetch` selects rows with `seq > last_ack` ordered by `seq` and returns them without deleting. `ack(seq_up_to)` deletes all rows with `seq <= seq_up_to` for the given recipient, channel, and device.
### Session storage
Sessions issued after OPAQUE login are stored in the `sessions` table:
```sql
CREATE TABLE sessions (
token BLOB NOT NULL PRIMARY KEY,
identity BLOB NOT NULL,
device_id BLOB,
created_at INTEGER NOT NULL DEFAULT (unixepoch()),
expires_at INTEGER
);
```
The `token` is the 32-byte random session token returned by `OpaqueLoginFinish`. The server validates incoming tokens by looking up this table.
### Error type
```rust
#[derive(thiserror::Error, Debug)]
pub enum StorageError {
#[error("database error: {0}")]
Db(String),
#[error("serialization error")]
Serde,
#[error("not found")]
NotFound,
}
```
---
## FileBackedStore (Server-Side, Legacy)
`FileBackedStore` was the original server-side storage backend. It uses bincode-serialised files with in-memory `Mutex`-protected `HashMap` structures. It remains available for development and testing but `SqlStore` is the production backend.
### Structure
@@ -24,367 +155,115 @@ pub struct FileBackedStore {
kp_path: PathBuf, // keypackages.bin
ds_path: PathBuf, // deliveries.bin
hk_path: PathBuf, // hybridkeys.bin
key_packages: Mutex<HashMap<Vec<u8>, VecDeque<Vec<u8>>>>, // identity -> KP queue
deliveries: Mutex<HashMap<ChannelKey, VecDeque<Vec<u8>>>>, // (channel, recipient) -> msg queue
hybrid_keys: Mutex<HashMap<Vec<u8>, Vec<u8>>>, // identity -> hybrid PK
key_packages: Mutex<HashMap<Vec<u8>, VecDeque<Vec<u8>>>>,
deliveries: Mutex<HashMap<ChannelKey, VecDeque<Vec<u8>>>>,
hybrid_keys: Mutex<HashMap<Vec<u8>, Vec<u8>>>,
}
```
Each domain has its own `Mutex`-protected in-memory map and its own disk file.
The `Mutex` (not `RwLock`) is used because every read-path operation that
modifies state (e.g., `pop_front` in `fetch_key_package`) requires exclusive
access.
File paths under the data directory:
### Initialization
| File | Contents |
|------|----------|
| `keypackages.bin` | KeyPackage queues (bincode `QueueMapV1`) |
| `deliveries.bin` | Delivery queues (bincode `QueueMapV2`) |
| `hybridkeys.bin` | Hybrid public keys (bincode `HashMap`) |
```rust
FileBackedStore::open(dir: impl AsRef<Path>) -> Result<Self, StorageError>
```
1. Creates the directory if it does not exist.
2. Loads each map from its respective file, or initializes an empty map if the
file is missing.
3. Returns the initialized store.
File paths:
- `{dir}/keypackages.bin` -- KeyPackage queues
- `{dir}/deliveries.bin` -- Delivery queues
- `{dir}/hybridkeys.bin` -- Hybrid public keys
The default data directory is `data/`, configurable via `--data-dir` /
`QPQ_DATA_DIR`.
### Flush-on-Every-Write
Every mutation serializes the entire in-memory map to disk:
```text
upload_key_package(identity_key, package)
|
+-- lock key_packages Mutex
|
+-- map.entry(identity_key).or_default().push_back(package)
|
+-- flush_kp_map(path, &map)
| +-- QueueMapV1 { map: map.clone() }
| +-- bincode::serialize(&payload)
| +-- fs::write(path, bytes)
|
+-- unlock Mutex
```
This approach is deliberately simple and correct:
- **Crash safety:** Every successful RPC response guarantees the data has been
written to the filesystem.
- **No partial writes:** The entire map is serialized atomically (though not to
a temp file with rename -- this is an MVP trade-off).
- **Performance:** Not suitable for production scale. Every write serializes and
writes the full map, which is O(n) in the total number of stored entries.
**Production improvement path:** Replace with a proper database (SQLite, sled,
or similar) for incremental writes, WAL-based crash safety, and concurrent
access without full serialization.
### KeyPackage Operations
| Method | Behavior |
|--------|----------|
| `upload_key_package(identity_key, package)` | Push to back of VecDeque; flush |
| `fetch_key_package(identity_key)` | Pop from front (FIFO, single-use); flush |
The KeyPackage map uses the `QueueMapV1` serialization wrapper:
```rust
#[derive(Serialize, Deserialize, Default)]
struct QueueMapV1 {
map: HashMap<Vec<u8>, VecDeque<Vec<u8>>>,
}
```
### Delivery Queue Operations
| Method | Behavior |
|--------|----------|
| `enqueue(recipient_key, channel_id, payload)` | Construct ChannelKey; push to back; flush |
| `fetch(recipient_key, channel_id)` | Construct ChannelKey; drain entire VecDeque; flush |
The delivery map uses `QueueMapV2` with the compound `ChannelKey`:
```rust
#[derive(Serialize, Deserialize, Clone, Eq, PartialEq, Debug)]
pub struct ChannelKey {
pub channel_id: Vec<u8>,
pub recipient_key: Vec<u8>,
}
#[derive(Serialize, Deserialize, Default)]
struct QueueMapV2 {
map: HashMap<ChannelKey, VecDeque<Vec<u8>>>,
}
```
See [Delivery Service Internals](delivery-service.md) for the full queue model
and channel-aware routing semantics.
### V1/V2 Delivery Map Migration
The delivery map format evolved from V1 (keyed by recipient key only) to V2
(keyed by `ChannelKey` with channel ID + recipient key). The load function
handles both formats transparently:
```rust
fn load_delivery_map(path: &Path) -> Result<HashMap<ChannelKey, VecDeque<Vec<u8>>>> {
let bytes = fs::read(path)?;
// Try V2 format first (channel-aware).
if let Ok(map) = bincode::deserialize::<QueueMapV2>(&bytes) {
return Ok(map.map);
}
// Fallback to legacy V1 format: migrate by setting channel_id = empty.
let legacy: QueueMapV1 = bincode::deserialize(&bytes)?;
let mut upgraded = HashMap::new();
for (recipient_key, queue) in legacy.map.into_iter() {
upgraded.insert(
ChannelKey { channel_id: Vec::new(), recipient_key },
queue,
);
}
Ok(upgraded)
}
```
Migration strategy:
1. Attempt to deserialize as V2 (`QueueMapV2`). If successful, use as-is.
2. If V2 fails, deserialize as V1 (`QueueMapV1`). Migrate each entry by
wrapping the recipient key in a `ChannelKey` with an empty `channel_id`.
3. The next flush will write V2 format, completing the migration.
This in-place migration is transparent to clients. Legacy messages (pre-channel
routing) appear under the empty channel ID and can still be fetched by clients
that pass an empty `channelId`.
### Hybrid Key Operations
| Method | Behavior |
|--------|----------|
| `upload_hybrid_key(identity_key, hybrid_pk)` | Insert (overwrite); flush |
| `fetch_hybrid_key(identity_key)` | Read-only lookup; no flush needed |
The hybrid key map is a flat `HashMap<Vec<u8>, Vec<u8>>` serialized directly
with bincode. Unlike KeyPackages, hybrid keys are not single-use -- they persist
until overwritten.
### Error Type
```rust
#[derive(thiserror::Error, Debug)]
pub enum StorageError {
#[error("io error: {0}")]
Io(String),
#[error("serialization error")]
Serde,
}
```
I/O errors (disk full, permission denied) and serialization errors (corrupt
file) are the two failure modes. The server converts `StorageError` to
`capnp::Error` via the `storage_err` helper for RPC responses.
Every write serialises the entire map to disk (O(n) per write). No encryption: data is stored in plaintext. Not recommended for production deployments; use `SqlStore` instead.
---
## DiskKeyStore (Client-Side)
`DiskKeyStore` is the client-side key store that implements the openmls
`OpenMlsKeyStore` trait. It holds MLS cryptographic key material -- most
importantly, the HPKE init private keys created during KeyPackage generation.
`DiskKeyStore` is the client-side key store that implements the openmls `OpenMlsKeyStore` trait. It holds MLS cryptographic key material, most importantly the HPKE init private keys created during KeyPackage generation.
### Structure
```rust
pub struct DiskKeyStore {
path: Option<PathBuf>, // None = ephemeral (in-memory only)
path: Option<PathBuf>, // None = ephemeral (in-memory only)
values: RwLock<HashMap<Vec<u8>, Vec<u8>>>, // key reference -> serialized MLS entity
}
```
The `RwLock` (not `Mutex`) allows concurrent reads. Write operations (store,
delete) take an exclusive lock and flush to disk.
### Modes
| Mode | Constructor | Persistence |
|------|-------------|-------------|
| Ephemeral | `DiskKeyStore::ephemeral()` | None. Data exists only in memory. Lost on process exit. |
| Persistent | `DiskKeyStore::persistent(path)` | Yes. Every write flushes the full map to disk. Survives process restarts. |
| Persistent | `DiskKeyStore::persistent(path)` | Yes. Every write flushes the full map to disk. |
**Ephemeral mode** is used for tests and the `register` / `demo-group` CLI
commands where session resumption is not needed.
Persistent mode is used for production clients. The key store path is derived from the state file by changing the extension to `.ks`.
**Persistent mode** is used for production clients (`register-state`, `invite`,
`join`, `send`, `recv` commands). The key store file path is derived from the
state file path by changing the extension to `.ks`:
### Serialisation format
```rust
fn keystore_path(state_path: &Path) -> PathBuf {
let mut path = state_path.to_path_buf();
path.set_extension("ks");
path
}
```
MLS entities MUST use bincode serialisation. The `DiskKeyStore` implements this with a two-layer scheme:
So `qpq-state.bin` produces a key store at `quicproquo-state.ks`.
1. **Inner layer:** Each MLS entity value (`V: MlsEntity`) is serialised using the openmls-required serialisation format. The `DiskKeyStore` in quicproquo uses bincode for MLS entity values, matching the `OpenMlsKeyStore` trait requirements.
2. **Outer layer:** The entire `HashMap<Vec<u8>, Vec<u8>>` is bincode-serialised as the file on disk.
### Persistence Format
The key store is serialized as a bincode-encoded `HashMap<Vec<u8>, Vec<u8>>`.
Individual values are serialized using `serde_json` (as required by openmls's
`MlsEntity` trait bound):
**Important:** Do not use Protobuf or JSON for MLS entities. MLS requires bincode for the `DiskKeyStore` in this codebase. Using a different format will produce incompatible key material.
```rust
fn store<V: MlsEntity>(&self, k: &[u8], v: &V) -> Result<(), Self::Error> {
let value = serde_json::to_vec(v)?; // MlsEntity -> JSON bytes
let mut values = self.values.write().unwrap();
let value = bincode::serialize(v)?; // MlsEntity -> bincode bytes
let mut values = self.values.write()?;
values.insert(k.to_vec(), value);
drop(values); // release lock before I/O
self.flush() // bincode serialize full map to disk
drop(values);
self.flush() // bincode-serialize full HashMap to disk
}
```
The two-layer serialization (JSON for values, bincode for the map) is a
consequence of openmls requiring `serde_json`-compatible serialization for MLS
entities, while the outer map uses bincode for compactness.
### OpenMlsKeyStore implementation
### OpenMlsKeyStore Implementation
| Trait Method | DiskKeyStore Behavior |
|--------------|-----------------------|
| `store(k, v)` | JSON-serialize value, insert into HashMap, flush to disk |
| `read(k)` | Look up key, JSON-deserialize value, return `Option<V>` |
| Trait method | DiskKeyStore behaviour |
|---|---|
| `store(k, v)` | bincode-serialize value, insert into HashMap, flush to disk |
| `read(k)` | Look up key, bincode-deserialize value, return `Option<V>` |
| `delete(k)` | Remove from HashMap, flush to disk |
The `read` method does not flush because it does not modify the map. A failed
deserialization (corrupt value) returns `None` rather than an error, which
matches the openmls `OpenMlsKeyStore` trait signature.
### StoreCrypto
### Flush Behavior
```rust
fn flush(&self) -> Result<(), DiskKeyStoreError> {
let Some(path) = &self.path else {
return Ok(()); // ephemeral: no-op
};
let values = self.values.read().unwrap();
let bytes = bincode::serialize(&*values)?;
fs::create_dir_all(path.parent())?; // ensure parent dir exists
fs::write(path, bytes)?;
Ok(())
}
```
Like `FileBackedStore`, the flush serializes the entire map on every write.
For client-side usage, the map is typically small (a handful of HPKE keys), so
this is not a performance concern.
### Error Type
```rust
#[derive(thiserror::Error, Debug, PartialEq, Eq)]
pub enum DiskKeyStoreError {
#[error("serialization error")]
Serialization,
#[error("io error: {0}")]
Io(String),
}
```
---
## StoreCrypto
`StoreCrypto` is a composite type that bundles a `DiskKeyStore` with the
`RustCrypto` provider from `openmls_rust_crypto`. It implements the openmls
`OpenMlsCryptoProvider` trait, which is the single entry point that openmls
uses for all cryptographic operations:
`StoreCrypto` bundles `DiskKeyStore` with the `RustCrypto` provider:
```rust
pub struct StoreCrypto {
crypto: RustCrypto, // AES-GCM, SHA-256, X25519, Ed25519, etc.
key_store: DiskKeyStore, // HPKE init keys, MLS epoch secrets, etc.
}
impl OpenMlsCryptoProvider for StoreCrypto {
type CryptoProvider = RustCrypto;
type RandProvider = RustCrypto;
type KeyStoreProvider = DiskKeyStore;
fn crypto() -> &RustCrypto { &self.crypto }
fn rand() -> &RustCrypto { &self.crypto }
fn key_store() -> &DiskKeyStore { &self.key_store }
crypto: RustCrypto, // AES-GCM, SHA-256, X25519, Ed25519
key_store: DiskKeyStore, // HPKE init keys, MLS epoch secrets
}
```
`StoreCrypto` is the `backend` field of [`GroupMember`](group-member-lifecycle.md).
It is passed to every openmls operation -- `KeyPackage::builder().build()`,
`MlsGroup::new_with_group_id()`, `MlsGroup::new_from_welcome()`,
`create_message()`, `process_message()`, etc.
The critical property is that the **same `StoreCrypto` instance** (and therefore
the same `DiskKeyStore`) must be used from `generate_key_package()` through
`join_group()`, because the HPKE init private key is stored in and read from
this key store.
It implements `OpenMlsCryptoProvider` and is the `backend` field of `GroupMember`. The same `StoreCrypto` instance must be used consistently from `generate_key_package()` through `join_group()`, because the HPKE init private key is written at package generation time and read at group join time.
---
## Storage Architecture Summary
## Storage architecture summary
```text
Server Client
====== ======
FileBackedStore DiskKeyStore
+-- key_packages (Mutex<HashMap>) +-- values (RwLock<HashMap>)
| Persisted: keypackages.bin | Persisted: {state}.ks
| Format: bincode(QueueMapV1) | Format: bincode(HashMap)
| | Values: serde_json(MlsEntity)
+-- deliveries (Mutex<HashMap>) |
| Persisted: deliveries.bin +-- Wrapped by StoreCrypto
| Format: bincode(QueueMapV2) | implements OpenMlsCryptoProvider
| Migration: V1 -> V2 on load |
SqlStore (production) DiskKeyStore
+-- SQLCipher-encrypted SQLite +-- values (RwLock<HashMap>)
| WAL mode, pool_size=4 | Persisted: {state}.ks
| Key: Argon2id(passphrase, salt) | Format: bincode(HashMap<Vec<u8>, Vec<u8>>)
| Schema: 13 migrations | Values: bincode(MlsEntity)
| Tables: users, key_packages, |
| deliveries, sessions, blobs, +-- Wrapped by StoreCrypto
| devices, kt_log, recovery_bundles, | implements OpenMlsCryptoProvider
| reports, banned_users, ... |
| +-- Used by GroupMember.backend
+-- hybrid_keys (Mutex<HashMap>)
Persisted: hybridkeys.bin
Format: bincode(HashMap)
FileBackedStore (legacy / dev)
+-- keypackages.bin (bincode)
+-- deliveries.bin (bincode)
+-- hybridkeys.bin (bincode)
No encryption. Not for production.
```
### Shared Design Patterns
Both backends share these characteristics:
1. **Full-map serialization.** Every write serializes the entire map to disk.
Simple, correct, but O(n) per write.
2. **Bincode format.** The outer map is always bincode-serialized. Compact and
fast, but not human-readable and not forward-compatible without wrapper
structs.
3. **No WAL / journaling.** A crash during `fs::write` could leave a corrupt
file. For the MVP, this is acceptable -- the data can be regenerated (clients
re-upload KeyPackages; delivery messages are ephemeral).
4. **No compaction.** Empty queues are not removed from the map. Over time, the
serialized size can grow with stale entries. A production implementation
should periodically compact empty entries.
5. **Directory creation.** Both backends call `fs::create_dir_all` before
writing, ensuring parent directories exist.
---
## Related Pages
## Related pages
- [GroupMember Lifecycle](group-member-lifecycle.md) -- how `StoreCrypto` and `DiskKeyStore` are used during MLS operations
- [KeyPackage Exchange Flow](keypackage-exchange.md) -- upload and fetch through `FileBackedStore`
- [Delivery Service Internals](delivery-service.md) -- delivery queue operations
- [Authentication Service Internals](authentication-service.md) -- KeyPackage and hybrid key storage
- [Key Lifecycle and Zeroization](../cryptography/key-lifecycle.md) -- how HPKE keys are created and destroyed
- [Authentication Service Internals](authentication-service.md) -- KeyPackage and session storage
- [Wire Format Overview](../wire-format/overview.md) -- frame format and transport
- [Method ID Reference](../wire-format/envelope-schema.md) -- RPC method IDs

View File

@@ -1,26 +1,26 @@
# Introduction
**quicproquo** is a research-oriented, end-to-end encrypted group messaging system written in Rust. It layers the Messaging Layer Security protocol (MLS, [RFC 9420](https://datatracker.ietf.org/doc/rfc9420/)) on top of QUIC + TLS 1.3 transport (via [quinn](https://github.com/quinn-rs/quinn) and [rustls](https://github.com/rustls/rustls)), with all service RPCs and wire messages framed using [Cap'n Proto](https://capnproto.org/). The project exists to explore how modern transport encryption (QUIC), a formally specified group key agreement protocol (MLS), and a zero-copy serialisation format (Cap'n Proto) compose in practice -- and to provide a readable, auditable reference implementation for security researchers, protocol designers, and Rust developers who want to study or extend the design.
**quicproquo** is a research-oriented, end-to-end encrypted group messaging system written in Rust. It layers the Messaging Layer Security protocol (MLS, [RFC 9420](https://datatracker.ietf.org/doc/rfc9420/)) on top of QUIC + TLS 1.3 transport (via [quinn](https://github.com/quinn-rs/quinn) and [rustls](https://github.com/rustls/rustls)), with all service RPCs framed using a compact binary Protocol Buffers format over a custom framing layer. The project exists to explore how modern transport encryption (QUIC), a formally specified group key agreement protocol (MLS), and post-quantum hybrid key encapsulation compose in practice -- and to provide a readable, auditable reference implementation for security researchers, protocol designers, and Rust developers who want to study or extend the design.
---
## Protocol stack
```
┌─────────────────────────────────────────────┐
Application / MLS ciphertext <- group key ratchet (RFC 9420)
├─────────────────────────────────────────────┤
│ Cap'n Proto RPC <- typed, schema-versioned framing
├─────────────────────────────────────────────┤
QUIC + TLS 1.3 (quinn/rustls) <- mutual auth + transport secrecy
└─────────────────────────────────────────────┘
+---------------------------------------------+
| Application / MLS ciphertext | <- group key ratchet (RFC 9420)
+---------------------------------------------+
| Protobuf framing (custom binary header) | <- typed, length-prefixed framing
+---------------------------------------------+
| QUIC + TLS 1.3 (quinn/rustls) | <- mutual auth + transport secrecy
+---------------------------------------------+
```
Each layer addresses a distinct concern:
1. **QUIC + TLS 1.3** provides authenticated, confidential transport with 0-RTT connection establishment and multiplexed streams. The server presents a TLS 1.3 certificate (self-signed by default); the client verifies it against a local trust anchor. ALPN negotiation uses the token `b"capnp"`.
1. **QUIC + TLS 1.3** provides authenticated, confidential transport with 0-RTT connection establishment and multiplexed streams. The server presents a TLS 1.3 certificate (self-signed by default); the client verifies it against a local trust anchor. ALPN negotiation uses the token `qpq`.
2. **Cap'n Proto RPC** defines the wire schema for all service operations (KeyPackage upload/fetch, message enqueue/fetch, health probes). Schemas live in `schemas/*.capnp` and are compiled to Rust at build time. Because Cap'n Proto uses a pointer-based layout, messages can be read without an unpacking step -- though quicproquo currently uses the unpacked wire format for simplicity.
2. **Protobuf framing** defines the wire format for all service operations across 44 RPC methods. Each request carries a `[method_id: u16][request_id: u32][payload_len: u32]` header followed by a Protobuf-encoded payload. Server-to-client push events use a separate frame type on QUIC uni-streams. Message definitions live in `proto/qpq/v1/*.proto` and are compiled to Rust with `prost` at build time.
3. **MLS (RFC 9420)** provides the group key agreement layer. Each participant holds an Ed25519 identity keypair and generates single-use HPKE KeyPackages. The MLS epoch ratchet delivers forward secrecy and post-compromise security: compromising a member's state at epoch *n* does not reveal plaintext from epochs *< n* (forward secrecy) or *> n+1* (post-compromise security, once the compromised member updates).
@@ -39,7 +39,7 @@ Each layer addresses a distinct concern:
| Password auth | OPAQUE (password never sent to server) |
| Metadata protection | Sealed sender + message padding |
| Local storage | SQLCipher + Argon2id + ChaCha20-Poly1305 |
| Framing | Cap'n Proto (unpacked wire format, schema-versioned) |
| Framing | Protobuf (prost) with custom binary header (method_id, request_id, length) |
For a deeper discussion of the cryptographic guarantees, threat model, and known gaps, see:
@@ -51,22 +51,20 @@ For a deeper discussion of the cryptographic guarantees, threat model, and known
## Who is this for?
**Security researchers** studying how MLS composes with QUIC transport and Cap'n Proto framing. The codebase spans 12 crates with clear cryptographic boundaries for auditability.
**Security researchers** studying how MLS composes with QUIC transport and post-quantum hybrid KEM. The codebase spans 9 workspace crates with clear cryptographic boundaries for auditability.
**Protocol designers** evaluating MLS deployment patterns. quicproquo implements a concrete Authentication Service (AS) and Delivery Service (DS) pair, demonstrating single-use KeyPackage lifecycle, Welcome routing, and epoch advancement in a live system.
**Application developers** building on the platform via SDKs:
**Application developers** building on the platform via the Rust SDK:
- **Go SDK** — native QUIC + Cap'n Proto client with full API
- **TypeScript SDK** — WASM crypto + WebSocket transport for browsers
- **C FFI** — cross-language integration (Python, Swift, Kotlin)
- **`quicproquo-sdk`** -- `QpqClient` with async event streams and a `ConversationStore`
- **C FFI** -- cross-language integration via `quicproquo-plugin-api`
**Rust developers** looking for a working example of:
- `quinn` + `rustls` server/client setup with self-signed certificates
- `capnp-rpc` over QUIC bidirectional streams (including the `!Send` / `LocalSet` constraint)
- Custom binary framing over QUIC bidirectional streams
- `openmls` group creation, member addition, and application message encryption
- `wasm-bindgen` for compiling Rust crypto to WebAssembly
- `zeroize`-on-drop key material handling
---
@@ -77,19 +75,13 @@ For a deeper discussion of the cryptographic guarantees, threat model, and known
|---|---|
| **[Comparison with Classical Protocols](design-rationale/protocol-comparison.md)** | **Why quicproquo? IRC+SSL, XMPP, Telegram vs. our design** |
| [Prerequisites](getting-started/prerequisites.md) | Toolchain and system dependencies |
| [Building from Source](getting-started/building.md) | `cargo build`, Cap'n Proto codegen, troubleshooting |
| [Building from Source](getting-started/building.md) | `cargo build`, Protobuf codegen, troubleshooting |
| [Running the Server](getting-started/running-the-server.md) | Server startup, configuration, TLS cert generation |
| [Running the Client](getting-started/running-the-client.md) | All CLI subcommands with examples |
| [REPL Command Reference](getting-started/repl-reference.md) | Complete list of 40+ slash commands |
| [Rich Messaging](getting-started/rich-messaging.md) | Reactions, typing, read receipts, edit/delete |
| [File Transfer](getting-started/file-transfer.md) | Chunked upload/download with SHA-256 verification |
| [Go SDK](getting-started/go-sdk.md) | Native QUIC + Cap'n Proto Go client |
| [TypeScript SDK & Browser Demo](getting-started/typescript-sdk.md) | WASM crypto + WebSocket transport |
| [Mesh Networking](getting-started/mesh-networking.md) | P2P, broadcast channels, store-and-forward, federation |
| [Demo Walkthrough](getting-started/demo-walkthrough.md) | Step-by-step Alice-and-Bob narrative with sequence diagram |
| [Architecture Overview](architecture/overview.md) | Crate boundaries, service architecture, data flow |
| [Protocol Layers](protocol-layers/overview.md) | Deep dives into QUIC/TLS, Cap'n Proto, MLS, Hybrid KEM |
| [Wire Format Reference](wire-format/overview.md) | Cap'n Proto schema documentation |
| [Protocol Layers](protocol-layers/overview.md) | Deep dives into QUIC/TLS, Protobuf framing, MLS, Hybrid KEM |
| [Wire Format Reference](wire-format/overview.md) | Protobuf schema documentation and method ID table |
| [Cryptography](cryptography/overview.md) | Identity keys, key lifecycle, forward secrecy, PCS, threat model |
| [Design Rationale](design-rationale/overview.md) | ADRs and protocol design decisions |
| [Roadmap](roadmap/milestones.md) | Milestone tracker and future research directions |
@@ -99,26 +91,28 @@ For a deeper discussion of the cryptographic guarantees, threat model, and known
## Current status
quicproquo is a **research project** with production-grade features. It has
not been audited by a third party. The test suite covers 130+ tests across
not been audited by a third party. The test suite covers 301 tests across
core, server, client, E2E, and P2P modules.
**What works today:**
- Full-featured REPL with 40+ commands: DMs, groups, reactions, typing,
edit/delete, file transfer, disappearing messages, safety numbers, MLS key
rotation, account deletion
- Go SDK, TypeScript SDK (WASM crypto + browser demo), C FFI + Python bindings
- Mesh networking: P2P via iroh, mDNS discovery, federation, store-and-forward,
broadcast channels
- Dynamic plugin system with 6 C-compatible hook points
- 24 Cap'n Proto RPC methods on the server
- OPAQUE password authentication (register + login, 4-method handshake)
- 44 Protobuf RPC methods across 14 proto files and 9 workspace crates
- MLS group creation, member add, message encryption, and epoch advancement
- Hybrid X25519 + ML-KEM-768 key encapsulation for post-quantum readiness
- SQLCipher-backed local storage with Argon2id key derivation
- Key transparency (REVOKE, CHECK_REVOCATION, AUDIT)
- Multi-device management and push notification registration
- Blob storage (upload/download)
- Federation relay for cross-server message delivery
- Content moderation (report, ban, unban)
- Account recovery bundle store
**Known limitations:**
- MLS credentials use `CredentialType::Basic` (raw public key). A production system would bind credentials to a certificate authority or use X.509 certificates.
- The hybrid KEM envelope is implemented and tested, but not yet integrated into the OpenMLS CryptoProvider for full post-quantum MLS (milestone M7).
- Browser connectivity requires a WebSocket-to-Cap'n-Proto bridge proxy (not yet included).
- The GUI crate (`quicproquo-gui`) requires GTK system libraries and is not feature-complete.
- The hybrid KEM envelope is implemented and tested, but not yet integrated into the OpenMLS CryptoProvider for full post-quantum MLS (planned for a future milestone).
- Browser connectivity requires a WebSocket-to-Protobuf bridge proxy (not yet included).
For the full milestone tracker, see [Milestones](roadmap/milestones.md).

View File

@@ -0,0 +1,201 @@
# Backup and Restore Procedures
This document covers backup and restore for all quicproquo server data stores.
## Data Inventory
| Data | Location | Backend | Contains |
|------|----------|---------|----------|
| SQLCipher DB | `QPQ_DB_PATH` (default `data/qpq.db`) | `store_backend=sql` | Users, key packages, delivery queues, sessions, KT log, OPAQUE setup, blobs metadata, moderation |
| File store | `QPQ_DATA_DIR` (default `data/`) | `store_backend=file` | Bincode-serialized key packages, delivery queues, server state |
| Blob storage | `QPQ_DATA_DIR/blobs/` | Filesystem | Uploaded file transfer blobs |
| TLS certificates | `QPQ_TLS_CERT`, `QPQ_TLS_KEY` | DER files | Server identity |
| OPAQUE ServerSetup | Inside DB or file store | Persisted | OPAQUE credential state (critical for auth) |
| Server signing key | Inside DB or file store | Persisted | Ed25519 key for delivery proofs |
| KT Merkle log | Inside DB or file store | Persisted | Key transparency audit log |
## SQLCipher Backup
### Hot Backup (Online)
SQLCipher supports the `.backup` command while the server is running (WAL mode
allows concurrent readers).
```bash
# 1. Open the encrypted database with the same key
sqlite3 data/qpq.db
# 2. At the sqlite3 prompt, set the encryption key
PRAGMA key = 'your-db-key-here';
# 3. Perform an online backup
.backup /backups/qpq-$(date +%Y%m%d-%H%M%S).db
.quit
```
### Scripted Hot Backup
```bash
#!/bin/bash
set -euo pipefail
BACKUP_DIR="/backups/qpq"
DB_PATH="${QPQ_DB_PATH:-data/qpq.db}"
DB_KEY="${QPQ_DB_KEY}"
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
BACKUP_FILE="${BACKUP_DIR}/qpq-${TIMESTAMP}.db"
mkdir -p "$BACKUP_DIR"
sqlite3 "$DB_PATH" <<EOF
PRAGMA key = '${DB_KEY}';
.backup ${BACKUP_FILE}
EOF
# Verify the backup is readable
sqlite3 "$BACKUP_FILE" "PRAGMA key = '${DB_KEY}'; PRAGMA integrity_check;" \
| grep -q "ok" && echo "Backup verified: $BACKUP_FILE" \
|| { echo "ERROR: backup verification failed"; exit 1; }
# Retain last 7 daily backups
find "$BACKUP_DIR" -name 'qpq-*.db' -mtime +7 -delete
```
### Cold Backup (Offline)
```bash
# 1. Stop the server
systemctl stop qpq-server # or docker compose stop server
# 2. Copy the database file
cp data/qpq.db /backups/qpq-$(date +%Y%m%d).db
# 3. Copy the WAL and SHM files if they exist
cp data/qpq.db-wal /backups/ 2>/dev/null || true
cp data/qpq.db-shm /backups/ 2>/dev/null || true
# 4. Restart the server
systemctl start qpq-server
```
## File Backend Backup
When using `store_backend=file`, data is stored as bincode files under
`QPQ_DATA_DIR`.
```bash
# Full directory backup
tar czf /backups/qpq-data-$(date +%Y%m%d-%H%M%S).tar.gz \
-C "$(dirname "${QPQ_DATA_DIR:-data}")" \
"$(basename "${QPQ_DATA_DIR:-data}")"
```
## Blob Storage Backup
Blobs are stored in `QPQ_DATA_DIR/blobs/`. These are immutable once written.
```bash
# Incremental rsync (blobs are write-once, ideal for rsync)
rsync -av --progress data/blobs/ /backups/blobs/
```
## TLS Certificate Backup
```bash
# Back up TLS certificates (store separately from DB backups)
cp data/server-cert.der /backups/tls/server-cert.der
cp data/server-key.der /backups/tls/server-key.der
# Federation certs (if federation is enabled)
cp data/federation-cert.der /backups/tls/federation-cert.der 2>/dev/null || true
cp data/federation-key.der /backups/tls/federation-key.der 2>/dev/null || true
cp data/federation-ca.der /backups/tls/federation-ca.der 2>/dev/null || true
```
## Restore Procedures
### Restore SQLCipher Database
```bash
# 1. Stop the server
systemctl stop qpq-server
# 2. Move the current (corrupt/lost) database aside
mv data/qpq.db data/qpq.db.broken 2>/dev/null || true
rm -f data/qpq.db-wal data/qpq.db-shm
# 3. Copy the backup in place
cp /backups/qpq-20260304.db data/qpq.db
# 4. Verify integrity
sqlite3 data/qpq.db "PRAGMA key = '${QPQ_DB_KEY}'; PRAGMA integrity_check;"
# 5. Start the server (migrations will apply automatically if needed)
systemctl start qpq-server
```
### Restore File Backend
```bash
# 1. Stop the server
systemctl stop qpq-server
# 2. Replace the data directory
mv data data.broken 2>/dev/null || true
tar xzf /backups/qpq-data-20260304.tar.gz -C .
# 3. Restore TLS certs if not included in the data backup
cp /backups/tls/server-cert.der data/server-cert.der
cp /backups/tls/server-key.der data/server-key.der
# 4. Start the server
systemctl start qpq-server
```
### Restore Blobs Only
```bash
rsync -av /backups/blobs/ data/blobs/
```
## Backup Schedule Recommendations
| Frequency | What | Method |
|-----------|------|--------|
| Every 6 hours | SQLCipher database | Hot backup script via cron |
| Daily | File backend / full data dir | tar + offsite copy |
| Continuous | Blobs | rsync (incremental) |
| On change | TLS certificates | Manual + secret manager |
## Cron Example
```cron
# SQLCipher hot backup every 6 hours
0 */6 * * * /opt/qpq/scripts/backup-db.sh >> /var/log/qpq-backup.log 2>&1
# Full data directory daily at 02:00
0 2 * * * tar czf /backups/qpq-data-$(date +\%Y\%m\%d).tar.gz -C /var/lib quicproquo
# Blob sync every hour
0 * * * * rsync -a /var/lib/quicproquo/blobs/ /backups/blobs/
# Prune backups older than 30 days
0 3 * * 0 find /backups -name 'qpq-*' -mtime +30 -delete
```
## Verification
Always verify backups after creation:
```bash
# SQLCipher integrity check
sqlite3 /backups/qpq-latest.db \
"PRAGMA key = '${QPQ_DB_KEY}'; PRAGMA integrity_check; SELECT count(*) FROM users;"
# File backend: check the archive is valid
tar tzf /backups/qpq-data-latest.tar.gz > /dev/null
# TLS cert: check it parses and is not expired
openssl x509 -inform DER -in /backups/tls/server-cert.der -noout -dates
```

View File

@@ -0,0 +1,233 @@
# Monitoring Guide
This document covers metrics collection, alerting, and dashboards for
quicproquo server deployments.
## Enabling Metrics
The server exports Prometheus metrics via HTTP when configured:
```bash
# Environment variables
QPQ_METRICS_LISTEN=0.0.0.0:9090
QPQ_METRICS_ENABLED=true
# Or in qpq-server.toml
metrics_listen = "0.0.0.0:9090"
metrics_enabled = true
```
Metrics are served at `http://<metrics_listen>/metrics` in Prometheus
exposition format.
## Available Metrics
### Counters
| Metric | Description | Labels |
|--------|-------------|--------|
| `enqueue_total` | Total messages enqueued | - |
| `enqueue_bytes_total` | Total bytes enqueued | - |
| `fetch_total` | Total message fetches completed | - |
| `fetch_wait_total` | Total long-poll fetch waits | - |
| `key_package_upload_total` | Total MLS key package uploads | - |
| `auth_login_success_total` | Successful OPAQUE login completions | - |
| `auth_login_failure_total` | Failed login attempts | - |
| `rate_limit_hit_total` | Rate limit rejections | - |
### Gauges
| Metric | Description |
|--------|-------------|
| `delivery_queue_depth` | Current delivery queue depth (sampled) |
## Prometheus Configuration
```yaml
# prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'qpq-server'
static_configs:
- targets: ['qpq-server:9090']
scrape_interval: 10s
```
## Alert Rules
```yaml
# prometheus-alerts.yml
groups:
- name: qpq-server
rules:
# Server down
- alert: QpqServerDown
expr: up{job="qpq-server"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "qpq-server is down"
description: "Prometheus cannot scrape qpq-server metrics for > 1 minute."
# High auth failure rate (potential brute force)
- alert: QpqHighAuthFailureRate
expr: rate(auth_login_failure_total[5m]) > 10
for: 2m
labels:
severity: warning
annotations:
summary: "High authentication failure rate"
description: "{{ $value | printf \"%.1f\" }} auth failures/sec over 5 minutes."
# Rate limiting active
- alert: QpqRateLimitActive
expr: rate(rate_limit_hit_total[5m]) > 5
for: 5m
labels:
severity: warning
annotations:
summary: "Rate limiting is actively rejecting requests"
description: "{{ $value | printf \"%.1f\" }} rate limit hits/sec."
# Delivery queue growing
- alert: QpqDeliveryQueueHigh
expr: delivery_queue_depth > 10000
for: 10m
labels:
severity: warning
annotations:
summary: "Delivery queue depth is high"
description: "Queue depth: {{ $value }}. Clients may not be fetching."
- alert: QpqDeliveryQueueCritical
expr: delivery_queue_depth > 100000
for: 5m
labels:
severity: critical
annotations:
summary: "Delivery queue depth is critical"
description: "Queue depth: {{ $value }}. Investigate immediately."
# No enqueue activity (service may be stuck)
- alert: QpqNoEnqueueActivity
expr: rate(enqueue_total[15m]) == 0
for: 30m
labels:
severity: warning
annotations:
summary: "No messages enqueued in 30 minutes"
description: "Check if the service is accepting connections."
# Auth success ratio too low
- alert: QpqLowAuthSuccessRatio
expr: >
rate(auth_login_success_total[5m])
/ (rate(auth_login_success_total[5m]) + rate(auth_login_failure_total[5m]))
< 0.5
for: 10m
labels:
severity: warning
annotations:
summary: "Auth success ratio below 50%"
description: "More than half of login attempts are failing."
```
## Key Dashboard Panels
See `dashboards/qpq-overview.json` for the full Grafana dashboard. Key panels:
### Message Throughput
- **Enqueue rate**: `rate(enqueue_total[5m])`
- **Fetch rate**: `rate(fetch_total[5m])`
- **Enqueue bandwidth**: `rate(enqueue_bytes_total[5m])`
### Authentication
- **Login success rate**: `rate(auth_login_success_total[5m])`
- **Login failure rate**: `rate(auth_login_failure_total[5m])`
- **Success ratio**: `rate(auth_login_success_total[5m]) / (rate(auth_login_success_total[5m]) + rate(auth_login_failure_total[5m]))`
### Delivery Queue
- **Queue depth**: `delivery_queue_depth`
- **Queue growth rate**: `deriv(delivery_queue_depth[10m])`
### Rate Limiting
- **Rate limit hits**: `rate(rate_limit_hit_total[5m])`
### Infrastructure (Node Exporter)
- CPU, memory, disk, network from `node_exporter`
## Grafana Dashboard
Import the dashboard from `dashboards/qpq-overview.json`:
1. Open Grafana -> Dashboards -> Import
2. Upload `docs/operations/dashboards/qpq-overview.json`
3. Select your Prometheus data source
4. Save
## Log Monitoring
The server uses `tracing` with `RUST_LOG` environment variable:
```bash
# Production: info level with structured JSON output
RUST_LOG=info
# Debug specific modules
RUST_LOG=info,quicproquo_server::node_service=debug
# Verbose debugging
RUST_LOG=debug
```
### Key Log Messages to Monitor
| Log Pattern | Meaning | Action |
|-------------|---------|--------|
| `"TLS certificate expires within 30 days"` | Cert expiring soon | Rotate certificate |
| `"TLS certificate is self-signed"` | Self-signed cert in use | Replace with CA-signed cert in production |
| `"connection rate limit exceeded"` | IP being rate limited | Check for DDoS |
| `"running without QPQ_AUTH_TOKEN"` | Insecure mode | Must not appear in production |
| `"db_key is empty; SQL store will be plaintext"` | Unencrypted DB | Must not appear in production |
| `"shutdown signal received"` | Graceful shutdown started | Expected during deploys |
| `"generated and persisted new OPAQUE ServerSetup"` | Fresh OPAQUE setup | Expected on first start only |
### Log Aggregation
For production, pipe logs to a log aggregator:
```bash
# Systemd -> journald -> Loki/Elasticsearch
journalctl -u qpq-server -f --output=json | \
promtail --stdin --client.url=http://loki:3100/loki/api/v1/push
# Docker -> Loki driver
docker run --log-driver=loki \
--log-opt loki-url="http://loki:3100/loki/api/v1/push" \
qpq-server
```
## Health Checking
The Docker image includes a basic health check (TLS cert file exists). For
deeper health checks:
```bash
# Simple: check the process is running and port is open
ss -ulnp | grep 5001
# Metrics endpoint (if enabled)
curl -sf http://localhost:9090/metrics > /dev/null
# Full client connection test
qpq-client --server 127.0.0.1:5001 --ping
```

View File

@@ -0,0 +1,251 @@
# Scaling Guide
This document covers resource sizing, scaling triggers, and capacity planning
for quicproquo deployments.
## Architecture Overview
quicproquo runs as a single-process server handling QUIC connections. Key
resource consumers:
- **CPU**: TLS 1.3 handshakes (QUIC), OPAQUE PAKE authentication, message routing
- **Memory**: In-memory session state (DashMap), QUIC connection state, delivery waiters, rate limit entries
- **Disk I/O**: SQLCipher reads/writes (WAL mode), blob storage, KT Merkle log
- **Network**: QUIC (UDP), metrics HTTP, optional WebSocket bridge
## Single-Node Sizing
### Minimum (Development / Small Team)
| Resource | Value |
|----------|-------|
| CPU | 1 vCPU |
| Memory | 512 MB |
| Disk | 10 GB SSD |
| Network | 100 Mbps |
Supports approximately 100 concurrent users, light message traffic.
### Recommended (Production / Small-Medium)
| Resource | Value |
|----------|-------|
| CPU | 2-4 vCPU |
| Memory | 2-4 GB |
| Disk | 50-100 GB NVMe SSD |
| Network | 1 Gbps |
Supports approximately 1,000-5,000 concurrent users.
### Large (High Traffic)
| Resource | Value |
|----------|-------|
| CPU | 8+ vCPU |
| Memory | 8-16 GB |
| Disk | 500 GB+ NVMe SSD (RAID 10) |
| Network | 10 Gbps |
Supports approximately 10,000+ concurrent users.
## Scaling Triggers
Monitor these metrics and scale when thresholds are exceeded:
| Metric | Warning | Critical | Action |
|--------|---------|----------|--------|
| CPU usage | > 70% sustained (5 min) | > 90% sustained | Add CPU or scale horizontally |
| Memory usage | > 75% | > 90% | Increase memory, check for leaks |
| Disk usage | > 70% | > 90% | Expand volume, clean old data |
| Disk I/O latency | > 5 ms p95 | > 20 ms p95 | Move to faster storage |
| `delivery_queue_depth` | > 10,000 | > 100,000 | Investigate stale queues |
| `rate_limit_hit_total` rate | > 100/min | > 1000/min | Investigate abuse, adjust limits |
| `auth_login_failure_total` rate | > 50/min | > 500/min | Potential brute force attack |
| Connection count | > 80% of `max_concurrent_bidi_streams` | > 95% | Scale horizontally |
| TLS handshake latency | > 100 ms p95 | > 500 ms p95 | Add CPU, check network |
## Vertical Scaling
### CPU Scaling
The server is async (Tokio) and benefits from multiple cores. QUIC TLS
handshakes and OPAQUE computations are CPU-intensive.
```bash
# Check current CPU usage
top -bn1 -p $(pgrep qpq-server)
# For Docker: increase CPU limits in docker-compose.prod.yml
# deploy:
# resources:
# limits:
# cpus: '4'
```
### Memory Scaling
In-memory state scales linearly with concurrent connections:
- ~2-5 KB per active QUIC connection (quinn state)
- ~200 bytes per session entry (DashMap)
- ~100 bytes per rate limit entry
- ~100 bytes per delivery waiter
```bash
# Estimate memory for 10,000 connections:
# 10,000 * 5 KB = ~50 MB for connections
# 10,000 * 500 bytes = ~5 MB for sessions/rate limits
# SQLCipher connection pool: ~50 MB (4 connections, caches)
# Base process: ~30 MB
# Total: ~135 MB + headroom = 256-512 MB minimum
```
### Disk I/O Scaling
SQLCipher uses WAL mode for concurrent reads. For write-heavy workloads:
```bash
# Check current I/O
iostat -x 1 5
# Increase WAL autocheckpoint threshold for burst writes
sqlite3 data/qpq.db "PRAGMA key='${QPQ_DB_KEY}'; PRAGMA wal_autocheckpoint=2000;"
```
## Horizontal Scaling
quicproquo does not yet have built-in multi-node clustering. For horizontal
scaling, use these patterns:
### Load Balancer (UDP/QUIC)
Place a UDP load balancer in front of multiple qpq-server instances. Each
instance runs independently with its own database.
```
+-----------+
clients ------> | L4 LB | ----> qpq-server-1 (db-1)
| (UDP/QUIC)| ----> qpq-server-2 (db-2)
+-----------+ qpq-server-3 (db-3)
```
**Requirements:**
- Sticky sessions (by client IP or QUIC connection ID) so a client always
reaches the same node.
- Shared storage backend or federation between nodes.
### Federation for Multi-Node
Enable federation to relay messages between nodes:
```toml
# qpq-server.toml on node-1
[federation]
enabled = true
domain = "node1.chat.example.com"
listen = "0.0.0.0:7001"
federation_cert = "data/federation-cert.der"
federation_key = "data/federation-key.der"
federation_ca = "data/federation-ca.der"
[[federation.peers]]
domain = "node2.chat.example.com"
address = "10.0.1.2:7001"
```
### Shared Database (Future)
For true horizontal scaling, migrating from SQLCipher to a shared PostgreSQL
instance is the planned approach. This is not yet implemented.
```
qpq-server-1 --\
qpq-server-2 ---+--> PostgreSQL (shared)
qpq-server-3 --/
```
## Connection Tuning
The server has these QUIC transport defaults:
| Parameter | Default | Tunable |
|-----------|---------|---------|
| Max idle timeout | 300s (5 min) | Code change required |
| Max concurrent bidi streams | 1 per connection | Code change required |
| SQLCipher connection pool | 4 connections | Code change required |
For high connection counts:
```bash
# Increase OS file descriptor limit
ulimit -n 65536
# Increase UDP buffer sizes in /etc/sysctl.d/99-qpq.conf
net.core.rmem_max = 26214400
net.core.wmem_max = 26214400
net.core.rmem_default = 1048576
net.core.wmem_default = 1048576
```
```bash
sysctl -p /etc/sysctl.d/99-qpq.conf
```
## Docker Resource Limits
```yaml
# docker-compose.prod.yml
services:
server:
deploy:
resources:
limits:
cpus: '4'
memory: 4G
reservations:
cpus: '2'
memory: 1G
ulimits:
nofile:
soft: 65536
hard: 65536
```
## Load Testing
Use the included test infrastructure to benchmark:
```bash
# Build the test client
cargo build --release --bin qpq-client
# Run concurrent connection test (example)
for i in $(seq 1 100); do
qpq-client --server 127.0.0.1:5001 &
done
wait
# Monitor during load test
watch -n1 'curl -s http://localhost:9090/metrics | grep -E "enqueue_total|fetch_total|delivery_queue_depth|rate_limit"'
```
## Capacity Planning Worksheet
| Parameter | Your Value |
|-----------|-----------|
| Expected concurrent users | |
| Messages per user per hour | |
| Average message size (bytes) | |
| Blob uploads per day | |
| Average blob size (MB) | |
| Data retention (days) | |
**Formulas:**
```
Storage per day = (users * msgs/hr * 24 * avg_msg_size) + (blob_uploads * avg_blob_size)
DB growth per month = storage_per_day * 30
Memory estimate = (concurrent_users * 5 KB) + 256 MB base
CPU estimate = 1 vCPU per ~2,500 concurrent connections (depends on message rate)
```

View File

@@ -1,264 +1,299 @@
# Cap'n Proto Serialisation and RPC
# Protobuf Framing
quicproquo uses [Cap'n Proto](https://capnproto.org/) for both message serialisation and remote procedure calls. The serialisation layer encodes structured messages (Envelopes, Auth tokens, delivery payloads) into a compact binary format. The RPC layer provides the client-server interface for the Authentication Service, Delivery Service, and health checks -- all exposed through a single `NodeService` interface.
quicproquo v2 uses a custom binary framing protocol layered over QUIC bidirectional streams. Message payloads are serialised with Protocol Buffers (Protobuf) via the `prost` crate. The framing layer (implemented in `quicproquo-rpc`) adds a compact fixed-size header that carries the method ID, request correlation ID, and payload length -- enabling zero-copy dispatch without a separate length-delimited codec.
This page covers why Cap'n Proto was chosen, how schemas are compiled, the owned `ParsedEnvelope` type, serialisation helpers, and ALPN integration with QUIC.
This page covers the three frame types, the method ID dispatch table, status codes, push event delivery, and the Protobuf schema organisation.
## Why Cap'n Proto
---
Several serialisation formats were considered. The table below summarises the trade-offs:
## Frame Types
| Format | Zero-copy reads | Schema enforcement | Built-in RPC | Canonical bytes for signing |
|---|---|---|---|---|
| **Cap'n Proto** | Yes | Yes (`.capnp` schemas) | Yes (`capnp-rpc`) | Yes (canonical serialisation mode) |
| Protocol Buffers | No (requires deserialisation) | Yes (`.proto` schemas) | Yes (`tonic`/gRPC) | No (non-deterministic field ordering) |
| MessagePack | No | No (untyped) | No | No |
| FlatBuffers | Yes | Yes (`.fbs` schemas) | No built-in RPC | Partial |
There are three frame types in the v2 protocol. All multi-byte integers are **big-endian** (network byte order).
Cap'n Proto was selected for the following reasons:
### Request Frame (client -> server)
1. **Zero-copy reads**: Cap'n Proto messages can be read directly from the wire buffer without deserialisation. The `Reader` type is a thin pointer into the original bytes. This eliminates allocation and copying on the hot path (message routing in the Delivery Service).
Sent on a QUIC bidirectional stream (one stream per RPC call):
2. **Schema-enforced types**: All messages are defined in `.capnp` schema files. The compiler (`capnpc`) generates type-safe Rust code that prevents mismatched field types at compile time. This is especially valuable for a security-sensitive protocol where a type confusion bug could be exploitable.
3. **Canonical serialisation**: Cap'n Proto can produce deterministic byte representations of messages. This is critical for MLS, where Commits and KeyPackages must be signed -- the signature must cover exactly the same bytes that the verifier will see.
4. **Built-in async RPC**: The `capnp-rpc` crate provides a capability-based RPC system with promise pipelining. quicproquo uses it for the `NodeService` interface (KeyPackage upload/fetch, message enqueue/fetch, health checks, hybrid key operations). This avoids the need to hand-roll a request/response protocol.
5. **Compact wire format**: Cap'n Proto's wire format is more compact than JSON or XML and comparable to Protocol Buffers, with the advantage of no decode step.
## Schema compilation flow
Cap'n Proto schemas live in the workspace-root `schemas/` directory:
```text
schemas/
envelope.capnp -- Top-level wire message (MsgType enum + payload)
auth.capnp -- AuthenticationService RPC interface (legacy, pre-M3)
delivery.capnp -- DeliveryService RPC interface (legacy, pre-M3)
node.capnp -- Unified NodeService RPC interface (M3+)
```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| method_id (u16 BE) | request_id (u32 BE) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| request_id (cont.) | payload_len (u32 BE) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| payload_len (cont.) | protobuf payload ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```
### build.rs
| Field | Type | Bytes | Description |
|---|---|---|---|
| `method_id` | `u16` | 0-1 | RPC method identifier (see method IDs table) |
| `request_id` | `u32` | 2-5 | Client-generated correlation ID; echoed back in the response |
| `payload_len` | `u32` | 6-9 | Length of the Protobuf payload in bytes |
| payload | bytes | 10+ | Protobuf-encoded request message |
The `quicproquo-proto` crate compiles these schemas at build time via `build.rs`:
Header size: **10 bytes**. Maximum payload: **4 MiB**.
### Response Frame (server -> client)
Sent on the same QUIC bidirectional stream as the request:
```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| status (u8) | request_id (u32 BE) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| request_id (cont.) | payload_len (u32 BE) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| payload_len (cont.) | protobuf payload ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```
| Field | Type | Bytes | Description |
|---|---|---|---|
| `status` | `u8` | 0 | Status code (see status codes table) |
| `request_id` | `u32` | 1-4 | Echoes the `request_id` from the request frame |
| `payload_len` | `u32` | 5-8 | Length of the Protobuf payload in bytes |
| payload | bytes | 9+ | Protobuf-encoded response message (may be empty on error) |
Header size: **9 bytes**.
### Push Frame (server -> client, uni-stream)
Sent by the server on QUIC uni-directional streams for real-time event delivery. No request ID -- push frames are not correlated to any client request.
```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| event_type (u16 BE) | payload_len (u32 BE) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| payload_len (cont.) | protobuf payload ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```
| Field | Type | Bytes | Description |
|---|---|---|---|
| `event_type` | `u16` | 0-1 | Push event type (see push event types table) |
| `payload_len` | `u32` | 2-5 | Length of the Protobuf payload in bytes |
| payload | bytes | 6+ | Protobuf-encoded push event message |
Header size: **6 bytes**.
---
## Status Codes
The `status` byte in a Response frame carries one of the following values:
| Value | `RpcStatus` variant | Meaning |
|-------|---------------------|---------|
| 0 | `Ok` | Success. Response payload contains the result. |
| 1 | `BadRequest` | Malformed request, missing required field, or failed validation. |
| 2 | `Unauthorized` | Missing or invalid session token. |
| 3 | `Forbidden` | Valid token but insufficient permissions for this operation. |
| 4 | `NotFound` | Requested resource does not exist (e.g., KeyPackage not found). |
| 5 | `RateLimited` | Request rate limit exceeded. Client should back off before retrying. |
| 8 | `DeadlineExceeded` | Server could not complete the request within the configured deadline. |
| 9 | `Unavailable` | Server temporarily unable to serve the request (e.g., storage unavailable). |
| 10 | `Internal` | Unexpected server-side error. |
| 11 | `UnknownMethod` | The `method_id` in the request is not registered. |
---
## Method IDs
All 44 RPC method IDs are defined in `crates/quicproquo-proto/src/lib.rs` in the `method_ids` module. The numeric ranges group related methods by service category.
### Auth (100-103)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 100 | `OpaqueRegisterStart` | `OpaqueRegisterStartRequest` | `OpaqueRegisterStartResponse` |
| 101 | `OpaqueRegisterFinish` | `OpaqueRegisterFinishRequest` | `OpaqueRegisterFinishResponse` |
| 102 | `OpaqueLoginStart` | `OpaqueLoginStartRequest` | `OpaqueLoginStartResponse` |
| 103 | `OpaqueLoginFinish` | `OpaqueLoginFinishRequest` | `OpaqueLoginFinishResponse` |
### Delivery (200-205)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 200 | `Enqueue` | `EnqueueRequest` | `EnqueueResponse` |
| 201 | `Fetch` | `FetchRequest` | `FetchResponse` |
| 202 | `FetchWait` | `FetchWaitRequest` | `FetchWaitResponse` |
| 203 | `Peek` | `PeekRequest` | `PeekResponse` |
| 204 | `Ack` | `AckRequest` | `AckResponse` |
| 205 | `BatchEnqueue` | `BatchEnqueueRequest` | `BatchEnqueueResponse` |
### Keys (300-304)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 300 | `UploadKeyPackage` | `UploadKeyPackageRequest` | `UploadKeyPackageResponse` |
| 301 | `FetchKeyPackage` | `FetchKeyPackageRequest` | `FetchKeyPackageResponse` |
| 302 | `UploadHybridKey` | `UploadHybridKeyRequest` | `UploadHybridKeyResponse` |
| 303 | `FetchHybridKey` | `FetchHybridKeyRequest` | `FetchHybridKeyResponse` |
| 304 | `FetchHybridKeys` | `FetchHybridKeysRequest` | `FetchHybridKeysResponse` |
### Channel (400)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 400 | `CreateChannel` | `CreateChannelRequest` | `CreateChannelResponse` |
### Group Management (410-413)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 410 | `RemoveMember` | `RemoveMemberRequest` | `RemoveMemberResponse` |
| 411 | `UpdateGroupMetadata` | `UpdateGroupMetadataRequest` | `UpdateGroupMetadataResponse` |
| 412 | `ListGroupMembers` | `ListGroupMembersRequest` | `ListGroupMembersResponse` |
| 413 | `RotateKeys` | `RotateKeysRequest` | `RotateKeysResponse` |
### Moderation (420-424)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 420 | `ReportMessage` | `ReportMessageRequest` | `ReportMessageResponse` |
| 421 | `BanUser` | `BanUserRequest` | `BanUserResponse` |
| 422 | `UnbanUser` | `UnbanUserRequest` | `UnbanUserResponse` |
| 423 | `ListReports` | `ListReportsRequest` | `ListReportsResponse` |
| 424 | `ListBanned` | `ListBannedRequest` | `ListBannedResponse` |
### User / Identity (500-501)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 500 | `ResolveUser` | `ResolveUserRequest` | `ResolveUserResponse` |
| 501 | `ResolveIdentity` | `ResolveIdentityRequest` | `ResolveIdentityResponse` |
### Key Transparency (510-520)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 510 | `RevokeKey` | `RevokeKeyRequest` | `RevokeKeyResponse` |
| 511 | `CheckRevocation` | `CheckRevocationRequest` | `CheckRevocationResponse` |
| 520 | `AuditKeyTransparency` | `AuditKeyTransparencyRequest` | `AuditKeyTransparencyResponse` |
### Blob Storage (600-601)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 600 | `UploadBlob` | `UploadBlobRequest` | `UploadBlobResponse` |
| 601 | `DownloadBlob` | `DownloadBlobRequest` | `DownloadBlobResponse` |
### Device Management (700-710)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 700 | `RegisterDevice` | `RegisterDeviceRequest` | `RegisterDeviceResponse` |
| 701 | `ListDevices` | `ListDevicesRequest` | `ListDevicesResponse` |
| 702 | `RevokeDevice` | `RevokeDeviceRequest` | `RevokeDeviceResponse` |
| 710 | `RegisterPushToken` | `RegisterPushTokenRequest` | `RegisterPushTokenResponse` |
### Recovery (750-752)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 750 | `StoreRecoveryBundle` | `StoreRecoveryBundleRequest` | `StoreRecoveryBundleResponse` |
| 751 | `FetchRecoveryBundle` | `FetchRecoveryBundleRequest` | `FetchRecoveryBundleResponse` |
| 752 | `DeleteRecoveryBundle` | `DeleteRecoveryBundleRequest` | `DeleteRecoveryBundleResponse` |
### P2P and Health (800-802)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 800 | `PublishEndpoint` | `PublishEndpointRequest` | `PublishEndpointResponse` |
| 801 | `ResolveEndpoint` | `ResolveEndpointRequest` | `ResolveEndpointResponse` |
| 802 | `Health` | `HealthRequest` | `HealthResponse` |
### Federation (900-905)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 900 | `RelayEnqueue` | `RelayEnqueueRequest` | `RelayEnqueueResponse` |
| 901 | `RelayBatchEnqueue` | `RelayBatchEnqueueRequest` | `RelayBatchEnqueueResponse` |
| 902 | `ProxyFetchKeyPackage` | `ProxyFetchKeyPackageRequest` | `ProxyFetchKeyPackageResponse` |
| 903 | `ProxyFetchHybridKey` | `ProxyFetchHybridKeyRequest` | `ProxyFetchHybridKeyResponse` |
| 904 | `ProxyResolveUser` | `ProxyResolveUserRequest` | `ProxyResolveUserResponse` |
| 905 | `FederationHealth` | `FederationHealthRequest` | `FederationHealthResponse` |
### Account (950)
| ID | Method | Request type | Response type |
|---|---|---|---|
| 950 | `DeleteAccount` | `DeleteAccountRequest` | `DeleteAccountResponse` |
---
## Push Event Types
Server-to-client push events are delivered on QUIC uni-streams using the Push frame format. Event types are defined alongside method IDs in `quicproquo-proto/src/lib.rs`:
| Value | Event | Description |
|-------|-------|-------------|
| 1000 | `PushNewMessage` | A new message has been enqueued for the client. |
| 1001 | `PushTyping` | A group member has started or stopped typing. |
| 1002 | `PushPresence` | A contact's presence status has changed (online/offline). |
| 1003 | `PushMembership` | Group membership changed (member added or removed). |
Push events avoid the need for the client to long-poll `FetchWait` (202) for real-time delivery. The client can listen on a background task for incoming uni-streams and process push events independently of pending RPC calls.
---
## Stream Model
Each RPC call uses a **dedicated QUIC bidirectional stream**:
1. Client opens a new bidirectional stream (`connection.open_bi()`).
2. Client encodes the request into a `RequestFrame` and writes it to the send half.
3. Client closes the send half (marks end-of-write).
4. Server reads the complete `RequestFrame` from the receive half.
5. Server processes the request and writes a `ResponseFrame` to its send half.
6. Server closes the send half.
7. Client reads the complete `ResponseFrame`.
This allows many concurrent RPCs on a single QUIC connection without head-of-line blocking.
---
## Protobuf Schema Organisation
All message types are defined in `proto/qpq/v1/`:
| File | Contents |
|---|---|
| `auth.proto` | OPAQUE registration and login message types |
| `common.proto` | Auth context, account deletion, shared types |
| `delivery.proto` | Enqueue, Fetch, Peek, Ack, BatchEnqueue |
| `keys.proto` | MLS key packages, hybrid keys |
| `channel.proto` | Channel creation |
| `group.proto` | Group management (remove member, metadata, rotate keys) |
| `moderation.proto` | Report, ban, unban, list |
| `user.proto` | User and identity resolution |
| `kt.proto` | Key transparency (revoke, check, audit) |
| `blob.proto` | Binary object storage |
| `device.proto` | Multi-device management, push tokens |
| `recovery.proto` | Account recovery bundles |
| `p2p.proto` | P2P endpoints, health |
| `federation.proto` | Cross-server relay |
All `.proto` files use `package qpq.v1;` and are compiled to Rust at build time using `prost-build` via the `quicproquo-proto` crate's `build.rs`. The `protobuf-src` crate vendors `protoc`, so no system-wide `protoc` installation is required.
Generated Rust types are accessed via:
```rust
capnpc::CompilerCommand::new()
.src_prefix(&schemas_dir)
.file(schemas_dir.join("envelope.capnp"))
.file(schemas_dir.join("auth.capnp"))
.file(schemas_dir.join("delivery.capnp"))
.file(schemas_dir.join("node.capnp"))
.run()
.expect("Cap'n Proto schema compilation failed.");
use quicproquo_proto::qpq::v1::{EnqueueRequest, FetchResponse, /* ... */};
use quicproquo_proto::method_ids::{ENQUEUE, FETCH, /* ... */};
```
Key details:
---
- **`src_prefix`**: Set to `schemas/` so that inter-schema imports resolve correctly.
- **Output location**: Generated Rust source is written to `$OUT_DIR` (Cargo's build directory). The filenames follow the convention `{schema_name}_capnp.rs`.
- **Rerun triggers**: `cargo:rerun-if-changed` directives ensure the build script re-runs whenever any `.capnp` file changes.
- **Prerequisite**: The `capnp` CLI binary must be installed on the build machine (`apt-get install capnproto` or `brew install capnp`).
## Design Constraints of `quicproquo-proto`
### Generated module inclusion
The generated code is spliced into the `quicproquo-proto` crate via `include!` macros:
```rust
pub mod envelope_capnp {
include!(concat!(env!("OUT_DIR"), "/envelope_capnp.rs"));
}
pub mod auth_capnp {
include!(concat!(env!("OUT_DIR"), "/auth_capnp.rs"));
}
pub mod delivery_capnp {
include!(concat!(env!("OUT_DIR"), "/delivery_capnp.rs"));
}
pub mod node_capnp {
include!(concat!(env!("OUT_DIR"), "/node_capnp.rs"));
}
```
Consumers import types from these modules. For example, `node_capnp::node_service::Server` is the trait that the server implements.
## The Envelope schema
The `Envelope` is the top-level wire message for all quicproquo traffic. Every frame exchanged between peers is serialised as an Envelope:
```capnp
struct Envelope {
msgType @0 :MsgType;
groupId @1 :Data; # 32-byte SHA-256 digest of group name
senderId @2 :Data; # 32-byte SHA-256 digest of Ed25519 pubkey
payload @3 :Data; # Opaque payload (MLS blob or control data)
timestampMs @4 :UInt64; # Unix epoch milliseconds
enum MsgType {
ping @0;
pong @1;
keyPackageUpload @2;
keyPackageFetch @3;
keyPackageResponse @4;
mlsWelcome @5;
mlsCommit @6;
mlsApplication @7;
error @8;
}
}
```
The Delivery Service routes by `(groupId, msgType)` without inspecting `payload`. This design keeps the DS MLS-unaware -- see [ADR-004: MLS-Unaware Delivery Service](../design-rationale/adr-004-mls-unaware-ds.md).
## The `ParsedEnvelope` owned type
Cap'n Proto readers (`envelope_capnp::envelope::Reader`) borrow from the original byte buffer and cannot be sent across async task boundaries (`!Send`). This is a fundamental limitation of zero-copy reads.
To bridge this gap, `quicproquo-proto` defines `ParsedEnvelope`:
```rust
pub struct ParsedEnvelope {
pub msg_type: MsgType,
pub group_id: Vec<u8>,
pub sender_id: Vec<u8>,
pub payload: Vec<u8>,
pub timestamp_ms: u64,
}
```
`ParsedEnvelope` eagerly copies all byte fields out of the Cap'n Proto reader, making the type `Send + 'static`. This allows it to cross Tokio task boundaries, be stored in queues, and be passed through channels.
The trade-off is clear: `ParsedEnvelope` allocates and copies, defeating the zero-copy benefit. This is acceptable because:
1. The copying happens once per message at the protocol boundary.
2. Application-layer code (MLS encryption/decryption, routing) needs owned data anyway.
3. The performance-critical path (Delivery Service routing) works with opaque `Vec<u8>` payloads, not parsed Cap'n Proto readers.
### Invariants
- `group_id` and `sender_id` are either empty (for control messages like Ping/Pong) or exactly 32 bytes (SHA-256 digest).
- `payload` is empty for Ping and Pong; non-empty for all MLS variants.
## Serialisation helpers
Two functions handle the conversion between `ParsedEnvelope` and wire bytes:
### `build_envelope`
```rust
pub fn build_envelope(env: &ParsedEnvelope) -> Result<Vec<u8>, capnp::Error>
```
Serialises a `ParsedEnvelope` to unpacked Cap'n Proto wire bytes. The output includes the Cap'n Proto segment table header followed by the message data. These bytes are suitable as a payload within a QUIC stream.
Internally, it builds a `capnp::message::Builder`, populates an `Envelope` root, and serialises via `capnp::serialize::write_message`.
### `parse_envelope`
```rust
pub fn parse_envelope(bytes: &[u8]) -> Result<ParsedEnvelope, capnp::Error>
```
Deserialises unpacked Cap'n Proto wire bytes into a `ParsedEnvelope`. All data is copied out of the reader before returning, so the input slice is not retained.
It returns `capnp::Error` if:
- The bytes are not valid Cap'n Proto wire format.
- The `msgType` discriminant is not present in the current schema (forward-compatibility guard).
### Low-level helpers
Two additional functions provide raw byte-to-message conversions:
```rust
pub fn to_bytes<A: Allocator>(msg: &Builder<A>) -> Result<Vec<u8>, capnp::Error>
pub fn from_bytes(bytes: &[u8]) -> Result<Reader<OwnedSegments>, capnp::Error>
```
`from_bytes` uses `ReaderOptions::new()` with default limits:
- **Traversal limit**: 32 MiB (4 * 1024 * 1024 words)
- **Nesting limit**: 512 levels
The traversal limit bounds DoS from deeply nested or excessively large Cap'n Proto messages. The server also enforces size limits: 5 MB per payload (`MAX_PAYLOAD_BYTES`) and 1 MB per KeyPackage (`MAX_KEYPACKAGE_BYTES`).
## The NodeService RPC interface
The M3 unified RPC interface is defined in `schemas/node.capnp`:
```capnp
interface NodeService {
uploadKeyPackage @0 (identityKey :Data, package :Data, auth :Auth)
-> (fingerprint :Data);
fetchKeyPackage @1 (identityKey :Data, auth :Auth) -> (package :Data);
enqueue @2 (recipientKey :Data, payload :Data,
channelId :Data, version :UInt16, auth :Auth) -> ();
fetch @3 (recipientKey :Data, channelId :Data,
version :UInt16, auth :Auth) -> (payloads :List(Data));
fetchWait @4 (recipientKey :Data, channelId :Data,
version :UInt16, timeoutMs :UInt64, auth :Auth)
-> (payloads :List(Data));
health @5 () -> (status :Text);
uploadHybridKey @6 (identityKey :Data, hybridPublicKey :Data) -> ();
fetchHybridKey @7 (identityKey :Data) -> (hybridPublicKey :Data);
}
```
This combines Authentication Service operations (`uploadKeyPackage`, `fetchKeyPackage`), Delivery Service operations (`enqueue`, `fetch`, `fetchWait`), health monitoring (`health`), and hybrid key management (`uploadHybridKey`, `fetchHybridKey`) into a single RPC interface.
### Auth context
Every mutating RPC method accepts an `Auth` struct:
```capnp
struct Auth {
version @0 :UInt16; # 0 = legacy/none, 1 = token-based auth
accessToken @1 :Data; # opaque bearer token
deviceId @2 :Data; # optional UUID bytes for auditing
}
```
The server validates the `version` field and rejects unknown versions. Token validation is planned for a future milestone. See [Auth, Devices, and Tokens](../roadmap/authz-plan.md).
## ALPN integration
Cap'n Proto RPC rides directly on the QUIC bidirectional stream. The ALPN (Application-Layer Protocol Negotiation) extension in the TLS handshake identifies the protocol:
```rust
tls.alpn_protocols = vec![b"capnp".to_vec()];
```
Both client and server set the ALPN to `b"capnp"`. If the client and server disagree on the ALPN, the TLS handshake fails before any application data is exchanged.
On the QUIC path, the flow is:
```text
Client Server
| |
|── QUIC handshake (TLS 1.3) ────►| ALPN: "capnp"
| |
|── open_bi() ───────────────────►| Bidirectional QUIC stream
| |
|◄─────── capnp-rpc messages ────►| VatNetwork reads/writes on the stream
```
The `tokio-util` compat layer converts Quinn stream types into `futures::AsyncRead + AsyncWrite`, which `capnp-rpc`'s `VatNetwork` expects. See [QUIC + TLS 1.3](quic-tls.md) for the full connection setup.
## Comparison with alternatives
### vs Protocol Buffers + gRPC
Protocol Buffers require a full deserialisation step to access any field. Cap'n Proto avoids this with zero-copy readers. gRPC requires HTTP/2 framing, which adds overhead on top of QUIC. Cap'n Proto RPC is leaner and maps naturally to a single QUIC stream.
### vs MessagePack
MessagePack is untyped -- there is no schema file, and type errors are caught at runtime. This is unacceptable for a security protocol where a misinterpreted field could be exploitable. MessagePack also has no RPC framework, requiring a hand-rolled request/response protocol.
### vs FlatBuffers
FlatBuffers supports zero-copy reads (like Cap'n Proto) but lacks a built-in RPC framework. The ecosystem and tooling are also less mature for Rust.
## Design constraints of `quicproquo-proto`
The `quicproquo-proto` crate enforces three design constraints:
The `quicproquo-proto` crate enforces three constraints:
1. **No crypto**: Key material never enters this crate. All encryption and signing happens in `quicproquo-core`.
2. **No I/O**: Callers own the transport. This crate only converts between bytes and types.
@@ -266,10 +301,12 @@ The `quicproquo-proto` crate enforces three design constraints:
These constraints keep the serialisation layer thin and auditable.
---
## Further reading
- [Envelope Schema](../wire-format/envelope-schema.md) -- Detailed field-by-field breakdown of the Envelope wire format.
- [NodeService Schema](../wire-format/node-service-schema.md) -- Full RPC interface documentation.
- [Auth Schema](../wire-format/auth-schema.md) -- Auth token structure and versioning.
- [MLS (RFC 9420)](mls.md) -- How MLS messages are carried as opaque payloads inside Cap'n Proto Envelopes.
- [ADR-002: Cap'n Proto over MessagePack](../design-rationale/adr-002-capnproto.md) -- Design rationale for choosing Cap'n Proto.
- [QUIC + TLS 1.3](quic-tls.md) -- The transport layer that carries these frames.
- [Service Architecture](../architecture/service-architecture.md) -- How the server dispatches method IDs to handlers.
- [Wire Format Reference](../wire-format/overview.md) -- Full Protobuf schema documentation.
- [MLS (RFC 9420)](mls.md) -- How MLS messages are carried as opaque payloads inside Protobuf delivery messages.
- [ADR-007](../design-rationale/adr-007-protobuf-migration.md) -- Design rationale for the v1 Cap'n Proto to v2 Protobuf migration.

View File

@@ -9,7 +9,7 @@ This page provides a high-level comparison and a suggested reading order. The de
| Layer | Standard / Spec | Crate(s) | Security Properties |
|---|---|---|---|
| **QUIC + TLS 1.3** | RFC 9000, RFC 9001 | `quinn 0.11`, `rustls 0.23` | Transport confidentiality, server authentication, 0-RTT resumption |
| **Cap'n Proto** | [capnproto.org specification](https://capnproto.org/encoding.html) | `capnp 0.19`, `capnp-rpc 0.19` | Zero-copy deserialisation, schema-enforced types, canonical serialisation for signing, async RPC |
| **Protobuf framing** | Custom binary header + [Protocol Buffers](https://protobuf.dev/) | `quicproquo-rpc`, `prost 0.13` | Typed length-prefixed frames, method dispatch, push events, status codes |
| **MLS** | [RFC 9420](https://www.rfc-editor.org/rfc/rfc9420.html) | `openmls 0.5` | Group key agreement, forward secrecy, post-compromise security (PCS) |
| **Hybrid KEM** | [draft-ietf-tls-hybrid-design](https://datatracker.ietf.org/doc/draft-ietf-tls-hybrid-design/) | `ml-kem 0.2`, `x25519-dalek 2` | Post-quantum resistance via ML-KEM-768 combined with X25519 |
@@ -27,7 +27,8 @@ Application plaintext
|
v
+-----------+
| Cap'n Proto| Schema-typed serialisation into Envelope frames
| Protobuf | Typed serialisation into Protobuf messages
| framing | + binary header [method_id/event_type][req_id][len]
+-----------+
|
v
@@ -39,19 +40,19 @@ Application plaintext
Network
```
The Hybrid KEM layer operates orthogonally: it wraps MLS payloads in an outer post-quantum encryption envelope before they enter the transport layer. It is implemented and tested but not yet integrated into the MLS ciphersuite (planned for the M5 milestone).
The Hybrid KEM layer operates orthogonally: it wraps MLS payloads in an outer post-quantum encryption envelope before they enter the transport layer. It is implemented and tested but not yet integrated into the MLS ciphersuite (planned for a future milestone).
## Suggested reading order
The pages in this section are ordered to build understanding incrementally:
1. **[QUIC + TLS 1.3](quic-tls.md)** -- Start here. This is the transport layer that every client-server connection uses. Understanding QUIC stream multiplexing and the TLS 1.3 handshake is prerequisite to understanding how Cap'n Proto RPC rides on top.
1. **[QUIC + TLS 1.3](quic-tls.md)** -- Start here. This is the transport layer that every client-server connection uses. Understanding QUIC stream multiplexing and the TLS 1.3 handshake is prerequisite to understanding how the Protobuf framing protocol rides on top.
2. **[MLS (RFC 9420)](mls.md)** -- The core cryptographic innovation. MLS provides the group key agreement that makes quicproquo an E2E encrypted group messenger rather than just a transport-encrypted relay. This is the longest and most detailed page.
3. **[Cap'n Proto Serialisation and RPC](capn-proto.md)** -- The serialisation and RPC layer that bridges MLS application data with the transport. Understanding the Envelope schema, the ParsedEnvelope owned type, and the NodeService RPC interface is essential for reading the server and client source code.
3. **[Protobuf Framing](capn-proto.md)** -- The framing and RPC layer that bridges MLS application data with the transport. Understanding the three frame types (Request, Response, Push), the method ID dispatch table, and status codes is essential for reading the server and client source code.
4. **[Hybrid KEM: X25519 + ML-KEM-768](hybrid-kem.md)** -- The post-quantum encryption layer. Read this last because it builds on concepts from all other layers: key encapsulation (from MLS), wire format conventions (from Cap'n Proto), and AEAD encryption.
4. **[Hybrid KEM: X25519 + ML-KEM-768](hybrid-kem.md)** -- The post-quantum encryption layer. Read this last because it builds on concepts from all other layers: key encapsulation (from MLS), wire format conventions (from Protobuf framing), and AEAD encryption.
## Cross-cutting concerns
@@ -59,9 +60,9 @@ Several topics span multiple layers and have their own dedicated pages elsewhere
- **Forward secrecy**: Provided by MLS epoch ratcheting. See [Forward Secrecy](../cryptography/forward-secrecy.md).
- **Post-compromise security**: Provided by MLS Update proposals. See [Post-Compromise Security](../cryptography/post-compromise-security.md).
- **Post-quantum readiness**: Currently provided by the standalone Hybrid KEM module; integration into MLS is planned for M5. See [Post-Quantum Readiness](../cryptography/post-quantum-readiness.md).
- **Post-quantum readiness**: Currently provided by the standalone Hybrid KEM module; integration into MLS is planned. See [Post-Quantum Readiness](../cryptography/post-quantum-readiness.md).
- **Key lifecycle and zeroization**: Private key material is zeroized after use across all layers. See [Key Lifecycle and Zeroization](../cryptography/key-lifecycle.md).
- **Wire format details**: The Cap'n Proto schema definitions are documented in the [Wire Format Reference](../wire-format/overview.md) section.
- **Wire format details**: The Protobuf schema definitions are documented in the [Wire Format Reference](../wire-format/overview.md) section.
- **Design rationale**: The ADR pages explain *why* each layer was chosen. See [Design Decisions Overview](../design-rationale/overview.md).
## Crate mapping
@@ -70,8 +71,9 @@ Each protocol layer maps to one or more workspace crates:
| Layer | Primary Crate | Source File(s) |
|---|---|---|
| QUIC + TLS 1.3 | `quicproquo-server`, `quicproquo-client` | `main.rs` (server and client entry points) |
| Cap'n Proto | `quicproquo-proto` | `src/lib.rs`, `build.rs`, `schemas/*.capnp` |
| QUIC + TLS 1.3 | `quicproquo-server`, `quicproquo-client` | Server and client entry points |
| Protobuf framing | `quicproquo-rpc` | `src/framing.rs`, `src/server.rs`, `src/client.rs` |
| Protobuf types + method IDs | `quicproquo-proto` | `src/lib.rs` (method_ids), `proto/qpq/v1/*.proto` |
| MLS | `quicproquo-core` | `src/group.rs`, `src/keystore.rs` |
| Hybrid KEM | `quicproquo-core` | `src/hybrid_kem.rs` |

View File

@@ -10,26 +10,27 @@ QUIC provides several advantages over traditional TCP-based transports:
- **0-RTT resumption**: Returning clients can send data in the first flight, reducing connection setup latency.
- **Integrated encryption**: TLS 1.3 is integral to the QUIC handshake; no extra round-trips for transport security.
- **NAT traversal**: UDP-based; connection migration survives NAT rebinding.
- **Ecosystem support**: `capnp-rpc` can use QUIC bidirectional streams directly via the `tokio-util` compat layer.
- **Per-call concurrency**: The v2 RPC framework opens one bidirectional stream per RPC call. Multiple calls run concurrently without blocking each other.
- **Push streams**: Server-to-client push events use QUIC uni-directional streams, avoiding any request-response overhead.
## Crate integration
quicproquo uses the following crates for QUIC and TLS:
- **`quinn 0.11`** -- The async QUIC implementation for Tokio. Provides `Endpoint`, `Connection`, and bidirectional stream types.
- **`quinn 0.11`** -- The async QUIC implementation for Tokio. Provides `Endpoint`, `Connection`, and bidirectional/uni-directional stream types.
- **`quinn-proto 0.11`** -- The protocol-level types, including `QuicServerConfig` and `QuicClientConfig` wrappers that bridge `rustls` into `quinn`.
- **`rustls 0.23`** -- The TLS implementation. quicproquo uses it in strict TLS 1.3 mode with no fallback to TLS 1.2.
- **`rcgen 0.13`** -- Self-signed certificate generation for development and testing.
### Server configuration
The server builds its QUIC endpoint configuration in `build_server_config()` (in `quicproquo-server/src/main.rs`):
The server builds its QUIC endpoint configuration with:
```rust
let mut tls = rustls::ServerConfig::builder_with_protocol_versions(&[&TLS13])
.with_no_client_auth()
.with_single_cert(cert_chain, key)?;
tls.alpn_protocols = vec![b"capnp".to_vec()];
tls.alpn_protocols = vec![b"qpq".to_vec()];
let crypto = QuicServerConfig::try_from(tls)?;
Ok(ServerConfig::with_crypto(Arc::new(crypto)))
@@ -39,9 +40,9 @@ Key points:
1. **TLS 1.3 strict mode**: `builder_with_protocol_versions(&[&TLS13])` ensures no TLS 1.2 fallback. This is a hard requirement: TLS 1.2 lacks the 0-RTT and full forward secrecy guarantees that quicproquo relies on.
2. **No client certificate authentication**: `with_no_client_auth()` means the server does not verify client certificates at the TLS layer. Client authentication is handled at the application layer via Ed25519 identity keys and MLS credentials. This is a deliberate design choice -- MLS provides stronger authentication properties than TLS client certificates.
2. **No client certificate authentication**: `with_no_client_auth()` means the server does not verify client certificates at the TLS layer. Client authentication is handled at the application layer via OPAQUE password authentication and Ed25519 identity keys. This is a deliberate design choice -- OPAQUE provides stronger authentication properties than TLS client certificates without requiring PKI infrastructure.
3. **ALPN negotiation**: The Application-Layer Protocol Negotiation extension is set to `b"capnp"`, advertising that this endpoint speaks Cap'n Proto RPC. Both client and server must agree on this protocol identifier or the TLS handshake fails.
3. **ALPN negotiation**: The Application-Layer Protocol Negotiation extension is set to `b"qpq"`, advertising that this endpoint speaks the quicproquo v2 Protobuf framing protocol. Both client and server must agree on this protocol identifier or the TLS handshake fails.
4. **`QuicServerConfig` bridge**: The `quinn-proto` crate provides `QuicServerConfig::try_from(tls)` to adapt the `rustls::ServerConfig` for use with QUIC. This handles the QUIC-specific TLS parameters (transport parameters, QUIC header protection keys) automatically.
@@ -53,10 +54,10 @@ The client performs the mirror operation. It loads the server's DER-encoded cert
let mut roots = rustls::RootCertStore::empty();
roots.add(CertificateDer::from(cert_bytes))?;
let tls = rustls::ClientConfig::builder_with_protocol_versions(&[&TLS13])
let mut tls = rustls::ClientConfig::builder_with_protocol_versions(&[&TLS13])
.with_root_certificates(roots)
.with_no_client_auth();
tls.alpn_protocols = vec![b"capnp".to_vec()];
tls.alpn_protocols = vec![b"qpq".to_vec()];
let crypto = QuicClientConfig::try_from(tls)?;
```
@@ -65,20 +66,26 @@ The client trusts exactly one certificate: the server's self-signed cert loaded
### Per-connection handling
Each accepted QUIC connection spawns a handler task:
The v2 server accepts connections and handles streams concurrently:
```rust
let (send, recv) = connection.accept_bi().await?;
let (reader, writer) = (recv.compat(), send.compat_write());
// Accept a QUIC connection
let connection = endpoint.accept().await?;
let network = twoparty::VatNetwork::new(reader, writer, Side::Server, Default::default());
let service: node_service::Client = capnp_rpc::new_client(NodeServiceImpl { store, waiters });
RpcSystem::new(Box::new(network), Some(service.client)).await?;
// For each incoming bidirectional stream (one per RPC call):
let (send, recv) = connection.accept_bi().await?;
// Read RequestFrame, dispatch, write ResponseFrame
tokio::spawn(handle_rpc(send, recv, server_state));
// For server-initiated push events:
let send = connection.open_uni().await?;
// Write PushFrame
tokio::spawn(send_push(send, event));
```
The `tokio-util` compat layer (`compat()` and `compat_write()`) converts Quinn's `RecvStream` and `SendStream` into types that implement `futures::AsyncRead` and `futures::AsyncWrite`, which `capnp-rpc`'s `VatNetwork` requires. The entire Cap'n Proto RPC system then runs over this single QUIC bidirectional stream.
Because `capnp-rpc` uses `Rc<RefCell<>>` internally (making it `!Send`), all RPC tasks run on a `tokio::task::LocalSet`. The server spawns each connection handler via `tokio::task::spawn_local`.
Unlike the v1 Cap'n Proto RPC (which required `tokio::task::LocalSet` due to
`!Send` internals), the v2 framework uses `Arc`-based shared state and
`tokio::spawn` for full multi-threaded concurrency.
## Certificate trust model
@@ -126,9 +133,9 @@ The QUIC + TLS 1.3 layer provides:
### What TLS does *not* provide
- **Client authentication**: Handled by MLS identity credentials at the application layer. See [MLS (RFC 9420)](mls.md).
- **End-to-end encryption**: TLS terminates at the server. The server can read the Cap'n Proto RPC framing and message routing metadata. Payload confidentiality is provided by MLS. See [MLS (RFC 9420)](mls.md).
- **Post-quantum resistance**: TLS 1.3 key exchange uses classical ECDHE. Post-quantum protection of application data is provided by the [Hybrid KEM](hybrid-kem.md) layer (M5 milestone).
- **Client authentication**: Handled by OPAQUE password authentication (methods 100-103) and Ed25519 identity keys at the application layer. See [Service Architecture](../architecture/service-architecture.md).
- **End-to-end encryption**: TLS terminates at the server. The server can read the Protobuf framing and message routing metadata. Payload confidentiality is provided by MLS. See [MLS (RFC 9420)](mls.md).
- **Post-quantum resistance**: TLS 1.3 key exchange uses classical ECDHE. Post-quantum protection of application data is provided by the [Hybrid KEM](hybrid-kem.md) layer.
## Configuration reference
@@ -136,7 +143,7 @@ The QUIC + TLS 1.3 layer provides:
| Environment Variable | CLI Flag | Default | Description |
|---|---|---|---|
| `QPQ_LISTEN` | `--listen` | `0.0.0.0:7000` | QUIC listen address |
| `QPQ_LISTEN` | `--listen` | `0.0.0.0:5001` | QUIC listen address |
| `QPQ_TLS_CERT` | `--tls-cert` | `data/server-cert.der` | TLS certificate path |
| `QPQ_TLS_KEY` | `--tls-key` | `data/server-key.der` | TLS private key path |
| `QPQ_DATA_DIR` | `--data-dir` | `data` | Persistent storage directory |
@@ -147,9 +154,9 @@ The QUIC + TLS 1.3 layer provides:
|---|---|---|---|
| `QPQ_CA_CERT` | `--ca-cert` | `data/server-cert.der` | Server certificate to trust |
| `QPQ_SERVER_NAME` | `--server-name` | `localhost` | Expected TLS server name (must match certificate SAN) |
| `QPQ_SERVER` | `--server` | `127.0.0.1:7000` | Server address (per-subcommand) |
| `QPQ_SERVER` | `--server` | `127.0.0.1:5001` | Server address (per-subcommand) |
## Further reading
- [Cap'n Proto Serialisation and RPC](capn-proto.md) -- The RPC layer that runs on top of QUIC streams.
- [Service Architecture](../architecture/service-architecture.md) -- How the server's `NodeServiceImpl` binds to the QUIC endpoint.
- [Protobuf Framing](capn-proto.md) -- The RPC framing layer that runs on top of QUIC streams.
- [Service Architecture](../architecture/service-architecture.md) -- How the server binds to the QUIC endpoint and dispatches 44 RPC methods.

View File

@@ -12,24 +12,6 @@ For the production readiness work breakdown, see
## Transport and Networking
### LibP2P / iroh (n0)
**Problem:** The current architecture is strictly client-server. Clients behind
NAT cannot communicate directly, and the server is a single point of failure for
delivery.
**Solution:** [LibP2P](https://libp2p.io/) and [iroh](https://iroh.computer/)
(from n0) provide peer discovery, NAT traversal (hole-punching), and relay
fallback. iroh is particularly interesting because it is Rust-native and built on
QUIC, aligning with quicproquo's existing transport layer.
**Architecture impact:** Move from pure client-server to a hybrid topology where
peers communicate directly when possible and fall back to server relay when NAT
traversal fails. The server role shifts from mandatory relay to optional
rendezvous/relay node.
**Crates:** `libp2p`, `iroh`, `iroh-net`
### WebTransport (HTTP/3)
**Problem:** Browser clients cannot use raw QUIC. The current stack requires a
@@ -66,23 +48,6 @@ significantly, so this should be optional.
## Storage and Persistence
### SQLCipher / libsql (Turso)
**Problem:** At M6, quicproquo needs persistent storage for group state, key
material, and message queues. Storing private keys in a plaintext SQLite database
is insufficient.
**Solution:** [SQLCipher](https://www.zetetic.net/sqlcipher/) provides
transparent, page-level AES-256 encryption for SQLite. Alternatively,
[libsql](https://turso.tech/libsql) (Turso) offers a SQLite fork with
encryption, replication, and embedded server capabilities.
**Architecture impact:** Replace the `sqlx` SQLite backend with SQLCipher.
Encryption key derived from a user-provided passphrase (via Argon2id) or a
hardware-backed key.
**Crates:** `rusqlite` (with `bundled-sqlcipher` feature), `libsql`
### CRDTs (Automerge / Yrs)
**Problem:** Multi-device support requires synchronising state (group membership,
@@ -153,20 +118,6 @@ queries. This is a significant performance trade-off: PIR has high computational
cost. Suitable for KeyPackage fetch (small database) before message fetch (large
database).
### Sealed Sender (Signal-style)
**Problem:** The server sees `(sender, recipient, timestamp)` metadata on every
enqueued message. Even without reading content, this metadata reveals social
graphs.
**Solution:** [Sealed Sender](https://signal.org/blog/sealed-sender/) encrypts
the sender's identity inside the MLS ciphertext. The server routes by
`recipientKey` only and cannot determine who sent the message.
**Architecture impact:** Modify the `enqueue` RPC to omit sender identity from
the server-visible metadata. The sender identity is included only inside the
MLS application message (encrypted).
### Key Transparency (RFC draft)
**Problem:** A compromised server could substitute public keys, performing a
@@ -200,24 +151,6 @@ DID URIs. The server resolves DIDs to public keys for routing.
**Crates:** `did-key`, `ssi`
### OPAQUE (aPAKE)
**Problem:** If quicproquo adds password-based account registration, the
server must never see the password -- not even a hash.
**Solution:** [OPAQUE](https://datatracker.ietf.org/doc/rfc9497/) is an
asymmetric password-authenticated key exchange where the server stores only a
one-way transformation of the password. The server cannot perform offline
dictionary attacks.
**Architecture impact:** Replace the registration/login flow with OPAQUE. The
server stores an OPAQUE registration record; the client runs the OPAQUE protocol
to authenticate and derive a session key.
**Crates:** `opaque-ke`
**References:** RFC 9497
### WebAuthn / Passkeys
**Problem:** Password-based auth (even with OPAQUE) is vulnerable to phishing.
@@ -380,18 +313,25 @@ command sets up the toolchain, `capnp`, and all dependencies.
---
## Top 5 Priority Implementations
## Top Priority Implementations
The following table ranks the most impactful technologies for near-term adoption,
considering the current state of the codebase and the [milestone plan](milestones.md).
| Priority | Technology | Why | Unlocks |
|----------|-----------|-----|---------|
| 1 | **Post-quantum hybrid KEM** | `ml-kem` is already vendored in the workspace. Completing the hybrid `OpenMlsCryptoProvider` makes quicproquo one of the first PQ MLS implementations. | M7 |
| 2 | **SQLCipher persistence** | Encrypted-at-rest storage is the prerequisite for multi-device support, offline usage, and server restart survival. | M6 |
| 3 | **OPAQUE auth** | Zero-knowledge password authentication is a massive security uplift for the account system. The server never sees or stores passwords. | Phase 3 (authz) |
| 4 | **iroh / LibP2P** | NAT traversal and optional P2P mesh makes quicproquo deployable without centralised infrastructure. Aligns with the existing QUIC transport. | Beyond M7 |
| 5 | **Sealed Sender + PIR** | Content encryption is table stakes. Metadata resistance (hiding who talks to whom) is the frontier of private messaging research. | Beyond M7 |
Items marked **Implemented** are already part of the v2 codebase.
| Priority | Technology | Why | Status |
|----------|-----------|-----|--------|
| -- | **Post-quantum hybrid KEM** | `ml-kem` vendored; custom `OpenMlsCryptoProvider` with X25519 + ML-KEM-768. | **Implemented** |
| -- | **SQLCipher persistence** | Encrypted-at-rest storage via rusqlite + bundled-sqlcipher + Argon2id key derivation. | **Implemented** |
| -- | **OPAQUE auth** | Zero-knowledge password authentication via `opaque-ke`. Server never stores passwords. | **Implemented** |
| -- | **iroh P2P** | NAT traversal and optional P2P mesh via the `quicproquo-p2p` crate (feature-flagged). | **Implemented** |
| -- | **Sealed Sender** | `--sealed-sender` flag encrypts sender identity inside MLS ciphertext. | **Implemented** |
| 1 | **PIR (Private Information Retrieval)** | Fetch messages without revealing the recipient's identity to the server. | Future |
| 2 | **Key Transparency** | Verifiable, append-only log of public key bindings. Detects key substitution attacks. | Future |
| 3 | **WebTransport (HTTP/3)** | Enables browser clients without a WebSocket bridge. | Future |
| 4 | **OpenTelemetry** | Distributed tracing and structured metrics for production observability. | Future |
| 5 | **WebAuthn / Passkeys** | Hardware-backed authentication to replace password-based login. | Future |
---

View File

@@ -17,7 +17,7 @@ for what that means in practice.
| M4 | Group CLI Subcommands | **Complete** | Persistent CLI (create-group, invite, join, send, recv), OPAQUE login |
| M5 | Multi-party Groups | **Complete** | N > 2 members, Commit fan-out, send --all, epoch sync |
| M6 | Persistence | **Complete** | SQLite/SQLCipher, migrations, durable server + client state |
| M7 | Post-quantum | **Next** | PQ hybrid for MLS/HPKE (X25519 + ML-KEM-768) |
| M7 | Post-quantum | **Complete** | PQ hybrid for MLS/HPKE (X25519 + ML-KEM-768) |
---
@@ -129,14 +129,13 @@ optional follow-ups.
**Goal:** Server survives restart. Client state persists across sessions.
**Deliverables:** SQLite/SQLCipher via rusqlite, `migrations/` directory and
migration runner; client state file and DiskKeyStore (encrypted QPCE optional).
See [Future Research: SQLCipher](future-research.md#storage--persistence) for
encrypted-at-rest options.
**Deliverables:** SQLCipher via rusqlite (bundled-sqlcipher feature), `migrations/`
directory and migration runner; client state file and DiskKeyStore with
Argon2id key derivation and ChaCha20-Poly1305 encryption at rest.
---
## M7 -- Post-quantum (Next)
## M7 -- Post-quantum (Complete)
**Goal:** Replace the MLS crypto backend with a hybrid X25519 + ML-KEM-768 KEM,
providing post-quantum confidentiality for all group key material.

View File

@@ -36,14 +36,14 @@ The following legacy behaviour has been removed; only current behaviour is suppo
| Deliverable | Status |
|-------------|--------|
| `create-group` | Planned |
| `invite <identity>` | Planned |
| `join` | Planned |
| `send <message>` | Planned |
| `recv` | Planned |
| Keep `demo-group` | Existing |
| `create-group` | **Complete** |
| `invite <identity>` | **Complete** |
| `join` | **Complete** |
| `send <message>` | **Complete** |
| `recv` | **Complete** |
| Keep `demo-group` | **Complete** |
See [Milestones](milestones.md#m4--group-cli-subcommands-next).
See [Milestones](milestones.md#m4--group-cli-subcommands-complete).
---
@@ -53,10 +53,10 @@ See [Milestones](milestones.md#m4--group-cli-subcommands-next).
| Deliverable | Status |
|-------------|--------|
| Commit fan-out via DS | Planned |
| Proposal handling (Add, Remove, Update) | Planned |
| Epoch sync across N members | Planned |
| Benchmarks | Planned |
| Commit fan-out via DS | **Complete** |
| Proposal handling (Add, Remove, Update) | **Complete** |
| Epoch sync across N members | **Complete** |
| Benchmarks | **Complete** |
---
@@ -66,10 +66,10 @@ See [Milestones](milestones.md#m4--group-cli-subcommands-next).
| Deliverable | Status |
|-------------|--------|
| SQLite/SQLCipher (AS + DS) | Partial (SqlStore exists) |
| `migrations/` | Planned |
| Client reconnect + session resume | Planned |
| Docker + healthcheck | Partial (Dockerfile exists) |
| SQLCipher (AS + DS) | **Complete** |
| `migrations/` | **Complete** |
| Client reconnect + session resume | **Complete** |
| Docker + healthcheck | **Complete** |
---

View File

@@ -44,7 +44,7 @@ how they are enforced in code.
### Transport Policy
- TLS 1.3 only (`rustls` configured with `TLS13` cipher suites exclusively).
- ALPN token `b"capnp"` required; reject connections with mismatched ALPN.
- ALPN token `b"qpq"` required; reject connections with mismatched ALPN.
- Self-signed certificates acceptable for development; production deployments
must use a CA-signed certificate or certificate pinning.
- Connection draining on shutdown (QUIC `CONNECTION_CLOSE`).
@@ -60,7 +60,7 @@ how they are enforced in code.
### Input Validation
- All incoming Cap'n Proto messages validated against schema before processing.
- All incoming Protobuf messages validated against schema before processing.
- Maximum payload size: 5 MB per RPC call.
- Group ID, identity key, and channel ID fields validated for correct length
(32 bytes, 32 bytes, 16 bytes respectively).
@@ -101,7 +101,7 @@ how they are enforced in code.
- Integration tests for every RPC method.
- Negative tests: malformed input, expired tokens, wrong identity, replay attempts.
- N-1 compatibility tests (old client against new server).
- Fuzzing targets for Cap'n Proto parsers and MLS message handling (Phase 5).
- Fuzzing targets for Protobuf parsers and MLS message handling (Phase 5).
---
@@ -125,10 +125,10 @@ how they are enforced in code.
| Task | Description |
|------|-------------|
| Wire versioning | Add `version` field to all Cap'n Proto structs; reject unknown versions |
| Wire versioning | Version field in all Protobuf frames; reject unknown versions |
| Ciphersuite allowlist | Server rejects KeyPackages outside the allowed set |
| Downgrade guards | Prevent epoch rollback; reject Commits with weaker ciphersuites |
| ALPN enforcement | Reject connections without `b"capnp"` ALPN token |
| ALPN enforcement | Reject connections without `b"qpq"` ALPN token |
| Connection draining | Graceful QUIC `CONNECTION_CLOSE` on server shutdown |
| KeyPackage rotation | Client-side timer to upload fresh KeyPackages before TTL expiry |
@@ -172,7 +172,7 @@ See [1:1 Channel Design](dm-channels.md) for the DM-specific design.
| Positive E2E tests | Full group lifecycle: register, create, invite, join, send, recv, leave |
| Negative E2E tests | Expired tokens, wrong identity, replay, malformed messages |
| Compat matrix | N-1 client/server version testing |
| Fuzz targets | `cargo-fuzz` targets for Cap'n Proto parsers, MLS message handlers |
| Fuzz targets | `cargo-fuzz` targets for Protobuf parsers, MLS message handlers |
| Golden-wire fixtures | Serialised test vectors for regression testing across versions |
### Phase 6 -- Reliability, Performance, and Operations

64
docs/src/sdk/index.md Normal file
View File

@@ -0,0 +1,64 @@
# Client SDKs
This guide covers how to build clients for the quicproquo E2E encrypted messenger
using the official SDKs or by implementing a new one.
## Official SDKs
| Language | Location | Transport | Status |
|----------|----------|-----------|--------|
| **Rust** | `crates/quicproquo-client` | QUIC + Protobuf (v2) | Production |
| **Go** | `sdks/go/` | QUIC + Protobuf (v2) | Production |
| **TypeScript** | `sdks/typescript/` | WebSocket bridge + WASM crypto | Production |
| **Python** | `sdks/python/` | QUIC + Protobuf (v2) / Rust FFI | Production |
| **C** | `crates/quicproquo-ffi/` | Rust FFI (synchronous) | Production |
| **Swift** | `sdks/swift/` | C FFI wrapper | In progress |
| **Kotlin** | `sdks/kotlin/` | JNI + C FFI | In progress |
| **Java** | `sdks/java/` | JNI + C FFI | In progress |
| **Ruby** | `sdks/ruby/` | FFI gem | In progress |
## Architecture Overview
```
Client SDK Server
---------- ------
+------------+ QUIC/TLS 1.3 +------------+
| App code | <--------------> | RPC |
| | v2 wire frames | dispatch |
| SDK API | | |
| | [method_id:u16] | handlers |
| Proto | [req_id:u32] | |
| encode/ | [len:u32] | storage |
| decode | [protobuf] | |
| | | |
| QUIC | | QUIC |
| transport | | listener |
+------------+ +------------+
```
Each RPC call opens a new QUIC bidirectional stream. The request and response
use the same 10-byte framing header followed by a protobuf payload.
## Quick Start
1. Choose an SDK for your language (see table above).
2. Connect to the server over QUIC (or WebSocket bridge for browsers).
3. Authenticate with OPAQUE (register or login).
4. Upload MLS key packages for E2E encryption.
5. Send and receive encrypted messages.
## Canonical Schemas
- **Protobuf** (v2): `proto/qpq/v1/*.proto` -- 14 service definitions
The protobuf schemas in `proto/qpq/v1/` are the canonical API contract for
the v2 protocol. New SDKs should implement against these definitions.
## Documentation
- [Wire Format Reference](wire-format.md) -- v2 QUIC + Protobuf framing and method IDs
- [Rust SDK](rust.md) -- native Rust client using `quicproquo-sdk`
- [Go SDK](../getting-started/go-sdk.md) -- Go client with QUIC transport
- [TypeScript SDK](../getting-started/typescript-sdk.md) -- browser and Node.js client
- [C FFI Bindings](../getting-started/ffi.md) -- C bindings for language integrations
- [WASM Integration](../getting-started/wasm.md) -- WASM crypto for browser clients

67
docs/src/sdk/rust.md Normal file
View File

@@ -0,0 +1,67 @@
# Rust SDK
The Rust client is the reference implementation, located in
`crates/quicproquo-client/`. It is built on top of the `quicproquo-sdk` crate,
which provides the high-level v2 API over QUIC + Protobuf.
## Installation
Add to your `Cargo.toml`:
```toml
[dependencies]
quicproquo-sdk = { path = "crates/quicproquo-sdk" }
```
## Connection
```rust
use quicproquo_sdk::QpqClient;
let client = QpqClient::connect("127.0.0.1:5001", &tls_config).await?;
let health = client.health().await?;
```
## CLI Client Usage
The `quicproquo-client` binary provides a CLI/TUI interface:
```rust
use quicproquo_client::{cmd_health, cmd_login, cmd_send};
// Health check
cmd_health("127.0.0.1:5001", &ca_cert_path, "localhost").await?;
// Login via OPAQUE
cmd_login(
"127.0.0.1:5001", &ca_cert_path, "localhost",
"alice", "password123",
None, // identity_key_hex
Some(&state_path), // state persistence
None, // state_password
).await?;
```
## Key Features
- Full MLS (RFC 9420) group encryption
- Hybrid post-quantum KEM (X25519 + ML-KEM-768)
- OPAQUE authentication with zeroizing credential storage
- SQLCipher local state with Argon2id key derivation
- Sealed sender metadata protection (`--sealed-sender` flag)
- v2 QUIC + Protobuf transport via the `quicproquo-sdk` crate
## Crate Structure
| Crate | Purpose |
|-------|---------|
| `quicproquo-core` | Crypto primitives, MLS, hybrid KEM |
| `quicproquo-proto` | Protobuf generated types |
| `quicproquo-rpc` | QUIC RPC framework (framing, dispatch) |
| `quicproquo-sdk` | High-level client SDK (`QpqClient`) |
| `quicproquo-client` | CLI/TUI client application |
## Related
- [Wire Format Reference](wire-format.md) -- frame layout and method IDs
- [SDK Overview](index.md) -- all language SDKs

210
docs/src/sdk/wire-format.md Normal file
View File

@@ -0,0 +1,210 @@
# Wire Format Reference
The quicproquo v2 protocol uses QUIC (RFC 9000) with TLS 1.3 as the transport
layer and Protocol Buffers for message serialization.
## Connection
- **Protocol**: QUIC with TLS 1.3
- **ALPN**: `qpq`
- **Port**: 5001 (default)
- **Certificate**: Server presents a TLS certificate; clients verify against a CA cert
## Frame Format
Every RPC request and response is wrapped in a 10-byte binary header:
```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| method_id (u16) | req_id (u32) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| req_id (cont.) | payload_len (u32) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| payload_len (cont.) | protobuf payload ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```
| Field | Type | Bytes | Description |
|-------|------|-------|-------------|
| `method_id` | `u16` | 0-1 | RPC method identifier (network byte order) |
| `req_id` | `u32` | 2-5 | Client-generated request correlation ID (network byte order) |
| `payload_len` | `u32` | 6-9 | Length of the protobuf payload (network byte order) |
| payload | bytes | 10+ | Protobuf-encoded request or response message |
All multi-byte integers are **big-endian** (network byte order).
## Stream Model
Each RPC call uses a **dedicated QUIC bidirectional stream**:
1. Client opens a new stream.
2. Client sends the request frame and marks end-of-stream.
3. Server reads the request, processes it, and sends the response frame.
4. Server marks end-of-stream.
This allows concurrent RPCs without head-of-line blocking.
## Method IDs
### Auth (100-103)
| ID | Method | Request | Response |
|----|--------|---------|----------|
| 100 | `OpaqueRegisterStart` | `OpaqueRegisterStartRequest` | `OpaqueRegisterStartResponse` |
| 101 | `OpaqueRegisterFinish` | `OpaqueRegisterFinishRequest` | `OpaqueRegisterFinishResponse` |
| 102 | `OpaqueLoginStart` | `OpaqueLoginStartRequest` | `OpaqueLoginStartResponse` |
| 103 | `OpaqueLoginFinish` | `OpaqueLoginFinishRequest` | `OpaqueLoginFinishResponse` |
### Delivery (200-205)
| ID | Method | Request | Response |
|----|--------|---------|----------|
| 200 | `Enqueue` | `EnqueueRequest` | `EnqueueResponse` |
| 201 | `Fetch` | `FetchRequest` | `FetchResponse` |
| 202 | `FetchWait` | `FetchWaitRequest` | `FetchWaitResponse` |
| 203 | `Peek` | `PeekRequest` | `PeekResponse` |
| 204 | `Ack` | `AckRequest` | `AckResponse` |
| 205 | `BatchEnqueue` | `BatchEnqueueRequest` | `BatchEnqueueResponse` |
### Keys (300-304)
| ID | Method | Request | Response |
|----|--------|---------|----------|
| 300 | `UploadKeyPackage` | `UploadKeyPackageRequest` | `UploadKeyPackageResponse` |
| 301 | `FetchKeyPackage` | `FetchKeyPackageRequest` | `FetchKeyPackageResponse` |
| 302 | `UploadHybridKey` | `UploadHybridKeyRequest` | `UploadHybridKeyResponse` |
| 303 | `FetchHybridKey` | `FetchHybridKeyRequest` | `FetchHybridKeyResponse` |
| 304 | `FetchHybridKeys` | `FetchHybridKeysRequest` | `FetchHybridKeysResponse` |
### Channel (400)
| ID | Method | Request | Response |
|----|--------|---------|----------|
| 400 | `CreateChannel` | `CreateChannelRequest` | `CreateChannelResponse` |
### Group Management (410-413)
| ID | Method | Request | Response |
|----|--------|---------|----------|
| 410 | `RemoveMember` | `RemoveMemberRequest` | `RemoveMemberResponse` |
| 411 | `UpdateGroupMetadata` | `UpdateGroupMetadataRequest` | `UpdateGroupMetadataResponse` |
| 412 | `ListGroupMembers` | `ListGroupMembersRequest` | `ListGroupMembersResponse` |
| 413 | `RotateKeys` | `RotateKeysRequest` | `RotateKeysResponse` |
### User (500-501)
| ID | Method | Request | Response |
|----|--------|---------|----------|
| 500 | `ResolveUser` | `ResolveUserRequest` | `ResolveUserResponse` |
| 501 | `ResolveIdentity` | `ResolveIdentityRequest` | `ResolveIdentityResponse` |
### Blob (600-601)
| ID | Method | Request | Response |
|----|--------|---------|----------|
| 600 | `UploadBlob` | `UploadBlobRequest` | `UploadBlobResponse` |
| 601 | `DownloadBlob` | `DownloadBlobRequest` | `DownloadBlobResponse` |
### Device (700-702)
| ID | Method | Request | Response |
|----|--------|---------|----------|
| 700 | `RegisterDevice` | `RegisterDeviceRequest` | `RegisterDeviceResponse` |
| 701 | `ListDevices` | `ListDevicesRequest` | `ListDevicesResponse` |
| 702 | `RevokeDevice` | `RevokeDeviceRequest` | `RevokeDeviceResponse` |
### P2P / Health (800-802)
| ID | Method | Request | Response |
|----|--------|---------|----------|
| 800 | `PublishEndpoint` | `PublishEndpointRequest` | `PublishEndpointResponse` |
| 801 | `ResolveEndpoint` | `ResolveEndpointRequest` | `ResolveEndpointResponse` |
| 802 | `Health` | `HealthRequest` | `HealthResponse` |
### Federation (900-905)
| ID | Method | Request | Response |
|----|--------|---------|----------|
| 900 | `RelayEnqueue` | `RelayEnqueueRequest` | `RelayEnqueueResponse` |
| 901-905 | Reserved | -- | -- |
### Account (950)
| ID | Method | Request | Response |
|----|--------|---------|----------|
| 950 | `DeleteAccount` | `DeleteAccountRequest` | `DeleteAccountResponse` |
## Protobuf Definitions
All message types are defined in `proto/qpq/v1/*.proto`:
| File | Services |
|------|----------|
| `auth.proto` | OPAQUE registration and login |
| `common.proto` | Auth context, account deletion |
| `delivery.proto` | Message enqueue, fetch, peek, ack |
| `keys.proto` | MLS key packages, hybrid keys |
| `channel.proto` | Channel creation |
| `user.proto` | User/identity resolution |
| `group.proto` | Group management |
| `blob.proto` | Binary object storage |
| `device.proto` | Multi-device management |
| `p2p.proto` | P2P endpoints, health |
| `federation.proto` | Cross-server relay |
| `push.proto` | Push notifications |
| `recovery.proto` | Account recovery |
| `moderation.proto` | Content moderation |
## Authentication Flow
Authentication uses the OPAQUE protocol (asymmetric PAKE):
```
Client Server
| |
| OpaqueRegisterStart(username, |
| registration_request) |
| ---------------------------------->|
| |
| OpaqueRegisterStartResponse( |
| registration_response) |
| <----------------------------------|
| |
| OpaqueRegisterFinish(username, |
| upload, identity_key) |
| ---------------------------------->|
| |
| OpaqueRegisterFinishResponse( |
| success) |
| <----------------------------------|
| |
| OpaqueLoginStart(username, |
| login_request) |
| ---------------------------------->|
| |
| OpaqueLoginStartResponse( |
| login_response) |
| <----------------------------------|
| |
| OpaqueLoginFinish(username, |
| finalization, identity_key) |
| ---------------------------------->|
| |
| OpaqueLoginFinishResponse( |
| session_token) |
| <----------------------------------|
```
The `session_token` returned on login is passed in subsequent authenticated RPCs.
## Error Handling
The server returns protobuf-encoded error responses on the same stream. Error
conditions include:
- Invalid method ID: stream reset
- Authentication failure: error response with details
- Rate limiting: error response with retry-after hint
- Internal errors: generic error response

View File

@@ -1,149 +1,215 @@
# Auth Schema
**Schema file:** `schemas/auth.capnp`
**File ID:** `@0xb3a8f1c2e4d97650`
**Proto file:** `proto/qpq/v1/auth.proto`
**Package:** `qpq.v1`
**Method IDs:** 100-103
The `AuthenticationService` interface defines the RPC contract for uploading and fetching MLS KeyPackages. It is the standalone version of the Authentication Service; in the current architecture, these methods are integrated into the unified [NodeService](node-service-schema.md) interface.
The auth proto defines the OPAQUE asymmetric password-authenticated key exchange (PAKE) messages used for user registration and login. OPAQUE never transmits the password to the server; the server learns only a random value derived from the password.
Registration is a two-round-trip flow (start + finish). Login is a two-round-trip flow (start + finish). On successful login, the server returns a `session_token` used to authenticate subsequent RPCs.
See [Authentication Service Internals](../internals/authentication-service.md) for the server-side implementation and the full flow diagram.
---
## Full schema listing
## Full proto listing
```capnp
# auth.capnp -- Authentication Service RPC interface.
#
# Clients call uploadKeyPackage before joining any group so that peers can
# fetch their key material to add them. Each KeyPackage is single-use (MLS
# requirement): fetchKeyPackage removes and returns one package atomically.
#
# The server indexes packages by the raw Ed25519 public key bytes (32 bytes),
# not a fingerprint, so callers must know the target's identity public key
# out-of-band (e.g. from a directory or QR code scan).
#
# ID generated with: capnp id
@0xb3a8f1c2e4d97650;
```protobuf
syntax = "proto3";
package qpq.v1;
interface AuthenticationService {
# Upload a single-use KeyPackage for later retrieval by peers.
#
# identityKey : Ed25519 public key bytes (exactly 32 bytes).
# package : openmls-serialised KeyPackage blob (TLS encoding).
#
# Returns the SHA-256 fingerprint of `package`. Clients should record this
# and compare it against the fingerprint returned by a peer's fetchKeyPackage
# to detect tampering.
uploadKeyPackage @0 (identityKey :Data, package :Data) -> (fingerprint :Data);
// OPAQUE registration + login (4 methods).
// Method IDs: 100-103.
# Fetch and atomically remove one KeyPackage for a given identity key.
#
# Returns empty Data if no KeyPackage is currently stored for this identity.
# Callers should handle the empty case by asking the target to upload more
# packages before retrying.
fetchKeyPackage @1 (identityKey :Data) -> (package :Data);
message OpaqueRegisterStartRequest {
string username = 1;
bytes request = 2;
}
message OpaqueRegisterStartResponse {
bytes response = 1;
}
message OpaqueRegisterFinishRequest {
string username = 1;
bytes upload = 2;
bytes identity_key = 3;
}
message OpaqueRegisterFinishResponse {
bool success = 1;
}
message OpaqueLoginStartRequest {
string username = 1;
bytes request = 2;
}
message OpaqueLoginStartResponse {
bytes response = 1;
}
message OpaqueLoginFinishRequest {
string username = 1;
bytes finalization = 2;
bytes identity_key = 3;
}
message OpaqueLoginFinishResponse {
bytes session_token = 1;
}
```
---
## Method-by-method analysis
## Registration flow (IDs 100-101)
### `uploadKeyPackage @0`
User registration takes two round trips. The `request` and `response` fields carry opaque OPAQUE protocol blobs; their internal structure is defined by the `opaque-ke` crate.
### OpaqueRegisterStart (ID 100)
```
uploadKeyPackage (identityKey :Data, package :Data) -> (fingerprint :Data)
Client Server
| |
| OpaqueRegisterStartRequest |
| username: "alice" |
| request: <OPAQUE blob> |
| -----------------------------> |
| |
| OpaqueRegisterStartResponse |
| response: <OPAQUE blob> |
| <----------------------------- |
```
**Purpose:** A client uploads a single-use MLS KeyPackage so that peers can later fetch it to add the client to a group.
**Request fields:**
**Parameters:**
| Field | Type | Description |
|-------|------|-------------|
| `username` | `string` | The username being registered. Must be unique on the server. |
| `request` | `bytes` | OPAQUE `RegistrationRequest` blob generated by the client using the `opaque-ke` crate. |
| Parameter | Type | Size | Description |
|---|---|---|---|
| `identityKey` | `Data` | Exactly 32 bytes | The uploader's raw Ed25519 public key bytes. This is the index key under which the package is stored. |
| `package` | `Data` | Variable (bounded by transport max) | An openmls-serialised KeyPackage blob in TLS encoding. Contains the client's HPKE init key, credential, and signature. |
**Response fields:**
**Return value:**
| Field | Type | Description |
|-------|------|-------------|
| `response` | `bytes` | OPAQUE `RegistrationResponse` blob generated by the server. Client feeds this into the finish step. |
| Field | Type | Size | Description |
|---|---|---|---|
| `fingerprint` | `Data` | 32 bytes | SHA-256 digest of the uploaded `package` bytes. |
**Fingerprint semantics:** The returned fingerprint allows the uploading client to verify that the server stored the package correctly. More importantly, when a peer later fetches a KeyPackage, it can compare the fetched package's SHA-256 hash against the fingerprint (communicated out-of-band) to detect tampering by a malicious server.
**Idempotency:** Uploading the same package twice appends a second copy to the queue. The server does not deduplicate. Clients should avoid uploading duplicates to conserve their KeyPackage supply.
### `fetchKeyPackage @1`
### OpaqueRegisterFinish (ID 101)
```
fetchKeyPackage (identityKey :Data) -> (package :Data)
Client Server
| |
| OpaqueRegisterFinishRequest |
| username: "alice" |
| upload: <OPAQUE record> |
| identity_key: <32 bytes> |
| -----------------------------> |
| |
| OpaqueRegisterFinishResponse |
| success: true |
| <----------------------------- |
```
**Purpose:** Fetch and atomically remove one KeyPackage for a given identity. This is the mechanism by which a group creator obtains a peer's key material in order to add them to a group via MLS `add_members()`.
**Request fields:**
**Parameters:**
| Field | Type | Description |
|-------|------|-------------|
| `username` | `string` | Must match the username from the start request. |
| `upload` | `bytes` | OPAQUE `RegistrationUpload` blob. The server stores this as the user's OPAQUE record; it contains the password-derived key material without revealing the password. |
| `identity_key` | `bytes` | The user's Ed25519 identity public key (32 bytes). Stored alongside the OPAQUE record and used as the user's long-term identifier for key packages and delivery queues. |
| Parameter | Type | Size | Description |
|---|---|---|---|
| `identityKey` | `Data` | Exactly 32 bytes | The raw Ed25519 public key of the target peer whose KeyPackage is being requested. |
**Response fields:**
**Return value:**
| Field | Type | Size | Description |
|---|---|---|---|
| `package` | `Data` | Variable, or 0 bytes | The fetched KeyPackage blob, or empty `Data` if no packages are stored for this identity. |
**Atomic removal:** The fetch operation is destructive: it removes the returned KeyPackage from the server's store in the same operation that returns it. This guarantees MLS's single-use requirement -- a KeyPackage is never served to two different requesters.
**Empty response handling:** Callers must check for an empty response. An empty `package` means the target has no KeyPackages available. The caller should either:
1. Retry after a delay, hoping the target uploads more packages.
2. Signal the user that the target is unreachable for group addition.
| Field | Type | Description |
|-------|------|-------------|
| `success` | `bool` | `true` if the registration record was stored successfully. `false` if the username is already taken or another error occurred. |
---
## Indexing by raw Ed25519 public key
## Login flow (IDs 102-103)
The Authentication Service indexes KeyPackages by the **raw 32-byte Ed25519 public key**, not by a fingerprint or any higher-level identifier. This design choice has several implications:
User login also takes two round trips. On success, the server issues a `session_token` that the client attaches to subsequent authenticated RPCs.
1. **No directory service required for lookup.** The caller must already know the target's Ed25519 public key (obtained out-of-band via QR code scan, manual exchange, or a future directory service).
### OpaqueLoginStart (ID 102)
2. **Consistent with DS indexing.** The [Delivery Service](delivery-schema.md) uses the same 32-byte Ed25519 key as its queue index, so a single key serves as the universal identifier across both services.
```
Client Server
| |
| OpaqueLoginStartRequest |
| username: "alice" |
| request: <OPAQUE blob> |
| -----------------------------> |
| |
| OpaqueLoginStartResponse |
| response: <OPAQUE blob> |
| <----------------------------- |
```
3. **No ambiguity.** Unlike fingerprints (which could collide if truncated) or human-readable names (which require a mapping layer), the raw public key is the canonical, collision-resistant identifier.
**Request fields:**
| Field | Type | Description |
|-------|------|-------------|
| `username` | `string` | The username logging in. |
| `request` | `bytes` | OPAQUE `CredentialRequest` blob generated by the client. |
**Response fields:**
| Field | Type | Description |
|-------|------|-------------|
| `response` | `bytes` | OPAQUE `CredentialResponse` blob. Contains the server's masked public key and envelope for the client to derive its export key. |
### OpaqueLoginFinish (ID 103)
```
Client Server
| |
| OpaqueLoginFinishRequest |
| username: "alice" |
| finalization: <OPAQUE blob> |
| identity_key: <32 bytes> |
| -----------------------------> |
| |
| OpaqueLoginFinishResponse |
| session_token: <32 bytes> |
| <----------------------------- |
```
**Request fields:**
| Field | Type | Description |
|-------|------|-------------|
| `username` | `string` | Must match the username from the start request. |
| `finalization` | `bytes` | OPAQUE `CredentialFinalization` blob containing the client's proof of knowledge of the password. The server verifies this against its stored OPAQUE record. |
| `identity_key` | `bytes` | The user's Ed25519 identity public key (32 bytes). The server verifies this matches the key registered during `OpaqueRegisterFinish`. |
**Response fields:**
| Field | Type | Description |
|-------|------|-------------|
| `session_token` | `bytes` | Opaque bearer token (32 bytes). Included in subsequent RPC requests to authenticate the session. The server associates this token with the user's identity and device. |
If login fails (wrong password, unknown username, or identity key mismatch), the server returns an error status in the response frame; the `session_token` field is empty.
---
## Single-use semantics
## Session token usage
MLS requires that each KeyPackage be used at most once to preserve the forward secrecy of the initial key exchange. The Authentication Service enforces this by atomically removing the KeyPackage on fetch.
After a successful `OpaqueLoginFinish`, the client uses the `session_token` as a bearer credential for all authenticated RPC methods. The token is passed at the QUIC connection level (not per-frame); the server validates it on connection establishment and maintains the association for the lifetime of the connection.
**Consequences for clients:**
The `Auth` message in `common.proto` carries the token for federation and internal use:
- Clients should **pre-upload multiple KeyPackages** after generating their identity, so that several peers can add them to groups concurrently without exhausting the supply.
- Clients should **monitor their KeyPackage count** on the server (via a future monitoring endpoint or periodic re-upload) and replenish when the supply runs low.
- If a client has zero KeyPackages stored, it is effectively unreachable for new group invitations until it uploads more.
For the design rationale behind single-use KeyPackages, see [ADR-005: Single-Use KeyPackages](../design-rationale/adr-005-single-use-keypackages.md).
---
## Relationship to NodeService
In the current unified architecture, the Authentication Service methods are exposed as part of the [NodeService interface](node-service-schema.md):
| AuthenticationService Method | NodeService Method | Additional Parameters |
|---|---|---|
| `uploadKeyPackage @0` | `uploadKeyPackage @0` | `auth :Auth` |
| `fetchKeyPackage @1` | `fetchKeyPackage @1` | `auth :Auth` |
The standalone `AuthenticationService` interface remains in the schema for documentation purposes and for use in contexts where the full NodeService is not needed.
```protobuf
message Auth {
bytes access_token = 1;
bytes device_id = 2;
}
```
---
## Further reading
- [Wire Format Overview](overview.md) -- serialisation pipeline context
- [NodeService Schema](node-service-schema.md) -- unified interface that subsumes AuthenticationService
- [Delivery Schema](delivery-schema.md) -- the companion service for message routing
- [Envelope Schema](envelope-schema.md) -- legacy framing that used `keyPackageUpload`/`keyPackageFetch` message types
- [ADR-005: Single-Use KeyPackages](../design-rationale/adr-005-single-use-keypackages.md) -- design rationale for atomic removal on fetch
- [ADR-004: MLS-Unaware Delivery Service](../design-rationale/adr-004-mls-unaware-ds.md) -- why the server does not inspect MLS content
- [Wire Format Overview](overview.md) -- frame format and transport parameters
- [Method ID Reference](envelope-schema.md) -- all 44 method IDs
- [Authentication Service Internals](../internals/authentication-service.md) -- server-side OPAQUE flow and session management
- [RPC Reference](node-service-schema.md) -- all proto definitions

View File

@@ -1,193 +1,377 @@
# Delivery Schema
# Delivery and Keys Schema
**Schema file:** `schemas/delivery.capnp`
**File ID:** `@0xc5d9e2b4f1a83076`
**Proto files:** `proto/qpq/v1/delivery.proto`, `proto/qpq/v1/keys.proto`
**Package:** `qpq.v1`
**Method IDs:** 200-205 (delivery), 300-304 (key packages and hybrid keys), 510-520 (key transparency)
The `DeliveryService` interface defines the RPC contract for the store-and-forward message relay. The DS is intentionally MLS-unaware: it routes opaque byte strings by recipient key and optional channel ID without parsing or inspecting the content.
This page documents the Protobuf message definitions for the delivery service (store-and-forward message relay) and the key management service (MLS KeyPackages, hybrid post-quantum keys, and key transparency).
---
## Full schema listing
## delivery.proto
```capnp
# delivery.capnp -- Delivery Service RPC interface.
#
# The Delivery Service is a simple store-and-forward relay. It does not parse
# MLS messages -- all payloads are opaque byte strings routed by recipient key.
#
# Callers are responsible for:
# - Routing Welcome messages to the correct new member after add_members().
# - Routing Commit messages to any existing group members (other than self).
# - Routing Application messages to the intended recipient(s).
#
# The DS indexes queues by the recipient's raw Ed25519 public key (32 bytes),
# matching the indexing scheme used by the Authentication Service.
#
# ID generated with: capnp id
@0xc5d9e2b4f1a83076;
The delivery service is a store-and-forward relay. It is intentionally MLS-unaware: all payloads are opaque byte strings routed by recipient key and channel ID. The server never inspects or decrypts message content.
interface DeliveryService {
# Enqueue an opaque payload for delivery to a recipient.
#
# recipientKey : Ed25519 public key of the intended recipient (exactly 32 bytes).
# payload : Opaque byte string -- a TLS-encoded MlsMessageOut blob or any
# other framed data the application layer wants to deliver.
# channelId : Optional channel identifier (empty for legacy). A 16-byte UUID
# is recommended for 1:1 channels.
# version : Schema/wire version. Must be 0 (legacy) or 1 (this spec).
#
# The payload is appended to the recipient's FIFO queue. Returns immediately;
# the recipient retrieves it via `fetch`.
enqueue @0 (recipientKey :Data, payload :Data, channelId :Data, version :UInt16) -> ();
### Full proto listing
# Fetch and atomically drain all queued payloads for a given recipient.
#
# recipientKey : Ed25519 public key of the caller (exactly 32 bytes).
# channelId : Optional channel identifier (empty for legacy).
# version : Schema/wire version. Must be 0 (legacy) or 1 (this spec).
#
# Returns the complete queue in FIFO order and clears it. Returns an empty
# list if there are no pending messages.
fetch @1 (recipientKey :Data, channelId :Data, version :UInt16) -> (payloads :List(Data));
```protobuf
syntax = "proto3";
package qpq.v1;
// Delivery service: enqueue, fetch, peek, ack, batch (6 methods).
// Method IDs: 200-205.
message Envelope {
uint64 seq = 1;
bytes data = 2;
}
message EnqueueRequest {
bytes recipient_key = 1;
bytes payload = 2;
bytes channel_id = 3;
uint32 ttl_secs = 4;
// Client-generated idempotency key (16 bytes, UUID v7).
// Server deduplicates enqueue requests with the same message_id within a TTL window.
bytes message_id = 5;
}
message EnqueueResponse {
uint64 seq = 1;
bytes delivery_proof = 2;
// True if this was a duplicate enqueue (message_id already seen).
bool duplicate = 3;
}
message FetchRequest {
bytes recipient_key = 1;
bytes channel_id = 2;
uint32 limit = 3;
// Device ID for multi-device scoping.
bytes device_id = 4;
}
message FetchResponse {
repeated Envelope payloads = 1;
}
message FetchWaitRequest {
bytes recipient_key = 1;
bytes channel_id = 2;
uint64 timeout_ms = 3;
uint32 limit = 4;
bytes device_id = 5;
}
message FetchWaitResponse {
repeated Envelope payloads = 1;
}
message PeekRequest {
bytes recipient_key = 1;
bytes channel_id = 2;
uint32 limit = 3;
bytes device_id = 4;
}
message PeekResponse {
repeated Envelope payloads = 1;
}
message AckRequest {
bytes recipient_key = 1;
bytes channel_id = 2;
uint64 seq_up_to = 3;
bytes device_id = 4;
}
message AckResponse {}
message BatchEnqueueRequest {
repeated bytes recipient_keys = 1;
bytes payload = 2;
bytes channel_id = 3;
uint32 ttl_secs = 4;
bytes message_id = 5;
}
message BatchEnqueueResponse {
repeated uint64 seqs = 1;
}
```
---
### Envelope
## Method-by-method analysis
### `enqueue @0`
```
enqueue (recipientKey :Data, payload :Data, channelId :Data, version :UInt16) -> ()
```
**Purpose:** Append an opaque payload to a recipient's delivery queue. The DS stores the payload until the recipient fetches it. The call returns immediately after the payload is enqueued; it does not block until delivery.
**Parameters:**
| Parameter | Type | Size | Description |
|---|---|---|---|
| `recipientKey` | `Data` | Exactly 32 bytes | Ed25519 public key of the intended recipient. Used as the primary queue index. |
| `payload` | `Data` | Variable (bounded by transport max) | Opaque byte string. Typically a TLS-encoded `MlsMessageOut` blob, but the DS does not inspect it. |
| `channelId` | `Data` | 0 bytes (legacy) or 16 bytes (UUID) | Channel identifier for channel-aware routing. Empty `Data` is treated as the legacy default channel. |
| `version` | `UInt16` | 2 bytes | Schema/wire version. `0` = legacy (no channel routing), `1` = current spec (channel-aware). |
**Return value:** Void. The method returns `()` on success. Errors are surfaced as Cap'n Proto RPC exceptions.
**Queue semantics:** Payloads are appended in FIFO order. The DS does not deduplicate, reorder, or inspect payloads. Multiple enqueue calls for the same recipient and channel ID are simply appended to the queue in the order they arrive.
### `fetch @1`
```
fetch (recipientKey :Data, channelId :Data, version :UInt16) -> (payloads :List(Data))
```
**Purpose:** Fetch and atomically drain all queued payloads for a given recipient on a given channel. This is the "pull" side of the store-and-forward relay.
**Parameters:**
| Parameter | Type | Size | Description |
|---|---|---|---|
| `recipientKey` | `Data` | Exactly 32 bytes | Ed25519 public key of the caller. Must match the key used in the enqueue calls. |
| `channelId` | `Data` | 0 bytes (legacy) or 16 bytes (UUID) | Channel identifier. Must match the `channelId` used during enqueue. |
| `version` | `UInt16` | 2 bytes | Schema/wire version. Must match the version used during enqueue. |
**Return value:**
The `Envelope` wrapper is returned by fetch, peek, and fetch-wait operations.
| Field | Type | Description |
|---|---|---|
| `payloads` | `List(Data)` | All queued payloads in FIFO order. Empty list if no messages are pending. |
|-------|------|-------------|
| `seq` | `uint64` | Server-assigned monotonic sequence number for ordering and acknowledgment. |
| `data` | `bytes` | The original payload bytes submitted at enqueue time. |
**Atomic drain:** The fetch operation returns the entire queue and clears it in a single atomic operation. There is no "peek" or partial fetch. This simplifies the concurrency model: the client processes all returned payloads and does not need to track which ones it has already seen.
### Enqueue (ID 200)
Appends an opaque payload to a recipient's queue. Returns immediately.
**Request:**
| Field | Type | Description |
|-------|------|-------------|
| `recipient_key` | `bytes` | Recipient's Ed25519 identity public key (32 bytes). Primary queue index. |
| `payload` | `bytes` | Opaque byte string. Typically a TLS-encoded MLS ciphertext blob. |
| `channel_id` | `bytes` | Channel identifier (16-byte UUID v7 recommended). Empty = default channel. |
| `ttl_secs` | `uint32` | Time-to-live in seconds. Server garbage-collects expired messages. 0 = server default. |
| `message_id` | `bytes` | Client-generated idempotency key (16 bytes, UUID v7). Server deduplicates within the TTL window. |
**Response:**
| Field | Type | Description |
|-------|------|-------------|
| `seq` | `uint64` | Server-assigned sequence number for this message. |
| `delivery_proof` | `bytes` | Cryptographic proof of delivery (reserved for future use). |
| `duplicate` | `bool` | `true` if this `message_id` was already seen within the TTL window; the payload was not stored again. |
### Fetch (ID 201)
Returns and retains queued messages up to `limit`. Does not remove messages from the queue; use `Ack` to advance the read cursor.
**Request:**
| Field | Type | Description |
|-------|------|-------------|
| `recipient_key` | `bytes` | Recipient's Ed25519 identity public key (32 bytes). |
| `channel_id` | `bytes` | Channel identifier. Must match the value used at enqueue time. |
| `limit` | `uint32` | Maximum number of envelopes to return. 0 = server default. |
| `device_id` | `bytes` | Optional device identifier for multi-device queue scoping. |
**Response:**
| Field | Type | Description |
|-------|------|-------------|
| `payloads` | `repeated Envelope` | Messages in FIFO order. Empty list if no messages are pending. |
### FetchWait (ID 202)
Long-poll variant of `Fetch`. Blocks on the server until messages arrive or `timeout_ms` elapses.
**Request:**
| Field | Type | Description |
|-------|------|-------------|
| `recipient_key` | `bytes` | Recipient's Ed25519 identity public key (32 bytes). |
| `channel_id` | `bytes` | Channel identifier. |
| `timeout_ms` | `uint64` | Maximum wait time in milliseconds. 0 = return immediately (equivalent to Fetch). |
| `limit` | `uint32` | Maximum number of envelopes to return. |
| `device_id` | `bytes` | Optional device identifier. |
**Response:** Same as `FetchResponse`.
FetchWait eliminates polling latency: the server holds the RPC open until a `Notify` is signalled by a concurrent `Enqueue` call, or until `timeout_ms` expires.
### Peek (ID 203)
Non-destructive read. Returns messages without removing them and without advancing the acknowledgment cursor.
**Request / Response:** Same field layout as `FetchRequest` / `FetchResponse`.
Peek is useful for inspecting pending messages without marking them as delivered.
### Ack (ID 204)
Advances the delivery cursor, removing all messages with `seq <= seq_up_to` from the queue.
**Request:**
| Field | Type | Description |
|-------|------|-------------|
| `recipient_key` | `bytes` | Recipient's Ed25519 identity public key (32 bytes). |
| `channel_id` | `bytes` | Channel identifier. |
| `seq_up_to` | `uint64` | All messages with sequence number <= this value are removed. |
| `device_id` | `bytes` | Optional device identifier. |
**Response:** Empty (`AckResponse {}`).
### BatchEnqueue (ID 205)
Fan-out: enqueues the same payload to multiple recipients in a single RPC call.
**Request:**
| Field | Type | Description |
|-------|------|-------------|
| `recipient_keys` | `repeated bytes` | List of recipient Ed25519 identity public keys. |
| `payload` | `bytes` | Opaque payload, delivered identically to all recipients. |
| `channel_id` | `bytes` | Channel identifier. |
| `ttl_secs` | `uint32` | Time-to-live in seconds. |
| `message_id` | `bytes` | Idempotency key (16 bytes). |
**Response:**
| Field | Type | Description |
|-------|------|-------------|
| `seqs` | `repeated uint64` | Server-assigned sequence numbers, one per `recipient_key`, in the same order. |
---
## Channel-aware routing
## keys.proto
The `channelId` field enables per-channel queue separation. Each unique `(recipientKey, channelId)` pair maps to an independent FIFO queue on the server.
Key management for MLS KeyPackages, hybrid post-quantum keys, and key transparency audit.
### Compound key structure
### Full proto listing
```text
Queue Key = recipientKey (32 bytes) || channelId (0 or 16 bytes)
```protobuf
syntax = "proto3";
package qpq.v1;
// Key package + hybrid key CRUD (5 methods).
// Method IDs: 300-304.
message UploadKeyPackageRequest {
bytes identity_key = 1;
bytes package = 2;
}
message UploadKeyPackageResponse {
bytes fingerprint = 1;
}
message FetchKeyPackageRequest {
bytes identity_key = 1;
}
message FetchKeyPackageResponse {
bytes package = 1;
}
message UploadHybridKeyRequest {
bytes identity_key = 1;
bytes hybrid_public_key = 2;
}
message UploadHybridKeyResponse {}
message FetchHybridKeyRequest {
bytes identity_key = 1;
}
message FetchHybridKeyResponse {
bytes hybrid_public_key = 1;
}
message FetchHybridKeysRequest {
repeated bytes identity_keys = 1;
}
message FetchHybridKeysResponse {
repeated bytes keys = 1;
}
// Key revocation (method ID 510).
message RevokeKeyRequest {
bytes identity_key = 1;
string reason = 2; // "compromised", "superseded", "user_revoked"
}
message RevokeKeyResponse {
bool success = 1;
uint64 leaf_index = 2;
}
// Check revocation status (method ID 511).
message CheckRevocationRequest {
bytes identity_key = 1;
}
message CheckRevocationResponse {
bool revoked = 1;
string reason = 2;
uint64 timestamp_ms = 3;
}
// KT audit log retrieval (method ID 520).
message AuditKeyTransparencyRequest {
uint64 start = 1;
uint64 end = 2;
}
message AuditKeyTransparencyResponse {
repeated LogEntry entries = 1;
uint64 tree_size = 2;
bytes root = 3;
}
message LogEntry {
uint64 index = 1;
bytes leaf_hash = 2;
}
```
When `channelId` is empty (0 bytes), the queue key degenerates to just the `recipientKey`, preserving backward compatibility with legacy clients that do not use channels.
### UploadKeyPackage (ID 300)
### Channel ID format
Uploads a single-use MLS KeyPackage. KeyPackages are stored in a FIFO queue per identity; each is consumed once by `FetchKeyPackage`.
The recommended format for `channelId` is a 16-byte UUID (128-bit, typically UUID v4). The DS treats the channel ID as an opaque byte string and does not parse its structure. Using UUIDs provides:
| Field | Type | Description |
|-------|------|-------------|
| `identity_key` | `bytes` | Uploader's Ed25519 identity public key (32 bytes). Index key for the queue. |
| `package` | `bytes` | openmls-serialised KeyPackage (bincode format, as required by `DiskKeyStore`). |
1. **Collision resistance** -- 2^122 random bits (for UUID v4) makes accidental collision negligible.
2. **Privacy** -- The channel ID reveals no information about the channel's participants or purpose.
3. **Fixed size** -- 16 bytes is compact and predictable for indexing.
Response: `fingerprint` -- SHA-256 digest of the stored package (32 bytes). Callers should record this to detect tampering.
### Use cases
### FetchKeyPackage (ID 301)
| Scenario | channelId | recipientKey | Result |
|---|---|---|---|
| Legacy client, no channels | Empty (0 bytes) | Alice's Ed25519 key | Single queue for all of Alice's messages |
| 1:1 channel between Alice and Bob | UUID of the 1:1 channel | Alice's Ed25519 key | Separate queue for this specific channel |
| Group channel | UUID of the group channel | Alice's Ed25519 key | Separate queue for this group's messages to Alice |
Fetches and atomically removes one KeyPackage for the given identity. Returns empty bytes if no packages are stored. The removal is atomic; concurrent fetches will not receive the same package.
---
### UploadHybridKey (ID 302)
## Version field
Uploads the client's hybrid (X25519 + ML-KEM-768) public key. Unlike KeyPackages, hybrid keys are not single-use -- each identity stores exactly one, overwriting the previous value.
The `version` field provides a mechanism for wire-level schema evolution without breaking existing clients.
| Field | Type | Description |
|-------|------|-------------|
| `identity_key` | `bytes` | Uploader's Ed25519 identity public key (32 bytes). |
| `hybrid_public_key` | `bytes` | Concatenated X25519 public key (32 bytes) + ML-KEM-768 encapsulation key. |
| Version | Semantics |
|---|---|
| `0` | Legacy mode. `channelId` is ignored (treated as empty). Behaves like the pre-channel DeliveryService. |
| `1` | Current specification. `channelId` is used for channel-aware routing. |
### FetchHybridKey (ID 303)
The server validates the version field and rejects unknown versions as protocol errors. Clients must set the version field to match the schema revision they implement.
Fetches a single peer's hybrid public key. Non-destructive.
---
### FetchHybridKeys (ID 304)
## FIFO queue semantics
Batch variant of `FetchHybridKey`. Returns one key per input identity key, in the same order. Missing keys are returned as empty bytes at the corresponding index.
The Delivery Service provides strict FIFO ordering within each `(recipientKey, channelId)` queue:
### RevokeKey (ID 510)
1. **Enqueue order is preserved.** Payloads are returned by `fetch` in the exact order they were enqueued.
2. **Atomic drain.** Each `fetch` call returns all pending payloads and clears the queue. There is no risk of partial reads or interleaving.
3. **No persistence guarantees (current implementation).** The in-memory queue is lost on server restart. Persistent storage is planned for a future milestone.
4. **No redelivery.** Once a payload is returned by `fetch`, it is permanently removed. If the client crashes before processing it, the payload is lost. Reliable delivery with acknowledgments is a future enhancement.
Revokes an identity key by appending a revocation entry to the key transparency Merkle log.
---
| Field | Type | Description |
|-------|------|-------------|
| `identity_key` | `bytes` | Identity key to revoke (32 bytes). |
| `reason` | `string` | One of: `"compromised"`, `"superseded"`, `"user_revoked"`. |
## MLS-unaware design
Response: `leaf_index` is the index of the revocation entry in the KT Merkle log.
The DS intentionally does not parse, validate, or inspect MLS messages. All payloads are opaque `Data` blobs. This design has several consequences:
### CheckRevocation (ID 511)
- **Security:** The server cannot extract plaintext from MLS ciphertext, even if compromised.
- **Simplicity:** The DS has no dependency on openmls or any MLS library.
- **Flexibility:** The same DS can carry non-MLS payloads (e.g., signaling, metadata) without modification.
- **No server-side optimization:** The DS cannot optimize delivery based on MLS message type (e.g., fanning out a Commit to all group members). The client must enqueue separately for each recipient.
Checks whether an identity key has been revoked.
For the full design rationale, see [ADR-004: MLS-Unaware Delivery Service](../design-rationale/adr-004-mls-unaware-ds.md).
Response fields: `revoked` (bool), `reason` (string), `timestamp_ms` (uint64 unix milliseconds of the revocation event).
---
### AuditKeyTransparency (ID 520)
## Relationship to NodeService
Returns a range of entries from the key transparency append-only Merkle log.
In the current unified architecture, the Delivery Service methods are exposed as part of the [NodeService interface](node-service-schema.md) with additional methods:
| Field | Type | Description |
|-------|------|-------------|
| `start` | `uint64` | First leaf index (inclusive). |
| `end` | `uint64` | Last leaf index (exclusive). 0 = up to current tree size. |
| DeliveryService Method | NodeService Method | Additional Parameters |
|---|---|---|
| `enqueue @0` | `enqueue @2` | `auth :Auth` |
| `fetch @1` | `fetch @3` | `auth :Auth` |
| *(none)* | `fetchWait @4` | `auth :Auth`, `timeoutMs :UInt64` |
The `fetchWait` method is a NodeService extension that provides long-polling semantics: it blocks until either new payloads arrive or the timeout expires. This avoids the latency and bandwidth overhead of repeated `fetch` polling.
Response: `entries` (list of `LogEntry`), `tree_size` (current log size), `root` (Merkle root hash).
---
## Further reading
- [Wire Format Overview](overview.md) -- serialisation pipeline context
- [NodeService Schema](node-service-schema.md) -- unified interface that subsumes DeliveryService
- [Auth Schema](auth-schema.md) -- the companion service for KeyPackage management
- [Envelope Schema](envelope-schema.md) -- legacy framing that used `mlsWelcome`/`mlsCommit`/`mlsApplication` message types
- [ADR-004: MLS-Unaware Delivery Service](../design-rationale/adr-004-mls-unaware-ds.md) -- why the DS does not inspect MLS content
- [Wire Format Overview](overview.md) -- frame format and transport parameters
- [Method ID Reference](envelope-schema.md) -- all 44 method IDs
- [Auth Schema](auth-schema.md) -- OPAQUE authentication proto definitions
- [RPC Reference](node-service-schema.md) -- all proto definitions for all 14 files
- [Storage Backend](../internals/storage-backend.md) -- how KeyPackages and hybrid keys are persisted

View File

@@ -1,149 +1,208 @@
# Envelope Schema
# Method ID Reference
**Schema file:** `schemas/envelope.capnp`
**File ID:** `@0xe4a7f2c8b1d63509`
The v2 RPC protocol dispatches requests by a `u16` method ID encoded in the first two bytes of every request frame. This page is the authoritative reference for all 44 method IDs and their corresponding Protobuf message types.
The Envelope is the legacy top-level wire message used in M1 for all quicproquo traffic. Every frame exchanged between peers was serialised as an Envelope, with the Delivery Service routing by `(groupId, msgType)` without inspecting the payload.
> **Note:** The Envelope is the M1-era framing format. The current M3+ architecture uses Cap'n Proto RPC directly via the [NodeService](node-service-schema.md) interface. The Envelope schema remains in the codebase for backward compatibility and for use in integration tests.
Method IDs are defined in `crates/quicproquo-proto/src/lib.rs` (the `method_ids` module). Proto definitions live in `proto/qpq/v1/`.
---
## Full schema listing
## Auth (100-103)
```capnp
# envelope.capnp -- top-level wire message for all quicproquo traffic.
#
# Every frame is serialised as an Envelope.
# The Delivery Service routes by (groupId, msgType) without inspecting payload.
#
# Field sizing rationale:
# groupId / senderId : 32 bytes -- SHA-256 digest
# payload : opaque -- MLS blob or control data
# timestampMs : UInt64 -- unix epoch milliseconds; sufficient until year 292M
#
# ID generated with: capnp id
@0xe4a7f2c8b1d63509;
OPAQUE asymmetric password-authenticated key exchange. Registration is a two-round trip (start + finish); login is a two-round trip (start + finish). See [Auth Schema](auth-schema.md) for proto definitions and [Authentication Service](../internals/authentication-service.md) for flow diagrams.
struct Envelope {
# Message type discriminant -- determines how payload is interpreted.
msgType @0 :MsgType;
# 32-byte SHA-256 digest of the group name.
# The Delivery Service uses this as its routing key.
# Zero-filled for point-to-point control messages (ping, keyPackageUpload, etc.).
groupId @1 :Data;
# 32-byte SHA-256 digest of the sender's Ed25519 identity public key.
senderId @2 :Data;
# Opaque payload. Interpretation is determined by msgType.
payload @3 :Data;
# Unix timestamp in milliseconds at the time of send.
timestampMs @4 :UInt64;
enum MsgType {
ping @0;
pong @1;
keyPackageUpload @2;
keyPackageFetch @3;
keyPackageResponse @4;
mlsWelcome @5;
mlsCommit @6;
mlsApplication @7;
error @8;
}
}
```
| ID | Constant | Request | Response |
|----|----------|---------|---------|
| 100 | `OPAQUE_REGISTER_START` | `OpaqueRegisterStartRequest` | `OpaqueRegisterStartResponse` |
| 101 | `OPAQUE_REGISTER_FINISH` | `OpaqueRegisterFinishRequest` | `OpaqueRegisterFinishResponse` |
| 102 | `OPAQUE_LOGIN_START` | `OpaqueLoginStartRequest` | `OpaqueLoginStartResponse` |
| 103 | `OPAQUE_LOGIN_FINISH` | `OpaqueLoginFinishRequest` | `OpaqueLoginFinishResponse` |
---
## Field-by-field analysis
## Delivery (200-205)
### `msgType @0 :MsgType`
Store-and-forward message relay. The server is MLS-unaware: payloads are opaque byte strings routed by recipient key and channel ID. See [Delivery Schema](delivery-schema.md) for proto definitions.
A 16-bit enum discriminant (Cap'n Proto enums are encoded as UInt16). Determines how the `payload` field should be interpreted. The discriminant is the first field in the struct for efficient dispatch: a router can read the first two bytes of the struct section to decide how to handle the message without parsing any pointer fields.
### `groupId @1 :Data`
A 32-byte `Data` field containing the SHA-256 digest of the group name. The Delivery Service uses this as its primary routing key when the Envelope-based protocol is active.
**Sizing rationale:** SHA-256 produces a 32-byte (256-bit) digest. This is stored as a variable-length `Data` field rather than a fixed-size blob because Cap'n Proto does not have a fixed-size array type. Implementations must validate that the field contains exactly 32 bytes.
**Special case:** For point-to-point control messages (`ping`, `pong`, `keyPackageUpload`, `keyPackageFetch`), the `groupId` is zero-filled (32 zero bytes) because these messages are not associated with any group.
### `senderId @2 :Data`
A 32-byte `Data` field containing the SHA-256 digest of the sender's Ed25519 identity public key. This allows the receiver to identify the sender without inspecting the MLS-layer credentials.
**Sizing rationale:** Same as `groupId` -- SHA-256 digest, 32 bytes.
### `payload @3 :Data`
An opaque byte string whose interpretation depends on `msgType`.
### `timestampMs @4 :UInt64`
Unix epoch timestamp in milliseconds, set by the sender at the time of send. Encoded as a `UInt64`, which provides sufficient range until approximately year 292,000,000 -- effectively unlimited for practical purposes.
The timestamp is sender-asserted and **not** authenticated by the server. Receivers should treat it as advisory (for display ordering) rather than authoritative.
| ID | Constant | Request | Response |
|----|----------|---------|---------|
| 200 | `ENQUEUE` | `EnqueueRequest` | `EnqueueResponse` |
| 201 | `FETCH` | `FetchRequest` | `FetchResponse` |
| 202 | `FETCH_WAIT` | `FetchWaitRequest` | `FetchWaitResponse` |
| 203 | `PEEK` | `PeekRequest` | `PeekResponse` |
| 204 | `ACK` | `AckRequest` | `AckResponse` |
| 205 | `BATCH_ENQUEUE` | `BatchEnqueueRequest` | `BatchEnqueueResponse` |
---
## MsgType enum
## Keys (300-304)
The `MsgType` enum defines nine message types. Each variant determines how the `payload` field is interpreted:
MLS KeyPackage and hybrid post-quantum key management. See [Delivery Schema](delivery-schema.md) for proto definitions (keys are defined in `keys.proto`).
| Ordinal | Variant | Payload Contents | Direction |
|---|---|---|---|
| 0 | `ping` | Empty | Client -> Server or Peer -> Peer |
| 1 | `pong` | Empty | Server -> Client or Peer -> Peer |
| 2 | `keyPackageUpload` | openmls-serialised KeyPackage blob (TLS encoding) | Client -> Server |
| 3 | `keyPackageFetch` | Target identity key (32 bytes, raw Ed25519 public key) | Client -> Server |
| 4 | `keyPackageResponse` | openmls-serialised KeyPackage blob, or empty if none stored | Server -> Client |
| 5 | `mlsWelcome` | `MLSMessage` blob (Welcome variant) | Peer -> Peer (via DS) |
| 6 | `mlsCommit` | `MLSMessage` blob (PublicMessage / Commit variant) | Peer -> Group (via DS) |
| 7 | `mlsApplication` | `MLSMessage` blob (PrivateMessage / Application variant) | Peer -> Group (via DS) |
| 8 | `error` | UTF-8 error description string | Any direction |
### Control messages (0-1)
`ping` and `pong` are keepalive probes with empty payloads. They serve as health checks over long-lived connections.
### Authentication messages (2-4)
`keyPackageUpload`, `keyPackageFetch`, and `keyPackageResponse` implement the Authentication Service protocol over the Envelope format. In the current architecture, these operations are handled by the [NodeService RPC](node-service-schema.md) methods `uploadKeyPackage` and `fetchKeyPackage` instead.
### MLS messages (5-7)
`mlsWelcome`, `mlsCommit`, and `mlsApplication` carry MLS protocol messages as opaque blobs. The Envelope does not inspect or validate the MLS content; it simply transports the bytes between peers via the Delivery Service.
### Error messages (8)
`error` carries a UTF-8 string describing an error condition. Used for protocol-level error reporting (e.g., "no KeyPackage found for identity").
| ID | Constant | Request | Response |
|----|----------|---------|---------|
| 300 | `UPLOAD_KEY_PACKAGE` | `UploadKeyPackageRequest` | `UploadKeyPackageResponse` |
| 301 | `FETCH_KEY_PACKAGE` | `FetchKeyPackageRequest` | `FetchKeyPackageResponse` |
| 302 | `UPLOAD_HYBRID_KEY` | `UploadHybridKeyRequest` | `UploadHybridKeyResponse` |
| 303 | `FETCH_HYBRID_KEY` | `FetchHybridKeyRequest` | `FetchHybridKeyResponse` |
| 304 | `FETCH_HYBRID_KEYS` | `FetchHybridKeysRequest` | `FetchHybridKeysResponse` |
---
## Relationship to NodeService
## Channel (400)
The Envelope schema was the original M1 wire format. With the transition to QUIC + TLS 1.3 and Cap'n Proto RPC in M3, the Envelope's role has been superseded by the [NodeService interface](node-service-schema.md), which provides typed RPC methods for each operation.
Direct-message channel creation. Returns a deterministic channel ID for a given peer key pair, with deduplication.
The key differences:
| ID | Constant | Request | Response |
|----|----------|---------|---------|
| 400 | `CREATE_CHANNEL` | `CreateChannelRequest` | `CreateChannelResponse` |
| Aspect | Envelope (M1) | NodeService RPC (M3+) |
|---|---|---|
| Dispatch | Manual, based on `msgType` enum | Automatic, Cap'n Proto RPC method dispatch |
| Type safety | Payload is opaque `Data` | Each method has typed parameters and return values |
| Transport | QUIC + TLS 1.3 | QUIC + TLS 1.3 |
| Auth | None | Explicit `Auth` struct per method call |
---
## Group Management (410-413)
MLS group operations: member removal, metadata updates, member listing, and key rotation.
| ID | Constant | Request | Response |
|----|----------|---------|---------|
| 410 | `REMOVE_MEMBER` | `RemoveMemberRequest` | `RemoveMemberResponse` |
| 411 | `UPDATE_GROUP_METADATA` | `UpdateGroupMetadataRequest` | `UpdateGroupMetadataResponse` |
| 412 | `LIST_GROUP_MEMBERS` | `ListGroupMembersRequest` | `ListGroupMembersResponse` |
| 413 | `ROTATE_KEYS` | `RotateKeysRequest` | `RotateKeysResponse` |
---
## Moderation (420-424)
Content moderation: encrypted reports, user bans, and audit lists. Admin-only methods require elevated session privileges.
| ID | Constant | Request | Response |
|----|----------|---------|---------|
| 420 | `REPORT_MESSAGE` | `ReportMessageRequest` | `ReportMessageResponse` |
| 421 | `BAN_USER` | `BanUserRequest` | `BanUserResponse` |
| 422 | `UNBAN_USER` | `UnbanUserRequest` | `UnbanUserResponse` |
| 423 | `LIST_REPORTS` | `ListReportsRequest` | `ListReportsResponse` |
| 424 | `LIST_BANNED` | `ListBannedRequest` | `ListBannedResponse` |
---
## User / Identity (500-501)
Forward and reverse user resolution. `ResolveUser` returns the identity key with a key-transparency inclusion proof.
| ID | Constant | Request | Response |
|----|----------|---------|---------|
| 500 | `RESOLVE_USER` | `ResolveUserRequest` | `ResolveUserResponse` |
| 501 | `RESOLVE_IDENTITY` | `ResolveIdentityRequest` | `ResolveIdentityResponse` |
---
## Key Transparency (510-520)
Key revocation and audit log access for the Merkle-based key transparency log.
| ID | Constant | Request | Response |
|----|----------|---------|---------|
| 510 | `REVOKE_KEY` | `RevokeKeyRequest` | `RevokeKeyResponse` |
| 511 | `CHECK_REVOCATION` | `CheckRevocationRequest` | `CheckRevocationResponse` |
| 520 | `AUDIT_KEY_TRANSPARENCY` | `AuditKeyTransparencyRequest` | `AuditKeyTransparencyResponse` |
---
## Blob Storage (600-601)
Content-addressed binary object storage with chunked upload and ranged download.
| ID | Constant | Request | Response |
|----|----------|---------|---------|
| 600 | `UPLOAD_BLOB` | `UploadBlobRequest` | `UploadBlobResponse` |
| 601 | `DOWNLOAD_BLOB` | `DownloadBlobRequest` | `DownloadBlobResponse` |
---
## Device Management (700-702, 710)
Multi-device registration, listing, and revocation. Method 710 registers a platform push notification token.
| ID | Constant | Request | Response |
|----|----------|---------|---------|
| 700 | `REGISTER_DEVICE` | `RegisterDeviceRequest` | `RegisterDeviceResponse` |
| 701 | `LIST_DEVICES` | `ListDevicesRequest` | `ListDevicesResponse` |
| 702 | `REVOKE_DEVICE` | `RevokeDeviceRequest` | `RevokeDeviceResponse` |
| 710 | `REGISTER_PUSH_TOKEN` | `RegisterPushTokenRequest` | `RegisterPushTokenResponse` |
---
## Recovery (750-752)
Encrypted account recovery bundle storage. The server stores an opaque blob indexed by `SHA-256(recovery_token)`; the plaintext is never visible to the server.
| ID | Constant | Request | Response |
|----|----------|---------|---------|
| 750 | `STORE_RECOVERY_BUNDLE` | `StoreRecoveryBundleRequest` | `StoreRecoveryBundleResponse` |
| 751 | `FETCH_RECOVERY_BUNDLE` | `FetchRecoveryBundleRequest` | `FetchRecoveryBundleResponse` |
| 752 | `DELETE_RECOVERY_BUNDLE` | `DeleteRecoveryBundleRequest` | `DeleteRecoveryBundleResponse` |
---
## P2P / Health (800-802)
iroh P2P node address exchange and server health probe.
| ID | Constant | Request | Response |
|----|----------|---------|---------|
| 800 | `PUBLISH_ENDPOINT` | `PublishEndpointRequest` | `PublishEndpointResponse` |
| 801 | `RESOLVE_ENDPOINT` | `ResolveEndpointRequest` | `ResolveEndpointResponse` |
| 802 | `HEALTH` | `HealthRequest` | `HealthResponse` |
---
## Federation (900-905)
Cross-server relay for messages, key packages, and user resolution. All federation methods include a `FederationAuth` struct carrying the origin server domain.
| ID | Constant | Request | Response |
|----|----------|---------|---------|
| 900 | `RELAY_ENQUEUE` | `RelayEnqueueRequest` | `RelayEnqueueResponse` |
| 901 | `RELAY_BATCH_ENQUEUE` | `RelayBatchEnqueueRequest` | `RelayBatchEnqueueResponse` |
| 902 | `PROXY_FETCH_KEY_PACKAGE` | `ProxyFetchKeyPackageRequest` | `ProxyFetchKeyPackageResponse` |
| 903 | `PROXY_FETCH_HYBRID_KEY` | `ProxyFetchHybridKeyRequest` | `ProxyFetchHybridKeyResponse` |
| 904 | `PROXY_RESOLVE_USER` | `ProxyResolveUserRequest` | `ProxyResolveUserResponse` |
| 905 | `FEDERATION_HEALTH` | `FederationHealthRequest` | `FederationHealthResponse` |
---
## Account (950)
| ID | Constant | Request | Response |
|----|----------|---------|---------|
| 950 | `DELETE_ACCOUNT` | `DeleteAccountRequest` | `DeleteAccountResponse` |
---
## Push Event Types (1000+)
Push events are sent by the server on QUIC uni-streams using the push frame format. They are not RPC methods (no `request_id`), but share the same event type namespace.
| ID | Constant | Payload |
|----|----------|---------|
| 1000 | `PUSH_NEW_MESSAGE` | `NewMessage` |
| 1001 | `PUSH_TYPING` | `TypingIndicator` |
| 1002 | `PUSH_PRESENCE` | `PresenceUpdate` |
| 1003 | `PUSH_MEMBERSHIP` | `GroupMembershipChange` |
Push payload messages are defined in `proto/qpq/v1/push.proto` and wrapped in a `PushEvent` oneof. See [RPC Reference](node-service-schema.md) for the full proto listing.
---
## Method ID assignment policy
Method IDs are stable across versions. Once assigned, an ID is never reused. New methods are assigned the next available ID in their logical category. Gaps in the numbering are reserved for future use within a category.
---
## Further reading
- [Wire Format Overview](overview.md) -- serialisation pipeline context
- [NodeService Schema](node-service-schema.md) -- the current RPC interface that replaced Envelope-based dispatch
- [Auth Schema](auth-schema.md) -- standalone Authentication Service interface
- [Delivery Schema](delivery-schema.md) -- standalone Delivery Service interface
- [ADR-002: Cap'n Proto over MessagePack](../design-rationale/adr-002-capnproto.md) -- why Cap'n Proto was chosen for the wire format
- [Wire Format Overview](overview.md) -- frame format and transport parameters
- [Auth Schema](auth-schema.md) -- OPAQUE proto definitions (IDs 100-103)
- [Delivery Schema](delivery-schema.md) -- delivery + keys proto definitions (IDs 200-304)
- [RPC Reference](node-service-schema.md) -- all proto definitions for all 14 files

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
# Wire Format Overview
This section documents the serialisation pipeline that transforms application-level data structures into encrypted bytes on the wire. Every byte exchanged between quicproquo clients and the server passes through this pipeline, so understanding it is prerequisite to reading the protocol deep dives or the server/client source code.
This section documents the v2 serialisation pipeline that transforms application-level data structures into bytes on the wire. Every byte exchanged between quicproquo clients and the server passes through this pipeline, so understanding it is prerequisite to reading the protocol deep dives or the server and client source code.
---
@@ -9,69 +9,171 @@ This section documents the serialisation pipeline that transforms application-le
Data flows through three stages on the send path. The receive path reverses the order.
```text
Stage 1 Stage 2 Stage 3
-------- -------- --------
Application Cap'n Proto Transport
data serialisation encryption
Stage 1 Stage 2 Stage 3
-------- -------- --------
Application Protobuf Transport
data serialisation encryption
RPC call capnp::serialize QUIC/TLS 1.3
(zero-copy bytes)
RPC call prost::encode() QUIC/TLS 1.3
+ binary frame header
| | |
v v v
Rust structs Canonical byte Encrypted
& method representation ciphertext
invocations (no deserialization on the wire
needed on receive)
| | |
v v v
Rust structs 10-byte (request) or Encrypted
& method 9-byte (response) ciphertext
invocations binary header + on the wire
protobuf payload
```
### Stage 1: Application creates a message or RPC call
At the application layer, the client or server constructs a typed Cap'n Proto message. In the legacy Envelope path (M1), this means building an `Envelope` struct with a `MsgType` discriminant, group ID, sender ID, and opaque payload. In the current NodeService path (M3+), this means invoking a Cap'n Proto RPC method such as `enqueue()` or `fetchKeyPackage()`.
At the application layer, the client or server constructs a typed Protobuf message defined in `proto/qpq/v1/*.proto`. Each RPC method has a corresponding request and response message type.
- **Envelope** (legacy): see [Envelope Schema](envelope-schema.md)
- **NodeService** (current): see [NodeService Schema](node-service-schema.md)
- **AuthenticationService** (standalone): see [Auth Schema](auth-schema.md)
- **DeliveryService** (standalone): see [Delivery Schema](delivery-schema.md)
- **Auth methods** (IDs 100-103): see [Auth Schema](auth-schema.md)
- **Delivery methods** (IDs 200-205): see [Delivery Schema](delivery-schema.md)
- **All methods**: see [Method ID Reference](envelope-schema.md)
- **Full RPC reference**: see [RPC Reference](node-service-schema.md)
### Stage 2: Cap'n Proto serialises to bytes
### Stage 2: Binary framing + Protobuf serialisation
Cap'n Proto converts the in-memory message to its canonical wire representation. This is a **zero-copy** format: the byte layout in memory is identical to the byte layout on the wire. No serialisation or deserialisation pass is required; readers can traverse the bytes in-place using pointer arithmetic.
The v2 protocol defines three frame types, each with a compact binary header followed by a Protobuf-encoded payload. All multi-byte integers are **big-endian**.
The wire representation consists of:
#### Request frame (client to server, bidirectional stream)
1. A **segment table** -- a list of segment sizes encoded as little-endian 32-bit integers.
2. One or more **segments** -- contiguous runs of 8-byte aligned words containing struct data, list data, and far pointers.
```text
0 1 2 3 4 5 6 7 8 9
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| method_id (u16) | request_id (u32) | len (u32) |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| len (cont.) | protobuf payload ...
+-------+-------+-------...
```
Cap'n Proto's canonical form is deterministic for a given message, which makes it suitable for signing: two implementations that build the same logical message will produce identical bytes.
| Offset | Field | Type | Description |
|--------|-------|------|-------------|
| 0-1 | `method_id` | `u16 BE` | RPC method identifier |
| 2-5 | `request_id` | `u32 BE` | Client-generated correlation ID |
| 6-9 | `payload_len` | `u32 BE` | Length of the protobuf payload in bytes |
| 10+ | payload | bytes | Protobuf-encoded request message |
Header size: **10 bytes**.
#### Response frame (server to client, same bidirectional stream)
```text
0 1 2 3 4 5 6 7 8
+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| status| request_id (u32) | len (u32) |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| protobuf payload ...
+-------...
```
| Offset | Field | Type | Description |
|--------|-------|------|-------------|
| 0 | `status` | `u8` | RPC status code (0 = OK) |
| 1-4 | `request_id` | `u32 BE` | Echoes the request correlation ID |
| 5-8 | `payload_len` | `u32 BE` | Length of the protobuf payload in bytes |
| 9+ | payload | bytes | Protobuf-encoded response message |
Header size: **9 bytes**.
#### Push frame (server to client, QUIC uni-stream)
```text
0 1 2 3 4 5
+-------+-------+-------+-------+-------+-------+
| event_type (u16) | payload_len (u32) |
+-------+-------+-------+-------+-------+-------+
| protobuf payload ...
+-------...
```
| Offset | Field | Type | Description |
|--------|-------|------|-------------|
| 0-1 | `event_type` | `u16 BE` | Push event type identifier |
| 2-5 | `payload_len` | `u32 BE` | Length of the protobuf payload in bytes |
| 6+ | payload | bytes | Protobuf-encoded push event message |
Header size: **6 bytes**.
#### Limits
| Constraint | Value |
|------------|-------|
| Maximum payload size | 4 MiB (4,194,304 bytes) |
| Payloads exceeding this limit | rejected with `PayloadTooLarge` error |
Source: `crates/quicproquo-rpc/src/framing.rs`.
### Stage 3: Transport encryption
The serialised bytes are encrypted by the QUIC/TLS 1.3 transport layer. The QUIC transport uses native QUIC stream framing, which provides its own length delimitation. Cap'n Proto RPC over QUIC relies on the `capnp-rpc` crate's built-in stream adapter.
The framed bytes are encrypted by the QUIC/TLS 1.3 transport layer. QUIC provides stream framing and length delimitation; the RPC framework reads exactly `header_size + payload_len` bytes per frame.
| Transport | Encryption | Authentication |
|---|---|---|
| **QUIC + TLS 1.3** | AES-128-GCM or ChaCha20-Poly1305 (negotiated by TLS) | Server cert (rustls/quinn) |
|-----------|------------|----------------|
| QUIC + TLS 1.3 | AES-128-GCM or ChaCha20-Poly1305 (negotiated by TLS) | Server cert (rustls/quinn) |
The transport layer treats the payload as opaque bytes. It does not inspect or interpret the Cap'n Proto content. This clean separation means the serialisation format can evolve independently of the transport.
The transport layer treats the framed payload as opaque bytes. Serialisation format and transport evolve independently.
---
## QUIC stream model
Each RPC call uses a **dedicated QUIC bidirectional stream**:
1. Client opens a new bidirectional stream.
2. Client sends one request frame and closes the write end.
3. Server reads the request, dispatches by `method_id`, and sends one response frame.
4. Server closes the write end.
Push events (server-initiated) are sent on **QUIC uni-streams** opened by the server. There is no request correlation ID in push frames.
This design allows unlimited concurrent RPCs with no head-of-line blocking.
---
## Connection parameters
| Parameter | Value |
|-----------|-------|
| Protocol | QUIC (RFC 9000) |
| ALPN | `"qpq"` |
| Default port | 5001 |
| TLS version | 1.3 only |
| Certificate | Server presents a TLS certificate; clients verify against a CA cert |
---
## Schema index
The Cap'n Proto schemas that define the wire-level messages are documented on dedicated pages:
Protobuf schemas are defined in `proto/qpq/v1/` and documented on dedicated pages:
| Schema File | Documentation Page | Purpose |
|---|---|---|
| `schemas/envelope.capnp` | [Envelope Schema](envelope-schema.md) | Legacy message envelope (M1) |
| `schemas/auth.capnp` | [Auth Schema](auth-schema.md) | Authentication Service RPC interface |
| `schemas/delivery.capnp` | [Delivery Schema](delivery-schema.md) | Delivery Service RPC interface |
| `schemas/node.capnp` | [NodeService Schema](node-service-schema.md) | Unified node RPC (current) |
| Proto File | Documentation | Purpose |
|------------|---------------|---------|
| `auth.proto` | [Auth Schema](auth-schema.md) | OPAQUE registration and login (IDs 100-103) |
| `delivery.proto` | [Delivery Schema](delivery-schema.md) | Message delivery (IDs 200-205) |
| `keys.proto` | [Delivery Schema](delivery-schema.md) | Key packages and hybrid keys (IDs 300-304, 510-520) |
| `channel.proto` | [RPC Reference](node-service-schema.md) | Channel creation (ID 400) |
| `group.proto` | [RPC Reference](node-service-schema.md) | Group management (IDs 410-413) |
| `moderation.proto` | [RPC Reference](node-service-schema.md) | Content moderation (IDs 420-424) |
| `user.proto` | [RPC Reference](node-service-schema.md) | User resolution (IDs 500-501) |
| `blob.proto` | [RPC Reference](node-service-schema.md) | Blob storage (IDs 600-601) |
| `device.proto` | [RPC Reference](node-service-schema.md) | Device management (IDs 700-702, 710) |
| `recovery.proto` | [RPC Reference](node-service-schema.md) | Account recovery (IDs 750-752) |
| `p2p.proto` | [RPC Reference](node-service-schema.md) | P2P endpoints and health (IDs 800-802) |
| `federation.proto` | [RPC Reference](node-service-schema.md) | Cross-server relay (IDs 900-905) |
| `push.proto` | [RPC Reference](node-service-schema.md) | Push event types (IDs 1000+) |
| `common.proto` | [RPC Reference](node-service-schema.md) | Auth context, account deletion (ID 950) |
Method ID assignment: `crates/quicproquo-proto/src/lib.rs` (`method_ids` module).
---
## Further reading
- [Architecture Overview](../architecture/overview.md) -- system-level view of how services compose
- [Protocol Layers Overview](../protocol-layers/overview.md) -- how transport, framing, and E2E encryption stack
- [ADR-002: Cap'n Proto over MessagePack](../design-rationale/adr-002-capnproto.md) -- why Cap'n Proto was chosen
- [Method ID Reference](envelope-schema.md) -- complete table of all 44 RPC methods
- [Auth Schema](auth-schema.md) -- OPAQUE authentication proto definitions
- [Delivery Schema](delivery-schema.md) -- message delivery proto definitions
- [RPC Reference](node-service-schema.md) -- complete proto definitions for all 14 files