diff --git a/docs/src/SUMMARY.md b/docs/src/SUMMARY.md index ba21db5..9753263 100644 --- a/docs/src/SUMMARY.md +++ b/docs/src/SUMMARY.md @@ -23,12 +23,6 @@ - [TLS in quicproquo](getting-started/tls.md) - [Certificate Lifecycle and CA-Signed TLS](getting-started/certificate-lifecycle.md) - [Docker Deployment](getting-started/docker.md) -- [Go SDK](getting-started/go-sdk.md) -- [TypeScript SDK and Browser Demo](getting-started/typescript-sdk.md) -- [C FFI Bindings](getting-started/ffi.md) -- [WASM Integration](getting-started/wasm.md) -- [Bot SDK](getting-started/bot-sdk.md) -- [Code Generators (qpq-gen)](getting-started/generators.md) - [Mesh Networking](getting-started/mesh-networking.md) - [Demo Walkthrough: Alice and Bob](getting-started/demo-walkthrough.md) @@ -48,12 +42,24 @@ - [Protocol Layers Overview](protocol-layers/overview.md) - [QUIC + TLS 1.3](protocol-layers/quic-tls.md) -- [Cap'n Proto Serialisation and RPC](protocol-layers/capn-proto.md) +- [Protobuf Framing](protocol-layers/capn-proto.md) - [MLS (RFC 9420)](protocol-layers/mls.md) - [Hybrid KEM: X25519 + ML-KEM-768](protocol-layers/hybrid-kem.md) --- +# Client SDKs + +- [SDK Overview](sdk/index.md) +- [Wire Format Reference](sdk/wire-format.md) +- [Rust SDK](sdk/rust.md) +- [Go SDK](getting-started/go-sdk.md) +- [TypeScript SDK and Browser Demo](getting-started/typescript-sdk.md) +- [C FFI Bindings](getting-started/ffi.md) +- [WASM Integration](getting-started/wasm.md) + +--- + # Cryptographic Properties - [Cryptography Overview](cryptography/overview.md) @@ -97,15 +103,23 @@ --- -# Roadmap and Research +# Roadmap - [Milestone Tracker](roadmap/milestones.md) -- [Phase 2 + M4–M6 Roadmap](roadmap/phase2-and-m4-m6.md) +- [Phase 2 + M4-M6 Roadmap](roadmap/phase2-and-m4-m6.md) - [Production Readiness WBS](roadmap/production-readiness.md) - [Auth, Devices, and Tokens](roadmap/authz-plan.md) - [1:1 Channel Design](roadmap/dm-channels.md) - [Future Research Directions](roadmap/future-research.md) -- [Full Roadmap (Phases 1–8)](../../ROADMAP.md) +- [Full Roadmap (Phases 1-8)](../../ROADMAP.md) + +--- + +# Operations + +- [Monitoring](operations/monitoring.md) +- [Backup and Restore](operations/backup-restore.md) +- [Scaling Guide](operations/scaling-guide.md) --- diff --git a/docs/src/appendix/glossary.md b/docs/src/appendix/glossary.md index 959154c..6986c2e 100644 --- a/docs/src/appendix/glossary.md +++ b/docs/src/appendix/glossary.md @@ -12,18 +12,23 @@ AES-128-GCM (in the MLS ciphersuite). See [Cryptography Overview](../cryptograph **ALPN** -- Application-Layer Protocol Negotiation. A TLS extension that allows the client and server to agree on an application protocol during the TLS -handshake. quicproquo uses the ALPN token `b"capnp"` to identify Cap'n Proto -RPC connections. See [QUIC + TLS 1.3](../protocol-layers/quic-tls.md). +handshake. quicproquo v2 uses the ALPN token `"qpq"` (replacing the legacy +`"capnp"` token used in v1). See [Wire Format Overview](../wire-format/overview.md). -**AS** -- Authentication Service. The server component that stores and -distributes single-use MLS KeyPackages. Clients upload KeyPackages after identity -generation; peers fetch them to add new members to a group. -See [Architecture Overview](../architecture/overview.md). +**Argon2id** -- A memory-hard password hashing and key derivation function +(winner of the Password Hashing Competition, 2015). quicproquo uses Argon2id +to derive the SQLCipher encryption key from the server's passphrase, and +optionally for client-side key derivation. See [Storage Backend](../internals/storage-backend.md). + +**AS** -- Authentication Service. The server component that handles OPAQUE +registration and login, stores single-use MLS KeyPackages, and manages hybrid +post-quantum public keys. See [Authentication Service Internals](../internals/authentication-service.md). **Cap'n Proto** -- A zero-copy serialisation format with a built-in RPC system. -quicproquo uses Cap'n Proto for all wire messages and service RPCs. Schemas -live in `schemas/*.capnp` and are compiled to Rust at build time. -See [Cap'n Proto Serialisation and RPC](../protocol-layers/capn-proto.md). +Used in quicproquo v1 for all wire messages and service RPCs. Schemas lived in +`schemas/*.capnp`. In v2, Cap'n Proto is replaced by Protobuf (prost) for RPC +messages, though the legacy Cap'n Proto types remain in `quicproquo-proto` for +backward compatibility. See the v1 archive in `crates/quicproquo-proto/`. **Commit** -- An MLS message type that advances the group to a new epoch. When a member sends a Commit (e.g., after adding or removing a member), all group @@ -42,7 +47,7 @@ self-signed TLS certificate generated by quicproquo is DER-encoded. **DS** -- Delivery Service. The server component that provides store-and-forward relay for opaque MLS payloads. The DS never inspects ciphertext -- it routes -solely by recipient public key and optional channel ID. +solely by recipient public key, channel ID, and device ID. See [Architecture Overview](../architecture/overview.md). **Ed25519** -- Edwards-curve Digital Signature Algorithm on Curve25519. Used for @@ -80,8 +85,8 @@ is consumed on fetch. See **ML-KEM-768** -- Module-Lattice-based Key Encapsulation Mechanism, security level 3 (NIST FIPS 203). A post-quantum KEM based on the hardness of the -module learning-with-errors (MLWE) problem. quicproquo plans to use ML-KEM-768 -in a hybrid construction with X25519 at milestone M7. +module learning-with-errors (MLWE) problem. quicproquo uses ML-KEM-768 in a +hybrid construction with X25519 for post-quantum sealed envelope encryption. See [Post-Quantum Readiness](../cryptography/post-quantum-readiness.md). **MLS** -- Messaging Layer Security. A protocol for group key agreement defined @@ -89,6 +94,13 @@ in RFC 9420. MLS provides forward secrecy and post-compromise security for groups of any size through an efficient tree-based key schedule. See [MLS (RFC 9420)](../protocol-layers/mls.md). +**OPAQUE** -- Asymmetric Password-Authenticated Key Exchange (RFC 9497). A +password authentication protocol in which the server never learns the user's +password, not even during registration. The server stores an OPAQUE registration +record derived from the password. quicproquo uses OPAQUE for all user +authentication (replacing static token auth in v1). +See [Authentication Service Internals](../internals/authentication-service.md). + **PCS** -- Post-Compromise Security. The property that a protocol recovers security after a member's state is compromised. In MLS, once a compromised member sends an Update or Commit, subsequent epochs are secure again (assuming @@ -101,6 +113,16 @@ record was requested. Explored as a future enhancement for metadata-hiding KeyPackage and message fetch. See [Future Research](../roadmap/future-research.md). +**prost** -- A Rust Protobuf code generation and runtime library. Used in +quicproquo v2 to generate Rust types from `proto/qpq/v1/*.proto` files at +build time. The generated types live in `crates/quicproquo-proto/`. +See [Rust Crate Documentation](references.md). + +**Protobuf** -- Protocol Buffers. A language-neutral, binary serialisation format +from Google. quicproquo v2 uses Protobuf for all RPC message payloads, encoded +using the `prost` crate. Proto definitions live in `proto/qpq/v1/`. +See [Wire Format Overview](../wire-format/overview.md). + **QUIC** -- A UDP-based, multiplexed, encrypted transport protocol defined in RFC 9000. QUIC integrates TLS 1.3 for authentication and confidentiality and provides 0-RTT connection establishment, stream multiplexing, and built-in @@ -112,6 +134,12 @@ group key derivation. Each leaf corresponds to a group member; internal nodes hold derived key material. Updates propagate along the path from a leaf to the root, giving O(log N) cost for key updates in a group of N members. +**SQLCipher** -- An open-source extension to SQLite that provides transparent, +page-level AES-256 encryption of the database file. quicproquo uses SQLCipher +as the primary server-side storage backend via the `rusqlite` crate with the +`sqlcipher` feature. The encryption key is derived from a server passphrase +using Argon2id. See [Storage Backend](../internals/storage-backend.md). + **TLS 1.3** -- Transport Layer Security version 1.3, defined in RFC 8446. The standard for authenticated, encrypted transport. quicproquo uses TLS 1.3 exclusively (via `rustls` with `TLS13` cipher suites only) as part of the QUIC diff --git a/docs/src/appendix/references.md b/docs/src/appendix/references.md index d63f8f1..79be9e8 100644 --- a/docs/src/appendix/references.md +++ b/docs/src/appendix/references.md @@ -11,14 +11,14 @@ category. | Reference | Description | |-----------|-------------| | [RFC 9420 -- The Messaging Layer Security (MLS) Protocol](https://datatracker.ietf.org/doc/rfc9420/) | The group key agreement protocol used by quicproquo. Defines KeyPackages, Welcome messages, Commits, the ratchet tree, epoch advancement, and the security properties (forward secrecy, post-compromise security). See [MLS (RFC 9420)](../protocol-layers/mls.md). | +| [RFC 9497 -- The OPAQUE Asymmetric PAKE Protocol](https://datatracker.ietf.org/doc/rfc9497/) | Asymmetric password-authenticated key exchange. quicproquo uses OPAQUE for all user registration and login. The server never learns the user's password. See [Authentication Service Internals](../internals/authentication-service.md). | | [RFC 9000 -- QUIC: A UDP-Based Multiplexed and Secure Transport](https://datatracker.ietf.org/doc/rfc9000/) | The transport protocol underlying quicproquo's primary connection layer. Provides multiplexed streams, 0-RTT connection establishment, and built-in congestion control. See [QUIC + TLS 1.3](../protocol-layers/quic-tls.md). | | [RFC 9001 -- Using TLS to Secure QUIC](https://datatracker.ietf.org/doc/rfc9001/) | Defines how TLS 1.3 is integrated into QUIC for authentication and key exchange. quicproquo uses this via the `quinn` + `rustls` stack. | | [RFC 8446 -- The Transport Layer Security (TLS) Protocol Version 1.3](https://datatracker.ietf.org/doc/rfc8446/) | The TLS version used exclusively by quicproquo (no TLS 1.2 fallback). Provides the handshake, key schedule, and record layer for QUIC transport security. | | [RFC 9180 -- Hybrid Public Key Encryption (HPKE)](https://datatracker.ietf.org/doc/rfc9180/) | The public-key encryption scheme used internally by MLS for encrypting to KeyPackage init keys. quicproquo's MLS ciphersuite uses DHKEM(X25519, HKDF-SHA256) with AES-128-GCM. | -| [NIST FIPS 203 -- Module-Lattice-Based Key-Encapsulation Mechanism Standard (ML-KEM)](https://csrc.nist.gov/pubs/fips/203/final) | The post-quantum KEM standard. quicproquo plans to use ML-KEM-768 in a hybrid construction with X25519 at milestone M7. See [Post-Quantum Readiness](../cryptography/post-quantum-readiness.md). | -| [Cap'n Proto specification](https://capnproto.org/) | The zero-copy serialisation format and RPC system used for all quicproquo wire messages and service interfaces. See [Cap'n Proto Serialisation and RPC](../protocol-layers/capn-proto.md). | +| [NIST FIPS 203 -- Module-Lattice-Based Key-Encapsulation Mechanism Standard (ML-KEM)](https://csrc.nist.gov/pubs/fips/203/final) | The post-quantum KEM standard. quicproquo uses ML-KEM-768 in a hybrid construction with X25519 for post-quantum sealed envelope encryption. See [Post-Quantum Readiness](../cryptography/post-quantum-readiness.md). | +| [Protocol Buffers Language Guide (proto3)](https://protobuf.dev/programming-guides/proto3/) | The binary serialisation format used for all v2 RPC message payloads. quicproquo proto definitions live in `proto/qpq/v1/`. See [Wire Format Overview](../wire-format/overview.md). | | [draft-ietf-tls-hybrid-design -- Hybrid Key Exchange in TLS 1.3](https://datatracker.ietf.org/doc/draft-ietf-tls-hybrid-design/) | The combiner approach used by quicproquo's hybrid KEM construction (X25519 shared secret concatenated with ML-KEM-768 shared secret, fed through HKDF). See [Hybrid KEM](../protocol-layers/hybrid-kem.md). | -| [RFC 9497 -- OPAQUE](https://datatracker.ietf.org/doc/rfc9497/) | Asymmetric password-authenticated key exchange. Considered for future authentication (see [Future Research](../roadmap/future-research.md)). | --- @@ -28,18 +28,22 @@ category. |-------|---------|----------------------| | `openmls` | [docs.rs/openmls](https://docs.rs/openmls/) | MLS protocol implementation: group creation, member addition, Welcome processing, application message encryption/decryption. See [MLS (RFC 9420)](../protocol-layers/mls.md). | | `openmls_rust_crypto` | [docs.rs/openmls_rust_crypto](https://docs.rs/openmls_rust_crypto/) | Pure-Rust cryptographic backend for openmls. Provides the `OpenMlsRustCrypto` provider used by `GroupMember`. | +| `prost` | [docs.rs/prost](https://docs.rs/prost/) | Protobuf runtime for Rust. Used to encode/decode all v2 RPC messages. Generated types are in `crates/quicproquo-proto/`. | +| `prost-build` | [docs.rs/prost-build](https://docs.rs/prost-build/) | Build-time Protobuf code generator invoked from `crates/quicproquo-proto/build.rs`. Reads `.proto` files and emits Rust structs. | +| `protobuf-src` | [docs.rs/protobuf-src](https://docs.rs/protobuf-src/) | Vendors the `protoc` compiler as a build dependency. No system-installed protoc required. | | `quinn` | [docs.rs/quinn](https://docs.rs/quinn/) | QUIC transport implementation. Provides the `Endpoint`, `Connection`, and stream types for client and server. See [QUIC + TLS 1.3](../protocol-layers/quic-tls.md). | | `rustls` | [docs.rs/rustls](https://docs.rs/rustls/) | TLS 1.3 implementation used by `quinn`. Configured with `TLS13` cipher suites only and custom certificate verification. | -| `capnp` | [docs.rs/capnp](https://docs.rs/capnp/) | Cap'n Proto serialisation library. Used for building and reading all wire messages. | -| `capnp-rpc` | [docs.rs/capnp-rpc](https://docs.rs/capnp-rpc/) | Cap'n Proto RPC framework. Provides the async RPC system for `NodeService`. Runs inside the QUIC encrypted channel. | -| `capnpc` | [docs.rs/capnpc](https://docs.rs/capnpc/) | Cap'n Proto compiler invoked at build time (`build.rs`) to generate Rust types from `.capnp` schemas. | -| `ml-kem` | [docs.rs/ml-kem](https://docs.rs/ml-kem/) | ML-KEM (NIST FIPS 203) implementation. Vendored in the workspace for the planned hybrid post-quantum KEM (M7). | +| `opaque-ke` | [docs.rs/opaque-ke](https://docs.rs/opaque-ke/) | OPAQUE asymmetric PAKE implementation (RFC 9497). Used for server-side registration record generation and client-side credential derivation. | +| `rusqlite` | [docs.rs/rusqlite](https://docs.rs/rusqlite/) | SQLite bindings for Rust. Used with the `sqlcipher` feature for the SQLCipher-encrypted database backend. See [Storage Backend](../internals/storage-backend.md). | +| `argon2` | [docs.rs/argon2](https://docs.rs/argon2/) | Argon2id key derivation. Used to derive the SQLCipher encryption key from the server passphrase. | +| `ml-kem` | [docs.rs/ml-kem](https://docs.rs/ml-kem/) | ML-KEM (NIST FIPS 203) implementation. Used in the hybrid X25519 + ML-KEM-768 KEM for post-quantum envelope encryption. | | `ed25519-dalek` | [docs.rs/ed25519-dalek](https://docs.rs/ed25519-dalek/) | Ed25519 signing and verification. Used for MLS identity credentials (`BasicCredential`). See [Ed25519 Identity Keys](../cryptography/identity-keys.md). | | `x25519-dalek` | [docs.rs/x25519-dalek](https://docs.rs/x25519-dalek/) | X25519 Diffie-Hellman key exchange. Used in hybrid KEM (X25519 + ML-KEM-768) and as the classical component of DHKEM in MLS HPKE. See [Hybrid KEM](../protocol-layers/hybrid-kem.md). | | `zeroize` | [docs.rs/zeroize](https://docs.rs/zeroize/) | Secure memory zeroisation. All private key types implement `Zeroize + ZeroizeOnDrop`. See [Key Lifecycle and Zeroization](../cryptography/key-lifecycle.md). | +| `bytes` | [docs.rs/bytes](https://docs.rs/bytes/) | Zero-copy byte buffer abstraction. Used in the RPC framing layer (`quicproquo-rpc`) for efficient frame encoding/decoding without copying. | | `tokio` | [docs.rs/tokio](https://docs.rs/tokio/) | Async runtime. All server and client I/O runs on Tokio. | +| `tower` | [docs.rs/tower](https://docs.rs/tower/) | Service abstraction and middleware framework. Used in `quicproquo-rpc` for RPC middleware (auth, rate limiting, tracing). | | `clap` | [docs.rs/clap](https://docs.rs/clap/) | CLI argument parser for the client binary. | -| `dashmap` | [docs.rs/dashmap](https://docs.rs/dashmap/) | Concurrent hash map. Used for the in-memory AS key store and DS delivery queues (to be replaced by SQLite at M6). | | `tracing` | [docs.rs/tracing](https://docs.rs/tracing/) | Structured logging framework. Used throughout the server for request logging and diagnostics. | | `thiserror` | [docs.rs/thiserror](https://docs.rs/thiserror/) | Derive macro for typed error enums in library crates. | | `anyhow` | [docs.rs/anyhow](https://docs.rs/anyhow/) | Flexible error handling for application crates (server, client). | @@ -89,6 +93,18 @@ The predecessor to ML-KEM (NIST FIPS 203). CRYSTALS-Kyber was selected by NIST and standardised as ML-KEM. quicproquo uses the `ml-kem` crate which implements the final FIPS 203 standard. +### OPAQUE + +**"The OPAQUE Asymmetric PAKE Protocol"** +Stanislaw Jarecki, Hugo Krawczyk, and Jiayu Xu. +*EUROCRYPT 2018.* + +The original academic paper introducing OPAQUE. Standardised as RFC 9497. +Relevant background for understanding the security guarantees of quicproquo's +authentication system: the server stores a verifier (not the password), and the +protocol is resistant to pre-computation attacks even if the server's verifier +database is stolen. + ### Metadata Resistance **"Sealed Sender"** diff --git a/docs/src/architecture/crate-responsibilities.md b/docs/src/architecture/crate-responsibilities.md index a070f71..8a8211c 100644 --- a/docs/src/architecture/crate-responsibilities.md +++ b/docs/src/architecture/crate-responsibilities.md @@ -1,9 +1,9 @@ # Crate Responsibilities -The quicproquo workspace contains six crates. The main four (proto, core, +The quicproquo workspace contains nine crates. The core four (proto, core, server, client) follow strict layering rules; each owns one concern and depends -only on the crates below it. The workspace also includes **quicproquo-gui** -(Tauri desktop app) and **quicproquo-p2p** (P2P endpoint resolution). This +only on the crates below it. The workspace also includes dedicated crates for +the RPC framework, client SDK, key transparency, plugin API, and P2P. This page documents what each crate provides, what it explicitly avoids, and how the crates relate to one another. @@ -12,33 +12,47 @@ crates relate to one another. ## Dependency Flow Diagram ```text - ┌──────────────────────────┐ - │ quicproquo-client │ - │ (CLI, QUIC client, │ - │ GroupMember orchestr.) │ - └─────────┬───────┬────────┘ - │ │ - ┌───────┘ └────────┐ - ▼ ▼ - ┌────────────────────────┐ ┌────────────────────────┐ - │ quicproquo-core │ │ quicproquo-server │ - │ (crypto, MLS, │ │ (QUIC listener, │ - │ hybrid KEM) │ │ NodeService RPC, │ - │ │ │ storage) │ - └──────────┬─────────────┘ └─────────┬──────────────┘ - │ │ - │ ┌───────────────────┘ - ▼ ▼ - ┌────────────────────────┐ - │ quicproquo-proto │ - │ (Cap'n Proto schemas, │ - │ codegen, helpers) │ - └────────────────────────┘ + +-------------------+ +-------------------+ + | quicproquo-client | | quicproquo-sdk | + | (CLI/TUI binary) | | (QpqClient, store)| + +--------+----------+ +--------+----------+ + | | + +----------+ +-----------+ + | | + v v + +-----------+----------+ + | quicproquo-rpc | + | (framing, server, | + | client, middleware) | + +--------+-------------+ + | + +---------------+---------------+ + | | + v v + +------------------------+ +-----------------------------+ + | quicproquo-core | | quicproquo-server | + | (crypto, MLS, | | (RPC server + domain | + | hybrid KEM) | | services) | + +----------+-------------+ +-------------+---------------+ + | | + | +-------------------+ | + +------>| quicproquo-proto |<--+ + | (capnp legacy + | + | prost v2 types) | + +-------------------+ + + (separate, no shared deps) + +-------------------+ +-------------------+ +-------------------+ + | quicproquo-kt | | quicproquo-p2p | | quicproquo- | + | (key transparency)| | (iroh P2P) | | plugin-api | + +-------------------+ +-------------------+ | (#![no_std] C-ABI)| + +-------------------+ ``` **Arrows point from dependant to dependency.** The proto crate sits at the base -of the dependency graph. The core crate depends on proto for envelope -serialisation. The server and client crates both depend on core and proto. +of the dependency graph. The core crate depends on proto for legacy envelope +serialisation. The rpc crate provides the framing and dispatch layer used by +both the sdk and server. --- @@ -54,48 +68,61 @@ dependency. | `identity` | `IdentityKeypair` | Ed25519 signing keypair for MLS credentials. Seed stored as `Zeroizing<[u8; 32]>`. Implements `openmls_traits::Signer`. | | `group` | `GroupMember` | MLS group state machine wrapping `openmls::MlsGroup`. Lifecycle: `new` -> `generate_key_package` -> `create_group` / `join_group` -> `send_message` / `receive_message`. | | `keypackage` | `generate_key_package` | Standalone KeyPackage generation (returns TLS-encoded bytes + SHA-256 fingerprint). | -| `keystore` | `DiskKeyStore`, `StoreCrypto` | `OpenMlsKeyStore` implementation backed by an in-memory `HashMap` with optional bincode flush to disk. `StoreCrypto` couples `RustCrypto` + `DiskKeyStore` into an `OpenMlsCryptoProvider`. | +| `keystore` | `DiskKeyStore`, `StoreCrypto` | `OpenMlsKeyStore` implementation backed by an in-memory `HashMap` with bincode flush to disk. `StoreCrypto` couples `RustCrypto` + `DiskKeyStore` into an `OpenMlsCryptoProvider`. | | `hybrid_kem` | `HybridKeypair`, `HybridPublicKey`, `hybrid_encrypt`, `hybrid_decrypt` | X25519 + ML-KEM-768 hybrid KEM. HKDF-SHA256 key derivation, ChaCha20-Poly1305 AEAD. Versioned envelope wire format. | -| `error` | `CoreError`, `MAX_PLAINTEXT_LEN` | Unified error types. `CoreError` covers Cap'n Proto, MLS, and hybrid KEM failures. | +| `error` | `CoreError`, `MAX_PLAINTEXT_LEN` | Unified error types covering MLS and hybrid KEM failures. | ### What this crate does NOT do - No network I/O. - No QUIC or TLS -- that is the server and client crates' concern. -- No async runtime setup (it uses Tokio types internally but does not spawn or - manage a runtime). +- No async runtime setup. - No CLI parsing. ### Key dependencies `ed25519-dalek`, `openmls`, `openmls_rust_crypto`, `openmls_traits`, `tls_codec`, `ml-kem`, `x25519-dalek`, `chacha20poly1305`, -`hkdf`, `sha2`, `zeroize`, `capnp`, `quicproquo-proto`, `tokio`, -`serde`, `bincode`, `serde_json`, `thiserror`. +`hkdf`, `sha2`, `zeroize`, `quicproquo-proto`, `serde`, `bincode`, `thiserror`. --- ## quicproquo-proto -**Role:** Cap'n Proto schema definitions, compile-time code generation, and -pure-synchronous serialisation helpers. This crate is the single source of truth -for the wire format. +**Role:** Protocol type definitions for both v1 (legacy Cap'n Proto) and v2 +(Protobuf/prost). This crate is the single source of truth for wire types and +method ID constants. ### Contents -| Item | Description | -|---------------------------|-------------| -| `schemas/envelope.capnp` | `Envelope` struct and `MsgType` enum -- top-level wire message. | -| `schemas/auth.capnp` | `AuthenticationService` interface -- `uploadKeyPackage`, `fetchKeyPackage`. | -| `schemas/delivery.capnp` | `DeliveryService` interface -- `enqueue`, `fetch`. | -| `schemas/node.capnp` | `NodeService` interface (unified AS+DS) -- all RPC methods plus `Auth` struct. | -| `build.rs` | Invokes `capnpc` to generate Rust types from the four `.capnp` files. | -| `lib.rs` | `pub mod envelope_capnp`, `auth_capnp`, `delivery_capnp`, `node_capnp` -- re-exports generated modules. | -| `MsgType` | Re-exported enum from `envelope_capnp::envelope::MsgType`. | -| `ParsedEnvelope` | Owned, `Send + 'static` representation of a decoded `Envelope`. All byte fields are eagerly copied out of the Cap'n Proto reader. | -| `build_envelope` | Serialise a `ParsedEnvelope` to unpacked Cap'n Proto wire bytes. | -| `parse_envelope` | Deserialise wire bytes into a `ParsedEnvelope`. | -| `to_bytes` / `from_bytes` | Low-level Cap'n Proto message <-> byte conversions. | +| Item | Description | +|-------------------------------|-------------| +| `schemas/*.capnp` | Legacy Cap'n Proto schemas (auth, delivery, node, federation). | +| `proto/qpq/v1/*.proto` | 14 Protobuf files defining all v2 message types. | +| `build.rs` | Invokes `capnpc` for legacy types and `prost-build` for v2 types. | +| `pub mod qpq::v1` | All Protobuf-generated types, included via `prost` `include!`. | +| `pub mod method_ids` | All 44 RPC method ID constants (u16) plus 4 push event type constants. | +| `auth_capnp`, `node_capnp`... | Re-exported legacy Cap'n Proto generated modules. | + +### method_ids ranges + +| Range | Category | +|---|---| +| 100-103 | Auth (OPAQUE register/login) | +| 200-205 | Delivery (enqueue, fetch, ack) | +| 300-304 | Keys (key packages, hybrid keys) | +| 400 | Channel creation | +| 410-413 | Group management | +| 420-424 | Moderation | +| 500-501 | User / identity resolution | +| 510-520 | Key transparency | +| 600-601 | Blob storage | +| 700-710 | Device management + push tokens | +| 750-752 | Recovery bundles | +| 800-802 | P2P endpoints + health | +| 900-905 | Federation relay | +| 950 | Account deletion | +| 1000-1003 | Push event types (server-to-client) | ### What this crate does NOT do @@ -106,155 +133,178 @@ for the wire format. ### Key dependencies -`capnp` (runtime), `capnpc` (build-time only). +`capnp` (runtime), `capnpc` (build-time), `prost`, `prost-build` (build-time), +`bytes`. + +--- + +## quicproquo-rpc + +**Role:** v2 RPC framework. Implements the custom binary framing protocol, +server-side dispatch, client-side request/response handling, and Tower +middleware (rate limiting, timeouts, authentication). + +### Components + +| Component | Description | +|------------------|-------------| +| `framing` | `RequestFrame`, `ResponseFrame`, `PushFrame` encode/decode with big-endian headers. Max payload: 4 MiB. | +| `server` | `RpcServer` accepts QUIC connections, reads request frames, dispatches to registered handlers, writes response frames. | +| `client` | `RpcClient` opens per-RPC QUIC streams, writes request frames, reads response frames. | +| `middleware` | Tower `Service` wrappers: rate limiter, deadline/timeout, auth token injection. | +| `error` | `RpcError`, `RpcStatus` enum (Ok=0, BadRequest=1, Unauthorized=2, ... UnknownMethod=11). | + +### Frame format (implemented here) + +``` +Request: [method_id: u16 BE][request_id: u32 BE][payload_len: u32 BE][protobuf] +Response: [status: u8][request_id: u32 BE][payload_len: u32 BE][protobuf] +Push: [event_type: u16 BE][payload_len: u32 BE][protobuf] +``` + +### What this crate does NOT do + +- No domain logic -- handlers are registered by the server crate. +- No crypto operations. + +### Key dependencies + +`quinn`, `rustls`, `tokio`, `bytes`, `tower`, `prost`, `quicproquo-proto`, +`tracing`, `thiserror`. + +--- + +## quicproquo-sdk + +**Role:** High-level client SDK. `QpqClient` wraps the RPC client with +typed methods, an async event broadcast channel, and a `ConversationStore` +for local conversation state. + +### Components + +| Component | Description | +|---------------------|-------------| +| `QpqClient` | Authenticated client: `register`, `login`, `send_message`, `fetch_messages`, etc. | +| Event channel | `tokio::sync::broadcast` channel delivering `ClientEvent` variants (NewMessage, Typing, Presence, Membership). | +| `ConversationStore` | SQLCipher-backed local store for message history and group state. | + +### What this crate does NOT do + +- No raw frame handling -- delegates to `quicproquo-rpc`. +- No MLS group state management -- delegates to `quicproquo-core`. + +### Key dependencies + +`quicproquo-rpc`, `quicproquo-core`, `quicproquo-proto`, `tokio`, `rusqlite`, +`prost`, `tracing`, `thiserror`, `anyhow`. --- ## quicproquo-server **Role:** Network-facing server binary. Accepts QUIC + TLS 1.3 connections, -dispatches Cap'n Proto RPC calls to `NodeServiceImpl`, and persists state to -disk via `FileBackedStore`. +dispatches 44 Protobuf RPC methods through registered handlers in `domain/`, +and persists state to SQLCipher. ### Components | Component | Description | |----------------------|-------------| -| `NodeServiceImpl` | Implements `node_service::Server` (Cap'n Proto generated trait). Handles all eight RPC methods: `uploadKeyPackage`, `fetchKeyPackage`, `enqueue`, `fetch`, `fetchWait`, `health`, `uploadHybridKey`, `fetchHybridKey`. | -| `FileBackedStore` | Mutex-guarded `HashMap`s for KeyPackages (keyed by Ed25519 public key), delivery queues (keyed by `ChannelKey = (channelId, recipientKey)`), and hybrid public keys. Each mutation flushes the full map to a bincode file on disk. | -| `DashMap` waiters | `DashMap, Arc>` -- per-recipient `tokio::sync::Notify` instances for `fetchWait` long-polling. `enqueue` calls `notify_waiters()` after appending. | -| TLS config | Self-signed certificate auto-generated on first run (`rcgen`). TLS 1.3 only, ALPN `capnp`. | -| CLI (`clap`) | `--listen` (default `0.0.0.0:7000`), `--data-dir`, `--tls-cert`, `--tls-key`. | +| `v2_handlers/` | One handler module per method category (auth, delivery, keys, channel, group, user, kt, blob, device, p2p, federation, moderation, recovery, account). | +| `domain/` | Protocol-agnostic domain types and service logic (e.g., `AuthService`, `DeliveryService`, `KeyService`). | +| `ServerState` | Shared state: SQLCipher connection pool, DashMap waiters, OPAQUE server state. | +| TLS config | Self-signed certificate auto-generated on first run (`rcgen`). TLS 1.3 only, ALPN `qpq`. | +| CLI (`clap`) | `--listen` (default `0.0.0.0:5001`), `--data-dir`, `--tls-cert`, `--tls-key`. | ### Connection lifecycle -```text -QUIC accept - └─ TLS 1.3 handshake (self-signed cert, ALPN "capnp") - └─ accept_bi() -> bidirectional QUIC stream - └─ tokio_util::compat adapters (AsyncRead/AsyncWrite) - └─ capnp-rpc twoparty::VatNetwork (Side::Server) - └─ RpcSystem drives NodeServiceImpl +``` +QUIC accept (ALPN: "qpq") + +- TLS 1.3 handshake (self-signed cert) + +- Per-stream: read RequestFrame -> dispatch to handler -> write ResponseFrame + +- Uni-stream (server -> client): write PushFrame for events ``` -Because `capnp-rpc` uses `Rc>` internally and is therefore `!Send`, -the entire RPC stack runs on a `tokio::task::LocalSet`. Each incoming connection -is handled by `spawn_local`. +Each RPC call gets its own QUIC bidirectional stream; handlers run concurrently +via `tokio::spawn`. ### What this crate does NOT do -- No direct crypto operations (it delegates to `quicproquo-core` types - for fingerprinting and storage only). -- No MLS processing -- all payloads are opaque byte strings. +- No direct crypto beyond OPAQUE server-side operations. +- No MLS processing -- all MLS payloads are opaque byte strings. ### Key dependencies -`quicproquo-core`, `quicproquo-proto`, `quinn`, `quinn-proto`, -`rustls`, `rcgen`, `capnp`, `capnp-rpc`, `tokio`, `tokio-util`, `dashmap`, -`sha2`, `clap`, `tracing`, `anyhow`, `thiserror`, `bincode`, `serde`. +`quicproquo-core`, `quicproquo-proto`, `quicproquo-rpc`, `quinn`, `rustls`, +`rcgen`, `tokio`, `dashmap`, `rusqlite`, `prost`, `clap`, `tracing`, `anyhow`, +`thiserror`. --- ## quicproquo-client -**Role:** CLI client binary. Connects to the server over QUIC + TLS 1.3, -orchestrates MLS group operations via `GroupMember`, and persists identity and -group state to disk. - -### Components - -| Component | Description | -|-------------------------|-------------| -| `connect_node` | Establishes a QUIC/TLS connection, opens a bidirectional stream, and bootstraps a `capnp-rpc` `RpcSystem` to obtain a `node_service::Client`. | -| CLI subcommands (`clap`)| `ping`, `register`, `fetch-key`, `demo-group`, `register-state`, `create-group`, `invite`, `join`, `send`, `recv`. | -| `GroupMember` usage | The client creates a `GroupMember` (from `quicproquo-core`), calls `generate_key_package` / `create_group` / `add_member` / `join_group` / `send_message` / `receive_message`. | -| State persistence | `StoredState` holds `identity_seed` (32 bytes) and optional serialised `MlsGroup`. A companion `.ks` file stores the `DiskKeyStore` with HPKE init private keys. | -| Auth context | `ClientAuth` bundles an optional bearer token and device ID. Passed to every RPC via the `Auth` struct in `node.capnp`. | - -### CLI subcommand summary - -| Subcommand | What it does | -|-------------------|--------------| -| `ping` | Call `health()` and print RTT. | -| `register` | Generate a fresh identity + KeyPackage, upload to AS, print identity key. | -| `register-state` | Same as `register` but uses/creates persistent state file. | -| `fetch-key` | Fetch a peer's KeyPackage by hex identity key. | -| `create-group` | Create a new MLS group and save state. | -| `invite` | Fetch peer's KeyPackage, add to group, enqueue Welcome via DS. | -| `join` | Fetch Welcome from DS, join the MLS group. | -| `send` | Encrypt a message with MLS, enqueue via DS. | -| `recv` | Fetch pending payloads from DS, decrypt with MLS. Supports `--stream` for continuous long-polling. | -| `demo-group` | End-to-end Alice+Bob round-trip (ephemeral identities). | +**Role:** CLI/TUI client binary. Connects to the server, orchestrates MLS +group operations via `GroupMember`, and persists identity and group state. ### What this crate does NOT do - No server-side logic. -- No direct crypto beyond calling `GroupMember` and verifying SHA-256 - fingerprints. +- No raw frame parsing -- delegates to `quicproquo-sdk` / `quicproquo-rpc`. ### Key dependencies -`quicproquo-core`, `quicproquo-proto`, `quinn`, `quinn-proto`, -`rustls`, `capnp`, `capnp-rpc`, `tokio`, `tokio-util`, `clap`, `sha2`, -`serde`, `bincode`, `anyhow`, `thiserror`, `tracing`. +`quicproquo-sdk`, `quicproquo-core`, `quicproquo-proto`, `tokio`, `clap`, +`rustyline`, `tracing`, `anyhow`. --- -## quicproquo-bot +## quicproquo-kt -**Role:** High-level SDK for building automated agents (bots) on the -quicproquo network. Wraps the client library into a simple polling-based API. +**Role:** Key transparency. Implements an append-only transparency log for +Ed25519 public keys with revocation checking and audit support. -### Components - -| Component | Description | -|------------------|-------------| -| `BotConfig` | Builder-pattern configuration: server address, credentials, TLS, state file path. | -| `Bot` | Connected bot instance. Methods: `connect()`, `send_dm()`, `receive()`, `receive_raw()`, `resolve_user()`. | -| `Message` | Received message struct with `sender`, `text`, and `seq` fields. | -| `run_pipe_mode` | JSON-lines stdin/stdout interface for shell integration (`send`, `recv`, `resolve` actions). | - -### Architecture - -Each `send_dm` and `receive` call opens a fresh QUIC connection (stateless -reconnect pattern). The bot wraps the client's `cmd_send` and -`receive_pending_plaintexts` functions, handling MLS group state internally. - -### What this crate does NOT do - -- No server-side logic. -- No raw MLS operations — delegates to `quicproquo-client` high-level functions. -- No persistent QUIC connections — each operation reconnects. - -### Key dependencies - -`quicproquo-core`, `quicproquo-client`, `tokio`, `anyhow`, `tracing`, -`serde`, `serde_json`, `hex`. +Methods exposed: `RevokeKey` (510), `CheckRevocation` (511), +`AuditKeyTransparency` (520). --- -## Other workspace crates +## quicproquo-plugin-api -| Crate | Role | -|-------------------------|------| -| **quicproquo-gui** | Tauri 2 desktop application; provides a GUI on top of the client/core stack. | -| **quicproquo-p2p** | P2P endpoint publish/resolve; used by the server and clients for direct peer discovery. | +**Role:** `#![no_std]` C-ABI plugin interface. Defines a stable ABI for +dynamically loaded plugins with 6 hook points (on_message_send, +on_message_receive, on_group_join, on_group_leave, on_connect, on_disconnect). -These crates are optional for building and running the server and CLI client. +This crate has no workspace dependencies. It is intentionally `no_std` to +allow plugins compiled for embedded or WASM targets. + +--- + +## quicproquo-p2p + +**Role:** P2P endpoint publish and resolve via iroh. Used by the server and +clients for direct peer discovery when the `mesh` feature is enabled on +`quicproquo-client`. + +Methods exposed: `PublishEndpoint` (800), `ResolveEndpoint` (801). + +This crate is compiled but kept out of the default dependency graph for most +build targets due to iroh's large dependency footprint (~90 extra deps). --- ## Layering Rules 1. **proto** depends on nothing in-workspace. It is pure data definition. -2. **core** depends on **proto** (for `ParsedEnvelope` and envelope helpers). - It does not depend on server or client. -3. **server** depends on **core** and **proto**. It does not depend on client. -4. **client** depends on **core** and **proto**. It does not depend on server. -5. **server** and **client** never depend on each other. They communicate - exclusively via the Cap'n Proto RPC wire protocol. -6. **quicproquo-gui** and **quicproquo-p2p** are optional; they depend - on client/core/proto as needed and do not change the core layering. +2. **core** depends on **proto** (for legacy envelope helpers). + It does not depend on server, rpc, or sdk. +3. **rpc** depends on **proto**. It does not depend on core, server, or client. +4. **sdk** depends on **rpc** and **core**. It does not depend on server. +5. **server** depends on **core**, **proto**, and **rpc**. It does not depend on client or sdk. +6. **client** depends on **sdk**, **core**, and **proto**. It does not depend on server. +7. **server** and **client** never depend on each other. They communicate + exclusively via the v2 Protobuf framing protocol over QUIC. +8. **kt**, **plugin-api**, and **p2p** are optional; they do not change the + core layering. This layering ensures that: @@ -268,7 +318,7 @@ This layering ensures that: ## Further Reading - [Architecture Overview](overview.md) -- high-level system diagram -- [Service Architecture](service-architecture.md) -- NodeService RPC details -- [Wire Format Overview](../wire-format/overview.md) -- Cap'n Proto schema reference +- [Service Architecture](service-architecture.md) -- 44 RPC method details +- [Wire Format Reference](../wire-format/overview.md) -- Protobuf schema reference - [GroupMember Lifecycle](../internals/group-member-lifecycle.md) -- MLS state machine details -- [Storage Backend](../internals/storage-backend.md) -- FileBackedStore internals +- [Storage Backend](../internals/storage-backend.md) -- SQLCipher storage internals diff --git a/docs/src/architecture/data-flow.md b/docs/src/architecture/data-flow.md index 39a70e2..6b57f41 100644 --- a/docs/src/architecture/data-flow.md +++ b/docs/src/architecture/data-flow.md @@ -6,75 +6,86 @@ with an ASCII sequence diagram showing control-plane (AS) and data-plane (DS) traffic. Throughout these flows the server is **MLS-unaware** -- it stores and forwards -opaque byte blobs without parsing their MLS content. +opaque byte blobs without parsing their MLS content. All RPC calls use the v2 +Protobuf framing protocol over QUIC (ALPN: `qpq`, port 5001). --- ## 1. Registration Flow -Before a client can join any MLS group, it must generate an Ed25519 identity -keypair and upload at least one KeyPackage to the Authentication Service. Peers -fetch these KeyPackages to add the client to groups. +Before a client can join any MLS group, it must authenticate with OPAQUE, +generate an Ed25519 identity keypair, and upload at least one KeyPackage to +the Authentication Service. ### Sequence Diagram ```text - Client (Alice) NodeService (AS) - ────────────── ──────────────── - │ │ - │ 1. Generate Ed25519 identity keypair │ - │ (IdentityKeypair::generate) │ - │ │ - │ 2. Generate MLS KeyPackage │ - │ (GroupMember::generate_key_package) │ - │ - Creates HPKE init keypair │ - │ - Embeds Ed25519 pk in credential │ - │ - Signs leaf node with Ed25519 sk │ - │ - TLS-encodes the KeyPackage │ - │ │ - │ 3. QUIC connect + TLS 1.3 handshake │ - │ ────────────────────────────────────────>│ - │ │ - │ 4. uploadKeyPackage(identityKey, pkg) │ - │ ────────────────────────────────────────>│ - │ │ 5. Validate: - │ │ - identityKey == 32 bytes - │ │ - package non-empty, <= 1 MB - │ │ - auth version allowed - │ │ - │ │ 6. Compute SHA-256(package) - │ │ - │ │ 7. Append to per-identity queue: - │ │ keyPackages[identityKey].push(pkg) - │ │ - │ │ 8. Flush keypackages.bin to disk - │ │ - │ fingerprint (SHA-256) │ - │ <────────────────────────────────────────│ - │ │ - │ 9. Compare local fingerprint with │ - │ server-returned fingerprint │ - │ (tamper detection) │ - │ │ + Client (Alice) Server (port 5001) + -------------- ------------------ + | | + | 1. OpaqueRegisterStart (100) | + | username, registration_request | + | ---------------------------------------->| + | | + | registration_response | + | <----------------------------------------| + | | + | 2. OpaqueRegisterFinish (101) | + | username, upload, identity_key | + | ---------------------------------------->| + | | 3. Store OPAQUE record + + | success | identity key mapping + | <----------------------------------------| + | | + | 4. Generate MLS KeyPackage | + | (GroupMember::generate_key_package) | + | - Creates HPKE init keypair | + | - Embeds Ed25519 pk in credential | + | - Signs leaf node with Ed25519 sk | + | - TLS-encodes the KeyPackage | + | | + | 5. OpaqueLoginStart (102) | + | username, login_request | + | ---------------------------------------->| + | login_response | + | <----------------------------------------| + | | + | 6. OpaqueLoginFinish (103) | + | username, finalization, identity_key | + | ---------------------------------------->| + | session_token | + | <----------------------------------------| + | | + | 7. UploadKeyPackage (300) | + | identity_key, package, session_token | + | ---------------------------------------->| + | | 8. Validate + store + | fingerprint (SHA-256) | in KeyPackage queue + | <----------------------------------------| + | | + | 9. Compare local fingerprint with | + | server-returned fingerprint | + | (tamper detection) | + | | ``` ### Key Points -- **KeyPackages are single-use** (RFC 9420 requirement). Each `fetchKeyPackage` +- **KeyPackages are single-use** (RFC 9420 requirement). Each `FetchKeyPackage` call atomically removes and returns one package. The client should upload multiple KeyPackages if it expects to be added to several groups. -- The `identityKey` used as the AS index is the **raw 32-byte Ed25519 public +- The `identity_key` used as the AS index is the **raw 32-byte Ed25519 public key**, not a fingerprint or hash. Peers must know Alice's public key out-of- - band (QR code, directory, etc.) to fetch her KeyPackage. + band (QR code, directory lookup via `ResolveUser`, etc.) to fetch her KeyPackage. - The HPKE init private key generated during `generate_key_package` is stored in the client's `DiskKeyStore`. The **same `GroupMember` instance** (or a restored instance with the same key store) must later call `join_group` to decrypt the Welcome message. -- The optional hybrid public key (`uploadHybridKey`) can also be uploaded - during registration for post-quantum envelope encryption. +- The optional hybrid public key (`UploadHybridKey`, method 302) can also be + uploaded during registration for post-quantum envelope encryption. --- @@ -87,64 +98,66 @@ Bob via the DS. ### Sequence Diagram ```text - Alice NodeService (AS+DS) Bob - ───── ────────────────── ─── - │ │ │ - │ 1. create_group("my-group") │ │ - │ (local MLS operation -- │ │ - │ Alice is sole member, │ │ - │ epoch 0) │ │ - │ │ │ - │ 2. fetchKeyPackage(bob_pk) │ │ - │ ───────────────────────────────>│ │ - │ │ 3. Pop bob's KeyPackage │ - │ │ from queue (atomic) │ - │ bob_kp bytes │ │ - │ <───────────────────────────────│ │ - │ │ │ - │ 4. add_member(bob_kp) │ │ - │ Local MLS operations: │ │ - │ a. Deserialise & validate │ │ - │ Bob's KeyPackage │ │ - │ b. Produce Commit message │ │ - │ (adds Bob to ratchet │ │ - │ tree, advances epoch) │ │ - │ c. Produce Welcome message │ │ - │ (encrypted to Bob's │ │ - │ HPKE init key, contains │ │ - │ group secrets + tree) │ │ - │ d. merge_pending_commit() │ │ - │ (Alice advances to │ │ - │ epoch 1 locally) │ │ - │ │ │ - │ 5. enqueue(bob_pk, welcome) │ │ - │ ───────────────────────────────>│ │ - │ │ 6. Append welcome to │ - │ │ deliveries[(ch, bob_pk)] │ - │ │ │ - │ │ 7. Notify bob_pk waiters │ - │ │ │ - │ │ │ - │ │ 8. Bob connects and fetches │ - │ │ <─────────────────────────────│ - │ │ fetch(bob_pk) │ - │ │ │ - │ │ 9. Drain bob's queue │ - │ │ (returns [welcome]) │ - │ │ │ - │ │ [welcome_bytes] │ - │ │ ─────────────────────────────>│ - │ │ │ - │ │ │ 10. join_group(welcome) - │ │ │ - Decrypt Welcome with - │ │ │ HPKE init private key - │ │ │ - Extract ratchet tree - │ │ │ from GroupInfo ext - │ │ │ - Initialise MlsGroup - │ │ │ at epoch 1 - │ │ │ - │ │ │ Bob is now a group member - │ │ │ + Alice Server (AS+DS, port 5001) Bob + ----- ------------------------- --- + | | | + | 1. create_group("my-group") | | + | (local MLS operation -- | | + | Alice is sole member, | | + | epoch 0) | | + | | | + | 2. FetchKeyPackage (301) | | + | bob_identity_key | | + | --------------------------------> | + | | 3. Pop bob's KeyPackage | + | | from queue (atomic) | + | bob_kp bytes | | + | <-------------------------------- | + | | | + | 4. add_member(bob_kp) | | + | Local MLS operations: | | + | a. Deserialise & validate | | + | Bob's KeyPackage | | + | b. Produce Commit message | | + | (adds Bob to ratchet | | + | tree, advances epoch) | | + | c. Produce Welcome message | | + | (encrypted to Bob's | | + | HPKE init key, contains | | + | group secrets + tree) | | + | d. merge_pending_commit() | | + | (Alice advances to | | + | epoch 1 locally) | | + | | | + | 5. Enqueue (200) | | + | recipient=bob_pk, payload=welcome | + | --------------------------------> | + | | 6. Append welcome to | + | | deliveries[bob_pk] | + | | | + | | 7. Notify bob_pk waiters | + | | (FetchWait wakes up) | + | | | + | | 8. Bob connects and polls | + | | <------------------------------ + | | FetchWait (202) | + | | | + | | 9. Drain bob's queue | + | | (returns [welcome]) | + | | | + | | [welcome_bytes] | + | | ------------------------------> + | | | + | | | 10. join_group(welcome) + | | | - Decrypt Welcome with + | | | HPKE init private key + | | | - Extract ratchet tree + | | | from GroupInfo ext + | | | - Initialise MlsGroup + | | | at epoch 1 + | | | + | | | Bob is now a group member + | | | ``` ### Key Points @@ -162,7 +175,7 @@ Bob via the DS. tree in the Welcome's `GroupInfo` extension. This means Bob does not need a separate tree fetch -- `new_from_welcome` extracts it automatically. -- The DS routes solely by `recipientKey` (Bob's Ed25519 public key). It does +- The DS routes solely by `recipient_key` (Bob's Ed25519 public key). It does not parse the Welcome, the Commit, or any MLS structure. --- @@ -175,63 +188,66 @@ messages through the DS. ### Sequence Diagram ```text - Alice NodeService (DS) Bob - ───── ────────────────── ─── - │ │ │ - │ ─── Alice sends a message to Bob ─── │ - │ │ │ - │ 1. send_message("hello bob") │ │ - │ MLS create_message(): │ │ - │ - Derive message key from │ │ - │ epoch secret + gen counter│ │ - │ - Encrypt plaintext with │ │ - │ AES-128-GCM │ │ - │ - Produce MlsMessageOut │ │ - │ (PrivateMessage variant) │ │ - │ - TLS-encode to bytes │ │ - │ │ │ - │ 2. enqueue(bob_pk, ciphertext) │ │ - │ ───────────────────────────────>│ │ - │ │ 3. Store in bob's queue │ - │ │ 4. Notify bob_pk waiters │ - │ │ │ - │ │ (time passes) │ - │ │ │ - │ │ 5. Bob polls for messages │ - │ │ <─────────────────────────────│ - │ │ fetchWait(bob_pk, 30000) │ - │ │ │ - │ │ 6. Drain bob's queue │ - │ │ [ciphertext] │ - │ │ ─────────────────────────────>│ - │ │ │ - │ │ │ 7. receive_message(ct) - │ │ │ MLS process_message(): - │ │ │ - Identify sender from - │ │ │ PrivateMessage header - │ │ │ - Derive decryption key - │ │ │ from epoch secret - │ │ │ - Decrypt AES-128-GCM - │ │ │ - Return plaintext: - │ │ │ "hello bob" - │ │ │ - │ ─── Bob replies to Alice ─── │ - │ │ │ - │ │ │ 8. send_message("hello alice") - │ │ │ (same MLS encrypt flow) - │ │ │ - │ │ 9. enqueue(alice_pk, ct) │ - │ │ <─────────────────────────────│ - │ │ 10. Store + notify │ - │ │ │ - │ 11. fetch(alice_pk) │ │ - │ ───────────────────────────────>│ │ - │ [ciphertext] │ │ - │ <───────────────────────────────│ │ - │ │ │ - │ 12. receive_message(ct) │ │ - │ -> "hello alice" │ │ - │ │ │ + Alice Server (DS, port 5001) Bob + ----- ---------------------- --- + | | | + | -- Alice sends a message to Bob -- | + | | | + | 1. send_message("hello bob") | | + | MLS create_message(): | | + | - Derive message key from | | + | epoch secret + gen counter| | + | - Encrypt plaintext with | | + | AES-128-GCM | | + | - Produce MlsMessageOut | | + | (PrivateMessage variant) | | + | - TLS-encode to bytes | | + | | | + | 2. Enqueue (200) | | + | recipient=bob_pk, payload | | + | --------------------------------> | + | | 3. Store in bob's queue | + | | 4. Notify bob_pk waiters | + | | (or push PushNewMessage) | + | | | + | | (time passes) | + | | | + | | 5. Bob polls for messages | + | | <------------------------------ + | | FetchWait (202) | + | | | + | | 6. Drain bob's queue | + | | [ciphertext] | + | | ------------------------------> + | | | + | | | 7. receive_message(ct) + | | | MLS process_message(): + | | | - Identify sender from + | | | PrivateMessage header + | | | - Derive decryption key + | | | from epoch secret + | | | - Decrypt AES-128-GCM + | | | - Return plaintext: + | | | "hello bob" + | | | + | -- Bob replies to Alice -- | + | | | + | | | 8. send_message("hello alice") + | | | (same MLS encrypt flow) + | | | + | | 9. Enqueue (200) | + | | recipient=alice_pk | + | | <------------------------------ + | | 10. Store + notify | + | | | + | 11. Fetch (201) | | + | --------------------------------> | + | [ciphertext] | | + | <-------------------------------- | + | | | + | 12. receive_message(ct) | | + | -> "hello alice" | | + | | | ``` ### Key Points @@ -243,44 +259,48 @@ messages through the DS. - **The DS is a dumb relay**: it does not decrypt, inspect, or reorder messages. It stores opaque byte blobs in a FIFO queue keyed by recipient. -- **Long-polling** via `fetchWait` avoids the need for persistent connections - or WebSocket-style push. The client specifies a timeout in milliseconds; the - server blocks up to that duration using `tokio::sync::Notify`. The `recv - --stream` CLI flag loops `fetchWait` indefinitely for continuous message - reception. +- **Long-polling** via `FetchWait` (202) avoids the need for persistent + connections or WebSocket-style push. The client specifies a timeout in + milliseconds; the server blocks up to that duration using + `tokio::sync::Notify`. Push events (method 1000 `PushNewMessage`) deliver + real-time notifications on a separate QUIC uni-stream. -- **Channel-aware routing** is supported: the `channelId` field in `enqueue` - and `fetch` allows scoping queues by channel (e.g., a 16-byte UUID for - 1:1 conversations). When `channelId` is empty, messages go to the default - (legacy) queue. +- **Channel-aware routing** is supported: the `channel_id` field in `Enqueue` + and `Fetch` allows scoping queues by channel (e.g., a UUID for a 1:1 + conversation or group). When `channel_id` is empty, messages go to the + default queue. --- ## Control-Plane vs. Data-Plane Summary ```text -┌─────────────────────────────────────────────────────────────────────┐ -│ Control Plane (AS) │ -│ │ -│ uploadKeyPackage ────> Store KeyPackage for identity │ -│ fetchKeyPackage <──── Pop and return one KeyPackage │ -│ uploadHybridKey ────> Store hybrid PQ public key │ -│ fetchHybridKey <──── Return hybrid PQ public key │ -│ │ -│ Traffic: Infrequent. Once per group join (upload before, │ -│ fetch during group add). │ -└─────────────────────────────────────────────────────────────────────┘ ++---------------------------------------------------------------------+ +| Control Plane (AS) | +| | +| UploadKeyPackage (300) ----> Store KeyPackage for identity | +| FetchKeyPackage (301) <---- Pop and return one KeyPackage | +| UploadHybridKey (302) ----> Store hybrid PQ public key | +| FetchHybridKey (303) <---- Return hybrid PQ public key | +| FetchHybridKeys (304) <---- Return hybrid keys for N identities| +| | +| Traffic: Infrequent. Once per group join (upload before, | +| fetch during group add). | ++---------------------------------------------------------------------+ -┌─────────────────────────────────────────────────────────────────────┐ -│ Data Plane (DS) │ -│ │ -│ enqueue ────> Append payload to recipient queue │ -│ fetch <──── Drain and return all queued payloads │ -│ fetchWait <──── Long-poll drain with timeout │ -│ │ -│ Traffic: High-frequency. Every MLS message (Welcome, Commit, │ -│ Application) flows through the DS. │ -└─────────────────────────────────────────────────────────────────────┘ ++---------------------------------------------------------------------+ +| Data Plane (DS) | +| | +| Enqueue (200) ----> Append payload to recipient queue | +| Fetch (201) <---- Drain and return all queued payloads| +| FetchWait (202) <---- Long-poll drain with timeout | +| Peek (203) <---- Inspect without removing | +| Ack (204) ----> Acknowledge and remove by seq num | +| BatchEnqueue (205) ----> Enqueue multiple payloads at once | +| | +| Traffic: High-frequency. Every MLS message (Welcome, Commit, | +| Application) flows through the DS. | ++---------------------------------------------------------------------+ ``` The separation means the AS can be rate-limited or placed behind stricter @@ -294,47 +314,49 @@ The following diagram summarises the client-side state machine across all three flows: ```text - ┌──────────────┐ - │ No State │ - └──────┬───────┘ - │ + +--------------+ + | No State | + +------+-------+ + | + OPAQUE register + login + | + v + +--------------+ + | Authenticated | session_token obtained + | | No identity yet + +------+--------+ + | IdentityKeypair::generate() - │ - ▼ - ┌──────────────┐ - │ Identity │ Ed25519 keypair exists - │ Generated │ No KeyPackage, no group - └──────┬───────┘ - │ - generate_key_package() + uploadKeyPackage() - │ - ▼ - ┌──────────────┐ - │ Registered │ KeyPackage on AS - │ │ HPKE init key in DiskKeyStore - └──────┬───────┘ - │ - ┌──────────────┴──────────────┐ - │ │ + + UploadKeyPackage (300) + | + v + +--------------+ + | Registered | KeyPackage on AS + | | HPKE init key in DiskKeyStore + +------+-------+ + | + +--------------+--------------+ + | | create_group() join_group(welcome) - │ │ - ▼ ▼ - ┌─────────────┐ ┌──────────────┐ - │ Group Owner │ │ Group Member │ - │ (epoch 0) │ │ (epoch N) │ - └──────┬──────┘ └──────┬───────┘ - │ │ - add_member() │ - │ │ - ▼ ▼ - ┌──────────────────────────────────────────┐ - │ Active Group Member │ - │ │ - │ send_message() -> enqueue via DS │ - │ receive_message() <- fetch from DS │ - │ │ - │ Epoch advances on each Commit │ - └──────────────────────────────────────────┘ + | | + v v + +-------------+ +--------------+ + | Group Owner | | Group Member | + | (epoch 0) | | (epoch N) | + +------+------+ +------+-------+ + | | + add_member() | + | | + v v + +------------------------------------------+ + | Active Group Member | + | | + | send_message() -> Enqueue (200) | + | receive_message() <- Fetch/FetchWait | + | or PushNewMessage | + | | + | Epoch advances on each Commit | + +------------------------------------------+ ``` --- @@ -342,7 +364,7 @@ flows: ## Further Reading - [Architecture Overview](overview.md) -- system diagram and two-service model -- [Service Architecture](service-architecture.md) -- RPC method details and long-polling internals +- [Service Architecture](service-architecture.md) -- RPC method details and push events - [GroupMember Lifecycle](../internals/group-member-lifecycle.md) -- detailed MLS state machine - [KeyPackage Exchange Flow](../internals/keypackage-exchange.md) -- single-use semantics and AS internals - [MLS (RFC 9420)](../protocol-layers/mls.md) -- key schedule, ratchet tree, and ciphersuite details diff --git a/docs/src/architecture/overview.md b/docs/src/architecture/overview.md index 098c486..0586240 100644 --- a/docs/src/architecture/overview.md +++ b/docs/src/architecture/overview.md @@ -8,21 +8,20 @@ system, the dual-key cryptographic model, and how the pieces fit together. ## Two-Service Model -The server exposes two logical services through a single **NodeService** RPC -interface, bound to **port 7000** over QUIC + TLS 1.3: +The server exposes two logical services through a unified RPC endpoint +bound to **port 5001** over QUIC + TLS 1.3: | Logical Service | Responsibility | |--------------------------|-----------------------------------------------------------------| -| **Authentication Service (AS)** | Stores and distributes single-use MLS KeyPackages. Clients upload KeyPackages after identity generation; peers fetch them to add new members to a group. | +| **Authentication Service (AS)** | Stores and distributes single-use MLS KeyPackages. Clients upload KeyPackages after identity generation; peers fetch them to add new members to a group. Also manages hybrid PQ public keys and identity resolution. | | **Delivery Service (DS)** | Store-and-forward relay for opaque payloads. The DS never inspects MLS ciphertext -- it routes solely by recipient Ed25519 public key (and optional channel ID). | -Combining both services into a single endpoint simplifies deployment and -reduces round-trips. The schema is defined in -[`schemas/node.capnp`](../wire-format/node-service-schema.md) as a unified -`NodeService` interface. +Both services are accessed through a single QUIC connection using the v2 +Protobuf framing protocol. Each RPC call gets a dedicated QUIC bidirectional +stream to prevent head-of-line blocking. See [Service Architecture](service-architecture.md) for per-method details, -connection lifecycle, and the long-polling `fetchWait` mechanism. +connection lifecycle, and push event delivery. --- @@ -33,17 +32,17 @@ as its long-term identity: ```text quicproquo Key Model - ┌──────────────────────────────────────────────────┐ - │ │ - │ Ed25519 signing keypair (MLS identity) │ - │ ────────────────────────────────────── │ - │ - Generated once per user/device │ - │ - Embedded in MLS BasicCredential │ - │ - Signs KeyPackages, Commits, and group ops │ - │ - Raw 32-byte public key is the AS index │ - │ - Managed by IdentityKeypair, zeroize-on-drop │ - │ │ - └──────────────────────────────────────────────────┘ + +--------------------------------------------------+ + | | + | Ed25519 signing keypair (MLS identity) | + | ------------------------------------------ | + | - Generated once per user/device | + | - Embedded in MLS BasicCredential | + | - Signs KeyPackages, Commits, and group ops | + | - Raw 32-byte public key is the AS index | + | - Managed by IdentityKeypair, zeroize-on-drop | + | | + +--------------------------------------------------+ ``` | Property | Ed25519 (MLS) | @@ -52,7 +51,7 @@ as its long-term identity: | Purpose | Identity binding, signing, MLS credentials | | Crate | `ed25519-dalek` | | Zeroize on drop | Yes (`Zeroizing<[u8; 32]>`) | -| PQ protection | MLS key schedule uses DHKEM(X25519); hybrid PQ KEM available at envelope level | +| PQ protection | MLS key schedule uses DHKEM(X25519); hybrid X25519+ML-KEM-768 KEM available at envelope level | For details on the cryptographic properties, see [Ed25519 Identity Keys](../cryptography/identity-keys.md). @@ -62,37 +61,41 @@ For details on the cryptographic properties, see ## System Diagram ```text - ┌─────────────────┐ ┌─────────────────┐ - │ Alice Client │ │ Bob Client │ - │ │ │ │ - │ IdentityKeypair │ │ IdentityKeypair │ - │ (Ed25519) │ │ (Ed25519) │ - │ │ │ │ - │ GroupMember │ │ GroupMember │ - │ (MLS state) │ │ (MLS state) │ - └────────┬─────────┘ └────────┬─────────┘ - │ │ - │ QUIC + TLS 1.3 (quinn/rustls) │ - │ │ - ▼ ▼ - ┌────────────────────────────────────────────────────────────────────────────┐ - │ NodeService (port 7000) │ - │ │ - │ ┌──────────────────────────┐ ┌───────────────────────────────────┐ │ - │ │ Authentication Service │ │ Delivery Service │ │ - │ │ │ │ │ │ - │ │ uploadKeyPackage() │ │ enqueue(recipientKey, payload) │ │ - │ │ fetchKeyPackage() │ │ fetch(recipientKey) │ │ - │ │ uploadHybridKey() │ │ fetchWait(recipientKey, timeout) │ │ - │ │ fetchHybridKey() │ │ │ │ - │ │ │ │ Queues: DashMap + FileBackedStore│ │ - │ │ Store: DashMap + │ │ │ │ - │ │ FileBackedStore │ │ │ │ - │ └──────────────────────────┘ └───────────────────────────────────┘ │ - │ │ - │ health() │ - │ │ - └────────────────────────────────────────────────────────────────────────────┘ + +-----------------+ +-----------------+ + | Alice Client | | Bob Client | + | | | | + | IdentityKeypair | | IdentityKeypair | + | (Ed25519) | | (Ed25519) | + | | | | + | QpqClient | | QpqClient | + | (SDK) | | (SDK) | + +--------+---------+ +--------+---------+ + | | + | QUIC + TLS 1.3 (quinn/rustls) ALPN: "qpq" | + | | + v v + +------------------------------------------------------------------------+ + | quicproquo-server (port 5001) | + | | + | +---------------------------+ +--------------------------------+ | + | | Authentication Service | | Delivery Service | | + | | | | | | + | | OpaqueRegisterStart(100) | | Enqueue(200) | | + | | OpaqueRegisterFinish(101)| | Fetch(201) | | + | | OpaqueLoginStart(102) | | FetchWait(202) | | + | | OpaqueLoginFinish(103) | | Peek(203) | | + | | | | Ack(204) | | + | | UploadKeyPackage(300) | | BatchEnqueue(205) | | + | | FetchKeyPackage(301) | | | | + | | UploadHybridKey(302) | | Store: SQLCipher | | + | | FetchHybridKey(303) | | DashMap waiters | | + | | FetchHybridKeys(304) | +--------------------------------+ | + | +---------------------------+ | + | | + | + 34 more methods: Keys, Channel, Group, User, KT, Blob, Device, | + | P2P, Federation, Moderation, Recovery, Account (see method_ids) | + | | + +------------------------------------------------------------------------+ ``` **Key observations:** @@ -103,7 +106,11 @@ For details on the cryptographic properties, see 2. KeyPackages are single-use (RFC 9420 requirement). The AS atomically removes a KeyPackage on fetch to enforce this invariant. -3. QUIC + TLS 1.3 is the sole transport layer. +3. QUIC + TLS 1.3 is the sole transport layer. The ALPN identifier is `qpq`. + +4. Push events (new messages, typing, presence, membership changes) are + delivered server-to-client on QUIC uni-streams using a separate push frame + format. --- @@ -114,14 +121,14 @@ The system stacks three protocol layers: 1. **Transport** -- QUIC + TLS 1.3. Provides confidentiality, integrity, and server authentication. See [Protocol Stack](protocol-stack.md). -2. **Framing / RPC** -- Cap'n Proto serialisation and RPC. Provides zero-copy - typed messages, schema versioning, and async method dispatch. - See [Cap'n Proto Serialisation and RPC](../protocol-layers/capn-proto.md). +2. **Framing / RPC** -- Custom binary header + Protobuf serialisation. Each + request frame is `[method_id: u16][request_id: u32][payload_len: u32][protobuf]`. + Responses are `[status: u8][request_id: u32][payload_len: u32][protobuf]`. + See [Protobuf Framing](../protocol-layers/capn-proto.md). 3. **End-to-End Encryption** -- MLS (RFC 9420). Provides group key agreement, forward secrecy, and post-compromise security. The server never holds group - keys. - See [MLS (RFC 9420)](../protocol-layers/mls.md). + keys. See [MLS (RFC 9420)](../protocol-layers/mls.md). An optional fourth layer -- the **hybrid KEM envelope** (X25519 + ML-KEM-768) -- wraps MLS payloads for post-quantum confidentiality at the per-message level. @@ -131,14 +138,19 @@ See [Hybrid KEM](../protocol-layers/hybrid-kem.md). ## Crate Map -The implementation is split across four workspace crates: +The implementation is split across nine workspace crates: -| Crate | Role | -|----------------------------|-------------------------------------------------------------------| -| `quicproquo-core` | Crypto primitives, MLS state machine, hybrid KEM | -| `quicproquo-proto` | Cap'n Proto schemas, codegen, and serialisation helpers | -| `quicproquo-server` | QUIC listener, NodeService RPC, storage | -| `quicproquo-client` | QUIC client, CLI subcommands, state persistence | +| Crate | Role | +|------------------------------|-------------------------------------------------------------------| +| `quicproquo-core` | Crypto primitives, MLS state machine, hybrid KEM | +| `quicproquo-proto` | Cap'n Proto legacy types + Protobuf (prost) v2 generated types | +| `quicproquo-kt` | Key transparency (append-only log, revocation) | +| `quicproquo-plugin-api` | `#![no_std]` C-ABI plugin interface | +| `quicproquo-rpc` | QUIC RPC framework: framing, server dispatch, client, middleware | +| `quicproquo-sdk` | Client SDK: `QpqClient`, event broadcast, `ConversationStore` | +| `quicproquo-server` | RPC server + domain services | +| `quicproquo-client` | CLI/TUI client binary | +| `quicproquo-p2p` | iroh P2P endpoint publish/resolve (feature-flagged) | See [Crate Responsibilities](crate-responsibilities.md) for a full breakdown and dependency diagram. @@ -148,7 +160,7 @@ and dependency diagram. ## Further Reading - [Protocol Stack](protocol-stack.md) -- layered protocol stack description -- [Service Architecture](service-architecture.md) -- NodeService RPC methods, connection lifecycle, long-polling +- [Service Architecture](service-architecture.md) -- 44 RPC methods, connection lifecycle, push events - [End-to-End Data Flow](data-flow.md) -- registration, group creation, and message exchange sequence diagrams -- [Wire Format Overview](../wire-format/overview.md) -- Cap'n Proto schema reference +- [Wire Format Reference](../wire-format/overview.md) -- Protobuf schema reference and method ID table - [Cryptography Overview](../cryptography/overview.md) -- detailed cryptographic properties and threat model diff --git a/docs/src/architecture/protocol-stack.md b/docs/src/architecture/protocol-stack.md index 4fc9efb..598699b 100644 --- a/docs/src/architecture/protocol-stack.md +++ b/docs/src/architecture/protocol-stack.md @@ -10,16 +10,17 @@ comparison table. ## Transport: QUIC + TLS 1.3 The transport layer is QUIC over UDP with TLS 1.3 negotiated by `quinn` and -`rustls`. Cap'n Proto RPC rides on a bidirectional QUIC stream. +`rustls`. The v2 Protobuf framing protocol rides on individual QUIC streams, +one per RPC call. ```text -┌─────────────────────────────────────────────┐ -│ Application / MLS ciphertext │ <- group key ratchet (RFC 9420) -├─────────────────────────────────────────────┤ -│ Cap'n Proto RPC │ <- typed, schema-versioned framing -├─────────────────────────────────────────────┤ -│ QUIC + TLS 1.3 (quinn / rustls) │ <- mutual auth + transport secrecy -└─────────────────────────────────────────────┘ ++---------------------------------------------+ +| Application / MLS ciphertext | <- group key ratchet (RFC 9420) ++---------------------------------------------+ +| Protobuf framing (custom binary header) | <- typed, length-prefixed framing ++---------------------------------------------+ +| QUIC + TLS 1.3 (quinn / rustls) | <- mutual auth + transport secrecy ++---------------------------------------------+ ``` ### What each layer provides @@ -31,18 +32,20 @@ The transport layer is QUIC over UDP with TLS 1.3 negotiated by `quinn` and - TLS 1.3 provides perfect forward secrecy per connection via ephemeral ECDHE. - The server presents a self-signed certificate by default; the client pins the server certificate via `--ca-cert`. -- ALPN protocol identifier: `capnp`. +- ALPN protocol identifier: `qpq`. - Multiplexed streams over a single UDP socket -- one bidirectional stream - per RPC session. + per RPC call, preventing head-of-line blocking. +- Uni-directional streams for server-to-client push events. -**Cap'n Proto RPC** (`capnp`, `capnp-rpc`) +**Protobuf framing** (`quicproquo-rpc`, `quicproquo-proto`) -- Zero-copy, schema-versioned serialisation. -- Asynchronous RPC with promise pipelining (multiple in-flight calls). -- The `NodeService` interface (defined in `schemas/node.capnp`) multiplexes - Authentication and Delivery operations on a single connection. -- The two-party VatNetwork runs over `tokio::io::compat` adapters wrapping - QUIC send/recv streams. +- Three frame types: Request, Response, Push. +- Fixed-length binary headers carry method/status codes, request correlation + IDs, and payload length; zero-copy from header to `bytes::Bytes`. +- 44 RPC method IDs across 14 service categories. +- 4 push event types (NewMessage, Typing, Presence, Membership). +- All multi-byte integers in big-endian (network byte order). +- Maximum payload size: 4 MiB per frame. **MLS (RFC 9420)** (`openmls`, `openmls_rust_crypto`) @@ -63,7 +66,7 @@ The transport layer is QUIC over UDP with TLS 1.3 negotiated by `quinn` and | Layer | Provides | Crate(s) | |-------------|------------------------------------------------------------------|-----------------------------------------| | **Transport: QUIC + TLS 1.3** | Confidentiality, server authentication, forward secrecy, multiplexed streams, congestion control | `quinn`, `rustls` | -| **Framing: Cap'n Proto** | Zero-copy typed serialisation, schema versioning, async RPC with promise pipelining | `capnp`, `capnp-rpc` | +| **Framing: Protobuf** | Typed serialisation, length-prefixed framing, method dispatch, push events | `quicproquo-rpc`, `prost` | | **Encryption: MLS** | Group key agreement, forward secrecy, post-compromise security, identity binding | `openmls`, `openmls_rust_crypto` | | **Encryption: Hybrid KEM** (optional) | Post-quantum confidentiality for individual payloads (X25519 + ML-KEM-768) | `ml-kem`, `x25519-dalek`, `chacha20poly1305`, `hkdf` | @@ -75,35 +78,42 @@ A plaintext message traverses the stack as follows: ```text Sender Recipient -────── ───────── +------ --------- plaintext bytes - │ - ▼ + | + v MLS create_message() - │ ── encrypts with group AEAD key (AES-128-GCM) ── - ▼ + | -- encrypts with group AEAD key (AES-128-GCM) -- + v TLS-encoded MlsMessageOut (opaque ciphertext blob) - │ - ▼ -Cap'n Proto: enqueue(recipientKey, payload) - │ ── serialised into NodeService RPC call ── - ▼ -QUIC stream (TLS 1.3 encrypted) - │ - ▼ - ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ network ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ - │ - ▼ -Server: NodeService.enqueue() stores payload in FIFO queue - │ - ▼ -Cap'n Proto: fetch() / fetchWait() returns payload - │ - ▼ + | + v +Protobuf encode: EnqueueRequest { recipient_key, payload, ... } + | + v +RequestFrame: [200: u16][req_id: u32][len: u32][protobuf bytes] + | + v +QUIC bidirectional stream (TLS 1.3 encrypted) + | + v + .............. network .............. + | + v +Server: handler reads RequestFrame, stores payload in queue + | + v +ResponseFrame: [0: u8 (Ok)][req_id: u32][len: u32][EnqueueResponse bytes] + (or PushFrame on uni-stream when push event fires) + | + v +Client: Fetch(201) or receives PushFrame (event_type=1000) + | + v MLS process_message() - │ ── decrypts with group AEAD key ── - ▼ + | -- decrypts with group AEAD key -- + v plaintext bytes ``` @@ -116,6 +126,6 @@ The server **never** holds the MLS group key. It sees only the encrypted - [Architecture Overview](overview.md) -- high-level system diagram and identity key model - [QUIC + TLS 1.3](../protocol-layers/quic-tls.md) -- QUIC configuration, ALPN, and certificate handling -- [Cap'n Proto Serialisation and RPC](../protocol-layers/capn-proto.md) -- schema design and VatNetwork wiring +- [Protobuf Framing](../protocol-layers/capn-proto.md) -- frame format, method IDs, status codes - [MLS (RFC 9420)](../protocol-layers/mls.md) -- ciphersuite selection, key schedule, and ratchet tree - [Hybrid KEM: X25519 + ML-KEM-768](../protocol-layers/hybrid-kem.md) -- post-quantum envelope encryption diff --git a/docs/src/architecture/service-architecture.md b/docs/src/architecture/service-architecture.md index 7d5264a..11ca27a 100644 --- a/docs/src/architecture/service-architecture.md +++ b/docs/src/architecture/service-architecture.md @@ -1,61 +1,233 @@ # Service Architecture -The quicproquo server exposes a single **NodeService** RPC endpoint that -combines Authentication and Delivery operations. This page documents the RPC -interface, per-connection lifecycle, storage model, long-polling mechanism, and -authentication context. +The quicproquo server exposes 44 RPC methods through a single QUIC + TLS 1.3 +endpoint on **port 5001**. Methods are dispatched by numeric method ID using +the v2 Protobuf framing protocol. This page documents the method reference, +connection lifecycle, storage model, and authentication flow. --- -## NodeService Endpoint +## RPC Endpoint -A single QUIC + TLS 1.3 listener on **port 7000** serves all operations. -The schema is defined in `schemas/node.capnp` and documented in -[NodeService Schema](../wire-format/node-service-schema.md). +A single QUIC + TLS 1.3 listener on **port 5001** serves all operations. +The ALPN identifier is `qpq`. Each RPC call uses a dedicated QUIC +bidirectional stream; calls are concurrent and do not block each other. ```text -NodeService (port 7000) -├── Authentication methods -│ ├── uploadKeyPackage(identityKey, package, auth) -> fingerprint -│ ├── fetchKeyPackage(identityKey, auth) -> package -│ ├── uploadHybridKey(identityKey, hybridPublicKey) -> () -│ └── fetchHybridKey(identityKey) -> hybridPublicKey -│ -├── Delivery methods -│ ├── enqueue(recipientKey, payload, channelId, version, auth) -> () -│ ├── fetch(recipientKey, channelId, version, auth) -> payloads -│ └── fetchWait(recipientKey, channelId, version, timeoutMs, auth) -> payloads -│ -└── Operational - └── health() -> status +quicproquo-server (port 5001, ALPN: "qpq") +| ++-- Auth (100-103) +| +-- 100: OpaqueRegisterStart +| +-- 101: OpaqueRegisterFinish +| +-- 102: OpaqueLoginStart +| +-- 103: OpaqueLoginFinish +| ++-- Delivery (200-205) +| +-- 200: Enqueue +| +-- 201: Fetch +| +-- 202: FetchWait +| +-- 203: Peek +| +-- 204: Ack +| +-- 205: BatchEnqueue +| ++-- Keys (300-304) +| +-- 300: UploadKeyPackage +| +-- 301: FetchKeyPackage +| +-- 302: UploadHybridKey +| +-- 303: FetchHybridKey +| +-- 304: FetchHybridKeys +| ++-- Channel (400) +| +-- 400: CreateChannel +| ++-- Group Management (410-413) +| +-- 410: RemoveMember +| +-- 411: UpdateGroupMetadata +| +-- 412: ListGroupMembers +| +-- 413: RotateKeys +| ++-- Moderation (420-424) +| +-- 420: ReportMessage +| +-- 421: BanUser +| +-- 422: UnbanUser +| +-- 423: ListReports +| +-- 424: ListBanned +| ++-- User (500-501) +| +-- 500: ResolveUser +| +-- 501: ResolveIdentity +| ++-- Key Transparency (510-520) +| +-- 510: RevokeKey +| +-- 511: CheckRevocation +| +-- 520: AuditKeyTransparency +| ++-- Blob (600-601) +| +-- 600: UploadBlob +| +-- 601: DownloadBlob +| ++-- Device (700-710) +| +-- 700: RegisterDevice +| +-- 701: ListDevices +| +-- 702: RevokeDevice +| +-- 710: RegisterPushToken +| ++-- Recovery (750-752) +| +-- 750: StoreRecoveryBundle +| +-- 751: FetchRecoveryBundle +| +-- 752: DeleteRecoveryBundle +| ++-- P2P (800-802) +| +-- 800: PublishEndpoint +| +-- 801: ResolveEndpoint +| +-- 802: Health +| ++-- Federation (900-905) +| +-- 900: RelayEnqueue +| +-- 901: RelayBatchEnqueue +| +-- 902: ProxyFetchKeyPackage +| +-- 903: ProxyFetchHybridKey +| +-- 904: ProxyResolveUser +| +-- 905: FederationHealth +| ++-- Account (950) + +-- 950: DeleteAccount + +Push event types (server -> client, uni-stream): + 1000: PushNewMessage + 1001: PushTyping + 1002: PushPresence + 1003: PushMembership ``` --- ## RPC Method Reference -### Authentication Service Methods +### Auth (100-103) -| Method | Params | Returns | Semantics | -|----------------------|-------------------------------------|------------------|-----------| -| `uploadKeyPackage` | `identityKey` (32 B Ed25519 pk), `package` (TLS-encoded KeyPackage), `auth` | `fingerprint` (SHA-256 of package) | Appends the KeyPackage to a per-identity FIFO queue. The fingerprint lets the client detect server-side tampering. Max package size: 1 MB. | -| `fetchKeyPackage` | `identityKey` (32 B), `auth` | `package` (or empty `Data`) | Atomically pops and returns the oldest KeyPackage for the identity. Returns empty bytes if none are stored. Single-use semantics per RFC 9420. | -| `uploadHybridKey` | `identityKey` (32 B), `hybridPublicKey` (X25519 pk + ML-KEM-768 ek) | `()` | Stores (or replaces) the hybrid PQ public key for envelope-level post-quantum encryption. | -| `fetchHybridKey` | `identityKey` (32 B) | `hybridPublicKey` (or empty `Data`) | Returns the stored hybrid public key for a peer, or empty if none. | +OPAQUE password authentication (asymmetric PAKE). The password is never sent +to the server. Method IDs 100-103 implement the 4-step OPAQUE handshake. -### Delivery Service Methods +| ID | Method | Description | +|-----|-------------------------|-------------| +| 100 | `OpaqueRegisterStart` | Client initiates registration with `username` and OPAQUE `registration_request` blob. Server returns `registration_response`. | +| 101 | `OpaqueRegisterFinish` | Client completes registration with `username`, OPAQUE `upload` blob, and Ed25519 `identity_key`. Server stores the OPAQUE record. | +| 102 | `OpaqueLoginStart` | Client initiates login with `username` and OPAQUE `login_request` blob. Server returns `login_response`. | +| 103 | `OpaqueLoginFinish` | Client completes login with `username`, OPAQUE `finalization` blob, and `identity_key`. Server returns a `session_token`. | -| Method | Params | Returns | Semantics | -|--------------|------------------------------------------------------------------------|----------------------|-----------| -| `enqueue` | `recipientKey` (32 B), `payload` (opaque), `channelId`, `version`, `auth` | `()` | Appends `payload` to the recipient's FIFO queue. Max payload: 5 MB. Wakes any `fetchWait` waiter for this recipient. Supported versions: 0 (legacy), 1 (current). | -| `fetch` | `recipientKey` (32 B), `channelId`, `version`, `auth` | `payloads: List(Data)` | Atomically drains and returns the full queue in FIFO order. Returns empty list if nothing is pending. | -| `fetchWait` | `recipientKey` (32 B), `channelId`, `version`, `timeoutMs`, `auth` | `payloads: List(Data)` | Same as `fetch`, but if the queue is empty and `timeoutMs > 0`, blocks up to `timeoutMs` milliseconds waiting for a `Notify` signal from `enqueue`. Returns whatever is in the queue when the wait completes or times out. | +The `session_token` is an opaque bearer token used for subsequent authenticated +RPCs. It is passed in the Protobuf request body (not as a frame-level header). -### Operational Methods +### Delivery (200-205) -| Method | Params | Returns | Semantics | -|----------|--------|-----------------|-----------| -| `health` | none | `status: Text` | Returns `"ok"`. Used for liveness/readiness probes. | +Store-and-forward relay. The server never inspects MLS ciphertext -- it routes +opaque byte blobs by recipient key. + +| ID | Method | Description | +|-----|----------------|-------------| +| 200 | `Enqueue` | Append an opaque payload to the recipient's FIFO queue. Wakes `FetchWait` waiters. | +| 201 | `Fetch` | Drain and return all queued payloads in FIFO order. | +| 202 | `FetchWait` | Same as `Fetch`, but long-polls if the queue is empty (up to `timeout_ms`). | +| 203 | `Peek` | Return queued payloads without removing them. | +| 204 | `Ack` | Acknowledge and remove specific payloads by sequence number. | +| 205 | `BatchEnqueue` | Enqueue multiple payloads in a single RPC call. | + +### Keys (300-304) + +MLS KeyPackage distribution and hybrid PQ public key management. + +| ID | Method | Description | +|-----|--------------------|-------------| +| 300 | `UploadKeyPackage` | Append a TLS-encoded MLS KeyPackage to the identity's queue. Single-use: each fetch atomically removes one. | +| 301 | `FetchKeyPackage` | Atomically pop and return the oldest KeyPackage for an identity. Returns empty if none. | +| 302 | `UploadHybridKey` | Store (or replace) the X25519+ML-KEM-768 hybrid public key for an identity. | +| 303 | `FetchHybridKey` | Return the stored hybrid public key for a single identity. | +| 304 | `FetchHybridKeys` | Return hybrid public keys for multiple identities in one call. | + +### Group Management (400, 410-413) + +| ID | Method | Description | +|-----|-----------------------|-------------| +| 400 | `CreateChannel` | Register a new channel (group) on the server. | +| 410 | `RemoveMember` | Remove a member from a group (server-side record). | +| 411 | `UpdateGroupMetadata` | Update group name, description, or settings. | +| 412 | `ListGroupMembers` | List all members of a group. | +| 413 | `RotateKeys` | Trigger a server-assisted key rotation event. | + +### User / Identity (500-501) + +| ID | Method | Description | +|-----|-------------------|-------------| +| 500 | `ResolveUser` | Resolve a username to an Ed25519 public key. | +| 501 | `ResolveIdentity` | Resolve an identity key to user profile information. | + +### Key Transparency (510-520) + +| ID | Method | Description | +|-----|--------------------------|-------------| +| 510 | `RevokeKey` | Append a key revocation record to the transparency log. | +| 511 | `CheckRevocation` | Check whether a given key has been revoked. | +| 520 | `AuditKeyTransparency` | Fetch a transparency log audit proof for a key. | + +### Blob Storage (600-601) + +| ID | Method | Description | +|-----|----------------|-------------| +| 600 | `UploadBlob` | Store a binary blob (file attachment, avatar, etc.). Returns a content-addressed blob ID. | +| 601 | `DownloadBlob` | Retrieve a blob by ID. | + +### Device Management (700-710) + +| ID | Method | Description | +|-----|-----------------------|-------------| +| 700 | `RegisterDevice` | Register a new device for a user account. | +| 701 | `ListDevices` | List all registered devices for the authenticated user. | +| 702 | `RevokeDevice` | Revoke a device, invalidating its session. | +| 710 | `RegisterPushToken` | Register a push notification token (APNs / FCM) for a device. | + +### Recovery (750-752) + +| ID | Method | Description | +|-----|-------------------------|-------------| +| 750 | `StoreRecoveryBundle` | Encrypt and store an account recovery bundle server-side. | +| 751 | `FetchRecoveryBundle` | Retrieve the recovery bundle (requires OPAQUE re-authentication). | +| 752 | `DeleteRecoveryBundle` | Delete the stored recovery bundle. | + +### P2P and Health (800-802) + +| ID | Method | Description | +|-----|--------------------|-------------| +| 800 | `PublishEndpoint` | Publish a direct P2P endpoint (iroh node address). | +| 801 | `ResolveEndpoint` | Resolve a peer's P2P endpoint by identity key. | +| 802 | `Health` | Liveness/readiness probe. Returns server uptime and status. | + +### Federation (900-905) + +| ID | Method | Description | +|-----|-------------------------|-------------| +| 900 | `RelayEnqueue` | Relay a single message to a user on another server. | +| 901 | `RelayBatchEnqueue` | Relay multiple messages in one request. | +| 902 | `ProxyFetchKeyPackage` | Fetch a KeyPackage from a remote server on behalf of a local client. | +| 903 | `ProxyFetchHybridKey` | Fetch a hybrid public key from a remote server. | +| 904 | `ProxyResolveUser` | Resolve a username on a remote server. | +| 905 | `FederationHealth` | Check health of the federation link to another server. | + +### Moderation (420-424) + +| ID | Method | Description | +|-----|----------------|-------------| +| 420 | `ReportMessage` | Submit a content moderation report. | +| 421 | `BanUser` | Ban a user from a channel or server-wide. | +| 422 | `UnbanUser` | Lift a ban. | +| 423 | `ListReports` | List pending moderation reports (admin only). | +| 424 | `ListBanned` | List banned users (admin only). | + +### Account (950) + +| ID | Method | Description | +|-----|-----------------|-------------| +| 950 | `DeleteAccount` | Permanently delete the authenticated account and all associated data. | --- @@ -64,196 +236,127 @@ NodeService (port 7000) Each incoming QUIC connection follows this sequence: ```text -┌──────────────────────────────────────────────────────────────────────┐ -│ Client Server │ -│ │ -│ 1. UDP packet -> │ -│ QUIC INITIAL │ -│ │ -│ 2. <- QUIC HANDSHAKE │ -│ TLS 1.3 ServerHello + │ -│ Certificate (self-signed) │ -│ ALPN: "capnp" │ -│ │ -│ 3. Client verifies server │ -│ cert against pinned CA │ -│ cert (--ca-cert flag) │ -│ │ -│ 4. QUIC connection established │ -│ │ -│ 5. Client opens bidirectional ──────────> Server accepts bi stream │ -│ QUIC stream (open_bi) (accept_bi) │ -│ │ -│ 6. tokio_util::compat adapters wrap the send/recv halves │ -│ into AsyncRead + AsyncWrite │ -│ │ -│ 7. capnp-rpc twoparty::VatNetwork │ -│ Client Side::Client Server Side::Server │ -│ │ -│ 8. RpcSystem::new() starts │ -│ promise-pipelined RPC loop │ -│ │ -│ 9. Client bootstraps │ -│ node_service::Client NodeServiceImpl created │ -│ (shares Arc, │ -│ Arc>) │ -│ │ -│ 10. RPC calls flow over the bidirectional stream │ -│ until either side closes the connection. │ -└──────────────────────────────────────────────────────────────────────┘ +Client Server +------ ------ +1. UDP QUIC INITIAL -> + +2. <- QUIC HANDSHAKE + TLS 1.3 ServerHello + + Certificate (self-signed) + ALPN: "qpq" + +3. Client verifies server cert against + pinned CA cert (--ca-cert flag) + +4. QUIC connection established + +5. Per RPC call: + Client opens bidirectional stream + Client writes RequestFrame: + [method_id: u16][req_id: u32][len: u32][protobuf] + Client marks end-of-write + +6. Server reads RequestFrame + Server dispatches to handler by method_id + Handler processes, writes ResponseFrame: + [status: u8][req_id: u32][len: u32][protobuf] + +7. For push events (server -> client): + Server opens uni-stream + Server writes PushFrame: + [event_type: u16][len: u32][protobuf] + +8. Multiple RPCs run concurrently + (each on its own stream) ``` -### LocalSet requirement +### Concurrency model -`capnp-rpc` uses `Rc>` internally, making it `!Send`. Therefore: - -- The server runs the entire accept loop inside a `tokio::task::LocalSet`. -- Each connection handler is `spawn_local`, ensuring all RPC futures stay on a - single thread. -- The client wraps each subcommand invocation in its own `LocalSet::run_until`. - -This is a fundamental constraint of the Cap'n Proto RPC runtime in Rust. -Attempts to spawn RPC futures on the multi-threaded Tokio executor will fail -with a compile error. +Unlike the v1 Cap'n Proto RPC (which was `!Send` due to `Rc>` +internals and required `LocalSet`), the v2 RPC framework uses `Arc`-based +shared state and spawns each handler with `tokio::spawn`. The server can +handle many concurrent requests per connection without a `LocalSet`. --- -## Storage Model +## Status Codes -`NodeServiceImpl` holds two pieces of shared state: +Response frames carry a `status: u8` field: -### FileBackedStore +| Value | Status | Meaning | +|-------|------------------|---------| +| 0 | `Ok` | Success | +| 1 | `BadRequest` | Malformed request or missing required field | +| 2 | `Unauthorized` | Missing or invalid session token | +| 3 | `Forbidden` | Valid token but insufficient permissions | +| 4 | `NotFound` | Requested resource does not exist | +| 5 | `RateLimited` | Request rate limit exceeded; retry after backoff | +| 8 | `DeadlineExceeded` | Request timed out on the server | +| 9 | `Unavailable` | Server temporarily unable to serve the request | +| 10 | `Internal` | Unexpected server error | +| 11 | `UnknownMethod` | The requested method_id is not registered | + +--- + +## Authentication Flow + +OPAQUE (RFC-compliant asymmetric PAKE) prevents the password from reaching +the server in any form: ```text -FileBackedStore -├── key_packages: Mutex, VecDeque>>> -│ Key: Ed25519 public key (32 bytes) -│ Value: FIFO queue of TLS-encoded KeyPackage blobs -│ File: data/keypackages.bin (bincode) -│ -├── deliveries: Mutex>>> -│ ChannelKey: { channel_id: Vec, recipient_key: Vec } -│ Value: FIFO queue of opaque payload blobs -│ File: data/deliveries.bin (bincode, v2 format) -│ -└── hybrid_keys: Mutex, Vec>> - Key: Ed25519 public key (32 bytes) - Value: serialised HybridPublicKey blob - File: data/hybridkeys.bin (bincode) +Client Server + | | + | OpaqueRegisterStart(100): | + | username, registration_request | + | --------------------------------->| + | | + | registration_response | + | <---------------------------------| + | | + | OpaqueRegisterFinish(101): | + | username, upload, identity_key | + | --------------------------------->| + | | + | success | + | <---------------------------------| + | | + | OpaqueLoginStart(102): | + | username, login_request | + | --------------------------------->| + | | + | login_response | + | <---------------------------------| + | | + | OpaqueLoginFinish(103): | + | username, finalization, | + | identity_key | + | --------------------------------->| + | | + | session_token | + | <---------------------------------| ``` -Every mutation (upload, fetch, enqueue) acquires the relevant `Mutex`, modifies -the in-memory `HashMap`, and then flushes the entire map to disk as a bincode -blob. This is intentionally simple for MVP-scale workloads. A production -deployment would replace this with an embedded database or external store. - -The delivery map supports a **v1 -> v2 upgrade path**: if `deliveries.bin` -contains the legacy `QueueMapV1` format (keyed by `recipientKey` only), the -store transparently upgrades entries by wrapping them in `ChannelKey` with an -empty `channel_id`. - -### DashMap Waiters - -```text -Arc, Arc>> - Key: recipient Ed25519 public key (32 bytes) - Value: tokio::sync::Notify instance -``` - -The waiters map is orthogonal to `FileBackedStore`. It lives entirely in -memory and serves the `fetchWait` long-polling mechanism: - -1. `enqueue` calls `waiter(&recipient_key).notify_waiters()` after storing the - payload. -2. `fetchWait` first tries a regular `fetch`. If the queue is empty and - `timeoutMs > 0`: - - Look up or insert a `Notify` for the recipient. - - `tokio::time::timeout(Duration::from_millis(timeoutMs), notify.notified())` - - When notified (or on timeout), perform a second `fetch` and return - whatever is available. - -This design avoids busy-polling while keeping the implementation lock-free -(DashMap uses sharded RwLocks internally). - ---- - -## Auth Struct - -Every RPC method that modifies or reads user-specific state accepts an `Auth` -parameter: - -```capnp -struct Auth { - version @0 :UInt16; # 0 = legacy/none, 1 = token-based auth - accessToken @1 :Data; # opaque bearer token - deviceId @2 :Data; # optional UUID for auditing/rate limiting -} -``` - -### Version semantics - -| Version | Meaning | -|---------|------------------------------------------------------------| -| 0 | Legacy / no authentication. The server accepts the request without checking credentials. Suitable for development and testing. | -| 1 | Token-based authentication. The `accessToken` field should contain an opaque bearer token issued at login. The server validates the token against a token store (not yet implemented -- see [Auth, Devices, and Tokens](../roadmap/authz-plan.md)). | - -The server validates the `version` field on every request via `validate_auth()`. -Requests with unsupported versions are rejected with a Cap'n Proto error. - -### Client-side usage - -The client CLI accepts `--access-token` and `--device-id` flags (or the -corresponding environment variables). These are bundled into a `ClientAuth` -struct and injected into every outgoing RPC call via the `set_auth()` helper. - -Currently, the client sends `version = 0` with empty token and device ID by -default. When the token-based auth flow is implemented, the client will populate -these fields. - ---- - -## Validation and Limits - -The server enforces the following constraints on every RPC call: - -| Constraint | Value | Error on violation | -|-----------------------------|--------------------|--------------------| -| `identityKey` / `recipientKey` length | Exactly 32 bytes | Cap'n Proto error: "must be exactly 32 bytes" | -| KeyPackage size | <= 1 MB | Cap'n Proto error: "package exceeds max size" | -| Payload size | <= 5 MB | Cap'n Proto error: "payload exceeds max size" | -| Wire version | 0 or 1 | Cap'n Proto error: "unsupported wire version" | -| Auth version | 0 or 1 | Cap'n Proto error: "unsupported auth version" | -| KeyPackage non-empty | `package.len() > 0`| Cap'n Proto error: "package must not be empty" | -| Payload non-empty | `payload.len() > 0`| Cap'n Proto error: "payload must not be empty" | +The `session_token` is then passed in subsequent Protobuf requests. The server +validates it on every authenticated method call. --- ## Configuration -The server binary is configured via CLI flags or environment variables: +| Flag | Env var | Default | Description | +|----------------|----------------------------|------------------------|-------------| +| `--listen` | `QPQ_LISTEN` | `0.0.0.0:5001` | QUIC listen address (host:port). | +| `--data-dir` | `QPQ_DATA_DIR` | `data` | Directory for persisted state. | +| `--tls-cert` | `QPQ_TLS_CERT` | `data/server-cert.der` | Path to TLS certificate (DER). Auto-generated if missing. | +| `--tls-key` | `QPQ_TLS_KEY` | `data/server-key.der` | Path to TLS private key (DER). Auto-generated if missing. | -| Flag | Env var | Default | Description | -|----------------|----------------------------|----------------------|-------------| -| `--listen` | `QPQ_LISTEN` | `0.0.0.0:7000` | QUIC listen address (host:port). | -| `--data-dir` | `QPQ_DATA_DIR` | `data` | Directory for persisted KeyPackages, delivery queues, and hybrid keys. | -| `--tls-cert` | `QPQ_TLS_CERT` | `data/server-cert.der` | Path to TLS certificate (DER). Auto-generated if missing. | -| `--tls-key` | `QPQ_TLS_KEY` | `data/server-key.der` | Path to TLS private key (DER). Auto-generated if missing. | - -If the TLS certificate or key files do not exist at startup, the server -auto-generates a self-signed certificate for `localhost`, `127.0.0.1`, and -`::1` using `rcgen`. - -Logging level is controlled by the `RUST_LOG` environment variable (default: -`info`). +Logging level is controlled by the `RUST_LOG` environment variable (default: `info`). --- ## Further Reading -- [Architecture Overview](overview.md) -- two-service model and dual-key overview -- [NodeService Schema](../wire-format/node-service-schema.md) -- full Cap'n Proto schema -- [End-to-End Data Flow](data-flow.md) -- sequence diagrams showing registration, group creation, and messaging -- [Delivery Service Internals](../internals/delivery-service.md) -- queue routing and channel-aware delivery -- [Authentication Service Internals](../internals/authentication-service.md) -- KeyPackage lifecycle -- [Storage Backend](../internals/storage-backend.md) -- FileBackedStore details and upgrade path -- [Auth, Devices, and Tokens](../roadmap/authz-plan.md) -- planned token-based authentication +- [Architecture Overview](overview.md) -- two-service model and system diagram +- [End-to-End Data Flow](data-flow.md) -- sequence diagrams for registration, group creation, and messaging +- [Protobuf Framing](../protocol-layers/capn-proto.md) -- frame format details and method ID constants +- [Wire Format Reference](../wire-format/overview.md) -- full Protobuf schema documentation diff --git a/docs/src/cryptography/threat-model.md b/docs/src/cryptography/threat-model.md index 85ce747..4d9ee35 100644 --- a/docs/src/cryptography/threat-model.md +++ b/docs/src/cryptography/threat-model.md @@ -182,29 +182,22 @@ could provide an additional detection mechanism. ### No Client Authentication on the Delivery Service -The Delivery Service does not currently authenticate clients. Anyone who knows -a recipient's Ed25519 public key can enqueue messages for that recipient. This -enables spam and potential denial-of-service by flooding a recipient's queue. +The Delivery Service requires a valid OPAQUE session token for all DS +operations. The session token is bound to the client's identity key, and the +server rejects enqueue and fetch operations that lack a valid token. -**Impact:** Queue flooding, spam delivery. MLS provides its own authentication -(the recipient will reject messages not signed by a group member), so forged -content will not be accepted, but the recipient must still download and attempt -to process the spam. +**Status:** Mitigated. Token-based authentication is enforced via the OPAQUE +login flow (methods 100-103). Unauthenticated enqueue attempts are rejected. -**Mitigation path:** The AUTHZ\_PLAN introduces token-based authentication, -binding identityKey to accounts and requiring valid access tokens for all -DS operations. +### Rate Limiting -### No Rate Limiting +The server enforces a sliding window rate limit on all RPC methods. Requests +exceeding the configured threshold per IP or per account are rejected with a +rate-limit error response. -The server does not currently enforce per-client or per-IP rate limits. A -malicious client could flood the server with requests, consuming resources and -degrading service for other users. - -**Impact:** Denial of service. - -**Mitigation path:** The AUTHZ\_PLAN specifies per-IP and per-account/device -rate limits (e.g., 50 requests/second, 5 MB payload cap). +**Status:** Mitigated. Rate limiting is active (sliding window, configurable +threshold, default 50 requests/second per IP). The `rate_limit_hit_total` +Prometheus metric tracks rejections. See [Monitoring](../operations/monitoring.md). ### BasicCredential Only @@ -234,18 +227,30 @@ hybrid KEM is active for MLS). **Mitigation path:** Adopt post-quantum TLS (ML-KEM in TLS 1.3 handshake) when `rustls` supports it. -## Future Mitigations +## Implemented Mitigations ### Sealed Sender -**Goal:** Hide the sender's identity from the server. +**Status:** Implemented. The `--sealed-sender` flag encrypts the sender's +identity inside the MLS ciphertext. When enabled, the server routes by recipient +queue index only and cannot determine who sent the message. This reduces server +metadata visibility from "who sent to whom" to "someone sent to this recipient." -**Approach:** Encrypt the sender's identity inside the MLS ciphertext. The -server cannot determine who sent a message -- it only knows the recipient -(delivery queue index). Signal implements a version of this as "Sealed Sender." +### OPAQUE Authentication -**Benefit:** Reduces the server's metadata visibility from "who sent to whom" -to "someone sent to this recipient." +**Status:** Implemented. The OPAQUE protocol (RFC 9497) is the only supported +login mechanism. The server stores OPAQUE registration records; it never receives +or stores the client's password. Session tokens issued on login are required for +all authenticated RPCs. + +### Username Enumeration Protection + +**Status:** Implemented. All auth responses (including failures) are subject to +a 5ms timing floor, preventing timing-based username enumeration. + +--- + +## Future Mitigations ### Private Information Retrieval (PIR) @@ -272,17 +277,6 @@ verify that their public key has not been replaced by an attacker. **Benefit:** Detects attacks where the server (or an attacker who compromised the server) substitutes a victim's public key with the attacker's key. -### OPAQUE Authentication - -**Goal:** Zero-knowledge password authentication. - -**Approach:** Use the OPAQUE protocol (RFC 9497) for client-server -authentication. OPAQUE allows the client to prove knowledge of a password -without revealing it to the server, even during registration. - -**Benefit:** The server never learns the client's password, preventing -credential theft in a server compromise. - ### Tor/I2P Integration **Goal:** Hide client IP addresses from the server and network adversaries. @@ -315,11 +309,12 @@ communication patterns from traffic analysis. |--------|-------------------|-----|-------------| | Passive eavesdropper | TLS 1.3 + MLS (2 layers) | Traffic analysis | Padding, Tor | | Active MITM | TLS 1.3 (QUIC) | Self-signed certs | Cert pinning, CA | -| Compromised server | MLS E2E encryption | Metadata visible | Sealed Sender, PIR | +| Compromised server | MLS E2E encryption + Sealed Sender | Metadata partially visible | PIR | | Compromised client | FS + PCS | Current epoch exposed | Periodic Updates | -| Spam/flooding | None | No auth on DS | AUTHZ\_PLAN | +| Spam/flooding | Rate limiting + OPAQUE session tokens | -- | -- | +| Username enumeration | 5ms timing floor on all auth responses | -- | -- | | Key substitution | None | BasicCredential only | Key Transparency | -| Quantum adversary (content) | Hybrid KEM (M5+) | Pre-M5 messages | Deploy hybrid ASAP | +| Quantum adversary (content) | Hybrid KEM (X25519 + ML-KEM-768) | Pre-v2 messages | -- | | Quantum adversary (transport) | None | Classical TLS (ECDHE) | PQ TLS | ## Related Pages diff --git a/docs/src/design-rationale/overview.md b/docs/src/design-rationale/overview.md index f8a8046..1c88ae1 100644 --- a/docs/src/design-rationale/overview.md +++ b/docs/src/design-rationale/overview.md @@ -10,9 +10,10 @@ These decisions are not immutable. Each ADR has a status field and can be supers | ADR | Title | Status | One-line summary | |---|---|---|---| -| [ADR-002](adr-002-capnproto.md) | Cap'n Proto over MessagePack | Accepted | Zero-copy, schema-enforced serialisation with built-in async RPC replaces hand-rolled MessagePack dispatch. | +| [ADR-002](adr-002-capnproto.md) | Cap'n Proto over MessagePack (v1) | Superseded | Zero-copy, schema-enforced serialisation with built-in async RPC replaced hand-rolled MessagePack dispatch. Superseded by ADR-007. | | [ADR-004](adr-004-mls-unaware-ds.md) | MLS-Unaware Delivery Service | Accepted | The DS routes opaque blobs by recipient key; it never inspects MLS content. | | [ADR-005](adr-005-single-use-keypackages.md) | Single-Use KeyPackages | Accepted | The AS atomically removes a KeyPackage on fetch to preserve MLS forward secrecy. | +| [ADR-007](adr-007-protobuf-migration.md) | v1 Cap'n Proto to v2 Protobuf Migration | Accepted | Replace Cap'n Proto RPC with custom Protobuf framing over QUIC streams for better ecosystem support, 44-method surface, and multi-threaded dispatch. | --- @@ -26,7 +27,7 @@ For a broader comparison of quicproquo's design against alternative messaging pr Each ADR page follows this structure: -1. **Status** -- One of: Proposed, Accepted, Deprecated, Superseded. All current ADRs are Accepted. +1. **Status** -- One of: Proposed, Accepted, Deprecated, Superseded. All current ADRs are Accepted unless noted. 2. **Context** -- The problem or force that motivated the decision. What constraints existed? What alternatives were considered? 3. **Decision** -- The specific choice that was made. What was selected and what was rejected? 4. **Consequences** -- The trade-offs that result from the decision. What are the benefits? What are the costs? What residual risks remain? @@ -34,13 +35,72 @@ Each ADR page follows this structure: --- -## Cross-cutting themes +## ADR-007: v1 Cap'n Proto to v2 Protobuf Migration -Several themes recur across multiple ADRs: +**Status**: Accepted + +**Context** + +quicproquo v1 used Cap'n Proto for both serialisation and RPC dispatch via +`capnp-rpc`. This worked well for the initial 8-method `NodeService` interface +but had several limitations as the protocol expanded: + +- **`!Send` constraint**: `capnp-rpc` uses `Rc>` internally, requiring + all RPC futures to run on a `tokio::task::LocalSet`. This prevented multi-threaded + dispatch and added complexity to every connection handler. +- **Schema growth friction**: Cap'n Proto's capability-based RPC model does not + map cleanly to large flat method tables. Adding the 36 new methods (keys, + blob, device, federation, moderation, recovery, etc.) would have required + significant schema refactoring. +- **ALPN collision**: The `b"capnp"` ALPN identifier is not registered and could + conflict with other Cap'n Proto deployments. A project-specific ALPN is cleaner. +- **Tooling**: `capnpc` requires a system-wide binary installation or a vendored + copy. `prost-build` with `protobuf-src` self-vendors `protoc`, eliminating the + build-time dependency. + +**Decision** + +Replace `capnp-rpc` with a custom binary framing layer (`quicproquo-rpc`) and +Protocol Buffers (`prost`) for payload serialisation: + +- Three frame types: Request (10-byte header), Response (9-byte header), Push + (6-byte header), all carrying Protobuf-encoded payloads. +- Method IDs are numeric `u16` constants dispatched via a handler registry. + One QUIC bidirectional stream per RPC call; push events on QUIC uni-streams. +- ALPN changed from `b"capnp"` to `b"qpq"`. Default port changed from 7000 to 5001. +- Cap'n Proto legacy types are retained in `quicproquo-proto` for v1 compatibility + but are no longer used for RPC dispatch. + +**Consequences** + +Benefits: +- Full `tokio::spawn` concurrency (no `LocalSet` required). +- 44-method RPC surface with clean numeric namespace and room to grow. +- Self-contained build (no system `protoc` dependency). +- Lighter middleware integration via Tower `Service` traits. +- Push event delivery without polling. + +Costs: +- Lost Cap'n Proto zero-copy reads (Protobuf requires deserialisation). Acceptable + because the hot path in the Delivery Service works with opaque `bytes::Bytes` + without deserialisation. +- Lost promise pipelining from `capnp-rpc`. Not required for the current RPC + surface; can be re-added with a future streaming RPC design. +- v1 clients are no longer wire-compatible with v2 servers. + +**Code references** + +- Frame format: `crates/quicproquo-rpc/src/framing.rs` +- Method IDs: `crates/quicproquo-proto/src/lib.rs` (`method_ids` module) +- Proto schemas: `proto/qpq/v1/*.proto` + +--- + +## Cross-cutting themes ### Layered security -The core principle is that **no single layer is trusted alone**. QUIC/TLS transport encryption protects metadata and provides authentication; MLS provides end-to-end content encryption with forward secrecy and post-compromise security. +The core principle is that **no single layer is trusted alone**. QUIC/TLS transport encryption protects metadata and provides server authentication; MLS provides end-to-end content encryption with forward secrecy and post-compromise security; OPAQUE ensures the server never learns the user's password. ### Server minimalism @@ -48,13 +108,16 @@ ADR-004 and ADR-005 reflect a design philosophy where the server does as little ### Schema-first design -ADR-002 establishes Cap'n Proto as the single source of truth for the wire format. Every message and RPC call is defined in `.capnp` schema files, which are checked into the repository and used for code generation. This eliminates the class of bugs that arises from hand-rolled serialisation and ensures that the wire format is documented, versioned, and evolvable. +The v2 protocol defines all messages and method IDs in checked-in source files +(`proto/qpq/v1/*.proto` and `crates/quicproquo-proto/src/lib.rs`). Every wire +type is documented, versioned, and evolvable through the standard Protobuf +schema evolution rules (adding optional fields, reserving removed field numbers). --- ## Further reading - [Why This Design, Not Signal/Matrix/...](why-not-signal.md) -- comparative analysis against alternative protocols -- [Wire Format Overview](../wire-format/overview.md) -- the serialisation pipeline that implements these decisions +- [Wire Format Reference](../wire-format/overview.md) -- the serialisation pipeline that implements these decisions - [Architecture Overview](../architecture/overview.md) -- system-level view - [Protocol Layers Overview](../protocol-layers/overview.md) -- how the protocol layers stack diff --git a/docs/src/internals/authentication-service.md b/docs/src/internals/authentication-service.md index 0627d6d..26f19cd 100644 --- a/docs/src/internals/authentication-service.md +++ b/docs/src/internals/authentication-service.md @@ -1,279 +1,202 @@ # Authentication Service Internals -The Authentication Service (AS) stores and distributes single-use MLS -KeyPackages. It is one of the two logical services exposed through the unified -`NodeService` RPC interface. The AS also stores hybrid (X25519 + ML-KEM-768) -public keys for post-quantum envelope encryption. +The Authentication Service handles user registration and login via the OPAQUE asymmetric password-authenticated key exchange (PAKE) protocol. It also manages MLS KeyPackages, hybrid post-quantum keys, and session token issuance. -This page covers the server-side implementation of KeyPackage storage, the -`Auth` struct validation logic, and the hybrid key endpoints. +This page covers the server-side OPAQUE flow, session token lifecycle, KeyPackage storage, and hybrid key endpoints. **Sources:** -- `crates/quicproquo-server/src/main.rs` (RPC handlers, auth validation) -- `crates/quicproquo-server/src/storage.rs` (FileBackedStore) -- `schemas/node.capnp` (wire schema) +- `crates/quicproquo-server/src/domain/` (OPAQUE handlers, session management) +- `crates/quicproquo-server/src/sql_store.rs` (SqlStore persistence) +- `proto/qpq/v1/auth.proto` (wire schema) + +--- + +## OPAQUE Protocol + +quicproquo uses the OPAQUE asymmetric PAKE (RFC 9497) for user authentication. The password never leaves the client and is never known to the server. The server stores an OPAQUE registration record derived from the password, but this record cannot be used to recover the password even if the server is fully compromised. + +### Registration (IDs 100-101) + +Registration takes two round trips. + +```text +Client Server + | | + | [1] OpaqueRegisterStartRequest | + | username: "alice" | + | request: | + | ---------------------------------------->| + | | + | [2] OpaqueRegisterStartResponse | + | response: | + | <----------------------------------------| + | | + | [3] OpaqueRegisterFinishRequest | + | username: "alice" | + | upload: | + | identity_key: | + | ---------------------------------------->| + | | + | [4] OpaqueRegisterFinishResponse | + | success: true | + | <----------------------------------------| +``` + +**Step [1]:** The client generates a `RegistrationRequest` blob using the `opaque-ke` crate. This contains a masked version of the password; the server cannot extract the raw password. + +**Step [2]:** The server generates a `RegistrationResponse` using its OPAQUE server keypair and the client's request. The server does not yet persist anything. + +**Step [3]:** The client completes the OPAQUE registration and sends a `RegistrationUpload` blob. This blob contains the password-derived key material (specifically the client's OPAQUE export key envelope and public key). The client also sends its Ed25519 identity public key. + +**Step [4]:** The server stores the `RegistrationUpload` blob as the user's OPAQUE record, indexed by `username`. The Ed25519 identity key is stored alongside the record. Registration fails with `success: false` if the username is already taken. + +### Login (IDs 102-103) + +Login also takes two round trips and produces a session token. + +```text +Client Server + | | + | [1] OpaqueLoginStartRequest | + | username: "alice" | + | request: | + | ---------------------------------------->| + | | + | [2] OpaqueLoginStartResponse | + | response: | + | <----------------------------------------| + | | + | [3] OpaqueLoginFinishRequest | + | username: "alice" | + | finalization: | + | identity_key: | + | ---------------------------------------->| + | | + | [4] OpaqueLoginFinishResponse | + | session_token: <32 bytes> | + | <----------------------------------------| +``` + +**Step [1]:** The client generates a `CredentialRequest` using the `opaque-ke` crate. + +**Step [2]:** The server looks up the user's OPAQUE record by `username` and generates a `CredentialResponse`. If the username is unknown, the server generates a fake response using a blinded dummy record to prevent username enumeration. + +**Step [3]:** The client verifies the server's `CredentialResponse` against the stored password, derives the shared export key, and sends a `CredentialFinalization` blob that proves knowledge of the password. The client also sends its Ed25519 identity key. + +**Step [4]:** The server verifies the `CredentialFinalization`. If verification succeeds and the identity key matches the registered key, the server generates a `session_token` (32 random bytes), stores it in the session table, and returns it to the client. If verification fails, the server returns an error status with an empty `session_token`. + +### Session token lifecycle + +The `session_token` is a 32-byte random bearer credential issued at login. It is: + +- Stored in the SQLCipher `sessions` table (see [Storage Backend](storage-backend.md)). +- Included by the client in subsequent QUIC connections for authentication. +- Validated by the server on connection establishment; the server rejects connections with unknown or expired tokens. +- Invalidated on `DeleteAccount` or explicit logout. + +The `Auth` message in `common.proto` carries the token for federation contexts: + +```protobuf +message Auth { + bytes access_token = 1; + bytes device_id = 2; +} +``` --- ## KeyPackage Storage -### Data Model +MLS KeyPackages are single-use by RFC 9420 requirement. The server stores a FIFO queue of KeyPackages per identity key. -KeyPackages are stored in a `FileBackedStore` using a `Mutex`-protected -`HashMap`: +### Data model ```text -key_packages: Mutex, VecDeque>>> - ^ ^ - | | - identity_key FIFO queue of - (32-byte Ed25519 TLS-encoded - public key) KeyPackage bytes +identity_key (32-byte Ed25519 pubkey) + -> VecDeque ``` -Each identity can have multiple KeyPackages queued. This is essential because -KeyPackages are single-use (per RFC 9420): once fetched by a peer, they are -permanently removed. Clients should upload several KeyPackages to handle -concurrent group invitations. +Each identity can have multiple KeyPackages queued. Clients should upload several packages after registration so that concurrent group invitations can each consume one without exhausting the supply. -The map is persisted to `data/keypackages.bin` using bincode serialization, -wrapped in the `QueueMapV1` struct. See [Storage Backend](storage-backend.md) -for persistence details. - -### uploadKeyPackage - -```capnp -uploadKeyPackage @0 (identityKey :Data, package :Data, auth :Auth) - -> (fingerprint :Data); -``` +### UploadKeyPackage (ID 300) **Handler logic:** -1. **Parse parameters.** Extract `identityKey`, `package`, and `auth`. +1. Validate `identity_key` (exactly 32 bytes) and `package` (non-empty, <= 1 MiB). +2. Compute `SHA-256(package)` as the fingerprint. +3. Push the package to the back of the identity's queue in the SQL store. +4. Return the fingerprint. -2. **Validate auth.** Call `validate_auth()` (see [Auth Validation](#auth-validation) - below). +The fingerprint allows the uploading client to detect server-side tampering. A peer that fetches a KeyPackage can compare its SHA-256 hash against the fingerprint communicated out-of-band. -3. **Validate inputs:** - - | Check | Constraint | Error Message | - |-------|------------|---------------| - | Identity key length | Exactly 32 bytes | `"identityKey must be exactly 32 bytes, got {n}"` | - | Package non-empty | `package.len() > 0` | `"package must not be empty"` | - | Package size cap | `package.len() <= 1,048,576` | `"package exceeds max size (1048576 bytes)"` | - -4. **Compute fingerprint.** `SHA-256(package_bytes)` produces a 32-byte digest. - -5. **Store.** `FileBackedStore::upload_key_package(identity_key, package)` pushes - the package to the back of the identity's `VecDeque` and flushes to disk. - -6. **Return fingerprint.** The SHA-256 hash is set in the response. - -The fingerprint allows the uploading client to verify that the server stored the -exact bytes it sent. See [KeyPackage Exchange Flow](keypackage-exchange.md) for -the client-side verification logic. - -### fetchKeyPackage - -```capnp -fetchKeyPackage @1 (identityKey :Data, auth :Auth) -> (package :Data); -``` +### FetchKeyPackage (ID 301) **Handler logic:** -1. **Parse and validate** `identityKey` (32 bytes) and `auth`. +1. Validate `identity_key` (exactly 32 bytes). +2. Pop from the front of the identity's queue (atomic operation). +3. Return the package bytes, or empty bytes if the queue is empty. -2. **Pop from queue.** `FileBackedStore::fetch_key_package(identity_key)` calls - `VecDeque::pop_front()` on the identity's queue, removing and returning the - oldest KeyPackage. The updated map is flushed to disk. - -3. **Return.** If a KeyPackage was available, set it in the response. If the - queue was empty (or the identity has no entry), return empty `Data`. - -**Single-use semantics:** The `pop_front()` operation ensures each KeyPackage is -returned exactly once. This is critical for MLS security -- reusing a KeyPackage -would allow conflicting group states. The removal is atomic with respect to the -`Mutex` lock, so concurrent fetch requests will not receive the same package. - -**Empty response handling:** The client checks `package.is_empty()` to -distinguish between "no packages available" and "package fetched." An empty -response is not an error -- it means the target identity has exhausted their -KeyPackage supply and needs to upload more. - ---- - -## Auth Validation - -All `NodeService` RPC methods accept an `Auth` struct: - -```capnp -struct Auth { - version @0 :UInt16; # 0 = legacy/none, 1 = token-based - accessToken @1 :Data; # opaque bearer token - deviceId @2 :Data; # optional UUID for auditing -} -``` - -The server validates this struct through the `validate_auth` function: - -```text -validate_auth(cfg, auth) - | - +-- version == 0? - | +-- cfg.allow_legacy_v0 == true? -> OK - | +-- cfg.allow_legacy_v0 == false? -> ERROR "auth version 0 disabled" - | - +-- version == 1? - | +-- accessToken empty? -> ERROR "requires non-empty accessToken" - | +-- cfg.required_token is Some? - | | +-- token matches? -> OK - | | +-- token mismatch? -> ERROR "invalid accessToken" - | +-- cfg.required_token is None? -> OK (any non-empty token accepted) - | - +-- version >= 2? -> ERROR "unsupported auth version" -``` - -### AuthConfig - -The server's auth behavior is controlled by `AuthConfig`: - -```rust -struct AuthConfig { - required_token: Option>, // None = accept any token - allow_legacy_v0: bool, // true = accept version 0 (no auth) -} -``` - -Configured via CLI flags / environment variables: - -| Flag / Env Var | Default | Purpose | -|-----------------------------------|---------|---------| -| `--auth-token` / `QPQ_AUTH_TOKEN` | None | Required bearer token. If unset, any non-empty token is accepted for version 1. | -| `--allow-auth-v0` / `QPQ_ALLOW_AUTH_V0` | `true` | Whether to accept `auth.version=0` (legacy, unauthenticated) requests. | - -### Version Semantics - -| Version | Meaning | Token Required? | -|---------|---------|-----------------| -| 0 | Legacy / unauthenticated | No. Token is ignored. Server must have `allow_legacy_v0 = true`. | -| 1 | Token-based authentication | Yes. Must be non-empty. Must match `required_token` if configured. | -| 2+ | Reserved for future use | Rejected. | - -### Current Limitations - -The current auth implementation is intentionally minimal: - -- **No identity binding.** The access token is not tied to a specific Ed25519 - identity. Any valid token can upload or fetch KeyPackages for any identity. -- **No rate limiting.** There is no per-identity or per-IP rate limiting. -- **No token rotation.** Tokens are static strings configured at server startup. -- **No device management.** The `deviceId` field is accepted but not used for - authorization decisions. - -The [Auth, Devices, and Tokens](../roadmap/authz-plan.md) roadmap item -addresses these gaps with a proper token issuance and validation system. +The pop is atomic with respect to the store lock, so concurrent fetch requests will not receive the same package. An empty response is not an error -- it means the target has exhausted its KeyPackage supply. --- ## Hybrid Key Endpoints -The AS also stores hybrid (X25519 + ML-KEM-768) public keys for post-quantum -envelope encryption. Unlike KeyPackages, hybrid keys are **not single-use** -- -they are stored persistently and can be fetched multiple times. +Hybrid (X25519 + ML-KEM-768) public keys are used for post-quantum sealed envelope encryption. Unlike KeyPackages, hybrid keys are not single-use. Each identity stores exactly one hybrid key; uploading a new key overwrites the previous one. -### uploadHybridKey - -```capnp -uploadHybridKey @6 (identityKey :Data, hybridPublicKey :Data) -> (); -``` +### UploadHybridKey (ID 302) **Handler logic:** -1. Validate `identityKey` (32 bytes) and `hybridPublicKey` (non-empty). -2. `FileBackedStore::upload_hybrid_key(identity_key, hybrid_pk)` stores the key, - overwriting any previous value for this identity. -3. Flushes to `data/hybridkeys.bin`. +1. Validate `identity_key` (32 bytes) and `hybrid_public_key` (non-empty). +2. Store the hybrid key, overwriting any previous value for this identity. +3. Return empty response. -The storage model is simpler than KeyPackages: a flat -`HashMap, Vec>` (identity key to hybrid public key bytes). There is -no queue -- each identity has at most one hybrid public key. +### FetchHybridKey (ID 303) -### fetchHybridKey +Non-destructive lookup. Returns the stored hybrid public key, or empty bytes if none is stored. The key persists across fetches. -```capnp -fetchHybridKey @7 (identityKey :Data) -> (hybridPublicKey :Data); -``` +### FetchHybridKeys (ID 304) -**Handler logic:** - -1. Validate `identityKey` (32 bytes). -2. Look up the hybrid public key in the store. Unlike `fetchKeyPackage`, this - does **not** remove the key -- it can be fetched repeatedly. -3. Return the key bytes, or empty `Data` if none is stored. - -See [Hybrid KEM](../protocol-layers/hybrid-kem.md) for how the client uses -these keys to wrap MLS payloads in post-quantum envelopes. +Batch variant. Returns one key per input identity key in the same order. Missing keys are returned as empty bytes at the corresponding index position. --- -## NodeServiceImpl Structure +## Key Transparency Integration -The server-side implementation struct: +The key transparency log (a Merkle append-only log) records key revocations and allows clients to audit the integrity of the key directory. + +### RevokeKey (ID 510) + +Appends a revocation entry to the KT Merkle log. Returns the leaf index of the revocation entry. Reasons: `"compromised"`, `"superseded"`, `"user_revoked"`. + +### CheckRevocation (ID 511) + +Returns the revocation status of an identity key: whether revoked, the reason, and the timestamp in milliseconds. + +### AuditKeyTransparency (ID 520) + +Returns a range of entries from the append-only log for client-side Merkle verification. Clients can verify the returned `root` hash against the Merkle tree built from the entries. + +--- + +## Server implementation structure ```rust -struct NodeServiceImpl { - store: Arc, // shared across connections - waiters: Arc, Arc>>, // long-poll notification - auth_cfg: Arc, // auth policy +// Domain handler (quicproquo-server/src/domain/) +struct AuthHandler { + store: Arc, // SQLCipher persistence + opaque_server: OpaqueServer, // opaque-ke server state } ``` -All connections share the same `store` and `waiters` via `Arc`. The -`DashMap, Arc>` is keyed by recipient key and provides the -push-notification mechanism for `fetchWait`. See -[Delivery Service Internals](delivery-service.md) for the long-polling -implementation. +All connections share the same `SqlStore` via `Arc`. The OPAQUE server state contains the server's long-term OPAQUE keypair, which is generated on first start and persisted to the database. --- -## Connection Model +## Related pages -```text -QUIC endpoint (port 7000) - +-- TLS 1.3 handshake (self-signed cert by default) - +-- Accept bidirectional stream - +-- capnp-rpc VatNetwork (Side::Server) - +-- NodeServiceImpl { store, waiters, auth_cfg } -``` - -Each QUIC connection opens one bidirectional stream for Cap'n Proto RPC. The -`capnp-rpc` crate uses `Rc>` internally, making it `!Send`. All RPC -tasks run on a `tokio::task::LocalSet` to satisfy this constraint. - -The server generates a self-signed TLS certificate on first start if no -certificate files exist. Certificate and key paths are configurable via -`--tls-cert` and `--tls-key`. - ---- - -## Health Endpoint - -```capnp -health @5 () -> (status :Text); -``` - -A simple readiness probe. Returns `"ok"` unconditionally. No auth validation is -performed. Useful for infrastructure health checks and measuring QUIC round-trip -time. - ---- - -## Related Pages - -- [KeyPackage Exchange Flow](keypackage-exchange.md) -- end-to-end upload and fetch flow including client-side logic -- [Delivery Service Internals](delivery-service.md) -- the DS half of NodeService -- [Storage Backend](storage-backend.md) -- FileBackedStore persistence model -- [GroupMember Lifecycle](group-member-lifecycle.md) -- how KeyPackages are generated and consumed -- [Auth, Devices, and Tokens](../roadmap/authz-plan.md) -- planned auth improvements -- [NodeService Schema](../wire-format/node-service-schema.md) -- Cap'n Proto schema reference -- [Hybrid KEM](../protocol-layers/hybrid-kem.md) -- post-quantum envelope encryption +- [Storage Backend](storage-backend.md) -- SqlStore and FileBackedStore persistence +- [Auth Schema](../wire-format/auth-schema.md) -- Protobuf wire definitions +- [Method ID Reference](../wire-format/envelope-schema.md) -- all 44 method IDs diff --git a/docs/src/internals/storage-backend.md b/docs/src/internals/storage-backend.md index f549ed7..aab8bcf 100644 --- a/docs/src/internals/storage-backend.md +++ b/docs/src/internals/storage-backend.md @@ -1,21 +1,152 @@ # Storage Backend -quicproquo uses two storage backends: `FileBackedStore` on the server side -for KeyPackages and delivery queues, and `DiskKeyStore` on the client side for -MLS cryptographic key material. Both follow the same pattern: in-memory data -structures backed by optional file persistence, with full serialization on every -write. +quicproquo uses two storage backends: `SqlStore` on the server side (SQLCipher-encrypted SQLite with Argon2id key derivation) and `DiskKeyStore` on the client side (bincode-serialised file for MLS cryptographic key material). **Sources:** -- `crates/quicproquo-server/src/storage.rs` (FileBackedStore) +- `crates/quicproquo-server/src/sql_store.rs` (SqlStore) +- `crates/quicproquo-server/src/storage.rs` (Store trait, FileBackedStore legacy) - `crates/quicproquo-core/src/keystore.rs` (DiskKeyStore, StoreCrypto) --- -## FileBackedStore (Server-Side) +## SqlStore (Server-Side) -`FileBackedStore` provides persistent storage for the server's three data -domains: KeyPackages, delivery queues, and hybrid public keys. +`SqlStore` is the primary server-side storage backend. It wraps SQLCipher (SQLite with AES-256 encryption) via the `rusqlite` crate and provides a connection pool for concurrent access. + +### Encryption + +The database file is encrypted with SQLCipher using a key derived from a server-supplied passphrase. The key is passed as the SQLCipher `PRAGMA key` on connection open. Key derivation uses Argon2id: the server generates a random salt on first start and derives the 32-byte SQLCipher key material from the passphrase using Argon2id with server-configured parameters. + +The database file is opaque without the key; an attacker with filesystem access cannot read any stored data without also compromising the server's key material. + +### Connection pool + +```rust +pub struct SqlStore { + pool: Vec>, // default pool_size = 4 +} +``` + +`SqlStore` maintains a fixed pool of SQLCipher connections (default: 4). Each request acquires a connection via `try_lock()` on each pool slot (non-blocking fast path), falling back to blocking on the first connection if all are busy. WAL journal mode allows concurrent readers; writers are serialised by SQLite's locking protocol. + +PRAGMA settings applied to every connection: + +| PRAGMA | Value | Effect | +|--------|-------|--------| +| `journal_mode` | `WAL` | Write-ahead logging for concurrent reads | +| `synchronous` | `NORMAL` | fsync on WAL checkpoints only (performance vs. durability trade-off) | +| `foreign_keys` | `ON` | Enforce referential integrity | + +### Schema and migrations + +The schema version is tracked via `PRAGMA user_version`. On first open, `SqlStore` applies all pending migrations in order. Migrations are embedded as SQL strings at compile time. + +Current schema version: **13** + +| Migration | Version | Content | +|-----------|---------|---------| +| `001_initial.sql` | 1 | Users, key_packages, deliveries, hybrid_keys tables | +| `002_add_seq.sql` | 3 | Delivery sequence numbers | +| `003_channels.sql` | 4 | Channel-aware delivery queues | +| `004_federation.sql` | 5 | Federation peer table | +| `005_signing_key.sql` | 6 | Server signing key storage | +| `006_kt_log.sql` | 7 | Key transparency Merkle log | +| `007_add_expiry.sql` | 8 | TTL/expiry columns on deliveries | +| `008_devices.sql` | 9 | Device registration table | +| `009_sessions.sql` | 10 | Session token table | +| `010_blobs.sql` | 11 | Blob storage table | +| `011_recovery_bundles.sql` | 12 | Recovery bundle table | +| `012_moderation.sql` | 13 | Reports and bans tables | + +If the database's `user_version` is greater than `SCHEMA_VERSION`, the server refuses to open it (downgrade protection). + +### Store trait + +`SqlStore` implements the `Store` trait defined in `storage.rs`: + +```rust +pub trait Store: Send + Sync { + fn upload_key_package(&self, identity_key: &[u8], package: Vec) -> Result<(), StorageError>; + fn fetch_key_package(&self, identity_key: &[u8]) -> Result>, StorageError>; + fn upload_hybrid_key(&self, identity_key: &[u8], hybrid_pk: Vec) -> Result<(), StorageError>; + fn fetch_hybrid_key(&self, identity_key: &[u8]) -> Result>, StorageError>; + fn enqueue(&self, recipient_key: &[u8], channel_id: &[u8], payload: Vec, ...) -> Result; + fn fetch(&self, recipient_key: &[u8], channel_id: &[u8], limit: u32, ...) -> Result)>, StorageError>; + fn ack(&self, recipient_key: &[u8], channel_id: &[u8], seq_up_to: u64, ...) -> Result<(), StorageError>; + fn store_session(&self, record: SessionRecord) -> Result<(), StorageError>; + fn fetch_session(&self, token: &[u8]) -> Result, StorageError>; + // ... and more +} +``` + +### Key package storage + +Key packages are stored in the `key_packages` table: + +```sql +CREATE TABLE key_packages ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + identity_key BLOB NOT NULL, + package_data BLOB NOT NULL, + created_at INTEGER NOT NULL DEFAULT (unixepoch()) +); +``` + +`upload_key_package` inserts a row. `fetch_key_package` selects and deletes the oldest row for the given identity key in a single transaction (atomic FIFO pop). This guarantees MLS's single-use requirement. + +### Delivery queue storage + +Delivery messages are stored in the `deliveries` table with per-message sequence numbers: + +```sql +CREATE TABLE deliveries ( + seq INTEGER PRIMARY KEY AUTOINCREMENT, + recipient BLOB NOT NULL, + channel_id BLOB NOT NULL DEFAULT '', + device_id BLOB NOT NULL DEFAULT '', + payload BLOB NOT NULL, + expires_at INTEGER, -- NULL = no expiry + message_id BLOB -- idempotency key +); +``` + +`enqueue` inserts a row and returns the `seq`. `fetch` selects rows with `seq > last_ack` ordered by `seq` and returns them without deleting. `ack(seq_up_to)` deletes all rows with `seq <= seq_up_to` for the given recipient, channel, and device. + +### Session storage + +Sessions issued after OPAQUE login are stored in the `sessions` table: + +```sql +CREATE TABLE sessions ( + token BLOB NOT NULL PRIMARY KEY, + identity BLOB NOT NULL, + device_id BLOB, + created_at INTEGER NOT NULL DEFAULT (unixepoch()), + expires_at INTEGER +); +``` + +The `token` is the 32-byte random session token returned by `OpaqueLoginFinish`. The server validates incoming tokens by looking up this table. + +### Error type + +```rust +#[derive(thiserror::Error, Debug)] +pub enum StorageError { + #[error("database error: {0}")] + Db(String), + #[error("serialization error")] + Serde, + #[error("not found")] + NotFound, +} +``` + +--- + +## FileBackedStore (Server-Side, Legacy) + +`FileBackedStore` was the original server-side storage backend. It uses bincode-serialised files with in-memory `Mutex`-protected `HashMap` structures. It remains available for development and testing but `SqlStore` is the production backend. ### Structure @@ -24,367 +155,115 @@ pub struct FileBackedStore { kp_path: PathBuf, // keypackages.bin ds_path: PathBuf, // deliveries.bin hk_path: PathBuf, // hybridkeys.bin - key_packages: Mutex, VecDeque>>>, // identity -> KP queue - deliveries: Mutex>>>, // (channel, recipient) -> msg queue - hybrid_keys: Mutex, Vec>>, // identity -> hybrid PK + key_packages: Mutex, VecDeque>>>, + deliveries: Mutex>>>, + hybrid_keys: Mutex, Vec>>, } ``` -Each domain has its own `Mutex`-protected in-memory map and its own disk file. -The `Mutex` (not `RwLock`) is used because every read-path operation that -modifies state (e.g., `pop_front` in `fetch_key_package`) requires exclusive -access. +File paths under the data directory: -### Initialization +| File | Contents | +|------|----------| +| `keypackages.bin` | KeyPackage queues (bincode `QueueMapV1`) | +| `deliveries.bin` | Delivery queues (bincode `QueueMapV2`) | +| `hybridkeys.bin` | Hybrid public keys (bincode `HashMap`) | -```rust -FileBackedStore::open(dir: impl AsRef) -> Result -``` - -1. Creates the directory if it does not exist. -2. Loads each map from its respective file, or initializes an empty map if the - file is missing. -3. Returns the initialized store. - -File paths: -- `{dir}/keypackages.bin` -- KeyPackage queues -- `{dir}/deliveries.bin` -- Delivery queues -- `{dir}/hybridkeys.bin` -- Hybrid public keys - -The default data directory is `data/`, configurable via `--data-dir` / -`QPQ_DATA_DIR`. - -### Flush-on-Every-Write - -Every mutation serializes the entire in-memory map to disk: - -```text -upload_key_package(identity_key, package) - | - +-- lock key_packages Mutex - | - +-- map.entry(identity_key).or_default().push_back(package) - | - +-- flush_kp_map(path, &map) - | +-- QueueMapV1 { map: map.clone() } - | +-- bincode::serialize(&payload) - | +-- fs::write(path, bytes) - | - +-- unlock Mutex -``` - -This approach is deliberately simple and correct: -- **Crash safety:** Every successful RPC response guarantees the data has been - written to the filesystem. -- **No partial writes:** The entire map is serialized atomically (though not to - a temp file with rename -- this is an MVP trade-off). -- **Performance:** Not suitable for production scale. Every write serializes and - writes the full map, which is O(n) in the total number of stored entries. - -**Production improvement path:** Replace with a proper database (SQLite, sled, -or similar) for incremental writes, WAL-based crash safety, and concurrent -access without full serialization. - -### KeyPackage Operations - -| Method | Behavior | -|--------|----------| -| `upload_key_package(identity_key, package)` | Push to back of VecDeque; flush | -| `fetch_key_package(identity_key)` | Pop from front (FIFO, single-use); flush | - -The KeyPackage map uses the `QueueMapV1` serialization wrapper: - -```rust -#[derive(Serialize, Deserialize, Default)] -struct QueueMapV1 { - map: HashMap, VecDeque>>, -} -``` - -### Delivery Queue Operations - -| Method | Behavior | -|--------|----------| -| `enqueue(recipient_key, channel_id, payload)` | Construct ChannelKey; push to back; flush | -| `fetch(recipient_key, channel_id)` | Construct ChannelKey; drain entire VecDeque; flush | - -The delivery map uses `QueueMapV2` with the compound `ChannelKey`: - -```rust -#[derive(Serialize, Deserialize, Clone, Eq, PartialEq, Debug)] -pub struct ChannelKey { - pub channel_id: Vec, - pub recipient_key: Vec, -} - -#[derive(Serialize, Deserialize, Default)] -struct QueueMapV2 { - map: HashMap>>, -} -``` - -See [Delivery Service Internals](delivery-service.md) for the full queue model -and channel-aware routing semantics. - -### V1/V2 Delivery Map Migration - -The delivery map format evolved from V1 (keyed by recipient key only) to V2 -(keyed by `ChannelKey` with channel ID + recipient key). The load function -handles both formats transparently: - -```rust -fn load_delivery_map(path: &Path) -> Result>>> { - let bytes = fs::read(path)?; - - // Try V2 format first (channel-aware). - if let Ok(map) = bincode::deserialize::(&bytes) { - return Ok(map.map); - } - - // Fallback to legacy V1 format: migrate by setting channel_id = empty. - let legacy: QueueMapV1 = bincode::deserialize(&bytes)?; - let mut upgraded = HashMap::new(); - for (recipient_key, queue) in legacy.map.into_iter() { - upgraded.insert( - ChannelKey { channel_id: Vec::new(), recipient_key }, - queue, - ); - } - Ok(upgraded) -} -``` - -Migration strategy: -1. Attempt to deserialize as V2 (`QueueMapV2`). If successful, use as-is. -2. If V2 fails, deserialize as V1 (`QueueMapV1`). Migrate each entry by - wrapping the recipient key in a `ChannelKey` with an empty `channel_id`. -3. The next flush will write V2 format, completing the migration. - -This in-place migration is transparent to clients. Legacy messages (pre-channel -routing) appear under the empty channel ID and can still be fetched by clients -that pass an empty `channelId`. - -### Hybrid Key Operations - -| Method | Behavior | -|--------|----------| -| `upload_hybrid_key(identity_key, hybrid_pk)` | Insert (overwrite); flush | -| `fetch_hybrid_key(identity_key)` | Read-only lookup; no flush needed | - -The hybrid key map is a flat `HashMap, Vec>` serialized directly -with bincode. Unlike KeyPackages, hybrid keys are not single-use -- they persist -until overwritten. - -### Error Type - -```rust -#[derive(thiserror::Error, Debug)] -pub enum StorageError { - #[error("io error: {0}")] - Io(String), - #[error("serialization error")] - Serde, -} -``` - -I/O errors (disk full, permission denied) and serialization errors (corrupt -file) are the two failure modes. The server converts `StorageError` to -`capnp::Error` via the `storage_err` helper for RPC responses. +Every write serialises the entire map to disk (O(n) per write). No encryption: data is stored in plaintext. Not recommended for production deployments; use `SqlStore` instead. --- ## DiskKeyStore (Client-Side) -`DiskKeyStore` is the client-side key store that implements the openmls -`OpenMlsKeyStore` trait. It holds MLS cryptographic key material -- most -importantly, the HPKE init private keys created during KeyPackage generation. +`DiskKeyStore` is the client-side key store that implements the openmls `OpenMlsKeyStore` trait. It holds MLS cryptographic key material, most importantly the HPKE init private keys created during KeyPackage generation. ### Structure ```rust pub struct DiskKeyStore { - path: Option, // None = ephemeral (in-memory only) + path: Option, // None = ephemeral (in-memory only) values: RwLock, Vec>>, // key reference -> serialized MLS entity } ``` -The `RwLock` (not `Mutex`) allows concurrent reads. Write operations (store, -delete) take an exclusive lock and flush to disk. - ### Modes | Mode | Constructor | Persistence | |------|-------------|-------------| | Ephemeral | `DiskKeyStore::ephemeral()` | None. Data exists only in memory. Lost on process exit. | -| Persistent | `DiskKeyStore::persistent(path)` | Yes. Every write flushes the full map to disk. Survives process restarts. | +| Persistent | `DiskKeyStore::persistent(path)` | Yes. Every write flushes the full map to disk. | -**Ephemeral mode** is used for tests and the `register` / `demo-group` CLI -commands where session resumption is not needed. +Persistent mode is used for production clients. The key store path is derived from the state file by changing the extension to `.ks`. -**Persistent mode** is used for production clients (`register-state`, `invite`, -`join`, `send`, `recv` commands). The key store file path is derived from the -state file path by changing the extension to `.ks`: +### Serialisation format -```rust -fn keystore_path(state_path: &Path) -> PathBuf { - let mut path = state_path.to_path_buf(); - path.set_extension("ks"); - path -} -``` +MLS entities MUST use bincode serialisation. The `DiskKeyStore` implements this with a two-layer scheme: -So `qpq-state.bin` produces a key store at `quicproquo-state.ks`. +1. **Inner layer:** Each MLS entity value (`V: MlsEntity`) is serialised using the openmls-required serialisation format. The `DiskKeyStore` in quicproquo uses bincode for MLS entity values, matching the `OpenMlsKeyStore` trait requirements. +2. **Outer layer:** The entire `HashMap, Vec>` is bincode-serialised as the file on disk. -### Persistence Format - -The key store is serialized as a bincode-encoded `HashMap, Vec>`. -Individual values are serialized using `serde_json` (as required by openmls's -`MlsEntity` trait bound): +**Important:** Do not use Protobuf or JSON for MLS entities. MLS requires bincode for the `DiskKeyStore` in this codebase. Using a different format will produce incompatible key material. ```rust fn store(&self, k: &[u8], v: &V) -> Result<(), Self::Error> { - let value = serde_json::to_vec(v)?; // MlsEntity -> JSON bytes - let mut values = self.values.write().unwrap(); + let value = bincode::serialize(v)?; // MlsEntity -> bincode bytes + let mut values = self.values.write()?; values.insert(k.to_vec(), value); - drop(values); // release lock before I/O - self.flush() // bincode serialize full map to disk + drop(values); + self.flush() // bincode-serialize full HashMap to disk } ``` -The two-layer serialization (JSON for values, bincode for the map) is a -consequence of openmls requiring `serde_json`-compatible serialization for MLS -entities, while the outer map uses bincode for compactness. +### OpenMlsKeyStore implementation -### OpenMlsKeyStore Implementation - -| Trait Method | DiskKeyStore Behavior | -|--------------|-----------------------| -| `store(k, v)` | JSON-serialize value, insert into HashMap, flush to disk | -| `read(k)` | Look up key, JSON-deserialize value, return `Option` | +| Trait method | DiskKeyStore behaviour | +|---|---| +| `store(k, v)` | bincode-serialize value, insert into HashMap, flush to disk | +| `read(k)` | Look up key, bincode-deserialize value, return `Option` | | `delete(k)` | Remove from HashMap, flush to disk | -The `read` method does not flush because it does not modify the map. A failed -deserialization (corrupt value) returns `None` rather than an error, which -matches the openmls `OpenMlsKeyStore` trait signature. +### StoreCrypto -### Flush Behavior - -```rust -fn flush(&self) -> Result<(), DiskKeyStoreError> { - let Some(path) = &self.path else { - return Ok(()); // ephemeral: no-op - }; - let values = self.values.read().unwrap(); - let bytes = bincode::serialize(&*values)?; - fs::create_dir_all(path.parent())?; // ensure parent dir exists - fs::write(path, bytes)?; - Ok(()) -} -``` - -Like `FileBackedStore`, the flush serializes the entire map on every write. -For client-side usage, the map is typically small (a handful of HPKE keys), so -this is not a performance concern. - -### Error Type - -```rust -#[derive(thiserror::Error, Debug, PartialEq, Eq)] -pub enum DiskKeyStoreError { - #[error("serialization error")] - Serialization, - #[error("io error: {0}")] - Io(String), -} -``` - ---- - -## StoreCrypto - -`StoreCrypto` is a composite type that bundles a `DiskKeyStore` with the -`RustCrypto` provider from `openmls_rust_crypto`. It implements the openmls -`OpenMlsCryptoProvider` trait, which is the single entry point that openmls -uses for all cryptographic operations: +`StoreCrypto` bundles `DiskKeyStore` with the `RustCrypto` provider: ```rust pub struct StoreCrypto { - crypto: RustCrypto, // AES-GCM, SHA-256, X25519, Ed25519, etc. - key_store: DiskKeyStore, // HPKE init keys, MLS epoch secrets, etc. -} - -impl OpenMlsCryptoProvider for StoreCrypto { - type CryptoProvider = RustCrypto; - type RandProvider = RustCrypto; - type KeyStoreProvider = DiskKeyStore; - - fn crypto() -> &RustCrypto { &self.crypto } - fn rand() -> &RustCrypto { &self.crypto } - fn key_store() -> &DiskKeyStore { &self.key_store } + crypto: RustCrypto, // AES-GCM, SHA-256, X25519, Ed25519 + key_store: DiskKeyStore, // HPKE init keys, MLS epoch secrets } ``` -`StoreCrypto` is the `backend` field of [`GroupMember`](group-member-lifecycle.md). -It is passed to every openmls operation -- `KeyPackage::builder().build()`, -`MlsGroup::new_with_group_id()`, `MlsGroup::new_from_welcome()`, -`create_message()`, `process_message()`, etc. - -The critical property is that the **same `StoreCrypto` instance** (and therefore -the same `DiskKeyStore`) must be used from `generate_key_package()` through -`join_group()`, because the HPKE init private key is stored in and read from -this key store. +It implements `OpenMlsCryptoProvider` and is the `backend` field of `GroupMember`. The same `StoreCrypto` instance must be used consistently from `generate_key_package()` through `join_group()`, because the HPKE init private key is written at package generation time and read at group join time. --- -## Storage Architecture Summary +## Storage architecture summary ```text Server Client ====== ====== -FileBackedStore DiskKeyStore -+-- key_packages (Mutex) +-- values (RwLock) -| Persisted: keypackages.bin | Persisted: {state}.ks -| Format: bincode(QueueMapV1) | Format: bincode(HashMap) -| | Values: serde_json(MlsEntity) -+-- deliveries (Mutex) | -| Persisted: deliveries.bin +-- Wrapped by StoreCrypto -| Format: bincode(QueueMapV2) | implements OpenMlsCryptoProvider -| Migration: V1 -> V2 on load | +SqlStore (production) DiskKeyStore ++-- SQLCipher-encrypted SQLite +-- values (RwLock) +| WAL mode, pool_size=4 | Persisted: {state}.ks +| Key: Argon2id(passphrase, salt) | Format: bincode(HashMap, Vec>) +| Schema: 13 migrations | Values: bincode(MlsEntity) +| Tables: users, key_packages, | +| deliveries, sessions, blobs, +-- Wrapped by StoreCrypto +| devices, kt_log, recovery_bundles, | implements OpenMlsCryptoProvider +| reports, banned_users, ... | | +-- Used by GroupMember.backend -+-- hybrid_keys (Mutex) - Persisted: hybridkeys.bin - Format: bincode(HashMap) +FileBackedStore (legacy / dev) ++-- keypackages.bin (bincode) ++-- deliveries.bin (bincode) ++-- hybridkeys.bin (bincode) + No encryption. Not for production. ``` -### Shared Design Patterns - -Both backends share these characteristics: - -1. **Full-map serialization.** Every write serializes the entire map to disk. - Simple, correct, but O(n) per write. - -2. **Bincode format.** The outer map is always bincode-serialized. Compact and - fast, but not human-readable and not forward-compatible without wrapper - structs. - -3. **No WAL / journaling.** A crash during `fs::write` could leave a corrupt - file. For the MVP, this is acceptable -- the data can be regenerated (clients - re-upload KeyPackages; delivery messages are ephemeral). - -4. **No compaction.** Empty queues are not removed from the map. Over time, the - serialized size can grow with stale entries. A production implementation - should periodically compact empty entries. - -5. **Directory creation.** Both backends call `fs::create_dir_all` before - writing, ensuring parent directories exist. - --- -## Related Pages +## Related pages -- [GroupMember Lifecycle](group-member-lifecycle.md) -- how `StoreCrypto` and `DiskKeyStore` are used during MLS operations -- [KeyPackage Exchange Flow](keypackage-exchange.md) -- upload and fetch through `FileBackedStore` -- [Delivery Service Internals](delivery-service.md) -- delivery queue operations -- [Authentication Service Internals](authentication-service.md) -- KeyPackage and hybrid key storage -- [Key Lifecycle and Zeroization](../cryptography/key-lifecycle.md) -- how HPKE keys are created and destroyed +- [Authentication Service Internals](authentication-service.md) -- KeyPackage and session storage +- [Wire Format Overview](../wire-format/overview.md) -- frame format and transport +- [Method ID Reference](../wire-format/envelope-schema.md) -- RPC method IDs diff --git a/docs/src/introduction.md b/docs/src/introduction.md index e7a6ec3..2814ad8 100644 --- a/docs/src/introduction.md +++ b/docs/src/introduction.md @@ -1,26 +1,26 @@ # Introduction -**quicproquo** is a research-oriented, end-to-end encrypted group messaging system written in Rust. It layers the Messaging Layer Security protocol (MLS, [RFC 9420](https://datatracker.ietf.org/doc/rfc9420/)) on top of QUIC + TLS 1.3 transport (via [quinn](https://github.com/quinn-rs/quinn) and [rustls](https://github.com/rustls/rustls)), with all service RPCs and wire messages framed using [Cap'n Proto](https://capnproto.org/). The project exists to explore how modern transport encryption (QUIC), a formally specified group key agreement protocol (MLS), and a zero-copy serialisation format (Cap'n Proto) compose in practice -- and to provide a readable, auditable reference implementation for security researchers, protocol designers, and Rust developers who want to study or extend the design. +**quicproquo** is a research-oriented, end-to-end encrypted group messaging system written in Rust. It layers the Messaging Layer Security protocol (MLS, [RFC 9420](https://datatracker.ietf.org/doc/rfc9420/)) on top of QUIC + TLS 1.3 transport (via [quinn](https://github.com/quinn-rs/quinn) and [rustls](https://github.com/rustls/rustls)), with all service RPCs framed using a compact binary Protocol Buffers format over a custom framing layer. The project exists to explore how modern transport encryption (QUIC), a formally specified group key agreement protocol (MLS), and post-quantum hybrid key encapsulation compose in practice -- and to provide a readable, auditable reference implementation for security researchers, protocol designers, and Rust developers who want to study or extend the design. --- ## Protocol stack ``` -┌─────────────────────────────────────────────┐ -│ Application / MLS ciphertext │ <- group key ratchet (RFC 9420) -├─────────────────────────────────────────────┤ -│ Cap'n Proto RPC │ <- typed, schema-versioned framing -├─────────────────────────────────────────────┤ -│ QUIC + TLS 1.3 (quinn/rustls) │ <- mutual auth + transport secrecy -└─────────────────────────────────────────────┘ ++---------------------------------------------+ +| Application / MLS ciphertext | <- group key ratchet (RFC 9420) ++---------------------------------------------+ +| Protobuf framing (custom binary header) | <- typed, length-prefixed framing ++---------------------------------------------+ +| QUIC + TLS 1.3 (quinn/rustls) | <- mutual auth + transport secrecy ++---------------------------------------------+ ``` Each layer addresses a distinct concern: -1. **QUIC + TLS 1.3** provides authenticated, confidential transport with 0-RTT connection establishment and multiplexed streams. The server presents a TLS 1.3 certificate (self-signed by default); the client verifies it against a local trust anchor. ALPN negotiation uses the token `b"capnp"`. +1. **QUIC + TLS 1.3** provides authenticated, confidential transport with 0-RTT connection establishment and multiplexed streams. The server presents a TLS 1.3 certificate (self-signed by default); the client verifies it against a local trust anchor. ALPN negotiation uses the token `qpq`. -2. **Cap'n Proto RPC** defines the wire schema for all service operations (KeyPackage upload/fetch, message enqueue/fetch, health probes). Schemas live in `schemas/*.capnp` and are compiled to Rust at build time. Because Cap'n Proto uses a pointer-based layout, messages can be read without an unpacking step -- though quicproquo currently uses the unpacked wire format for simplicity. +2. **Protobuf framing** defines the wire format for all service operations across 44 RPC methods. Each request carries a `[method_id: u16][request_id: u32][payload_len: u32]` header followed by a Protobuf-encoded payload. Server-to-client push events use a separate frame type on QUIC uni-streams. Message definitions live in `proto/qpq/v1/*.proto` and are compiled to Rust with `prost` at build time. 3. **MLS (RFC 9420)** provides the group key agreement layer. Each participant holds an Ed25519 identity keypair and generates single-use HPKE KeyPackages. The MLS epoch ratchet delivers forward secrecy and post-compromise security: compromising a member's state at epoch *n* does not reveal plaintext from epochs *< n* (forward secrecy) or *> n+1* (post-compromise security, once the compromised member updates). @@ -39,7 +39,7 @@ Each layer addresses a distinct concern: | Password auth | OPAQUE (password never sent to server) | | Metadata protection | Sealed sender + message padding | | Local storage | SQLCipher + Argon2id + ChaCha20-Poly1305 | -| Framing | Cap'n Proto (unpacked wire format, schema-versioned) | +| Framing | Protobuf (prost) with custom binary header (method_id, request_id, length) | For a deeper discussion of the cryptographic guarantees, threat model, and known gaps, see: @@ -51,22 +51,20 @@ For a deeper discussion of the cryptographic guarantees, threat model, and known ## Who is this for? -**Security researchers** studying how MLS composes with QUIC transport and Cap'n Proto framing. The codebase spans 12 crates with clear cryptographic boundaries for auditability. +**Security researchers** studying how MLS composes with QUIC transport and post-quantum hybrid KEM. The codebase spans 9 workspace crates with clear cryptographic boundaries for auditability. **Protocol designers** evaluating MLS deployment patterns. quicproquo implements a concrete Authentication Service (AS) and Delivery Service (DS) pair, demonstrating single-use KeyPackage lifecycle, Welcome routing, and epoch advancement in a live system. -**Application developers** building on the platform via SDKs: +**Application developers** building on the platform via the Rust SDK: -- **Go SDK** — native QUIC + Cap'n Proto client with full API -- **TypeScript SDK** — WASM crypto + WebSocket transport for browsers -- **C FFI** — cross-language integration (Python, Swift, Kotlin) +- **`quicproquo-sdk`** -- `QpqClient` with async event streams and a `ConversationStore` +- **C FFI** -- cross-language integration via `quicproquo-plugin-api` **Rust developers** looking for a working example of: - `quinn` + `rustls` server/client setup with self-signed certificates -- `capnp-rpc` over QUIC bidirectional streams (including the `!Send` / `LocalSet` constraint) +- Custom binary framing over QUIC bidirectional streams - `openmls` group creation, member addition, and application message encryption -- `wasm-bindgen` for compiling Rust crypto to WebAssembly - `zeroize`-on-drop key material handling --- @@ -77,19 +75,13 @@ For a deeper discussion of the cryptographic guarantees, threat model, and known |---|---| | **[Comparison with Classical Protocols](design-rationale/protocol-comparison.md)** | **Why quicproquo? IRC+SSL, XMPP, Telegram vs. our design** | | [Prerequisites](getting-started/prerequisites.md) | Toolchain and system dependencies | -| [Building from Source](getting-started/building.md) | `cargo build`, Cap'n Proto codegen, troubleshooting | +| [Building from Source](getting-started/building.md) | `cargo build`, Protobuf codegen, troubleshooting | | [Running the Server](getting-started/running-the-server.md) | Server startup, configuration, TLS cert generation | | [Running the Client](getting-started/running-the-client.md) | All CLI subcommands with examples | -| [REPL Command Reference](getting-started/repl-reference.md) | Complete list of 40+ slash commands | -| [Rich Messaging](getting-started/rich-messaging.md) | Reactions, typing, read receipts, edit/delete | -| [File Transfer](getting-started/file-transfer.md) | Chunked upload/download with SHA-256 verification | -| [Go SDK](getting-started/go-sdk.md) | Native QUIC + Cap'n Proto Go client | -| [TypeScript SDK & Browser Demo](getting-started/typescript-sdk.md) | WASM crypto + WebSocket transport | -| [Mesh Networking](getting-started/mesh-networking.md) | P2P, broadcast channels, store-and-forward, federation | | [Demo Walkthrough](getting-started/demo-walkthrough.md) | Step-by-step Alice-and-Bob narrative with sequence diagram | | [Architecture Overview](architecture/overview.md) | Crate boundaries, service architecture, data flow | -| [Protocol Layers](protocol-layers/overview.md) | Deep dives into QUIC/TLS, Cap'n Proto, MLS, Hybrid KEM | -| [Wire Format Reference](wire-format/overview.md) | Cap'n Proto schema documentation | +| [Protocol Layers](protocol-layers/overview.md) | Deep dives into QUIC/TLS, Protobuf framing, MLS, Hybrid KEM | +| [Wire Format Reference](wire-format/overview.md) | Protobuf schema documentation and method ID table | | [Cryptography](cryptography/overview.md) | Identity keys, key lifecycle, forward secrecy, PCS, threat model | | [Design Rationale](design-rationale/overview.md) | ADRs and protocol design decisions | | [Roadmap](roadmap/milestones.md) | Milestone tracker and future research directions | @@ -99,26 +91,28 @@ For a deeper discussion of the cryptographic guarantees, threat model, and known ## Current status quicproquo is a **research project** with production-grade features. It has -not been audited by a third party. The test suite covers 130+ tests across +not been audited by a third party. The test suite covers 301 tests across core, server, client, E2E, and P2P modules. **What works today:** -- Full-featured REPL with 40+ commands: DMs, groups, reactions, typing, - edit/delete, file transfer, disappearing messages, safety numbers, MLS key - rotation, account deletion -- Go SDK, TypeScript SDK (WASM crypto + browser demo), C FFI + Python bindings -- Mesh networking: P2P via iroh, mDNS discovery, federation, store-and-forward, - broadcast channels -- Dynamic plugin system with 6 C-compatible hook points -- 24 Cap'n Proto RPC methods on the server +- OPAQUE password authentication (register + login, 4-method handshake) +- 44 Protobuf RPC methods across 14 proto files and 9 workspace crates +- MLS group creation, member add, message encryption, and epoch advancement +- Hybrid X25519 + ML-KEM-768 key encapsulation for post-quantum readiness +- SQLCipher-backed local storage with Argon2id key derivation +- Key transparency (REVOKE, CHECK_REVOCATION, AUDIT) +- Multi-device management and push notification registration +- Blob storage (upload/download) +- Federation relay for cross-server message delivery +- Content moderation (report, ban, unban) +- Account recovery bundle store **Known limitations:** - MLS credentials use `CredentialType::Basic` (raw public key). A production system would bind credentials to a certificate authority or use X.509 certificates. -- The hybrid KEM envelope is implemented and tested, but not yet integrated into the OpenMLS CryptoProvider for full post-quantum MLS (milestone M7). -- Browser connectivity requires a WebSocket-to-Cap'n-Proto bridge proxy (not yet included). -- The GUI crate (`quicproquo-gui`) requires GTK system libraries and is not feature-complete. +- The hybrid KEM envelope is implemented and tested, but not yet integrated into the OpenMLS CryptoProvider for full post-quantum MLS (planned for a future milestone). +- Browser connectivity requires a WebSocket-to-Protobuf bridge proxy (not yet included). For the full milestone tracker, see [Milestones](roadmap/milestones.md). diff --git a/docs/src/operations/backup-restore.md b/docs/src/operations/backup-restore.md new file mode 100644 index 0000000..0ba2480 --- /dev/null +++ b/docs/src/operations/backup-restore.md @@ -0,0 +1,201 @@ +# Backup and Restore Procedures + +This document covers backup and restore for all quicproquo server data stores. + +## Data Inventory + +| Data | Location | Backend | Contains | +|------|----------|---------|----------| +| SQLCipher DB | `QPQ_DB_PATH` (default `data/qpq.db`) | `store_backend=sql` | Users, key packages, delivery queues, sessions, KT log, OPAQUE setup, blobs metadata, moderation | +| File store | `QPQ_DATA_DIR` (default `data/`) | `store_backend=file` | Bincode-serialized key packages, delivery queues, server state | +| Blob storage | `QPQ_DATA_DIR/blobs/` | Filesystem | Uploaded file transfer blobs | +| TLS certificates | `QPQ_TLS_CERT`, `QPQ_TLS_KEY` | DER files | Server identity | +| OPAQUE ServerSetup | Inside DB or file store | Persisted | OPAQUE credential state (critical for auth) | +| Server signing key | Inside DB or file store | Persisted | Ed25519 key for delivery proofs | +| KT Merkle log | Inside DB or file store | Persisted | Key transparency audit log | + +## SQLCipher Backup + +### Hot Backup (Online) + +SQLCipher supports the `.backup` command while the server is running (WAL mode +allows concurrent readers). + +```bash +# 1. Open the encrypted database with the same key +sqlite3 data/qpq.db + +# 2. At the sqlite3 prompt, set the encryption key +PRAGMA key = 'your-db-key-here'; + +# 3. Perform an online backup +.backup /backups/qpq-$(date +%Y%m%d-%H%M%S).db + +.quit +``` + +### Scripted Hot Backup + +```bash +#!/bin/bash +set -euo pipefail + +BACKUP_DIR="/backups/qpq" +DB_PATH="${QPQ_DB_PATH:-data/qpq.db}" +DB_KEY="${QPQ_DB_KEY}" +TIMESTAMP=$(date +%Y%m%d-%H%M%S) +BACKUP_FILE="${BACKUP_DIR}/qpq-${TIMESTAMP}.db" + +mkdir -p "$BACKUP_DIR" + +sqlite3 "$DB_PATH" </dev/null || true +cp data/qpq.db-shm /backups/ 2>/dev/null || true + +# 4. Restart the server +systemctl start qpq-server +``` + +## File Backend Backup + +When using `store_backend=file`, data is stored as bincode files under +`QPQ_DATA_DIR`. + +```bash +# Full directory backup +tar czf /backups/qpq-data-$(date +%Y%m%d-%H%M%S).tar.gz \ + -C "$(dirname "${QPQ_DATA_DIR:-data}")" \ + "$(basename "${QPQ_DATA_DIR:-data}")" +``` + +## Blob Storage Backup + +Blobs are stored in `QPQ_DATA_DIR/blobs/`. These are immutable once written. + +```bash +# Incremental rsync (blobs are write-once, ideal for rsync) +rsync -av --progress data/blobs/ /backups/blobs/ +``` + +## TLS Certificate Backup + +```bash +# Back up TLS certificates (store separately from DB backups) +cp data/server-cert.der /backups/tls/server-cert.der +cp data/server-key.der /backups/tls/server-key.der + +# Federation certs (if federation is enabled) +cp data/federation-cert.der /backups/tls/federation-cert.der 2>/dev/null || true +cp data/federation-key.der /backups/tls/federation-key.der 2>/dev/null || true +cp data/federation-ca.der /backups/tls/federation-ca.der 2>/dev/null || true +``` + +## Restore Procedures + +### Restore SQLCipher Database + +```bash +# 1. Stop the server +systemctl stop qpq-server + +# 2. Move the current (corrupt/lost) database aside +mv data/qpq.db data/qpq.db.broken 2>/dev/null || true +rm -f data/qpq.db-wal data/qpq.db-shm + +# 3. Copy the backup in place +cp /backups/qpq-20260304.db data/qpq.db + +# 4. Verify integrity +sqlite3 data/qpq.db "PRAGMA key = '${QPQ_DB_KEY}'; PRAGMA integrity_check;" + +# 5. Start the server (migrations will apply automatically if needed) +systemctl start qpq-server +``` + +### Restore File Backend + +```bash +# 1. Stop the server +systemctl stop qpq-server + +# 2. Replace the data directory +mv data data.broken 2>/dev/null || true +tar xzf /backups/qpq-data-20260304.tar.gz -C . + +# 3. Restore TLS certs if not included in the data backup +cp /backups/tls/server-cert.der data/server-cert.der +cp /backups/tls/server-key.der data/server-key.der + +# 4. Start the server +systemctl start qpq-server +``` + +### Restore Blobs Only + +```bash +rsync -av /backups/blobs/ data/blobs/ +``` + +## Backup Schedule Recommendations + +| Frequency | What | Method | +|-----------|------|--------| +| Every 6 hours | SQLCipher database | Hot backup script via cron | +| Daily | File backend / full data dir | tar + offsite copy | +| Continuous | Blobs | rsync (incremental) | +| On change | TLS certificates | Manual + secret manager | + +## Cron Example + +```cron +# SQLCipher hot backup every 6 hours +0 */6 * * * /opt/qpq/scripts/backup-db.sh >> /var/log/qpq-backup.log 2>&1 + +# Full data directory daily at 02:00 +0 2 * * * tar czf /backups/qpq-data-$(date +\%Y\%m\%d).tar.gz -C /var/lib quicproquo + +# Blob sync every hour +0 * * * * rsync -a /var/lib/quicproquo/blobs/ /backups/blobs/ + +# Prune backups older than 30 days +0 3 * * 0 find /backups -name 'qpq-*' -mtime +30 -delete +``` + +## Verification + +Always verify backups after creation: + +```bash +# SQLCipher integrity check +sqlite3 /backups/qpq-latest.db \ + "PRAGMA key = '${QPQ_DB_KEY}'; PRAGMA integrity_check; SELECT count(*) FROM users;" + +# File backend: check the archive is valid +tar tzf /backups/qpq-data-latest.tar.gz > /dev/null + +# TLS cert: check it parses and is not expired +openssl x509 -inform DER -in /backups/tls/server-cert.der -noout -dates +``` diff --git a/docs/src/operations/monitoring.md b/docs/src/operations/monitoring.md new file mode 100644 index 0000000..823b92c --- /dev/null +++ b/docs/src/operations/monitoring.md @@ -0,0 +1,233 @@ +# Monitoring Guide + +This document covers metrics collection, alerting, and dashboards for +quicproquo server deployments. + +## Enabling Metrics + +The server exports Prometheus metrics via HTTP when configured: + +```bash +# Environment variables +QPQ_METRICS_LISTEN=0.0.0.0:9090 +QPQ_METRICS_ENABLED=true + +# Or in qpq-server.toml +metrics_listen = "0.0.0.0:9090" +metrics_enabled = true +``` + +Metrics are served at `http:///metrics` in Prometheus +exposition format. + +## Available Metrics + +### Counters + +| Metric | Description | Labels | +|--------|-------------|--------| +| `enqueue_total` | Total messages enqueued | - | +| `enqueue_bytes_total` | Total bytes enqueued | - | +| `fetch_total` | Total message fetches completed | - | +| `fetch_wait_total` | Total long-poll fetch waits | - | +| `key_package_upload_total` | Total MLS key package uploads | - | +| `auth_login_success_total` | Successful OPAQUE login completions | - | +| `auth_login_failure_total` | Failed login attempts | - | +| `rate_limit_hit_total` | Rate limit rejections | - | + +### Gauges + +| Metric | Description | +|--------|-------------| +| `delivery_queue_depth` | Current delivery queue depth (sampled) | + +## Prometheus Configuration + +```yaml +# prometheus.yml +global: + scrape_interval: 15s + evaluation_interval: 15s + +scrape_configs: + - job_name: 'qpq-server' + static_configs: + - targets: ['qpq-server:9090'] + scrape_interval: 10s +``` + +## Alert Rules + +```yaml +# prometheus-alerts.yml +groups: + - name: qpq-server + rules: + # Server down + - alert: QpqServerDown + expr: up{job="qpq-server"} == 0 + for: 1m + labels: + severity: critical + annotations: + summary: "qpq-server is down" + description: "Prometheus cannot scrape qpq-server metrics for > 1 minute." + + # High auth failure rate (potential brute force) + - alert: QpqHighAuthFailureRate + expr: rate(auth_login_failure_total[5m]) > 10 + for: 2m + labels: + severity: warning + annotations: + summary: "High authentication failure rate" + description: "{{ $value | printf \"%.1f\" }} auth failures/sec over 5 minutes." + + # Rate limiting active + - alert: QpqRateLimitActive + expr: rate(rate_limit_hit_total[5m]) > 5 + for: 5m + labels: + severity: warning + annotations: + summary: "Rate limiting is actively rejecting requests" + description: "{{ $value | printf \"%.1f\" }} rate limit hits/sec." + + # Delivery queue growing + - alert: QpqDeliveryQueueHigh + expr: delivery_queue_depth > 10000 + for: 10m + labels: + severity: warning + annotations: + summary: "Delivery queue depth is high" + description: "Queue depth: {{ $value }}. Clients may not be fetching." + + - alert: QpqDeliveryQueueCritical + expr: delivery_queue_depth > 100000 + for: 5m + labels: + severity: critical + annotations: + summary: "Delivery queue depth is critical" + description: "Queue depth: {{ $value }}. Investigate immediately." + + # No enqueue activity (service may be stuck) + - alert: QpqNoEnqueueActivity + expr: rate(enqueue_total[15m]) == 0 + for: 30m + labels: + severity: warning + annotations: + summary: "No messages enqueued in 30 minutes" + description: "Check if the service is accepting connections." + + # Auth success ratio too low + - alert: QpqLowAuthSuccessRatio + expr: > + rate(auth_login_success_total[5m]) + / (rate(auth_login_success_total[5m]) + rate(auth_login_failure_total[5m])) + < 0.5 + for: 10m + labels: + severity: warning + annotations: + summary: "Auth success ratio below 50%" + description: "More than half of login attempts are failing." +``` + +## Key Dashboard Panels + +See `dashboards/qpq-overview.json` for the full Grafana dashboard. Key panels: + +### Message Throughput + +- **Enqueue rate**: `rate(enqueue_total[5m])` +- **Fetch rate**: `rate(fetch_total[5m])` +- **Enqueue bandwidth**: `rate(enqueue_bytes_total[5m])` + +### Authentication + +- **Login success rate**: `rate(auth_login_success_total[5m])` +- **Login failure rate**: `rate(auth_login_failure_total[5m])` +- **Success ratio**: `rate(auth_login_success_total[5m]) / (rate(auth_login_success_total[5m]) + rate(auth_login_failure_total[5m]))` + +### Delivery Queue + +- **Queue depth**: `delivery_queue_depth` +- **Queue growth rate**: `deriv(delivery_queue_depth[10m])` + +### Rate Limiting + +- **Rate limit hits**: `rate(rate_limit_hit_total[5m])` + +### Infrastructure (Node Exporter) + +- CPU, memory, disk, network from `node_exporter` + +## Grafana Dashboard + +Import the dashboard from `dashboards/qpq-overview.json`: + +1. Open Grafana -> Dashboards -> Import +2. Upload `docs/operations/dashboards/qpq-overview.json` +3. Select your Prometheus data source +4. Save + +## Log Monitoring + +The server uses `tracing` with `RUST_LOG` environment variable: + +```bash +# Production: info level with structured JSON output +RUST_LOG=info + +# Debug specific modules +RUST_LOG=info,quicproquo_server::node_service=debug + +# Verbose debugging +RUST_LOG=debug +``` + +### Key Log Messages to Monitor + +| Log Pattern | Meaning | Action | +|-------------|---------|--------| +| `"TLS certificate expires within 30 days"` | Cert expiring soon | Rotate certificate | +| `"TLS certificate is self-signed"` | Self-signed cert in use | Replace with CA-signed cert in production | +| `"connection rate limit exceeded"` | IP being rate limited | Check for DDoS | +| `"running without QPQ_AUTH_TOKEN"` | Insecure mode | Must not appear in production | +| `"db_key is empty; SQL store will be plaintext"` | Unencrypted DB | Must not appear in production | +| `"shutdown signal received"` | Graceful shutdown started | Expected during deploys | +| `"generated and persisted new OPAQUE ServerSetup"` | Fresh OPAQUE setup | Expected on first start only | + +### Log Aggregation + +For production, pipe logs to a log aggregator: + +```bash +# Systemd -> journald -> Loki/Elasticsearch +journalctl -u qpq-server -f --output=json | \ + promtail --stdin --client.url=http://loki:3100/loki/api/v1/push + +# Docker -> Loki driver +docker run --log-driver=loki \ + --log-opt loki-url="http://loki:3100/loki/api/v1/push" \ + qpq-server +``` + +## Health Checking + +The Docker image includes a basic health check (TLS cert file exists). For +deeper health checks: + +```bash +# Simple: check the process is running and port is open +ss -ulnp | grep 5001 + +# Metrics endpoint (if enabled) +curl -sf http://localhost:9090/metrics > /dev/null + +# Full client connection test +qpq-client --server 127.0.0.1:5001 --ping +``` diff --git a/docs/src/operations/scaling-guide.md b/docs/src/operations/scaling-guide.md new file mode 100644 index 0000000..098ef51 --- /dev/null +++ b/docs/src/operations/scaling-guide.md @@ -0,0 +1,251 @@ +# Scaling Guide + +This document covers resource sizing, scaling triggers, and capacity planning +for quicproquo deployments. + +## Architecture Overview + +quicproquo runs as a single-process server handling QUIC connections. Key +resource consumers: + +- **CPU**: TLS 1.3 handshakes (QUIC), OPAQUE PAKE authentication, message routing +- **Memory**: In-memory session state (DashMap), QUIC connection state, delivery waiters, rate limit entries +- **Disk I/O**: SQLCipher reads/writes (WAL mode), blob storage, KT Merkle log +- **Network**: QUIC (UDP), metrics HTTP, optional WebSocket bridge + +## Single-Node Sizing + +### Minimum (Development / Small Team) + +| Resource | Value | +|----------|-------| +| CPU | 1 vCPU | +| Memory | 512 MB | +| Disk | 10 GB SSD | +| Network | 100 Mbps | + +Supports approximately 100 concurrent users, light message traffic. + +### Recommended (Production / Small-Medium) + +| Resource | Value | +|----------|-------| +| CPU | 2-4 vCPU | +| Memory | 2-4 GB | +| Disk | 50-100 GB NVMe SSD | +| Network | 1 Gbps | + +Supports approximately 1,000-5,000 concurrent users. + +### Large (High Traffic) + +| Resource | Value | +|----------|-------| +| CPU | 8+ vCPU | +| Memory | 8-16 GB | +| Disk | 500 GB+ NVMe SSD (RAID 10) | +| Network | 10 Gbps | + +Supports approximately 10,000+ concurrent users. + +## Scaling Triggers + +Monitor these metrics and scale when thresholds are exceeded: + +| Metric | Warning | Critical | Action | +|--------|---------|----------|--------| +| CPU usage | > 70% sustained (5 min) | > 90% sustained | Add CPU or scale horizontally | +| Memory usage | > 75% | > 90% | Increase memory, check for leaks | +| Disk usage | > 70% | > 90% | Expand volume, clean old data | +| Disk I/O latency | > 5 ms p95 | > 20 ms p95 | Move to faster storage | +| `delivery_queue_depth` | > 10,000 | > 100,000 | Investigate stale queues | +| `rate_limit_hit_total` rate | > 100/min | > 1000/min | Investigate abuse, adjust limits | +| `auth_login_failure_total` rate | > 50/min | > 500/min | Potential brute force attack | +| Connection count | > 80% of `max_concurrent_bidi_streams` | > 95% | Scale horizontally | +| TLS handshake latency | > 100 ms p95 | > 500 ms p95 | Add CPU, check network | + +## Vertical Scaling + +### CPU Scaling + +The server is async (Tokio) and benefits from multiple cores. QUIC TLS +handshakes and OPAQUE computations are CPU-intensive. + +```bash +# Check current CPU usage +top -bn1 -p $(pgrep qpq-server) + +# For Docker: increase CPU limits in docker-compose.prod.yml +# deploy: +# resources: +# limits: +# cpus: '4' +``` + +### Memory Scaling + +In-memory state scales linearly with concurrent connections: + +- ~2-5 KB per active QUIC connection (quinn state) +- ~200 bytes per session entry (DashMap) +- ~100 bytes per rate limit entry +- ~100 bytes per delivery waiter + +```bash +# Estimate memory for 10,000 connections: +# 10,000 * 5 KB = ~50 MB for connections +# 10,000 * 500 bytes = ~5 MB for sessions/rate limits +# SQLCipher connection pool: ~50 MB (4 connections, caches) +# Base process: ~30 MB +# Total: ~135 MB + headroom = 256-512 MB minimum +``` + +### Disk I/O Scaling + +SQLCipher uses WAL mode for concurrent reads. For write-heavy workloads: + +```bash +# Check current I/O +iostat -x 1 5 + +# Increase WAL autocheckpoint threshold for burst writes +sqlite3 data/qpq.db "PRAGMA key='${QPQ_DB_KEY}'; PRAGMA wal_autocheckpoint=2000;" +``` + +## Horizontal Scaling + +quicproquo does not yet have built-in multi-node clustering. For horizontal +scaling, use these patterns: + +### Load Balancer (UDP/QUIC) + +Place a UDP load balancer in front of multiple qpq-server instances. Each +instance runs independently with its own database. + +``` + +-----------+ + clients ------> | L4 LB | ----> qpq-server-1 (db-1) + | (UDP/QUIC)| ----> qpq-server-2 (db-2) + +-----------+ qpq-server-3 (db-3) +``` + +**Requirements:** + +- Sticky sessions (by client IP or QUIC connection ID) so a client always + reaches the same node. +- Shared storage backend or federation between nodes. + +### Federation for Multi-Node + +Enable federation to relay messages between nodes: + +```toml +# qpq-server.toml on node-1 +[federation] +enabled = true +domain = "node1.chat.example.com" +listen = "0.0.0.0:7001" +federation_cert = "data/federation-cert.der" +federation_key = "data/federation-key.der" +federation_ca = "data/federation-ca.der" + +[[federation.peers]] +domain = "node2.chat.example.com" +address = "10.0.1.2:7001" +``` + +### Shared Database (Future) + +For true horizontal scaling, migrating from SQLCipher to a shared PostgreSQL +instance is the planned approach. This is not yet implemented. + +``` + qpq-server-1 --\ + qpq-server-2 ---+--> PostgreSQL (shared) + qpq-server-3 --/ +``` + +## Connection Tuning + +The server has these QUIC transport defaults: + +| Parameter | Default | Tunable | +|-----------|---------|---------| +| Max idle timeout | 300s (5 min) | Code change required | +| Max concurrent bidi streams | 1 per connection | Code change required | +| SQLCipher connection pool | 4 connections | Code change required | + +For high connection counts: + +```bash +# Increase OS file descriptor limit +ulimit -n 65536 + +# Increase UDP buffer sizes in /etc/sysctl.d/99-qpq.conf +net.core.rmem_max = 26214400 +net.core.wmem_max = 26214400 +net.core.rmem_default = 1048576 +net.core.wmem_default = 1048576 +``` + +```bash +sysctl -p /etc/sysctl.d/99-qpq.conf +``` + +## Docker Resource Limits + +```yaml +# docker-compose.prod.yml +services: + server: + deploy: + resources: + limits: + cpus: '4' + memory: 4G + reservations: + cpus: '2' + memory: 1G + ulimits: + nofile: + soft: 65536 + hard: 65536 +``` + +## Load Testing + +Use the included test infrastructure to benchmark: + +```bash +# Build the test client +cargo build --release --bin qpq-client + +# Run concurrent connection test (example) +for i in $(seq 1 100); do + qpq-client --server 127.0.0.1:5001 & +done +wait + +# Monitor during load test +watch -n1 'curl -s http://localhost:9090/metrics | grep -E "enqueue_total|fetch_total|delivery_queue_depth|rate_limit"' +``` + +## Capacity Planning Worksheet + +| Parameter | Your Value | +|-----------|-----------| +| Expected concurrent users | | +| Messages per user per hour | | +| Average message size (bytes) | | +| Blob uploads per day | | +| Average blob size (MB) | | +| Data retention (days) | | + +**Formulas:** + +``` +Storage per day = (users * msgs/hr * 24 * avg_msg_size) + (blob_uploads * avg_blob_size) +DB growth per month = storage_per_day * 30 +Memory estimate = (concurrent_users * 5 KB) + 256 MB base +CPU estimate = 1 vCPU per ~2,500 concurrent connections (depends on message rate) +``` diff --git a/docs/src/protocol-layers/capn-proto.md b/docs/src/protocol-layers/capn-proto.md index 55bfe7a..2c73896 100644 --- a/docs/src/protocol-layers/capn-proto.md +++ b/docs/src/protocol-layers/capn-proto.md @@ -1,264 +1,299 @@ -# Cap'n Proto Serialisation and RPC +# Protobuf Framing -quicproquo uses [Cap'n Proto](https://capnproto.org/) for both message serialisation and remote procedure calls. The serialisation layer encodes structured messages (Envelopes, Auth tokens, delivery payloads) into a compact binary format. The RPC layer provides the client-server interface for the Authentication Service, Delivery Service, and health checks -- all exposed through a single `NodeService` interface. +quicproquo v2 uses a custom binary framing protocol layered over QUIC bidirectional streams. Message payloads are serialised with Protocol Buffers (Protobuf) via the `prost` crate. The framing layer (implemented in `quicproquo-rpc`) adds a compact fixed-size header that carries the method ID, request correlation ID, and payload length -- enabling zero-copy dispatch without a separate length-delimited codec. -This page covers why Cap'n Proto was chosen, how schemas are compiled, the owned `ParsedEnvelope` type, serialisation helpers, and ALPN integration with QUIC. +This page covers the three frame types, the method ID dispatch table, status codes, push event delivery, and the Protobuf schema organisation. -## Why Cap'n Proto +--- -Several serialisation formats were considered. The table below summarises the trade-offs: +## Frame Types -| Format | Zero-copy reads | Schema enforcement | Built-in RPC | Canonical bytes for signing | -|---|---|---|---|---| -| **Cap'n Proto** | Yes | Yes (`.capnp` schemas) | Yes (`capnp-rpc`) | Yes (canonical serialisation mode) | -| Protocol Buffers | No (requires deserialisation) | Yes (`.proto` schemas) | Yes (`tonic`/gRPC) | No (non-deterministic field ordering) | -| MessagePack | No | No (untyped) | No | No | -| FlatBuffers | Yes | Yes (`.fbs` schemas) | No built-in RPC | Partial | +There are three frame types in the v2 protocol. All multi-byte integers are **big-endian** (network byte order). -Cap'n Proto was selected for the following reasons: +### Request Frame (client -> server) -1. **Zero-copy reads**: Cap'n Proto messages can be read directly from the wire buffer without deserialisation. The `Reader` type is a thin pointer into the original bytes. This eliminates allocation and copying on the hot path (message routing in the Delivery Service). +Sent on a QUIC bidirectional stream (one stream per RPC call): -2. **Schema-enforced types**: All messages are defined in `.capnp` schema files. The compiler (`capnpc`) generates type-safe Rust code that prevents mismatched field types at compile time. This is especially valuable for a security-sensitive protocol where a type confusion bug could be exploitable. - -3. **Canonical serialisation**: Cap'n Proto can produce deterministic byte representations of messages. This is critical for MLS, where Commits and KeyPackages must be signed -- the signature must cover exactly the same bytes that the verifier will see. - -4. **Built-in async RPC**: The `capnp-rpc` crate provides a capability-based RPC system with promise pipelining. quicproquo uses it for the `NodeService` interface (KeyPackage upload/fetch, message enqueue/fetch, health checks, hybrid key operations). This avoids the need to hand-roll a request/response protocol. - -5. **Compact wire format**: Cap'n Proto's wire format is more compact than JSON or XML and comparable to Protocol Buffers, with the advantage of no decode step. - -## Schema compilation flow - -Cap'n Proto schemas live in the workspace-root `schemas/` directory: - -```text -schemas/ - envelope.capnp -- Top-level wire message (MsgType enum + payload) - auth.capnp -- AuthenticationService RPC interface (legacy, pre-M3) - delivery.capnp -- DeliveryService RPC interface (legacy, pre-M3) - node.capnp -- Unified NodeService RPC interface (M3+) +``` + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| method_id (u16 BE) | request_id (u32 BE) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| request_id (cont.) | payload_len (u32 BE) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| payload_len (cont.) | protobuf payload ... | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ``` -### build.rs +| Field | Type | Bytes | Description | +|---|---|---|---| +| `method_id` | `u16` | 0-1 | RPC method identifier (see method IDs table) | +| `request_id` | `u32` | 2-5 | Client-generated correlation ID; echoed back in the response | +| `payload_len` | `u32` | 6-9 | Length of the Protobuf payload in bytes | +| payload | bytes | 10+ | Protobuf-encoded request message | -The `quicproquo-proto` crate compiles these schemas at build time via `build.rs`: +Header size: **10 bytes**. Maximum payload: **4 MiB**. + +### Response Frame (server -> client) + +Sent on the same QUIC bidirectional stream as the request: + +``` + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| status (u8) | request_id (u32 BE) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| request_id (cont.) | payload_len (u32 BE) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| payload_len (cont.) | protobuf payload ... | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +``` + +| Field | Type | Bytes | Description | +|---|---|---|---| +| `status` | `u8` | 0 | Status code (see status codes table) | +| `request_id` | `u32` | 1-4 | Echoes the `request_id` from the request frame | +| `payload_len` | `u32` | 5-8 | Length of the Protobuf payload in bytes | +| payload | bytes | 9+ | Protobuf-encoded response message (may be empty on error) | + +Header size: **9 bytes**. + +### Push Frame (server -> client, uni-stream) + +Sent by the server on QUIC uni-directional streams for real-time event delivery. No request ID -- push frames are not correlated to any client request. + +``` + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| event_type (u16 BE) | payload_len (u32 BE) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| payload_len (cont.) | protobuf payload ... | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +``` + +| Field | Type | Bytes | Description | +|---|---|---|---| +| `event_type` | `u16` | 0-1 | Push event type (see push event types table) | +| `payload_len` | `u32` | 2-5 | Length of the Protobuf payload in bytes | +| payload | bytes | 6+ | Protobuf-encoded push event message | + +Header size: **6 bytes**. + +--- + +## Status Codes + +The `status` byte in a Response frame carries one of the following values: + +| Value | `RpcStatus` variant | Meaning | +|-------|---------------------|---------| +| 0 | `Ok` | Success. Response payload contains the result. | +| 1 | `BadRequest` | Malformed request, missing required field, or failed validation. | +| 2 | `Unauthorized` | Missing or invalid session token. | +| 3 | `Forbidden` | Valid token but insufficient permissions for this operation. | +| 4 | `NotFound` | Requested resource does not exist (e.g., KeyPackage not found). | +| 5 | `RateLimited` | Request rate limit exceeded. Client should back off before retrying. | +| 8 | `DeadlineExceeded` | Server could not complete the request within the configured deadline. | +| 9 | `Unavailable` | Server temporarily unable to serve the request (e.g., storage unavailable). | +| 10 | `Internal` | Unexpected server-side error. | +| 11 | `UnknownMethod` | The `method_id` in the request is not registered. | + +--- + +## Method IDs + +All 44 RPC method IDs are defined in `crates/quicproquo-proto/src/lib.rs` in the `method_ids` module. The numeric ranges group related methods by service category. + +### Auth (100-103) + +| ID | Method | Request type | Response type | +|---|---|---|---| +| 100 | `OpaqueRegisterStart` | `OpaqueRegisterStartRequest` | `OpaqueRegisterStartResponse` | +| 101 | `OpaqueRegisterFinish` | `OpaqueRegisterFinishRequest` | `OpaqueRegisterFinishResponse` | +| 102 | `OpaqueLoginStart` | `OpaqueLoginStartRequest` | `OpaqueLoginStartResponse` | +| 103 | `OpaqueLoginFinish` | `OpaqueLoginFinishRequest` | `OpaqueLoginFinishResponse` | + +### Delivery (200-205) + +| ID | Method | Request type | Response type | +|---|---|---|---| +| 200 | `Enqueue` | `EnqueueRequest` | `EnqueueResponse` | +| 201 | `Fetch` | `FetchRequest` | `FetchResponse` | +| 202 | `FetchWait` | `FetchWaitRequest` | `FetchWaitResponse` | +| 203 | `Peek` | `PeekRequest` | `PeekResponse` | +| 204 | `Ack` | `AckRequest` | `AckResponse` | +| 205 | `BatchEnqueue` | `BatchEnqueueRequest` | `BatchEnqueueResponse` | + +### Keys (300-304) + +| ID | Method | Request type | Response type | +|---|---|---|---| +| 300 | `UploadKeyPackage` | `UploadKeyPackageRequest` | `UploadKeyPackageResponse` | +| 301 | `FetchKeyPackage` | `FetchKeyPackageRequest` | `FetchKeyPackageResponse` | +| 302 | `UploadHybridKey` | `UploadHybridKeyRequest` | `UploadHybridKeyResponse` | +| 303 | `FetchHybridKey` | `FetchHybridKeyRequest` | `FetchHybridKeyResponse` | +| 304 | `FetchHybridKeys` | `FetchHybridKeysRequest` | `FetchHybridKeysResponse` | + +### Channel (400) + +| ID | Method | Request type | Response type | +|---|---|---|---| +| 400 | `CreateChannel` | `CreateChannelRequest` | `CreateChannelResponse` | + +### Group Management (410-413) + +| ID | Method | Request type | Response type | +|---|---|---|---| +| 410 | `RemoveMember` | `RemoveMemberRequest` | `RemoveMemberResponse` | +| 411 | `UpdateGroupMetadata` | `UpdateGroupMetadataRequest` | `UpdateGroupMetadataResponse` | +| 412 | `ListGroupMembers` | `ListGroupMembersRequest` | `ListGroupMembersResponse` | +| 413 | `RotateKeys` | `RotateKeysRequest` | `RotateKeysResponse` | + +### Moderation (420-424) + +| ID | Method | Request type | Response type | +|---|---|---|---| +| 420 | `ReportMessage` | `ReportMessageRequest` | `ReportMessageResponse` | +| 421 | `BanUser` | `BanUserRequest` | `BanUserResponse` | +| 422 | `UnbanUser` | `UnbanUserRequest` | `UnbanUserResponse` | +| 423 | `ListReports` | `ListReportsRequest` | `ListReportsResponse` | +| 424 | `ListBanned` | `ListBannedRequest` | `ListBannedResponse` | + +### User / Identity (500-501) + +| ID | Method | Request type | Response type | +|---|---|---|---| +| 500 | `ResolveUser` | `ResolveUserRequest` | `ResolveUserResponse` | +| 501 | `ResolveIdentity` | `ResolveIdentityRequest` | `ResolveIdentityResponse` | + +### Key Transparency (510-520) + +| ID | Method | Request type | Response type | +|---|---|---|---| +| 510 | `RevokeKey` | `RevokeKeyRequest` | `RevokeKeyResponse` | +| 511 | `CheckRevocation` | `CheckRevocationRequest` | `CheckRevocationResponse` | +| 520 | `AuditKeyTransparency` | `AuditKeyTransparencyRequest` | `AuditKeyTransparencyResponse` | + +### Blob Storage (600-601) + +| ID | Method | Request type | Response type | +|---|---|---|---| +| 600 | `UploadBlob` | `UploadBlobRequest` | `UploadBlobResponse` | +| 601 | `DownloadBlob` | `DownloadBlobRequest` | `DownloadBlobResponse` | + +### Device Management (700-710) + +| ID | Method | Request type | Response type | +|---|---|---|---| +| 700 | `RegisterDevice` | `RegisterDeviceRequest` | `RegisterDeviceResponse` | +| 701 | `ListDevices` | `ListDevicesRequest` | `ListDevicesResponse` | +| 702 | `RevokeDevice` | `RevokeDeviceRequest` | `RevokeDeviceResponse` | +| 710 | `RegisterPushToken` | `RegisterPushTokenRequest` | `RegisterPushTokenResponse` | + +### Recovery (750-752) + +| ID | Method | Request type | Response type | +|---|---|---|---| +| 750 | `StoreRecoveryBundle` | `StoreRecoveryBundleRequest` | `StoreRecoveryBundleResponse` | +| 751 | `FetchRecoveryBundle` | `FetchRecoveryBundleRequest` | `FetchRecoveryBundleResponse` | +| 752 | `DeleteRecoveryBundle` | `DeleteRecoveryBundleRequest` | `DeleteRecoveryBundleResponse` | + +### P2P and Health (800-802) + +| ID | Method | Request type | Response type | +|---|---|---|---| +| 800 | `PublishEndpoint` | `PublishEndpointRequest` | `PublishEndpointResponse` | +| 801 | `ResolveEndpoint` | `ResolveEndpointRequest` | `ResolveEndpointResponse` | +| 802 | `Health` | `HealthRequest` | `HealthResponse` | + +### Federation (900-905) + +| ID | Method | Request type | Response type | +|---|---|---|---| +| 900 | `RelayEnqueue` | `RelayEnqueueRequest` | `RelayEnqueueResponse` | +| 901 | `RelayBatchEnqueue` | `RelayBatchEnqueueRequest` | `RelayBatchEnqueueResponse` | +| 902 | `ProxyFetchKeyPackage` | `ProxyFetchKeyPackageRequest` | `ProxyFetchKeyPackageResponse` | +| 903 | `ProxyFetchHybridKey` | `ProxyFetchHybridKeyRequest` | `ProxyFetchHybridKeyResponse` | +| 904 | `ProxyResolveUser` | `ProxyResolveUserRequest` | `ProxyResolveUserResponse` | +| 905 | `FederationHealth` | `FederationHealthRequest` | `FederationHealthResponse` | + +### Account (950) + +| ID | Method | Request type | Response type | +|---|---|---|---| +| 950 | `DeleteAccount` | `DeleteAccountRequest` | `DeleteAccountResponse` | + +--- + +## Push Event Types + +Server-to-client push events are delivered on QUIC uni-streams using the Push frame format. Event types are defined alongside method IDs in `quicproquo-proto/src/lib.rs`: + +| Value | Event | Description | +|-------|-------|-------------| +| 1000 | `PushNewMessage` | A new message has been enqueued for the client. | +| 1001 | `PushTyping` | A group member has started or stopped typing. | +| 1002 | `PushPresence` | A contact's presence status has changed (online/offline). | +| 1003 | `PushMembership` | Group membership changed (member added or removed). | + +Push events avoid the need for the client to long-poll `FetchWait` (202) for real-time delivery. The client can listen on a background task for incoming uni-streams and process push events independently of pending RPC calls. + +--- + +## Stream Model + +Each RPC call uses a **dedicated QUIC bidirectional stream**: + +1. Client opens a new bidirectional stream (`connection.open_bi()`). +2. Client encodes the request into a `RequestFrame` and writes it to the send half. +3. Client closes the send half (marks end-of-write). +4. Server reads the complete `RequestFrame` from the receive half. +5. Server processes the request and writes a `ResponseFrame` to its send half. +6. Server closes the send half. +7. Client reads the complete `ResponseFrame`. + +This allows many concurrent RPCs on a single QUIC connection without head-of-line blocking. + +--- + +## Protobuf Schema Organisation + +All message types are defined in `proto/qpq/v1/`: + +| File | Contents | +|---|---| +| `auth.proto` | OPAQUE registration and login message types | +| `common.proto` | Auth context, account deletion, shared types | +| `delivery.proto` | Enqueue, Fetch, Peek, Ack, BatchEnqueue | +| `keys.proto` | MLS key packages, hybrid keys | +| `channel.proto` | Channel creation | +| `group.proto` | Group management (remove member, metadata, rotate keys) | +| `moderation.proto` | Report, ban, unban, list | +| `user.proto` | User and identity resolution | +| `kt.proto` | Key transparency (revoke, check, audit) | +| `blob.proto` | Binary object storage | +| `device.proto` | Multi-device management, push tokens | +| `recovery.proto` | Account recovery bundles | +| `p2p.proto` | P2P endpoints, health | +| `federation.proto` | Cross-server relay | + +All `.proto` files use `package qpq.v1;` and are compiled to Rust at build time using `prost-build` via the `quicproquo-proto` crate's `build.rs`. The `protobuf-src` crate vendors `protoc`, so no system-wide `protoc` installation is required. + +Generated Rust types are accessed via: ```rust -capnpc::CompilerCommand::new() - .src_prefix(&schemas_dir) - .file(schemas_dir.join("envelope.capnp")) - .file(schemas_dir.join("auth.capnp")) - .file(schemas_dir.join("delivery.capnp")) - .file(schemas_dir.join("node.capnp")) - .run() - .expect("Cap'n Proto schema compilation failed."); +use quicproquo_proto::qpq::v1::{EnqueueRequest, FetchResponse, /* ... */}; +use quicproquo_proto::method_ids::{ENQUEUE, FETCH, /* ... */}; ``` -Key details: +--- -- **`src_prefix`**: Set to `schemas/` so that inter-schema imports resolve correctly. -- **Output location**: Generated Rust source is written to `$OUT_DIR` (Cargo's build directory). The filenames follow the convention `{schema_name}_capnp.rs`. -- **Rerun triggers**: `cargo:rerun-if-changed` directives ensure the build script re-runs whenever any `.capnp` file changes. -- **Prerequisite**: The `capnp` CLI binary must be installed on the build machine (`apt-get install capnproto` or `brew install capnp`). +## Design Constraints of `quicproquo-proto` -### Generated module inclusion - -The generated code is spliced into the `quicproquo-proto` crate via `include!` macros: - -```rust -pub mod envelope_capnp { - include!(concat!(env!("OUT_DIR"), "/envelope_capnp.rs")); -} -pub mod auth_capnp { - include!(concat!(env!("OUT_DIR"), "/auth_capnp.rs")); -} -pub mod delivery_capnp { - include!(concat!(env!("OUT_DIR"), "/delivery_capnp.rs")); -} -pub mod node_capnp { - include!(concat!(env!("OUT_DIR"), "/node_capnp.rs")); -} -``` - -Consumers import types from these modules. For example, `node_capnp::node_service::Server` is the trait that the server implements. - -## The Envelope schema - -The `Envelope` is the top-level wire message for all quicproquo traffic. Every frame exchanged between peers is serialised as an Envelope: - -```capnp -struct Envelope { - msgType @0 :MsgType; - groupId @1 :Data; # 32-byte SHA-256 digest of group name - senderId @2 :Data; # 32-byte SHA-256 digest of Ed25519 pubkey - payload @3 :Data; # Opaque payload (MLS blob or control data) - timestampMs @4 :UInt64; # Unix epoch milliseconds - - enum MsgType { - ping @0; - pong @1; - keyPackageUpload @2; - keyPackageFetch @3; - keyPackageResponse @4; - mlsWelcome @5; - mlsCommit @6; - mlsApplication @7; - error @8; - } -} -``` - -The Delivery Service routes by `(groupId, msgType)` without inspecting `payload`. This design keeps the DS MLS-unaware -- see [ADR-004: MLS-Unaware Delivery Service](../design-rationale/adr-004-mls-unaware-ds.md). - -## The `ParsedEnvelope` owned type - -Cap'n Proto readers (`envelope_capnp::envelope::Reader`) borrow from the original byte buffer and cannot be sent across async task boundaries (`!Send`). This is a fundamental limitation of zero-copy reads. - -To bridge this gap, `quicproquo-proto` defines `ParsedEnvelope`: - -```rust -pub struct ParsedEnvelope { - pub msg_type: MsgType, - pub group_id: Vec, - pub sender_id: Vec, - pub payload: Vec, - pub timestamp_ms: u64, -} -``` - -`ParsedEnvelope` eagerly copies all byte fields out of the Cap'n Proto reader, making the type `Send + 'static`. This allows it to cross Tokio task boundaries, be stored in queues, and be passed through channels. - -The trade-off is clear: `ParsedEnvelope` allocates and copies, defeating the zero-copy benefit. This is acceptable because: - -1. The copying happens once per message at the protocol boundary. -2. Application-layer code (MLS encryption/decryption, routing) needs owned data anyway. -3. The performance-critical path (Delivery Service routing) works with opaque `Vec` payloads, not parsed Cap'n Proto readers. - -### Invariants - -- `group_id` and `sender_id` are either empty (for control messages like Ping/Pong) or exactly 32 bytes (SHA-256 digest). -- `payload` is empty for Ping and Pong; non-empty for all MLS variants. - -## Serialisation helpers - -Two functions handle the conversion between `ParsedEnvelope` and wire bytes: - -### `build_envelope` - -```rust -pub fn build_envelope(env: &ParsedEnvelope) -> Result, capnp::Error> -``` - -Serialises a `ParsedEnvelope` to unpacked Cap'n Proto wire bytes. The output includes the Cap'n Proto segment table header followed by the message data. These bytes are suitable as a payload within a QUIC stream. - -Internally, it builds a `capnp::message::Builder`, populates an `Envelope` root, and serialises via `capnp::serialize::write_message`. - -### `parse_envelope` - -```rust -pub fn parse_envelope(bytes: &[u8]) -> Result -``` - -Deserialises unpacked Cap'n Proto wire bytes into a `ParsedEnvelope`. All data is copied out of the reader before returning, so the input slice is not retained. - -It returns `capnp::Error` if: -- The bytes are not valid Cap'n Proto wire format. -- The `msgType` discriminant is not present in the current schema (forward-compatibility guard). - -### Low-level helpers - -Two additional functions provide raw byte-to-message conversions: - -```rust -pub fn to_bytes(msg: &Builder) -> Result, capnp::Error> -pub fn from_bytes(bytes: &[u8]) -> Result, capnp::Error> -``` - -`from_bytes` uses `ReaderOptions::new()` with default limits: -- **Traversal limit**: 32 MiB (4 * 1024 * 1024 words) -- **Nesting limit**: 512 levels - -The traversal limit bounds DoS from deeply nested or excessively large Cap'n Proto messages. The server also enforces size limits: 5 MB per payload (`MAX_PAYLOAD_BYTES`) and 1 MB per KeyPackage (`MAX_KEYPACKAGE_BYTES`). - -## The NodeService RPC interface - -The M3 unified RPC interface is defined in `schemas/node.capnp`: - -```capnp -interface NodeService { - uploadKeyPackage @0 (identityKey :Data, package :Data, auth :Auth) - -> (fingerprint :Data); - fetchKeyPackage @1 (identityKey :Data, auth :Auth) -> (package :Data); - enqueue @2 (recipientKey :Data, payload :Data, - channelId :Data, version :UInt16, auth :Auth) -> (); - fetch @3 (recipientKey :Data, channelId :Data, - version :UInt16, auth :Auth) -> (payloads :List(Data)); - fetchWait @4 (recipientKey :Data, channelId :Data, - version :UInt16, timeoutMs :UInt64, auth :Auth) - -> (payloads :List(Data)); - health @5 () -> (status :Text); - uploadHybridKey @6 (identityKey :Data, hybridPublicKey :Data) -> (); - fetchHybridKey @7 (identityKey :Data) -> (hybridPublicKey :Data); -} -``` - -This combines Authentication Service operations (`uploadKeyPackage`, `fetchKeyPackage`), Delivery Service operations (`enqueue`, `fetch`, `fetchWait`), health monitoring (`health`), and hybrid key management (`uploadHybridKey`, `fetchHybridKey`) into a single RPC interface. - -### Auth context - -Every mutating RPC method accepts an `Auth` struct: - -```capnp -struct Auth { - version @0 :UInt16; # 0 = legacy/none, 1 = token-based auth - accessToken @1 :Data; # opaque bearer token - deviceId @2 :Data; # optional UUID bytes for auditing -} -``` - -The server validates the `version` field and rejects unknown versions. Token validation is planned for a future milestone. See [Auth, Devices, and Tokens](../roadmap/authz-plan.md). - -## ALPN integration - -Cap'n Proto RPC rides directly on the QUIC bidirectional stream. The ALPN (Application-Layer Protocol Negotiation) extension in the TLS handshake identifies the protocol: - -```rust -tls.alpn_protocols = vec![b"capnp".to_vec()]; -``` - -Both client and server set the ALPN to `b"capnp"`. If the client and server disagree on the ALPN, the TLS handshake fails before any application data is exchanged. - -On the QUIC path, the flow is: - -```text -Client Server - | | - |── QUIC handshake (TLS 1.3) ────►| ALPN: "capnp" - | | - |── open_bi() ───────────────────►| Bidirectional QUIC stream - | | - |◄─────── capnp-rpc messages ────►| VatNetwork reads/writes on the stream -``` - -The `tokio-util` compat layer converts Quinn stream types into `futures::AsyncRead + AsyncWrite`, which `capnp-rpc`'s `VatNetwork` expects. See [QUIC + TLS 1.3](quic-tls.md) for the full connection setup. - -## Comparison with alternatives - -### vs Protocol Buffers + gRPC - -Protocol Buffers require a full deserialisation step to access any field. Cap'n Proto avoids this with zero-copy readers. gRPC requires HTTP/2 framing, which adds overhead on top of QUIC. Cap'n Proto RPC is leaner and maps naturally to a single QUIC stream. - -### vs MessagePack - -MessagePack is untyped -- there is no schema file, and type errors are caught at runtime. This is unacceptable for a security protocol where a misinterpreted field could be exploitable. MessagePack also has no RPC framework, requiring a hand-rolled request/response protocol. - -### vs FlatBuffers - -FlatBuffers supports zero-copy reads (like Cap'n Proto) but lacks a built-in RPC framework. The ecosystem and tooling are also less mature for Rust. - -## Design constraints of `quicproquo-proto` - -The `quicproquo-proto` crate enforces three design constraints: +The `quicproquo-proto` crate enforces three constraints: 1. **No crypto**: Key material never enters this crate. All encryption and signing happens in `quicproquo-core`. 2. **No I/O**: Callers own the transport. This crate only converts between bytes and types. @@ -266,10 +301,12 @@ The `quicproquo-proto` crate enforces three design constraints: These constraints keep the serialisation layer thin and auditable. +--- + ## Further reading -- [Envelope Schema](../wire-format/envelope-schema.md) -- Detailed field-by-field breakdown of the Envelope wire format. -- [NodeService Schema](../wire-format/node-service-schema.md) -- Full RPC interface documentation. -- [Auth Schema](../wire-format/auth-schema.md) -- Auth token structure and versioning. -- [MLS (RFC 9420)](mls.md) -- How MLS messages are carried as opaque payloads inside Cap'n Proto Envelopes. -- [ADR-002: Cap'n Proto over MessagePack](../design-rationale/adr-002-capnproto.md) -- Design rationale for choosing Cap'n Proto. +- [QUIC + TLS 1.3](quic-tls.md) -- The transport layer that carries these frames. +- [Service Architecture](../architecture/service-architecture.md) -- How the server dispatches method IDs to handlers. +- [Wire Format Reference](../wire-format/overview.md) -- Full Protobuf schema documentation. +- [MLS (RFC 9420)](mls.md) -- How MLS messages are carried as opaque payloads inside Protobuf delivery messages. +- [ADR-007](../design-rationale/adr-007-protobuf-migration.md) -- Design rationale for the v1 Cap'n Proto to v2 Protobuf migration. diff --git a/docs/src/protocol-layers/overview.md b/docs/src/protocol-layers/overview.md index 51536b4..ffbc234 100644 --- a/docs/src/protocol-layers/overview.md +++ b/docs/src/protocol-layers/overview.md @@ -9,7 +9,7 @@ This page provides a high-level comparison and a suggested reading order. The de | Layer | Standard / Spec | Crate(s) | Security Properties | |---|---|---|---| | **QUIC + TLS 1.3** | RFC 9000, RFC 9001 | `quinn 0.11`, `rustls 0.23` | Transport confidentiality, server authentication, 0-RTT resumption | -| **Cap'n Proto** | [capnproto.org specification](https://capnproto.org/encoding.html) | `capnp 0.19`, `capnp-rpc 0.19` | Zero-copy deserialisation, schema-enforced types, canonical serialisation for signing, async RPC | +| **Protobuf framing** | Custom binary header + [Protocol Buffers](https://protobuf.dev/) | `quicproquo-rpc`, `prost 0.13` | Typed length-prefixed frames, method dispatch, push events, status codes | | **MLS** | [RFC 9420](https://www.rfc-editor.org/rfc/rfc9420.html) | `openmls 0.5` | Group key agreement, forward secrecy, post-compromise security (PCS) | | **Hybrid KEM** | [draft-ietf-tls-hybrid-design](https://datatracker.ietf.org/doc/draft-ietf-tls-hybrid-design/) | `ml-kem 0.2`, `x25519-dalek 2` | Post-quantum resistance via ML-KEM-768 combined with X25519 | @@ -27,7 +27,8 @@ Application plaintext | v +-----------+ - | Cap'n Proto| Schema-typed serialisation into Envelope frames + | Protobuf | Typed serialisation into Protobuf messages + | framing | + binary header [method_id/event_type][req_id][len] +-----------+ | v @@ -39,19 +40,19 @@ Application plaintext Network ``` -The Hybrid KEM layer operates orthogonally: it wraps MLS payloads in an outer post-quantum encryption envelope before they enter the transport layer. It is implemented and tested but not yet integrated into the MLS ciphersuite (planned for the M5 milestone). +The Hybrid KEM layer operates orthogonally: it wraps MLS payloads in an outer post-quantum encryption envelope before they enter the transport layer. It is implemented and tested but not yet integrated into the MLS ciphersuite (planned for a future milestone). ## Suggested reading order The pages in this section are ordered to build understanding incrementally: -1. **[QUIC + TLS 1.3](quic-tls.md)** -- Start here. This is the transport layer that every client-server connection uses. Understanding QUIC stream multiplexing and the TLS 1.3 handshake is prerequisite to understanding how Cap'n Proto RPC rides on top. +1. **[QUIC + TLS 1.3](quic-tls.md)** -- Start here. This is the transport layer that every client-server connection uses. Understanding QUIC stream multiplexing and the TLS 1.3 handshake is prerequisite to understanding how the Protobuf framing protocol rides on top. 2. **[MLS (RFC 9420)](mls.md)** -- The core cryptographic innovation. MLS provides the group key agreement that makes quicproquo an E2E encrypted group messenger rather than just a transport-encrypted relay. This is the longest and most detailed page. -3. **[Cap'n Proto Serialisation and RPC](capn-proto.md)** -- The serialisation and RPC layer that bridges MLS application data with the transport. Understanding the Envelope schema, the ParsedEnvelope owned type, and the NodeService RPC interface is essential for reading the server and client source code. +3. **[Protobuf Framing](capn-proto.md)** -- The framing and RPC layer that bridges MLS application data with the transport. Understanding the three frame types (Request, Response, Push), the method ID dispatch table, and status codes is essential for reading the server and client source code. -4. **[Hybrid KEM: X25519 + ML-KEM-768](hybrid-kem.md)** -- The post-quantum encryption layer. Read this last because it builds on concepts from all other layers: key encapsulation (from MLS), wire format conventions (from Cap'n Proto), and AEAD encryption. +4. **[Hybrid KEM: X25519 + ML-KEM-768](hybrid-kem.md)** -- The post-quantum encryption layer. Read this last because it builds on concepts from all other layers: key encapsulation (from MLS), wire format conventions (from Protobuf framing), and AEAD encryption. ## Cross-cutting concerns @@ -59,9 +60,9 @@ Several topics span multiple layers and have their own dedicated pages elsewhere - **Forward secrecy**: Provided by MLS epoch ratcheting. See [Forward Secrecy](../cryptography/forward-secrecy.md). - **Post-compromise security**: Provided by MLS Update proposals. See [Post-Compromise Security](../cryptography/post-compromise-security.md). -- **Post-quantum readiness**: Currently provided by the standalone Hybrid KEM module; integration into MLS is planned for M5. See [Post-Quantum Readiness](../cryptography/post-quantum-readiness.md). +- **Post-quantum readiness**: Currently provided by the standalone Hybrid KEM module; integration into MLS is planned. See [Post-Quantum Readiness](../cryptography/post-quantum-readiness.md). - **Key lifecycle and zeroization**: Private key material is zeroized after use across all layers. See [Key Lifecycle and Zeroization](../cryptography/key-lifecycle.md). -- **Wire format details**: The Cap'n Proto schema definitions are documented in the [Wire Format Reference](../wire-format/overview.md) section. +- **Wire format details**: The Protobuf schema definitions are documented in the [Wire Format Reference](../wire-format/overview.md) section. - **Design rationale**: The ADR pages explain *why* each layer was chosen. See [Design Decisions Overview](../design-rationale/overview.md). ## Crate mapping @@ -70,8 +71,9 @@ Each protocol layer maps to one or more workspace crates: | Layer | Primary Crate | Source File(s) | |---|---|---| -| QUIC + TLS 1.3 | `quicproquo-server`, `quicproquo-client` | `main.rs` (server and client entry points) | -| Cap'n Proto | `quicproquo-proto` | `src/lib.rs`, `build.rs`, `schemas/*.capnp` | +| QUIC + TLS 1.3 | `quicproquo-server`, `quicproquo-client` | Server and client entry points | +| Protobuf framing | `quicproquo-rpc` | `src/framing.rs`, `src/server.rs`, `src/client.rs` | +| Protobuf types + method IDs | `quicproquo-proto` | `src/lib.rs` (method_ids), `proto/qpq/v1/*.proto` | | MLS | `quicproquo-core` | `src/group.rs`, `src/keystore.rs` | | Hybrid KEM | `quicproquo-core` | `src/hybrid_kem.rs` | diff --git a/docs/src/protocol-layers/quic-tls.md b/docs/src/protocol-layers/quic-tls.md index 8c09d07..067bd22 100644 --- a/docs/src/protocol-layers/quic-tls.md +++ b/docs/src/protocol-layers/quic-tls.md @@ -10,26 +10,27 @@ QUIC provides several advantages over traditional TCP-based transports: - **0-RTT resumption**: Returning clients can send data in the first flight, reducing connection setup latency. - **Integrated encryption**: TLS 1.3 is integral to the QUIC handshake; no extra round-trips for transport security. - **NAT traversal**: UDP-based; connection migration survives NAT rebinding. -- **Ecosystem support**: `capnp-rpc` can use QUIC bidirectional streams directly via the `tokio-util` compat layer. +- **Per-call concurrency**: The v2 RPC framework opens one bidirectional stream per RPC call. Multiple calls run concurrently without blocking each other. +- **Push streams**: Server-to-client push events use QUIC uni-directional streams, avoiding any request-response overhead. ## Crate integration quicproquo uses the following crates for QUIC and TLS: -- **`quinn 0.11`** -- The async QUIC implementation for Tokio. Provides `Endpoint`, `Connection`, and bidirectional stream types. +- **`quinn 0.11`** -- The async QUIC implementation for Tokio. Provides `Endpoint`, `Connection`, and bidirectional/uni-directional stream types. - **`quinn-proto 0.11`** -- The protocol-level types, including `QuicServerConfig` and `QuicClientConfig` wrappers that bridge `rustls` into `quinn`. - **`rustls 0.23`** -- The TLS implementation. quicproquo uses it in strict TLS 1.3 mode with no fallback to TLS 1.2. - **`rcgen 0.13`** -- Self-signed certificate generation for development and testing. ### Server configuration -The server builds its QUIC endpoint configuration in `build_server_config()` (in `quicproquo-server/src/main.rs`): +The server builds its QUIC endpoint configuration with: ```rust let mut tls = rustls::ServerConfig::builder_with_protocol_versions(&[&TLS13]) .with_no_client_auth() .with_single_cert(cert_chain, key)?; -tls.alpn_protocols = vec![b"capnp".to_vec()]; +tls.alpn_protocols = vec![b"qpq".to_vec()]; let crypto = QuicServerConfig::try_from(tls)?; Ok(ServerConfig::with_crypto(Arc::new(crypto))) @@ -39,9 +40,9 @@ Key points: 1. **TLS 1.3 strict mode**: `builder_with_protocol_versions(&[&TLS13])` ensures no TLS 1.2 fallback. This is a hard requirement: TLS 1.2 lacks the 0-RTT and full forward secrecy guarantees that quicproquo relies on. -2. **No client certificate authentication**: `with_no_client_auth()` means the server does not verify client certificates at the TLS layer. Client authentication is handled at the application layer via Ed25519 identity keys and MLS credentials. This is a deliberate design choice -- MLS provides stronger authentication properties than TLS client certificates. +2. **No client certificate authentication**: `with_no_client_auth()` means the server does not verify client certificates at the TLS layer. Client authentication is handled at the application layer via OPAQUE password authentication and Ed25519 identity keys. This is a deliberate design choice -- OPAQUE provides stronger authentication properties than TLS client certificates without requiring PKI infrastructure. -3. **ALPN negotiation**: The Application-Layer Protocol Negotiation extension is set to `b"capnp"`, advertising that this endpoint speaks Cap'n Proto RPC. Both client and server must agree on this protocol identifier or the TLS handshake fails. +3. **ALPN negotiation**: The Application-Layer Protocol Negotiation extension is set to `b"qpq"`, advertising that this endpoint speaks the quicproquo v2 Protobuf framing protocol. Both client and server must agree on this protocol identifier or the TLS handshake fails. 4. **`QuicServerConfig` bridge**: The `quinn-proto` crate provides `QuicServerConfig::try_from(tls)` to adapt the `rustls::ServerConfig` for use with QUIC. This handles the QUIC-specific TLS parameters (transport parameters, QUIC header protection keys) automatically. @@ -53,10 +54,10 @@ The client performs the mirror operation. It loads the server's DER-encoded cert let mut roots = rustls::RootCertStore::empty(); roots.add(CertificateDer::from(cert_bytes))?; -let tls = rustls::ClientConfig::builder_with_protocol_versions(&[&TLS13]) +let mut tls = rustls::ClientConfig::builder_with_protocol_versions(&[&TLS13]) .with_root_certificates(roots) .with_no_client_auth(); -tls.alpn_protocols = vec![b"capnp".to_vec()]; +tls.alpn_protocols = vec![b"qpq".to_vec()]; let crypto = QuicClientConfig::try_from(tls)?; ``` @@ -65,20 +66,26 @@ The client trusts exactly one certificate: the server's self-signed cert loaded ### Per-connection handling -Each accepted QUIC connection spawns a handler task: +The v2 server accepts connections and handles streams concurrently: ```rust -let (send, recv) = connection.accept_bi().await?; -let (reader, writer) = (recv.compat(), send.compat_write()); +// Accept a QUIC connection +let connection = endpoint.accept().await?; -let network = twoparty::VatNetwork::new(reader, writer, Side::Server, Default::default()); -let service: node_service::Client = capnp_rpc::new_client(NodeServiceImpl { store, waiters }); -RpcSystem::new(Box::new(network), Some(service.client)).await?; +// For each incoming bidirectional stream (one per RPC call): +let (send, recv) = connection.accept_bi().await?; +// Read RequestFrame, dispatch, write ResponseFrame +tokio::spawn(handle_rpc(send, recv, server_state)); + +// For server-initiated push events: +let send = connection.open_uni().await?; +// Write PushFrame +tokio::spawn(send_push(send, event)); ``` -The `tokio-util` compat layer (`compat()` and `compat_write()`) converts Quinn's `RecvStream` and `SendStream` into types that implement `futures::AsyncRead` and `futures::AsyncWrite`, which `capnp-rpc`'s `VatNetwork` requires. The entire Cap'n Proto RPC system then runs over this single QUIC bidirectional stream. - -Because `capnp-rpc` uses `Rc>` internally (making it `!Send`), all RPC tasks run on a `tokio::task::LocalSet`. The server spawns each connection handler via `tokio::task::spawn_local`. +Unlike the v1 Cap'n Proto RPC (which required `tokio::task::LocalSet` due to +`!Send` internals), the v2 framework uses `Arc`-based shared state and +`tokio::spawn` for full multi-threaded concurrency. ## Certificate trust model @@ -126,9 +133,9 @@ The QUIC + TLS 1.3 layer provides: ### What TLS does *not* provide -- **Client authentication**: Handled by MLS identity credentials at the application layer. See [MLS (RFC 9420)](mls.md). -- **End-to-end encryption**: TLS terminates at the server. The server can read the Cap'n Proto RPC framing and message routing metadata. Payload confidentiality is provided by MLS. See [MLS (RFC 9420)](mls.md). -- **Post-quantum resistance**: TLS 1.3 key exchange uses classical ECDHE. Post-quantum protection of application data is provided by the [Hybrid KEM](hybrid-kem.md) layer (M5 milestone). +- **Client authentication**: Handled by OPAQUE password authentication (methods 100-103) and Ed25519 identity keys at the application layer. See [Service Architecture](../architecture/service-architecture.md). +- **End-to-end encryption**: TLS terminates at the server. The server can read the Protobuf framing and message routing metadata. Payload confidentiality is provided by MLS. See [MLS (RFC 9420)](mls.md). +- **Post-quantum resistance**: TLS 1.3 key exchange uses classical ECDHE. Post-quantum protection of application data is provided by the [Hybrid KEM](hybrid-kem.md) layer. ## Configuration reference @@ -136,7 +143,7 @@ The QUIC + TLS 1.3 layer provides: | Environment Variable | CLI Flag | Default | Description | |---|---|---|---| -| `QPQ_LISTEN` | `--listen` | `0.0.0.0:7000` | QUIC listen address | +| `QPQ_LISTEN` | `--listen` | `0.0.0.0:5001` | QUIC listen address | | `QPQ_TLS_CERT` | `--tls-cert` | `data/server-cert.der` | TLS certificate path | | `QPQ_TLS_KEY` | `--tls-key` | `data/server-key.der` | TLS private key path | | `QPQ_DATA_DIR` | `--data-dir` | `data` | Persistent storage directory | @@ -147,9 +154,9 @@ The QUIC + TLS 1.3 layer provides: |---|---|---|---| | `QPQ_CA_CERT` | `--ca-cert` | `data/server-cert.der` | Server certificate to trust | | `QPQ_SERVER_NAME` | `--server-name` | `localhost` | Expected TLS server name (must match certificate SAN) | -| `QPQ_SERVER` | `--server` | `127.0.0.1:7000` | Server address (per-subcommand) | +| `QPQ_SERVER` | `--server` | `127.0.0.1:5001` | Server address (per-subcommand) | ## Further reading -- [Cap'n Proto Serialisation and RPC](capn-proto.md) -- The RPC layer that runs on top of QUIC streams. -- [Service Architecture](../architecture/service-architecture.md) -- How the server's `NodeServiceImpl` binds to the QUIC endpoint. +- [Protobuf Framing](capn-proto.md) -- The RPC framing layer that runs on top of QUIC streams. +- [Service Architecture](../architecture/service-architecture.md) -- How the server binds to the QUIC endpoint and dispatches 44 RPC methods. diff --git a/docs/src/roadmap/future-research.md b/docs/src/roadmap/future-research.md index 07d63e5..24965fb 100644 --- a/docs/src/roadmap/future-research.md +++ b/docs/src/roadmap/future-research.md @@ -12,24 +12,6 @@ For the production readiness work breakdown, see ## Transport and Networking -### LibP2P / iroh (n0) - -**Problem:** The current architecture is strictly client-server. Clients behind -NAT cannot communicate directly, and the server is a single point of failure for -delivery. - -**Solution:** [LibP2P](https://libp2p.io/) and [iroh](https://iroh.computer/) -(from n0) provide peer discovery, NAT traversal (hole-punching), and relay -fallback. iroh is particularly interesting because it is Rust-native and built on -QUIC, aligning with quicproquo's existing transport layer. - -**Architecture impact:** Move from pure client-server to a hybrid topology where -peers communicate directly when possible and fall back to server relay when NAT -traversal fails. The server role shifts from mandatory relay to optional -rendezvous/relay node. - -**Crates:** `libp2p`, `iroh`, `iroh-net` - ### WebTransport (HTTP/3) **Problem:** Browser clients cannot use raw QUIC. The current stack requires a @@ -66,23 +48,6 @@ significantly, so this should be optional. ## Storage and Persistence -### SQLCipher / libsql (Turso) - -**Problem:** At M6, quicproquo needs persistent storage for group state, key -material, and message queues. Storing private keys in a plaintext SQLite database -is insufficient. - -**Solution:** [SQLCipher](https://www.zetetic.net/sqlcipher/) provides -transparent, page-level AES-256 encryption for SQLite. Alternatively, -[libsql](https://turso.tech/libsql) (Turso) offers a SQLite fork with -encryption, replication, and embedded server capabilities. - -**Architecture impact:** Replace the `sqlx` SQLite backend with SQLCipher. -Encryption key derived from a user-provided passphrase (via Argon2id) or a -hardware-backed key. - -**Crates:** `rusqlite` (with `bundled-sqlcipher` feature), `libsql` - ### CRDTs (Automerge / Yrs) **Problem:** Multi-device support requires synchronising state (group membership, @@ -153,20 +118,6 @@ queries. This is a significant performance trade-off: PIR has high computational cost. Suitable for KeyPackage fetch (small database) before message fetch (large database). -### Sealed Sender (Signal-style) - -**Problem:** The server sees `(sender, recipient, timestamp)` metadata on every -enqueued message. Even without reading content, this metadata reveals social -graphs. - -**Solution:** [Sealed Sender](https://signal.org/blog/sealed-sender/) encrypts -the sender's identity inside the MLS ciphertext. The server routes by -`recipientKey` only and cannot determine who sent the message. - -**Architecture impact:** Modify the `enqueue` RPC to omit sender identity from -the server-visible metadata. The sender identity is included only inside the -MLS application message (encrypted). - ### Key Transparency (RFC draft) **Problem:** A compromised server could substitute public keys, performing a @@ -200,24 +151,6 @@ DID URIs. The server resolves DIDs to public keys for routing. **Crates:** `did-key`, `ssi` -### OPAQUE (aPAKE) - -**Problem:** If quicproquo adds password-based account registration, the -server must never see the password -- not even a hash. - -**Solution:** [OPAQUE](https://datatracker.ietf.org/doc/rfc9497/) is an -asymmetric password-authenticated key exchange where the server stores only a -one-way transformation of the password. The server cannot perform offline -dictionary attacks. - -**Architecture impact:** Replace the registration/login flow with OPAQUE. The -server stores an OPAQUE registration record; the client runs the OPAQUE protocol -to authenticate and derive a session key. - -**Crates:** `opaque-ke` - -**References:** RFC 9497 - ### WebAuthn / Passkeys **Problem:** Password-based auth (even with OPAQUE) is vulnerable to phishing. @@ -380,18 +313,25 @@ command sets up the toolchain, `capnp`, and all dependencies. --- -## Top 5 Priority Implementations +## Top Priority Implementations The following table ranks the most impactful technologies for near-term adoption, considering the current state of the codebase and the [milestone plan](milestones.md). -| Priority | Technology | Why | Unlocks | -|----------|-----------|-----|---------| -| 1 | **Post-quantum hybrid KEM** | `ml-kem` is already vendored in the workspace. Completing the hybrid `OpenMlsCryptoProvider` makes quicproquo one of the first PQ MLS implementations. | M7 | -| 2 | **SQLCipher persistence** | Encrypted-at-rest storage is the prerequisite for multi-device support, offline usage, and server restart survival. | M6 | -| 3 | **OPAQUE auth** | Zero-knowledge password authentication is a massive security uplift for the account system. The server never sees or stores passwords. | Phase 3 (authz) | -| 4 | **iroh / LibP2P** | NAT traversal and optional P2P mesh makes quicproquo deployable without centralised infrastructure. Aligns with the existing QUIC transport. | Beyond M7 | -| 5 | **Sealed Sender + PIR** | Content encryption is table stakes. Metadata resistance (hiding who talks to whom) is the frontier of private messaging research. | Beyond M7 | +Items marked **Implemented** are already part of the v2 codebase. + +| Priority | Technology | Why | Status | +|----------|-----------|-----|--------| +| -- | **Post-quantum hybrid KEM** | `ml-kem` vendored; custom `OpenMlsCryptoProvider` with X25519 + ML-KEM-768. | **Implemented** | +| -- | **SQLCipher persistence** | Encrypted-at-rest storage via rusqlite + bundled-sqlcipher + Argon2id key derivation. | **Implemented** | +| -- | **OPAQUE auth** | Zero-knowledge password authentication via `opaque-ke`. Server never stores passwords. | **Implemented** | +| -- | **iroh P2P** | NAT traversal and optional P2P mesh via the `quicproquo-p2p` crate (feature-flagged). | **Implemented** | +| -- | **Sealed Sender** | `--sealed-sender` flag encrypts sender identity inside MLS ciphertext. | **Implemented** | +| 1 | **PIR (Private Information Retrieval)** | Fetch messages without revealing the recipient's identity to the server. | Future | +| 2 | **Key Transparency** | Verifiable, append-only log of public key bindings. Detects key substitution attacks. | Future | +| 3 | **WebTransport (HTTP/3)** | Enables browser clients without a WebSocket bridge. | Future | +| 4 | **OpenTelemetry** | Distributed tracing and structured metrics for production observability. | Future | +| 5 | **WebAuthn / Passkeys** | Hardware-backed authentication to replace password-based login. | Future | --- diff --git a/docs/src/roadmap/milestones.md b/docs/src/roadmap/milestones.md index 9ee2e45..5d2c37c 100644 --- a/docs/src/roadmap/milestones.md +++ b/docs/src/roadmap/milestones.md @@ -17,7 +17,7 @@ for what that means in practice. | M4 | Group CLI Subcommands | **Complete** | Persistent CLI (create-group, invite, join, send, recv), OPAQUE login | | M5 | Multi-party Groups | **Complete** | N > 2 members, Commit fan-out, send --all, epoch sync | | M6 | Persistence | **Complete** | SQLite/SQLCipher, migrations, durable server + client state | -| M7 | Post-quantum | **Next** | PQ hybrid for MLS/HPKE (X25519 + ML-KEM-768) | +| M7 | Post-quantum | **Complete** | PQ hybrid for MLS/HPKE (X25519 + ML-KEM-768) | --- @@ -129,14 +129,13 @@ optional follow-ups. **Goal:** Server survives restart. Client state persists across sessions. -**Deliverables:** SQLite/SQLCipher via rusqlite, `migrations/` directory and -migration runner; client state file and DiskKeyStore (encrypted QPCE optional). -See [Future Research: SQLCipher](future-research.md#storage--persistence) for -encrypted-at-rest options. +**Deliverables:** SQLCipher via rusqlite (bundled-sqlcipher feature), `migrations/` +directory and migration runner; client state file and DiskKeyStore with +Argon2id key derivation and ChaCha20-Poly1305 encryption at rest. --- -## M7 -- Post-quantum (Next) +## M7 -- Post-quantum (Complete) **Goal:** Replace the MLS crypto backend with a hybrid X25519 + ML-KEM-768 KEM, providing post-quantum confidentiality for all group key material. diff --git a/docs/src/roadmap/phase2-and-m4-m6.md b/docs/src/roadmap/phase2-and-m4-m6.md index 0181ce4..17a4f72 100644 --- a/docs/src/roadmap/phase2-and-m4-m6.md +++ b/docs/src/roadmap/phase2-and-m4-m6.md @@ -36,14 +36,14 @@ The following legacy behaviour has been removed; only current behaviour is suppo | Deliverable | Status | |-------------|--------| -| `create-group` | Planned | -| `invite ` | Planned | -| `join` | Planned | -| `send ` | Planned | -| `recv` | Planned | -| Keep `demo-group` | Existing | +| `create-group` | **Complete** | +| `invite ` | **Complete** | +| `join` | **Complete** | +| `send ` | **Complete** | +| `recv` | **Complete** | +| Keep `demo-group` | **Complete** | -See [Milestones](milestones.md#m4--group-cli-subcommands-next). +See [Milestones](milestones.md#m4--group-cli-subcommands-complete). --- @@ -53,10 +53,10 @@ See [Milestones](milestones.md#m4--group-cli-subcommands-next). | Deliverable | Status | |-------------|--------| -| Commit fan-out via DS | Planned | -| Proposal handling (Add, Remove, Update) | Planned | -| Epoch sync across N members | Planned | -| Benchmarks | Planned | +| Commit fan-out via DS | **Complete** | +| Proposal handling (Add, Remove, Update) | **Complete** | +| Epoch sync across N members | **Complete** | +| Benchmarks | **Complete** | --- @@ -66,10 +66,10 @@ See [Milestones](milestones.md#m4--group-cli-subcommands-next). | Deliverable | Status | |-------------|--------| -| SQLite/SQLCipher (AS + DS) | Partial (SqlStore exists) | -| `migrations/` | Planned | -| Client reconnect + session resume | Planned | -| Docker + healthcheck | Partial (Dockerfile exists) | +| SQLCipher (AS + DS) | **Complete** | +| `migrations/` | **Complete** | +| Client reconnect + session resume | **Complete** | +| Docker + healthcheck | **Complete** | --- diff --git a/docs/src/roadmap/production-readiness.md b/docs/src/roadmap/production-readiness.md index 380bc55..96c8d27 100644 --- a/docs/src/roadmap/production-readiness.md +++ b/docs/src/roadmap/production-readiness.md @@ -44,7 +44,7 @@ how they are enforced in code. ### Transport Policy - TLS 1.3 only (`rustls` configured with `TLS13` cipher suites exclusively). -- ALPN token `b"capnp"` required; reject connections with mismatched ALPN. +- ALPN token `b"qpq"` required; reject connections with mismatched ALPN. - Self-signed certificates acceptable for development; production deployments must use a CA-signed certificate or certificate pinning. - Connection draining on shutdown (QUIC `CONNECTION_CLOSE`). @@ -60,7 +60,7 @@ how they are enforced in code. ### Input Validation -- All incoming Cap'n Proto messages validated against schema before processing. +- All incoming Protobuf messages validated against schema before processing. - Maximum payload size: 5 MB per RPC call. - Group ID, identity key, and channel ID fields validated for correct length (32 bytes, 32 bytes, 16 bytes respectively). @@ -101,7 +101,7 @@ how they are enforced in code. - Integration tests for every RPC method. - Negative tests: malformed input, expired tokens, wrong identity, replay attempts. - N-1 compatibility tests (old client against new server). -- Fuzzing targets for Cap'n Proto parsers and MLS message handling (Phase 5). +- Fuzzing targets for Protobuf parsers and MLS message handling (Phase 5). --- @@ -125,10 +125,10 @@ how they are enforced in code. | Task | Description | |------|-------------| -| Wire versioning | Add `version` field to all Cap'n Proto structs; reject unknown versions | +| Wire versioning | Version field in all Protobuf frames; reject unknown versions | | Ciphersuite allowlist | Server rejects KeyPackages outside the allowed set | | Downgrade guards | Prevent epoch rollback; reject Commits with weaker ciphersuites | -| ALPN enforcement | Reject connections without `b"capnp"` ALPN token | +| ALPN enforcement | Reject connections without `b"qpq"` ALPN token | | Connection draining | Graceful QUIC `CONNECTION_CLOSE` on server shutdown | | KeyPackage rotation | Client-side timer to upload fresh KeyPackages before TTL expiry | @@ -172,7 +172,7 @@ See [1:1 Channel Design](dm-channels.md) for the DM-specific design. | Positive E2E tests | Full group lifecycle: register, create, invite, join, send, recv, leave | | Negative E2E tests | Expired tokens, wrong identity, replay, malformed messages | | Compat matrix | N-1 client/server version testing | -| Fuzz targets | `cargo-fuzz` targets for Cap'n Proto parsers, MLS message handlers | +| Fuzz targets | `cargo-fuzz` targets for Protobuf parsers, MLS message handlers | | Golden-wire fixtures | Serialised test vectors for regression testing across versions | ### Phase 6 -- Reliability, Performance, and Operations diff --git a/docs/src/sdk/index.md b/docs/src/sdk/index.md new file mode 100644 index 0000000..20a76c2 --- /dev/null +++ b/docs/src/sdk/index.md @@ -0,0 +1,64 @@ +# Client SDKs + +This guide covers how to build clients for the quicproquo E2E encrypted messenger +using the official SDKs or by implementing a new one. + +## Official SDKs + +| Language | Location | Transport | Status | +|----------|----------|-----------|--------| +| **Rust** | `crates/quicproquo-client` | QUIC + Protobuf (v2) | Production | +| **Go** | `sdks/go/` | QUIC + Protobuf (v2) | Production | +| **TypeScript** | `sdks/typescript/` | WebSocket bridge + WASM crypto | Production | +| **Python** | `sdks/python/` | QUIC + Protobuf (v2) / Rust FFI | Production | +| **C** | `crates/quicproquo-ffi/` | Rust FFI (synchronous) | Production | +| **Swift** | `sdks/swift/` | C FFI wrapper | In progress | +| **Kotlin** | `sdks/kotlin/` | JNI + C FFI | In progress | +| **Java** | `sdks/java/` | JNI + C FFI | In progress | +| **Ruby** | `sdks/ruby/` | FFI gem | In progress | + +## Architecture Overview + +``` + Client SDK Server + ---------- ------ + +------------+ QUIC/TLS 1.3 +------------+ + | App code | <--------------> | RPC | + | | v2 wire frames | dispatch | + | SDK API | | | + | | [method_id:u16] | handlers | + | Proto | [req_id:u32] | | + | encode/ | [len:u32] | storage | + | decode | [protobuf] | | + | | | | + | QUIC | | QUIC | + | transport | | listener | + +------------+ +------------+ +``` + +Each RPC call opens a new QUIC bidirectional stream. The request and response +use the same 10-byte framing header followed by a protobuf payload. + +## Quick Start + +1. Choose an SDK for your language (see table above). +2. Connect to the server over QUIC (or WebSocket bridge for browsers). +3. Authenticate with OPAQUE (register or login). +4. Upload MLS key packages for E2E encryption. +5. Send and receive encrypted messages. + +## Canonical Schemas + +- **Protobuf** (v2): `proto/qpq/v1/*.proto` -- 14 service definitions + +The protobuf schemas in `proto/qpq/v1/` are the canonical API contract for +the v2 protocol. New SDKs should implement against these definitions. + +## Documentation + +- [Wire Format Reference](wire-format.md) -- v2 QUIC + Protobuf framing and method IDs +- [Rust SDK](rust.md) -- native Rust client using `quicproquo-sdk` +- [Go SDK](../getting-started/go-sdk.md) -- Go client with QUIC transport +- [TypeScript SDK](../getting-started/typescript-sdk.md) -- browser and Node.js client +- [C FFI Bindings](../getting-started/ffi.md) -- C bindings for language integrations +- [WASM Integration](../getting-started/wasm.md) -- WASM crypto for browser clients diff --git a/docs/src/sdk/rust.md b/docs/src/sdk/rust.md new file mode 100644 index 0000000..1f7606e --- /dev/null +++ b/docs/src/sdk/rust.md @@ -0,0 +1,67 @@ +# Rust SDK + +The Rust client is the reference implementation, located in +`crates/quicproquo-client/`. It is built on top of the `quicproquo-sdk` crate, +which provides the high-level v2 API over QUIC + Protobuf. + +## Installation + +Add to your `Cargo.toml`: + +```toml +[dependencies] +quicproquo-sdk = { path = "crates/quicproquo-sdk" } +``` + +## Connection + +```rust +use quicproquo_sdk::QpqClient; + +let client = QpqClient::connect("127.0.0.1:5001", &tls_config).await?; +let health = client.health().await?; +``` + +## CLI Client Usage + +The `quicproquo-client` binary provides a CLI/TUI interface: + +```rust +use quicproquo_client::{cmd_health, cmd_login, cmd_send}; + +// Health check +cmd_health("127.0.0.1:5001", &ca_cert_path, "localhost").await?; + +// Login via OPAQUE +cmd_login( + "127.0.0.1:5001", &ca_cert_path, "localhost", + "alice", "password123", + None, // identity_key_hex + Some(&state_path), // state persistence + None, // state_password +).await?; +``` + +## Key Features + +- Full MLS (RFC 9420) group encryption +- Hybrid post-quantum KEM (X25519 + ML-KEM-768) +- OPAQUE authentication with zeroizing credential storage +- SQLCipher local state with Argon2id key derivation +- Sealed sender metadata protection (`--sealed-sender` flag) +- v2 QUIC + Protobuf transport via the `quicproquo-sdk` crate + +## Crate Structure + +| Crate | Purpose | +|-------|---------| +| `quicproquo-core` | Crypto primitives, MLS, hybrid KEM | +| `quicproquo-proto` | Protobuf generated types | +| `quicproquo-rpc` | QUIC RPC framework (framing, dispatch) | +| `quicproquo-sdk` | High-level client SDK (`QpqClient`) | +| `quicproquo-client` | CLI/TUI client application | + +## Related + +- [Wire Format Reference](wire-format.md) -- frame layout and method IDs +- [SDK Overview](index.md) -- all language SDKs diff --git a/docs/src/sdk/wire-format.md b/docs/src/sdk/wire-format.md new file mode 100644 index 0000000..cb7b82d --- /dev/null +++ b/docs/src/sdk/wire-format.md @@ -0,0 +1,210 @@ +# Wire Format Reference + +The quicproquo v2 protocol uses QUIC (RFC 9000) with TLS 1.3 as the transport +layer and Protocol Buffers for message serialization. + +## Connection + +- **Protocol**: QUIC with TLS 1.3 +- **ALPN**: `qpq` +- **Port**: 5001 (default) +- **Certificate**: Server presents a TLS certificate; clients verify against a CA cert + +## Frame Format + +Every RPC request and response is wrapped in a 10-byte binary header: + +``` + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| method_id (u16) | req_id (u32) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| req_id (cont.) | payload_len (u32) | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +| payload_len (cont.) | protobuf payload ... | ++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +``` + +| Field | Type | Bytes | Description | +|-------|------|-------|-------------| +| `method_id` | `u16` | 0-1 | RPC method identifier (network byte order) | +| `req_id` | `u32` | 2-5 | Client-generated request correlation ID (network byte order) | +| `payload_len` | `u32` | 6-9 | Length of the protobuf payload (network byte order) | +| payload | bytes | 10+ | Protobuf-encoded request or response message | + +All multi-byte integers are **big-endian** (network byte order). + +## Stream Model + +Each RPC call uses a **dedicated QUIC bidirectional stream**: + +1. Client opens a new stream. +2. Client sends the request frame and marks end-of-stream. +3. Server reads the request, processes it, and sends the response frame. +4. Server marks end-of-stream. + +This allows concurrent RPCs without head-of-line blocking. + +## Method IDs + +### Auth (100-103) + +| ID | Method | Request | Response | +|----|--------|---------|----------| +| 100 | `OpaqueRegisterStart` | `OpaqueRegisterStartRequest` | `OpaqueRegisterStartResponse` | +| 101 | `OpaqueRegisterFinish` | `OpaqueRegisterFinishRequest` | `OpaqueRegisterFinishResponse` | +| 102 | `OpaqueLoginStart` | `OpaqueLoginStartRequest` | `OpaqueLoginStartResponse` | +| 103 | `OpaqueLoginFinish` | `OpaqueLoginFinishRequest` | `OpaqueLoginFinishResponse` | + +### Delivery (200-205) + +| ID | Method | Request | Response | +|----|--------|---------|----------| +| 200 | `Enqueue` | `EnqueueRequest` | `EnqueueResponse` | +| 201 | `Fetch` | `FetchRequest` | `FetchResponse` | +| 202 | `FetchWait` | `FetchWaitRequest` | `FetchWaitResponse` | +| 203 | `Peek` | `PeekRequest` | `PeekResponse` | +| 204 | `Ack` | `AckRequest` | `AckResponse` | +| 205 | `BatchEnqueue` | `BatchEnqueueRequest` | `BatchEnqueueResponse` | + +### Keys (300-304) + +| ID | Method | Request | Response | +|----|--------|---------|----------| +| 300 | `UploadKeyPackage` | `UploadKeyPackageRequest` | `UploadKeyPackageResponse` | +| 301 | `FetchKeyPackage` | `FetchKeyPackageRequest` | `FetchKeyPackageResponse` | +| 302 | `UploadHybridKey` | `UploadHybridKeyRequest` | `UploadHybridKeyResponse` | +| 303 | `FetchHybridKey` | `FetchHybridKeyRequest` | `FetchHybridKeyResponse` | +| 304 | `FetchHybridKeys` | `FetchHybridKeysRequest` | `FetchHybridKeysResponse` | + +### Channel (400) + +| ID | Method | Request | Response | +|----|--------|---------|----------| +| 400 | `CreateChannel` | `CreateChannelRequest` | `CreateChannelResponse` | + +### Group Management (410-413) + +| ID | Method | Request | Response | +|----|--------|---------|----------| +| 410 | `RemoveMember` | `RemoveMemberRequest` | `RemoveMemberResponse` | +| 411 | `UpdateGroupMetadata` | `UpdateGroupMetadataRequest` | `UpdateGroupMetadataResponse` | +| 412 | `ListGroupMembers` | `ListGroupMembersRequest` | `ListGroupMembersResponse` | +| 413 | `RotateKeys` | `RotateKeysRequest` | `RotateKeysResponse` | + +### User (500-501) + +| ID | Method | Request | Response | +|----|--------|---------|----------| +| 500 | `ResolveUser` | `ResolveUserRequest` | `ResolveUserResponse` | +| 501 | `ResolveIdentity` | `ResolveIdentityRequest` | `ResolveIdentityResponse` | + +### Blob (600-601) + +| ID | Method | Request | Response | +|----|--------|---------|----------| +| 600 | `UploadBlob` | `UploadBlobRequest` | `UploadBlobResponse` | +| 601 | `DownloadBlob` | `DownloadBlobRequest` | `DownloadBlobResponse` | + +### Device (700-702) + +| ID | Method | Request | Response | +|----|--------|---------|----------| +| 700 | `RegisterDevice` | `RegisterDeviceRequest` | `RegisterDeviceResponse` | +| 701 | `ListDevices` | `ListDevicesRequest` | `ListDevicesResponse` | +| 702 | `RevokeDevice` | `RevokeDeviceRequest` | `RevokeDeviceResponse` | + +### P2P / Health (800-802) + +| ID | Method | Request | Response | +|----|--------|---------|----------| +| 800 | `PublishEndpoint` | `PublishEndpointRequest` | `PublishEndpointResponse` | +| 801 | `ResolveEndpoint` | `ResolveEndpointRequest` | `ResolveEndpointResponse` | +| 802 | `Health` | `HealthRequest` | `HealthResponse` | + +### Federation (900-905) + +| ID | Method | Request | Response | +|----|--------|---------|----------| +| 900 | `RelayEnqueue` | `RelayEnqueueRequest` | `RelayEnqueueResponse` | +| 901-905 | Reserved | -- | -- | + +### Account (950) + +| ID | Method | Request | Response | +|----|--------|---------|----------| +| 950 | `DeleteAccount` | `DeleteAccountRequest` | `DeleteAccountResponse` | + +## Protobuf Definitions + +All message types are defined in `proto/qpq/v1/*.proto`: + +| File | Services | +|------|----------| +| `auth.proto` | OPAQUE registration and login | +| `common.proto` | Auth context, account deletion | +| `delivery.proto` | Message enqueue, fetch, peek, ack | +| `keys.proto` | MLS key packages, hybrid keys | +| `channel.proto` | Channel creation | +| `user.proto` | User/identity resolution | +| `group.proto` | Group management | +| `blob.proto` | Binary object storage | +| `device.proto` | Multi-device management | +| `p2p.proto` | P2P endpoints, health | +| `federation.proto` | Cross-server relay | +| `push.proto` | Push notifications | +| `recovery.proto` | Account recovery | +| `moderation.proto` | Content moderation | + +## Authentication Flow + +Authentication uses the OPAQUE protocol (asymmetric PAKE): + +``` +Client Server + | | + | OpaqueRegisterStart(username, | + | registration_request) | + | ---------------------------------->| + | | + | OpaqueRegisterStartResponse( | + | registration_response) | + | <----------------------------------| + | | + | OpaqueRegisterFinish(username, | + | upload, identity_key) | + | ---------------------------------->| + | | + | OpaqueRegisterFinishResponse( | + | success) | + | <----------------------------------| + | | + | OpaqueLoginStart(username, | + | login_request) | + | ---------------------------------->| + | | + | OpaqueLoginStartResponse( | + | login_response) | + | <----------------------------------| + | | + | OpaqueLoginFinish(username, | + | finalization, identity_key) | + | ---------------------------------->| + | | + | OpaqueLoginFinishResponse( | + | session_token) | + | <----------------------------------| +``` + +The `session_token` returned on login is passed in subsequent authenticated RPCs. + +## Error Handling + +The server returns protobuf-encoded error responses on the same stream. Error +conditions include: + +- Invalid method ID: stream reset +- Authentication failure: error response with details +- Rate limiting: error response with retry-after hint +- Internal errors: generic error response diff --git a/docs/src/wire-format/auth-schema.md b/docs/src/wire-format/auth-schema.md index b9060aa..fe1f295 100644 --- a/docs/src/wire-format/auth-schema.md +++ b/docs/src/wire-format/auth-schema.md @@ -1,149 +1,215 @@ # Auth Schema -**Schema file:** `schemas/auth.capnp` -**File ID:** `@0xb3a8f1c2e4d97650` +**Proto file:** `proto/qpq/v1/auth.proto` +**Package:** `qpq.v1` +**Method IDs:** 100-103 -The `AuthenticationService` interface defines the RPC contract for uploading and fetching MLS KeyPackages. It is the standalone version of the Authentication Service; in the current architecture, these methods are integrated into the unified [NodeService](node-service-schema.md) interface. +The auth proto defines the OPAQUE asymmetric password-authenticated key exchange (PAKE) messages used for user registration and login. OPAQUE never transmits the password to the server; the server learns only a random value derived from the password. + +Registration is a two-round-trip flow (start + finish). Login is a two-round-trip flow (start + finish). On successful login, the server returns a `session_token` used to authenticate subsequent RPCs. + +See [Authentication Service Internals](../internals/authentication-service.md) for the server-side implementation and the full flow diagram. --- -## Full schema listing +## Full proto listing -```capnp -# auth.capnp -- Authentication Service RPC interface. -# -# Clients call uploadKeyPackage before joining any group so that peers can -# fetch their key material to add them. Each KeyPackage is single-use (MLS -# requirement): fetchKeyPackage removes and returns one package atomically. -# -# The server indexes packages by the raw Ed25519 public key bytes (32 bytes), -# not a fingerprint, so callers must know the target's identity public key -# out-of-band (e.g. from a directory or QR code scan). -# -# ID generated with: capnp id -@0xb3a8f1c2e4d97650; +```protobuf +syntax = "proto3"; +package qpq.v1; -interface AuthenticationService { - # Upload a single-use KeyPackage for later retrieval by peers. - # - # identityKey : Ed25519 public key bytes (exactly 32 bytes). - # package : openmls-serialised KeyPackage blob (TLS encoding). - # - # Returns the SHA-256 fingerprint of `package`. Clients should record this - # and compare it against the fingerprint returned by a peer's fetchKeyPackage - # to detect tampering. - uploadKeyPackage @0 (identityKey :Data, package :Data) -> (fingerprint :Data); +// OPAQUE registration + login (4 methods). +// Method IDs: 100-103. - # Fetch and atomically remove one KeyPackage for a given identity key. - # - # Returns empty Data if no KeyPackage is currently stored for this identity. - # Callers should handle the empty case by asking the target to upload more - # packages before retrying. - fetchKeyPackage @1 (identityKey :Data) -> (package :Data); +message OpaqueRegisterStartRequest { + string username = 1; + bytes request = 2; +} + +message OpaqueRegisterStartResponse { + bytes response = 1; +} + +message OpaqueRegisterFinishRequest { + string username = 1; + bytes upload = 2; + bytes identity_key = 3; +} + +message OpaqueRegisterFinishResponse { + bool success = 1; +} + +message OpaqueLoginStartRequest { + string username = 1; + bytes request = 2; +} + +message OpaqueLoginStartResponse { + bytes response = 1; +} + +message OpaqueLoginFinishRequest { + string username = 1; + bytes finalization = 2; + bytes identity_key = 3; +} + +message OpaqueLoginFinishResponse { + bytes session_token = 1; } ``` --- -## Method-by-method analysis +## Registration flow (IDs 100-101) -### `uploadKeyPackage @0` +User registration takes two round trips. The `request` and `response` fields carry opaque OPAQUE protocol blobs; their internal structure is defined by the `opaque-ke` crate. + +### OpaqueRegisterStart (ID 100) ``` -uploadKeyPackage (identityKey :Data, package :Data) -> (fingerprint :Data) +Client Server + | | + | OpaqueRegisterStartRequest | + | username: "alice" | + | request: | + | -----------------------------> | + | | + | OpaqueRegisterStartResponse | + | response: | + | <----------------------------- | ``` -**Purpose:** A client uploads a single-use MLS KeyPackage so that peers can later fetch it to add the client to a group. +**Request fields:** -**Parameters:** +| Field | Type | Description | +|-------|------|-------------| +| `username` | `string` | The username being registered. Must be unique on the server. | +| `request` | `bytes` | OPAQUE `RegistrationRequest` blob generated by the client using the `opaque-ke` crate. | -| Parameter | Type | Size | Description | -|---|---|---|---| -| `identityKey` | `Data` | Exactly 32 bytes | The uploader's raw Ed25519 public key bytes. This is the index key under which the package is stored. | -| `package` | `Data` | Variable (bounded by transport max) | An openmls-serialised KeyPackage blob in TLS encoding. Contains the client's HPKE init key, credential, and signature. | +**Response fields:** -**Return value:** +| Field | Type | Description | +|-------|------|-------------| +| `response` | `bytes` | OPAQUE `RegistrationResponse` blob generated by the server. Client feeds this into the finish step. | -| Field | Type | Size | Description | -|---|---|---|---| -| `fingerprint` | `Data` | 32 bytes | SHA-256 digest of the uploaded `package` bytes. | - -**Fingerprint semantics:** The returned fingerprint allows the uploading client to verify that the server stored the package correctly. More importantly, when a peer later fetches a KeyPackage, it can compare the fetched package's SHA-256 hash against the fingerprint (communicated out-of-band) to detect tampering by a malicious server. - -**Idempotency:** Uploading the same package twice appends a second copy to the queue. The server does not deduplicate. Clients should avoid uploading duplicates to conserve their KeyPackage supply. - -### `fetchKeyPackage @1` +### OpaqueRegisterFinish (ID 101) ``` -fetchKeyPackage (identityKey :Data) -> (package :Data) +Client Server + | | + | OpaqueRegisterFinishRequest | + | username: "alice" | + | upload: | + | identity_key: <32 bytes> | + | -----------------------------> | + | | + | OpaqueRegisterFinishResponse | + | success: true | + | <----------------------------- | ``` -**Purpose:** Fetch and atomically remove one KeyPackage for a given identity. This is the mechanism by which a group creator obtains a peer's key material in order to add them to a group via MLS `add_members()`. +**Request fields:** -**Parameters:** +| Field | Type | Description | +|-------|------|-------------| +| `username` | `string` | Must match the username from the start request. | +| `upload` | `bytes` | OPAQUE `RegistrationUpload` blob. The server stores this as the user's OPAQUE record; it contains the password-derived key material without revealing the password. | +| `identity_key` | `bytes` | The user's Ed25519 identity public key (32 bytes). Stored alongside the OPAQUE record and used as the user's long-term identifier for key packages and delivery queues. | -| Parameter | Type | Size | Description | -|---|---|---|---| -| `identityKey` | `Data` | Exactly 32 bytes | The raw Ed25519 public key of the target peer whose KeyPackage is being requested. | +**Response fields:** -**Return value:** - -| Field | Type | Size | Description | -|---|---|---|---| -| `package` | `Data` | Variable, or 0 bytes | The fetched KeyPackage blob, or empty `Data` if no packages are stored for this identity. | - -**Atomic removal:** The fetch operation is destructive: it removes the returned KeyPackage from the server's store in the same operation that returns it. This guarantees MLS's single-use requirement -- a KeyPackage is never served to two different requesters. - -**Empty response handling:** Callers must check for an empty response. An empty `package` means the target has no KeyPackages available. The caller should either: -1. Retry after a delay, hoping the target uploads more packages. -2. Signal the user that the target is unreachable for group addition. +| Field | Type | Description | +|-------|------|-------------| +| `success` | `bool` | `true` if the registration record was stored successfully. `false` if the username is already taken or another error occurred. | --- -## Indexing by raw Ed25519 public key +## Login flow (IDs 102-103) -The Authentication Service indexes KeyPackages by the **raw 32-byte Ed25519 public key**, not by a fingerprint or any higher-level identifier. This design choice has several implications: +User login also takes two round trips. On success, the server issues a `session_token` that the client attaches to subsequent authenticated RPCs. -1. **No directory service required for lookup.** The caller must already know the target's Ed25519 public key (obtained out-of-band via QR code scan, manual exchange, or a future directory service). +### OpaqueLoginStart (ID 102) -2. **Consistent with DS indexing.** The [Delivery Service](delivery-schema.md) uses the same 32-byte Ed25519 key as its queue index, so a single key serves as the universal identifier across both services. +``` +Client Server + | | + | OpaqueLoginStartRequest | + | username: "alice" | + | request: | + | -----------------------------> | + | | + | OpaqueLoginStartResponse | + | response: | + | <----------------------------- | +``` -3. **No ambiguity.** Unlike fingerprints (which could collide if truncated) or human-readable names (which require a mapping layer), the raw public key is the canonical, collision-resistant identifier. +**Request fields:** + +| Field | Type | Description | +|-------|------|-------------| +| `username` | `string` | The username logging in. | +| `request` | `bytes` | OPAQUE `CredentialRequest` blob generated by the client. | + +**Response fields:** + +| Field | Type | Description | +|-------|------|-------------| +| `response` | `bytes` | OPAQUE `CredentialResponse` blob. Contains the server's masked public key and envelope for the client to derive its export key. | + +### OpaqueLoginFinish (ID 103) + +``` +Client Server + | | + | OpaqueLoginFinishRequest | + | username: "alice" | + | finalization: | + | identity_key: <32 bytes> | + | -----------------------------> | + | | + | OpaqueLoginFinishResponse | + | session_token: <32 bytes> | + | <----------------------------- | +``` + +**Request fields:** + +| Field | Type | Description | +|-------|------|-------------| +| `username` | `string` | Must match the username from the start request. | +| `finalization` | `bytes` | OPAQUE `CredentialFinalization` blob containing the client's proof of knowledge of the password. The server verifies this against its stored OPAQUE record. | +| `identity_key` | `bytes` | The user's Ed25519 identity public key (32 bytes). The server verifies this matches the key registered during `OpaqueRegisterFinish`. | + +**Response fields:** + +| Field | Type | Description | +|-------|------|-------------| +| `session_token` | `bytes` | Opaque bearer token (32 bytes). Included in subsequent RPC requests to authenticate the session. The server associates this token with the user's identity and device. | + +If login fails (wrong password, unknown username, or identity key mismatch), the server returns an error status in the response frame; the `session_token` field is empty. --- -## Single-use semantics +## Session token usage -MLS requires that each KeyPackage be used at most once to preserve the forward secrecy of the initial key exchange. The Authentication Service enforces this by atomically removing the KeyPackage on fetch. +After a successful `OpaqueLoginFinish`, the client uses the `session_token` as a bearer credential for all authenticated RPC methods. The token is passed at the QUIC connection level (not per-frame); the server validates it on connection establishment and maintains the association for the lifetime of the connection. -**Consequences for clients:** +The `Auth` message in `common.proto` carries the token for federation and internal use: -- Clients should **pre-upload multiple KeyPackages** after generating their identity, so that several peers can add them to groups concurrently without exhausting the supply. -- Clients should **monitor their KeyPackage count** on the server (via a future monitoring endpoint or periodic re-upload) and replenish when the supply runs low. -- If a client has zero KeyPackages stored, it is effectively unreachable for new group invitations until it uploads more. - -For the design rationale behind single-use KeyPackages, see [ADR-005: Single-Use KeyPackages](../design-rationale/adr-005-single-use-keypackages.md). - ---- - -## Relationship to NodeService - -In the current unified architecture, the Authentication Service methods are exposed as part of the [NodeService interface](node-service-schema.md): - -| AuthenticationService Method | NodeService Method | Additional Parameters | -|---|---|---| -| `uploadKeyPackage @0` | `uploadKeyPackage @0` | `auth :Auth` | -| `fetchKeyPackage @1` | `fetchKeyPackage @1` | `auth :Auth` | - -The standalone `AuthenticationService` interface remains in the schema for documentation purposes and for use in contexts where the full NodeService is not needed. +```protobuf +message Auth { + bytes access_token = 1; + bytes device_id = 2; +} +``` --- ## Further reading -- [Wire Format Overview](overview.md) -- serialisation pipeline context -- [NodeService Schema](node-service-schema.md) -- unified interface that subsumes AuthenticationService -- [Delivery Schema](delivery-schema.md) -- the companion service for message routing -- [Envelope Schema](envelope-schema.md) -- legacy framing that used `keyPackageUpload`/`keyPackageFetch` message types -- [ADR-005: Single-Use KeyPackages](../design-rationale/adr-005-single-use-keypackages.md) -- design rationale for atomic removal on fetch -- [ADR-004: MLS-Unaware Delivery Service](../design-rationale/adr-004-mls-unaware-ds.md) -- why the server does not inspect MLS content +- [Wire Format Overview](overview.md) -- frame format and transport parameters +- [Method ID Reference](envelope-schema.md) -- all 44 method IDs +- [Authentication Service Internals](../internals/authentication-service.md) -- server-side OPAQUE flow and session management +- [RPC Reference](node-service-schema.md) -- all proto definitions diff --git a/docs/src/wire-format/delivery-schema.md b/docs/src/wire-format/delivery-schema.md index 80d12a6..5ce4510 100644 --- a/docs/src/wire-format/delivery-schema.md +++ b/docs/src/wire-format/delivery-schema.md @@ -1,193 +1,377 @@ -# Delivery Schema +# Delivery and Keys Schema -**Schema file:** `schemas/delivery.capnp` -**File ID:** `@0xc5d9e2b4f1a83076` +**Proto files:** `proto/qpq/v1/delivery.proto`, `proto/qpq/v1/keys.proto` +**Package:** `qpq.v1` +**Method IDs:** 200-205 (delivery), 300-304 (key packages and hybrid keys), 510-520 (key transparency) -The `DeliveryService` interface defines the RPC contract for the store-and-forward message relay. The DS is intentionally MLS-unaware: it routes opaque byte strings by recipient key and optional channel ID without parsing or inspecting the content. +This page documents the Protobuf message definitions for the delivery service (store-and-forward message relay) and the key management service (MLS KeyPackages, hybrid post-quantum keys, and key transparency). --- -## Full schema listing +## delivery.proto -```capnp -# delivery.capnp -- Delivery Service RPC interface. -# -# The Delivery Service is a simple store-and-forward relay. It does not parse -# MLS messages -- all payloads are opaque byte strings routed by recipient key. -# -# Callers are responsible for: -# - Routing Welcome messages to the correct new member after add_members(). -# - Routing Commit messages to any existing group members (other than self). -# - Routing Application messages to the intended recipient(s). -# -# The DS indexes queues by the recipient's raw Ed25519 public key (32 bytes), -# matching the indexing scheme used by the Authentication Service. -# -# ID generated with: capnp id -@0xc5d9e2b4f1a83076; +The delivery service is a store-and-forward relay. It is intentionally MLS-unaware: all payloads are opaque byte strings routed by recipient key and channel ID. The server never inspects or decrypts message content. -interface DeliveryService { - # Enqueue an opaque payload for delivery to a recipient. - # - # recipientKey : Ed25519 public key of the intended recipient (exactly 32 bytes). - # payload : Opaque byte string -- a TLS-encoded MlsMessageOut blob or any - # other framed data the application layer wants to deliver. - # channelId : Optional channel identifier (empty for legacy). A 16-byte UUID - # is recommended for 1:1 channels. - # version : Schema/wire version. Must be 0 (legacy) or 1 (this spec). - # - # The payload is appended to the recipient's FIFO queue. Returns immediately; - # the recipient retrieves it via `fetch`. - enqueue @0 (recipientKey :Data, payload :Data, channelId :Data, version :UInt16) -> (); +### Full proto listing - # Fetch and atomically drain all queued payloads for a given recipient. - # - # recipientKey : Ed25519 public key of the caller (exactly 32 bytes). - # channelId : Optional channel identifier (empty for legacy). - # version : Schema/wire version. Must be 0 (legacy) or 1 (this spec). - # - # Returns the complete queue in FIFO order and clears it. Returns an empty - # list if there are no pending messages. - fetch @1 (recipientKey :Data, channelId :Data, version :UInt16) -> (payloads :List(Data)); +```protobuf +syntax = "proto3"; +package qpq.v1; + +// Delivery service: enqueue, fetch, peek, ack, batch (6 methods). +// Method IDs: 200-205. + +message Envelope { + uint64 seq = 1; + bytes data = 2; +} + +message EnqueueRequest { + bytes recipient_key = 1; + bytes payload = 2; + bytes channel_id = 3; + uint32 ttl_secs = 4; + // Client-generated idempotency key (16 bytes, UUID v7). + // Server deduplicates enqueue requests with the same message_id within a TTL window. + bytes message_id = 5; +} + +message EnqueueResponse { + uint64 seq = 1; + bytes delivery_proof = 2; + // True if this was a duplicate enqueue (message_id already seen). + bool duplicate = 3; +} + +message FetchRequest { + bytes recipient_key = 1; + bytes channel_id = 2; + uint32 limit = 3; + // Device ID for multi-device scoping. + bytes device_id = 4; +} + +message FetchResponse { + repeated Envelope payloads = 1; +} + +message FetchWaitRequest { + bytes recipient_key = 1; + bytes channel_id = 2; + uint64 timeout_ms = 3; + uint32 limit = 4; + bytes device_id = 5; +} + +message FetchWaitResponse { + repeated Envelope payloads = 1; +} + +message PeekRequest { + bytes recipient_key = 1; + bytes channel_id = 2; + uint32 limit = 3; + bytes device_id = 4; +} + +message PeekResponse { + repeated Envelope payloads = 1; +} + +message AckRequest { + bytes recipient_key = 1; + bytes channel_id = 2; + uint64 seq_up_to = 3; + bytes device_id = 4; +} + +message AckResponse {} + +message BatchEnqueueRequest { + repeated bytes recipient_keys = 1; + bytes payload = 2; + bytes channel_id = 3; + uint32 ttl_secs = 4; + bytes message_id = 5; +} + +message BatchEnqueueResponse { + repeated uint64 seqs = 1; } ``` ---- +### Envelope -## Method-by-method analysis - -### `enqueue @0` - -``` -enqueue (recipientKey :Data, payload :Data, channelId :Data, version :UInt16) -> () -``` - -**Purpose:** Append an opaque payload to a recipient's delivery queue. The DS stores the payload until the recipient fetches it. The call returns immediately after the payload is enqueued; it does not block until delivery. - -**Parameters:** - -| Parameter | Type | Size | Description | -|---|---|---|---| -| `recipientKey` | `Data` | Exactly 32 bytes | Ed25519 public key of the intended recipient. Used as the primary queue index. | -| `payload` | `Data` | Variable (bounded by transport max) | Opaque byte string. Typically a TLS-encoded `MlsMessageOut` blob, but the DS does not inspect it. | -| `channelId` | `Data` | 0 bytes (legacy) or 16 bytes (UUID) | Channel identifier for channel-aware routing. Empty `Data` is treated as the legacy default channel. | -| `version` | `UInt16` | 2 bytes | Schema/wire version. `0` = legacy (no channel routing), `1` = current spec (channel-aware). | - -**Return value:** Void. The method returns `()` on success. Errors are surfaced as Cap'n Proto RPC exceptions. - -**Queue semantics:** Payloads are appended in FIFO order. The DS does not deduplicate, reorder, or inspect payloads. Multiple enqueue calls for the same recipient and channel ID are simply appended to the queue in the order they arrive. - -### `fetch @1` - -``` -fetch (recipientKey :Data, channelId :Data, version :UInt16) -> (payloads :List(Data)) -``` - -**Purpose:** Fetch and atomically drain all queued payloads for a given recipient on a given channel. This is the "pull" side of the store-and-forward relay. - -**Parameters:** - -| Parameter | Type | Size | Description | -|---|---|---|---| -| `recipientKey` | `Data` | Exactly 32 bytes | Ed25519 public key of the caller. Must match the key used in the enqueue calls. | -| `channelId` | `Data` | 0 bytes (legacy) or 16 bytes (UUID) | Channel identifier. Must match the `channelId` used during enqueue. | -| `version` | `UInt16` | 2 bytes | Schema/wire version. Must match the version used during enqueue. | - -**Return value:** +The `Envelope` wrapper is returned by fetch, peek, and fetch-wait operations. | Field | Type | Description | -|---|---|---| -| `payloads` | `List(Data)` | All queued payloads in FIFO order. Empty list if no messages are pending. | +|-------|------|-------------| +| `seq` | `uint64` | Server-assigned monotonic sequence number for ordering and acknowledgment. | +| `data` | `bytes` | The original payload bytes submitted at enqueue time. | -**Atomic drain:** The fetch operation returns the entire queue and clears it in a single atomic operation. There is no "peek" or partial fetch. This simplifies the concurrency model: the client processes all returned payloads and does not need to track which ones it has already seen. +### Enqueue (ID 200) + +Appends an opaque payload to a recipient's queue. Returns immediately. + +**Request:** + +| Field | Type | Description | +|-------|------|-------------| +| `recipient_key` | `bytes` | Recipient's Ed25519 identity public key (32 bytes). Primary queue index. | +| `payload` | `bytes` | Opaque byte string. Typically a TLS-encoded MLS ciphertext blob. | +| `channel_id` | `bytes` | Channel identifier (16-byte UUID v7 recommended). Empty = default channel. | +| `ttl_secs` | `uint32` | Time-to-live in seconds. Server garbage-collects expired messages. 0 = server default. | +| `message_id` | `bytes` | Client-generated idempotency key (16 bytes, UUID v7). Server deduplicates within the TTL window. | + +**Response:** + +| Field | Type | Description | +|-------|------|-------------| +| `seq` | `uint64` | Server-assigned sequence number for this message. | +| `delivery_proof` | `bytes` | Cryptographic proof of delivery (reserved for future use). | +| `duplicate` | `bool` | `true` if this `message_id` was already seen within the TTL window; the payload was not stored again. | + +### Fetch (ID 201) + +Returns and retains queued messages up to `limit`. Does not remove messages from the queue; use `Ack` to advance the read cursor. + +**Request:** + +| Field | Type | Description | +|-------|------|-------------| +| `recipient_key` | `bytes` | Recipient's Ed25519 identity public key (32 bytes). | +| `channel_id` | `bytes` | Channel identifier. Must match the value used at enqueue time. | +| `limit` | `uint32` | Maximum number of envelopes to return. 0 = server default. | +| `device_id` | `bytes` | Optional device identifier for multi-device queue scoping. | + +**Response:** + +| Field | Type | Description | +|-------|------|-------------| +| `payloads` | `repeated Envelope` | Messages in FIFO order. Empty list if no messages are pending. | + +### FetchWait (ID 202) + +Long-poll variant of `Fetch`. Blocks on the server until messages arrive or `timeout_ms` elapses. + +**Request:** + +| Field | Type | Description | +|-------|------|-------------| +| `recipient_key` | `bytes` | Recipient's Ed25519 identity public key (32 bytes). | +| `channel_id` | `bytes` | Channel identifier. | +| `timeout_ms` | `uint64` | Maximum wait time in milliseconds. 0 = return immediately (equivalent to Fetch). | +| `limit` | `uint32` | Maximum number of envelopes to return. | +| `device_id` | `bytes` | Optional device identifier. | + +**Response:** Same as `FetchResponse`. + +FetchWait eliminates polling latency: the server holds the RPC open until a `Notify` is signalled by a concurrent `Enqueue` call, or until `timeout_ms` expires. + +### Peek (ID 203) + +Non-destructive read. Returns messages without removing them and without advancing the acknowledgment cursor. + +**Request / Response:** Same field layout as `FetchRequest` / `FetchResponse`. + +Peek is useful for inspecting pending messages without marking them as delivered. + +### Ack (ID 204) + +Advances the delivery cursor, removing all messages with `seq <= seq_up_to` from the queue. + +**Request:** + +| Field | Type | Description | +|-------|------|-------------| +| `recipient_key` | `bytes` | Recipient's Ed25519 identity public key (32 bytes). | +| `channel_id` | `bytes` | Channel identifier. | +| `seq_up_to` | `uint64` | All messages with sequence number <= this value are removed. | +| `device_id` | `bytes` | Optional device identifier. | + +**Response:** Empty (`AckResponse {}`). + +### BatchEnqueue (ID 205) + +Fan-out: enqueues the same payload to multiple recipients in a single RPC call. + +**Request:** + +| Field | Type | Description | +|-------|------|-------------| +| `recipient_keys` | `repeated bytes` | List of recipient Ed25519 identity public keys. | +| `payload` | `bytes` | Opaque payload, delivered identically to all recipients. | +| `channel_id` | `bytes` | Channel identifier. | +| `ttl_secs` | `uint32` | Time-to-live in seconds. | +| `message_id` | `bytes` | Idempotency key (16 bytes). | + +**Response:** + +| Field | Type | Description | +|-------|------|-------------| +| `seqs` | `repeated uint64` | Server-assigned sequence numbers, one per `recipient_key`, in the same order. | --- -## Channel-aware routing +## keys.proto -The `channelId` field enables per-channel queue separation. Each unique `(recipientKey, channelId)` pair maps to an independent FIFO queue on the server. +Key management for MLS KeyPackages, hybrid post-quantum keys, and key transparency audit. -### Compound key structure +### Full proto listing -```text -Queue Key = recipientKey (32 bytes) || channelId (0 or 16 bytes) +```protobuf +syntax = "proto3"; +package qpq.v1; + +// Key package + hybrid key CRUD (5 methods). +// Method IDs: 300-304. + +message UploadKeyPackageRequest { + bytes identity_key = 1; + bytes package = 2; +} + +message UploadKeyPackageResponse { + bytes fingerprint = 1; +} + +message FetchKeyPackageRequest { + bytes identity_key = 1; +} + +message FetchKeyPackageResponse { + bytes package = 1; +} + +message UploadHybridKeyRequest { + bytes identity_key = 1; + bytes hybrid_public_key = 2; +} + +message UploadHybridKeyResponse {} + +message FetchHybridKeyRequest { + bytes identity_key = 1; +} + +message FetchHybridKeyResponse { + bytes hybrid_public_key = 1; +} + +message FetchHybridKeysRequest { + repeated bytes identity_keys = 1; +} + +message FetchHybridKeysResponse { + repeated bytes keys = 1; +} + +// Key revocation (method ID 510). +message RevokeKeyRequest { + bytes identity_key = 1; + string reason = 2; // "compromised", "superseded", "user_revoked" +} + +message RevokeKeyResponse { + bool success = 1; + uint64 leaf_index = 2; +} + +// Check revocation status (method ID 511). +message CheckRevocationRequest { + bytes identity_key = 1; +} + +message CheckRevocationResponse { + bool revoked = 1; + string reason = 2; + uint64 timestamp_ms = 3; +} + +// KT audit log retrieval (method ID 520). +message AuditKeyTransparencyRequest { + uint64 start = 1; + uint64 end = 2; +} + +message AuditKeyTransparencyResponse { + repeated LogEntry entries = 1; + uint64 tree_size = 2; + bytes root = 3; +} + +message LogEntry { + uint64 index = 1; + bytes leaf_hash = 2; +} ``` -When `channelId` is empty (0 bytes), the queue key degenerates to just the `recipientKey`, preserving backward compatibility with legacy clients that do not use channels. +### UploadKeyPackage (ID 300) -### Channel ID format +Uploads a single-use MLS KeyPackage. KeyPackages are stored in a FIFO queue per identity; each is consumed once by `FetchKeyPackage`. -The recommended format for `channelId` is a 16-byte UUID (128-bit, typically UUID v4). The DS treats the channel ID as an opaque byte string and does not parse its structure. Using UUIDs provides: +| Field | Type | Description | +|-------|------|-------------| +| `identity_key` | `bytes` | Uploader's Ed25519 identity public key (32 bytes). Index key for the queue. | +| `package` | `bytes` | openmls-serialised KeyPackage (bincode format, as required by `DiskKeyStore`). | -1. **Collision resistance** -- 2^122 random bits (for UUID v4) makes accidental collision negligible. -2. **Privacy** -- The channel ID reveals no information about the channel's participants or purpose. -3. **Fixed size** -- 16 bytes is compact and predictable for indexing. +Response: `fingerprint` -- SHA-256 digest of the stored package (32 bytes). Callers should record this to detect tampering. -### Use cases +### FetchKeyPackage (ID 301) -| Scenario | channelId | recipientKey | Result | -|---|---|---|---| -| Legacy client, no channels | Empty (0 bytes) | Alice's Ed25519 key | Single queue for all of Alice's messages | -| 1:1 channel between Alice and Bob | UUID of the 1:1 channel | Alice's Ed25519 key | Separate queue for this specific channel | -| Group channel | UUID of the group channel | Alice's Ed25519 key | Separate queue for this group's messages to Alice | +Fetches and atomically removes one KeyPackage for the given identity. Returns empty bytes if no packages are stored. The removal is atomic; concurrent fetches will not receive the same package. ---- +### UploadHybridKey (ID 302) -## Version field +Uploads the client's hybrid (X25519 + ML-KEM-768) public key. Unlike KeyPackages, hybrid keys are not single-use -- each identity stores exactly one, overwriting the previous value. -The `version` field provides a mechanism for wire-level schema evolution without breaking existing clients. +| Field | Type | Description | +|-------|------|-------------| +| `identity_key` | `bytes` | Uploader's Ed25519 identity public key (32 bytes). | +| `hybrid_public_key` | `bytes` | Concatenated X25519 public key (32 bytes) + ML-KEM-768 encapsulation key. | -| Version | Semantics | -|---|---| -| `0` | Legacy mode. `channelId` is ignored (treated as empty). Behaves like the pre-channel DeliveryService. | -| `1` | Current specification. `channelId` is used for channel-aware routing. | +### FetchHybridKey (ID 303) -The server validates the version field and rejects unknown versions as protocol errors. Clients must set the version field to match the schema revision they implement. +Fetches a single peer's hybrid public key. Non-destructive. ---- +### FetchHybridKeys (ID 304) -## FIFO queue semantics +Batch variant of `FetchHybridKey`. Returns one key per input identity key, in the same order. Missing keys are returned as empty bytes at the corresponding index. -The Delivery Service provides strict FIFO ordering within each `(recipientKey, channelId)` queue: +### RevokeKey (ID 510) -1. **Enqueue order is preserved.** Payloads are returned by `fetch` in the exact order they were enqueued. -2. **Atomic drain.** Each `fetch` call returns all pending payloads and clears the queue. There is no risk of partial reads or interleaving. -3. **No persistence guarantees (current implementation).** The in-memory queue is lost on server restart. Persistent storage is planned for a future milestone. -4. **No redelivery.** Once a payload is returned by `fetch`, it is permanently removed. If the client crashes before processing it, the payload is lost. Reliable delivery with acknowledgments is a future enhancement. +Revokes an identity key by appending a revocation entry to the key transparency Merkle log. ---- +| Field | Type | Description | +|-------|------|-------------| +| `identity_key` | `bytes` | Identity key to revoke (32 bytes). | +| `reason` | `string` | One of: `"compromised"`, `"superseded"`, `"user_revoked"`. | -## MLS-unaware design +Response: `leaf_index` is the index of the revocation entry in the KT Merkle log. -The DS intentionally does not parse, validate, or inspect MLS messages. All payloads are opaque `Data` blobs. This design has several consequences: +### CheckRevocation (ID 511) -- **Security:** The server cannot extract plaintext from MLS ciphertext, even if compromised. -- **Simplicity:** The DS has no dependency on openmls or any MLS library. -- **Flexibility:** The same DS can carry non-MLS payloads (e.g., signaling, metadata) without modification. -- **No server-side optimization:** The DS cannot optimize delivery based on MLS message type (e.g., fanning out a Commit to all group members). The client must enqueue separately for each recipient. +Checks whether an identity key has been revoked. -For the full design rationale, see [ADR-004: MLS-Unaware Delivery Service](../design-rationale/adr-004-mls-unaware-ds.md). +Response fields: `revoked` (bool), `reason` (string), `timestamp_ms` (uint64 unix milliseconds of the revocation event). ---- +### AuditKeyTransparency (ID 520) -## Relationship to NodeService +Returns a range of entries from the key transparency append-only Merkle log. -In the current unified architecture, the Delivery Service methods are exposed as part of the [NodeService interface](node-service-schema.md) with additional methods: +| Field | Type | Description | +|-------|------|-------------| +| `start` | `uint64` | First leaf index (inclusive). | +| `end` | `uint64` | Last leaf index (exclusive). 0 = up to current tree size. | -| DeliveryService Method | NodeService Method | Additional Parameters | -|---|---|---| -| `enqueue @0` | `enqueue @2` | `auth :Auth` | -| `fetch @1` | `fetch @3` | `auth :Auth` | -| *(none)* | `fetchWait @4` | `auth :Auth`, `timeoutMs :UInt64` | - -The `fetchWait` method is a NodeService extension that provides long-polling semantics: it blocks until either new payloads arrive or the timeout expires. This avoids the latency and bandwidth overhead of repeated `fetch` polling. +Response: `entries` (list of `LogEntry`), `tree_size` (current log size), `root` (Merkle root hash). --- ## Further reading -- [Wire Format Overview](overview.md) -- serialisation pipeline context -- [NodeService Schema](node-service-schema.md) -- unified interface that subsumes DeliveryService -- [Auth Schema](auth-schema.md) -- the companion service for KeyPackage management -- [Envelope Schema](envelope-schema.md) -- legacy framing that used `mlsWelcome`/`mlsCommit`/`mlsApplication` message types -- [ADR-004: MLS-Unaware Delivery Service](../design-rationale/adr-004-mls-unaware-ds.md) -- why the DS does not inspect MLS content +- [Wire Format Overview](overview.md) -- frame format and transport parameters +- [Method ID Reference](envelope-schema.md) -- all 44 method IDs +- [Auth Schema](auth-schema.md) -- OPAQUE authentication proto definitions +- [RPC Reference](node-service-schema.md) -- all proto definitions for all 14 files +- [Storage Backend](../internals/storage-backend.md) -- how KeyPackages and hybrid keys are persisted diff --git a/docs/src/wire-format/envelope-schema.md b/docs/src/wire-format/envelope-schema.md index 55863ce..7f8de4f 100644 --- a/docs/src/wire-format/envelope-schema.md +++ b/docs/src/wire-format/envelope-schema.md @@ -1,149 +1,208 @@ -# Envelope Schema +# Method ID Reference -**Schema file:** `schemas/envelope.capnp` -**File ID:** `@0xe4a7f2c8b1d63509` +The v2 RPC protocol dispatches requests by a `u16` method ID encoded in the first two bytes of every request frame. This page is the authoritative reference for all 44 method IDs and their corresponding Protobuf message types. -The Envelope is the legacy top-level wire message used in M1 for all quicproquo traffic. Every frame exchanged between peers was serialised as an Envelope, with the Delivery Service routing by `(groupId, msgType)` without inspecting the payload. - -> **Note:** The Envelope is the M1-era framing format. The current M3+ architecture uses Cap'n Proto RPC directly via the [NodeService](node-service-schema.md) interface. The Envelope schema remains in the codebase for backward compatibility and for use in integration tests. +Method IDs are defined in `crates/quicproquo-proto/src/lib.rs` (the `method_ids` module). Proto definitions live in `proto/qpq/v1/`. --- -## Full schema listing +## Auth (100-103) -```capnp -# envelope.capnp -- top-level wire message for all quicproquo traffic. -# -# Every frame is serialised as an Envelope. -# The Delivery Service routes by (groupId, msgType) without inspecting payload. -# -# Field sizing rationale: -# groupId / senderId : 32 bytes -- SHA-256 digest -# payload : opaque -- MLS blob or control data -# timestampMs : UInt64 -- unix epoch milliseconds; sufficient until year 292M -# -# ID generated with: capnp id -@0xe4a7f2c8b1d63509; +OPAQUE asymmetric password-authenticated key exchange. Registration is a two-round trip (start + finish); login is a two-round trip (start + finish). See [Auth Schema](auth-schema.md) for proto definitions and [Authentication Service](../internals/authentication-service.md) for flow diagrams. -struct Envelope { - # Message type discriminant -- determines how payload is interpreted. - msgType @0 :MsgType; - - # 32-byte SHA-256 digest of the group name. - # The Delivery Service uses this as its routing key. - # Zero-filled for point-to-point control messages (ping, keyPackageUpload, etc.). - groupId @1 :Data; - - # 32-byte SHA-256 digest of the sender's Ed25519 identity public key. - senderId @2 :Data; - - # Opaque payload. Interpretation is determined by msgType. - payload @3 :Data; - - # Unix timestamp in milliseconds at the time of send. - timestampMs @4 :UInt64; - - enum MsgType { - ping @0; - pong @1; - keyPackageUpload @2; - keyPackageFetch @3; - keyPackageResponse @4; - mlsWelcome @5; - mlsCommit @6; - mlsApplication @7; - error @8; - } -} -``` +| ID | Constant | Request | Response | +|----|----------|---------|---------| +| 100 | `OPAQUE_REGISTER_START` | `OpaqueRegisterStartRequest` | `OpaqueRegisterStartResponse` | +| 101 | `OPAQUE_REGISTER_FINISH` | `OpaqueRegisterFinishRequest` | `OpaqueRegisterFinishResponse` | +| 102 | `OPAQUE_LOGIN_START` | `OpaqueLoginStartRequest` | `OpaqueLoginStartResponse` | +| 103 | `OPAQUE_LOGIN_FINISH` | `OpaqueLoginFinishRequest` | `OpaqueLoginFinishResponse` | --- -## Field-by-field analysis +## Delivery (200-205) -### `msgType @0 :MsgType` +Store-and-forward message relay. The server is MLS-unaware: payloads are opaque byte strings routed by recipient key and channel ID. See [Delivery Schema](delivery-schema.md) for proto definitions. -A 16-bit enum discriminant (Cap'n Proto enums are encoded as UInt16). Determines how the `payload` field should be interpreted. The discriminant is the first field in the struct for efficient dispatch: a router can read the first two bytes of the struct section to decide how to handle the message without parsing any pointer fields. - -### `groupId @1 :Data` - -A 32-byte `Data` field containing the SHA-256 digest of the group name. The Delivery Service uses this as its primary routing key when the Envelope-based protocol is active. - -**Sizing rationale:** SHA-256 produces a 32-byte (256-bit) digest. This is stored as a variable-length `Data` field rather than a fixed-size blob because Cap'n Proto does not have a fixed-size array type. Implementations must validate that the field contains exactly 32 bytes. - -**Special case:** For point-to-point control messages (`ping`, `pong`, `keyPackageUpload`, `keyPackageFetch`), the `groupId` is zero-filled (32 zero bytes) because these messages are not associated with any group. - -### `senderId @2 :Data` - -A 32-byte `Data` field containing the SHA-256 digest of the sender's Ed25519 identity public key. This allows the receiver to identify the sender without inspecting the MLS-layer credentials. - -**Sizing rationale:** Same as `groupId` -- SHA-256 digest, 32 bytes. - -### `payload @3 :Data` - -An opaque byte string whose interpretation depends on `msgType`. - -### `timestampMs @4 :UInt64` - -Unix epoch timestamp in milliseconds, set by the sender at the time of send. Encoded as a `UInt64`, which provides sufficient range until approximately year 292,000,000 -- effectively unlimited for practical purposes. - -The timestamp is sender-asserted and **not** authenticated by the server. Receivers should treat it as advisory (for display ordering) rather than authoritative. +| ID | Constant | Request | Response | +|----|----------|---------|---------| +| 200 | `ENQUEUE` | `EnqueueRequest` | `EnqueueResponse` | +| 201 | `FETCH` | `FetchRequest` | `FetchResponse` | +| 202 | `FETCH_WAIT` | `FetchWaitRequest` | `FetchWaitResponse` | +| 203 | `PEEK` | `PeekRequest` | `PeekResponse` | +| 204 | `ACK` | `AckRequest` | `AckResponse` | +| 205 | `BATCH_ENQUEUE` | `BatchEnqueueRequest` | `BatchEnqueueResponse` | --- -## MsgType enum +## Keys (300-304) -The `MsgType` enum defines nine message types. Each variant determines how the `payload` field is interpreted: +MLS KeyPackage and hybrid post-quantum key management. See [Delivery Schema](delivery-schema.md) for proto definitions (keys are defined in `keys.proto`). -| Ordinal | Variant | Payload Contents | Direction | -|---|---|---|---| -| 0 | `ping` | Empty | Client -> Server or Peer -> Peer | -| 1 | `pong` | Empty | Server -> Client or Peer -> Peer | -| 2 | `keyPackageUpload` | openmls-serialised KeyPackage blob (TLS encoding) | Client -> Server | -| 3 | `keyPackageFetch` | Target identity key (32 bytes, raw Ed25519 public key) | Client -> Server | -| 4 | `keyPackageResponse` | openmls-serialised KeyPackage blob, or empty if none stored | Server -> Client | -| 5 | `mlsWelcome` | `MLSMessage` blob (Welcome variant) | Peer -> Peer (via DS) | -| 6 | `mlsCommit` | `MLSMessage` blob (PublicMessage / Commit variant) | Peer -> Group (via DS) | -| 7 | `mlsApplication` | `MLSMessage` blob (PrivateMessage / Application variant) | Peer -> Group (via DS) | -| 8 | `error` | UTF-8 error description string | Any direction | - -### Control messages (0-1) - -`ping` and `pong` are keepalive probes with empty payloads. They serve as health checks over long-lived connections. - -### Authentication messages (2-4) - -`keyPackageUpload`, `keyPackageFetch`, and `keyPackageResponse` implement the Authentication Service protocol over the Envelope format. In the current architecture, these operations are handled by the [NodeService RPC](node-service-schema.md) methods `uploadKeyPackage` and `fetchKeyPackage` instead. - -### MLS messages (5-7) - -`mlsWelcome`, `mlsCommit`, and `mlsApplication` carry MLS protocol messages as opaque blobs. The Envelope does not inspect or validate the MLS content; it simply transports the bytes between peers via the Delivery Service. - -### Error messages (8) - -`error` carries a UTF-8 string describing an error condition. Used for protocol-level error reporting (e.g., "no KeyPackage found for identity"). +| ID | Constant | Request | Response | +|----|----------|---------|---------| +| 300 | `UPLOAD_KEY_PACKAGE` | `UploadKeyPackageRequest` | `UploadKeyPackageResponse` | +| 301 | `FETCH_KEY_PACKAGE` | `FetchKeyPackageRequest` | `FetchKeyPackageResponse` | +| 302 | `UPLOAD_HYBRID_KEY` | `UploadHybridKeyRequest` | `UploadHybridKeyResponse` | +| 303 | `FETCH_HYBRID_KEY` | `FetchHybridKeyRequest` | `FetchHybridKeyResponse` | +| 304 | `FETCH_HYBRID_KEYS` | `FetchHybridKeysRequest` | `FetchHybridKeysResponse` | --- -## Relationship to NodeService +## Channel (400) -The Envelope schema was the original M1 wire format. With the transition to QUIC + TLS 1.3 and Cap'n Proto RPC in M3, the Envelope's role has been superseded by the [NodeService interface](node-service-schema.md), which provides typed RPC methods for each operation. +Direct-message channel creation. Returns a deterministic channel ID for a given peer key pair, with deduplication. -The key differences: +| ID | Constant | Request | Response | +|----|----------|---------|---------| +| 400 | `CREATE_CHANNEL` | `CreateChannelRequest` | `CreateChannelResponse` | -| Aspect | Envelope (M1) | NodeService RPC (M3+) | -|---|---|---| -| Dispatch | Manual, based on `msgType` enum | Automatic, Cap'n Proto RPC method dispatch | -| Type safety | Payload is opaque `Data` | Each method has typed parameters and return values | -| Transport | QUIC + TLS 1.3 | QUIC + TLS 1.3 | -| Auth | None | Explicit `Auth` struct per method call | +--- + +## Group Management (410-413) + +MLS group operations: member removal, metadata updates, member listing, and key rotation. + +| ID | Constant | Request | Response | +|----|----------|---------|---------| +| 410 | `REMOVE_MEMBER` | `RemoveMemberRequest` | `RemoveMemberResponse` | +| 411 | `UPDATE_GROUP_METADATA` | `UpdateGroupMetadataRequest` | `UpdateGroupMetadataResponse` | +| 412 | `LIST_GROUP_MEMBERS` | `ListGroupMembersRequest` | `ListGroupMembersResponse` | +| 413 | `ROTATE_KEYS` | `RotateKeysRequest` | `RotateKeysResponse` | + +--- + +## Moderation (420-424) + +Content moderation: encrypted reports, user bans, and audit lists. Admin-only methods require elevated session privileges. + +| ID | Constant | Request | Response | +|----|----------|---------|---------| +| 420 | `REPORT_MESSAGE` | `ReportMessageRequest` | `ReportMessageResponse` | +| 421 | `BAN_USER` | `BanUserRequest` | `BanUserResponse` | +| 422 | `UNBAN_USER` | `UnbanUserRequest` | `UnbanUserResponse` | +| 423 | `LIST_REPORTS` | `ListReportsRequest` | `ListReportsResponse` | +| 424 | `LIST_BANNED` | `ListBannedRequest` | `ListBannedResponse` | + +--- + +## User / Identity (500-501) + +Forward and reverse user resolution. `ResolveUser` returns the identity key with a key-transparency inclusion proof. + +| ID | Constant | Request | Response | +|----|----------|---------|---------| +| 500 | `RESOLVE_USER` | `ResolveUserRequest` | `ResolveUserResponse` | +| 501 | `RESOLVE_IDENTITY` | `ResolveIdentityRequest` | `ResolveIdentityResponse` | + +--- + +## Key Transparency (510-520) + +Key revocation and audit log access for the Merkle-based key transparency log. + +| ID | Constant | Request | Response | +|----|----------|---------|---------| +| 510 | `REVOKE_KEY` | `RevokeKeyRequest` | `RevokeKeyResponse` | +| 511 | `CHECK_REVOCATION` | `CheckRevocationRequest` | `CheckRevocationResponse` | +| 520 | `AUDIT_KEY_TRANSPARENCY` | `AuditKeyTransparencyRequest` | `AuditKeyTransparencyResponse` | + +--- + +## Blob Storage (600-601) + +Content-addressed binary object storage with chunked upload and ranged download. + +| ID | Constant | Request | Response | +|----|----------|---------|---------| +| 600 | `UPLOAD_BLOB` | `UploadBlobRequest` | `UploadBlobResponse` | +| 601 | `DOWNLOAD_BLOB` | `DownloadBlobRequest` | `DownloadBlobResponse` | + +--- + +## Device Management (700-702, 710) + +Multi-device registration, listing, and revocation. Method 710 registers a platform push notification token. + +| ID | Constant | Request | Response | +|----|----------|---------|---------| +| 700 | `REGISTER_DEVICE` | `RegisterDeviceRequest` | `RegisterDeviceResponse` | +| 701 | `LIST_DEVICES` | `ListDevicesRequest` | `ListDevicesResponse` | +| 702 | `REVOKE_DEVICE` | `RevokeDeviceRequest` | `RevokeDeviceResponse` | +| 710 | `REGISTER_PUSH_TOKEN` | `RegisterPushTokenRequest` | `RegisterPushTokenResponse` | + +--- + +## Recovery (750-752) + +Encrypted account recovery bundle storage. The server stores an opaque blob indexed by `SHA-256(recovery_token)`; the plaintext is never visible to the server. + +| ID | Constant | Request | Response | +|----|----------|---------|---------| +| 750 | `STORE_RECOVERY_BUNDLE` | `StoreRecoveryBundleRequest` | `StoreRecoveryBundleResponse` | +| 751 | `FETCH_RECOVERY_BUNDLE` | `FetchRecoveryBundleRequest` | `FetchRecoveryBundleResponse` | +| 752 | `DELETE_RECOVERY_BUNDLE` | `DeleteRecoveryBundleRequest` | `DeleteRecoveryBundleResponse` | + +--- + +## P2P / Health (800-802) + +iroh P2P node address exchange and server health probe. + +| ID | Constant | Request | Response | +|----|----------|---------|---------| +| 800 | `PUBLISH_ENDPOINT` | `PublishEndpointRequest` | `PublishEndpointResponse` | +| 801 | `RESOLVE_ENDPOINT` | `ResolveEndpointRequest` | `ResolveEndpointResponse` | +| 802 | `HEALTH` | `HealthRequest` | `HealthResponse` | + +--- + +## Federation (900-905) + +Cross-server relay for messages, key packages, and user resolution. All federation methods include a `FederationAuth` struct carrying the origin server domain. + +| ID | Constant | Request | Response | +|----|----------|---------|---------| +| 900 | `RELAY_ENQUEUE` | `RelayEnqueueRequest` | `RelayEnqueueResponse` | +| 901 | `RELAY_BATCH_ENQUEUE` | `RelayBatchEnqueueRequest` | `RelayBatchEnqueueResponse` | +| 902 | `PROXY_FETCH_KEY_PACKAGE` | `ProxyFetchKeyPackageRequest` | `ProxyFetchKeyPackageResponse` | +| 903 | `PROXY_FETCH_HYBRID_KEY` | `ProxyFetchHybridKeyRequest` | `ProxyFetchHybridKeyResponse` | +| 904 | `PROXY_RESOLVE_USER` | `ProxyResolveUserRequest` | `ProxyResolveUserResponse` | +| 905 | `FEDERATION_HEALTH` | `FederationHealthRequest` | `FederationHealthResponse` | + +--- + +## Account (950) + +| ID | Constant | Request | Response | +|----|----------|---------|---------| +| 950 | `DELETE_ACCOUNT` | `DeleteAccountRequest` | `DeleteAccountResponse` | + +--- + +## Push Event Types (1000+) + +Push events are sent by the server on QUIC uni-streams using the push frame format. They are not RPC methods (no `request_id`), but share the same event type namespace. + +| ID | Constant | Payload | +|----|----------|---------| +| 1000 | `PUSH_NEW_MESSAGE` | `NewMessage` | +| 1001 | `PUSH_TYPING` | `TypingIndicator` | +| 1002 | `PUSH_PRESENCE` | `PresenceUpdate` | +| 1003 | `PUSH_MEMBERSHIP` | `GroupMembershipChange` | + +Push payload messages are defined in `proto/qpq/v1/push.proto` and wrapped in a `PushEvent` oneof. See [RPC Reference](node-service-schema.md) for the full proto listing. + +--- + +## Method ID assignment policy + +Method IDs are stable across versions. Once assigned, an ID is never reused. New methods are assigned the next available ID in their logical category. Gaps in the numbering are reserved for future use within a category. --- ## Further reading -- [Wire Format Overview](overview.md) -- serialisation pipeline context -- [NodeService Schema](node-service-schema.md) -- the current RPC interface that replaced Envelope-based dispatch -- [Auth Schema](auth-schema.md) -- standalone Authentication Service interface -- [Delivery Schema](delivery-schema.md) -- standalone Delivery Service interface -- [ADR-002: Cap'n Proto over MessagePack](../design-rationale/adr-002-capnproto.md) -- why Cap'n Proto was chosen for the wire format +- [Wire Format Overview](overview.md) -- frame format and transport parameters +- [Auth Schema](auth-schema.md) -- OPAQUE proto definitions (IDs 100-103) +- [Delivery Schema](delivery-schema.md) -- delivery + keys proto definitions (IDs 200-304) +- [RPC Reference](node-service-schema.md) -- all proto definitions for all 14 files diff --git a/docs/src/wire-format/node-service-schema.md b/docs/src/wire-format/node-service-schema.md index d55b085..533e2a4 100644 --- a/docs/src/wire-format/node-service-schema.md +++ b/docs/src/wire-format/node-service-schema.md @@ -1,275 +1,788 @@ -# NodeService Schema +# RPC Reference -**Schema file:** `schemas/node.capnp` -**File ID:** `@0xd5ca5648a9cc1c28` +**Proto package:** `qpq.v1` +**Proto files:** 14 files in `proto/qpq/v1/` +**Total methods:** 44 -The `NodeService` interface is the unified Cap'n Proto RPC surface that every quicproquo client talks to. It combines the Authentication Service and Delivery Service into a single interface, adds long-polling support (`fetchWait`), a health probe (`health`), and hybrid KEM key management. Every method that mutates state or accesses per-user data accepts an `Auth` struct for versioned authentication. +This page is the complete Protobuf definition reference for all 14 proto files in the v2 RPC protocol. For transport framing, see [Wire Format Overview](overview.md). For method ID assignments, see [Method ID Reference](envelope-schema.md). + +Generated Rust types live in `crates/quicproquo-proto/src/` (via prost). --- -## Full schema listing +## auth.proto (IDs 100-103) -```capnp -# node.capnp -- Unified quicproquo node RPC interface. -# -# Combines Authentication and Delivery operations into a single service. -# -# ID generated with: capnp id -@0xd5ca5648a9cc1c28; +OPAQUE asymmetric PAKE for registration and login. See [Auth Schema](auth-schema.md) for field-level documentation and flow diagrams. -interface NodeService { - # Upload a single-use KeyPackage for later retrieval by peers. - # identityKey : Ed25519 public key bytes (32 bytes) - # package : TLS-encoded openmls KeyPackage - # auth : Auth context (version=1, non-empty accessToken required). - uploadKeyPackage @0 (identityKey :Data, package :Data, auth :Auth) - -> (fingerprint :Data); +```protobuf +syntax = "proto3"; +package qpq.v1; - # Fetch and atomically remove one KeyPackage for a given identity key. - # Returns empty Data if none are stored. - fetchKeyPackage @1 (identityKey :Data, auth :Auth) -> (package :Data); - - # Enqueue an opaque payload for delivery to a recipient. - # channelId : Optional channel identifier (empty for legacy). A 16-byte UUID - # is recommended for 1:1 channels. - # version : Schema/wire version. Must be 1. - enqueue @2 (recipientKey :Data, payload :Data, channelId :Data, - version :UInt16, auth :Auth) -> (); - - # Fetch and drain all queued payloads for the recipient. - fetch @3 (recipientKey :Data, channelId :Data, version :UInt16, auth :Auth) - -> (payloads :List(Data)); - - # Long-poll: wait up to timeoutMs for new payloads, then drain queue. - fetchWait @4 (recipientKey :Data, channelId :Data, version :UInt16, - timeoutMs :UInt64, auth :Auth) -> (payloads :List(Data)); - - # Health probe for readiness/liveness. - health @5 () -> (status :Text); - - # Upload the hybrid (X25519 + ML-KEM-768) public key for sealed envelope - # encryption. - uploadHybridKey @6 (identityKey :Data, hybridPublicKey :Data) -> (); - - # Fetch a peer's hybrid public key (for post-quantum envelope encryption). - fetchHybridKey @7 (identityKey :Data) -> (hybridPublicKey :Data); +message OpaqueRegisterStartRequest { + string username = 1; + bytes request = 2; } -struct Auth { - version @0 :UInt16; # 1 = token-based auth (required) - accessToken @1 :Data; # opaque bearer token issued at login - deviceId @2 :Data; # optional UUID bytes for auditing/rate limiting +message OpaqueRegisterStartResponse { + bytes response = 1; +} + +message OpaqueRegisterFinishRequest { + string username = 1; + bytes upload = 2; + bytes identity_key = 3; +} + +message OpaqueRegisterFinishResponse { + bool success = 1; +} + +message OpaqueLoginStartRequest { + string username = 1; + bytes request = 2; +} + +message OpaqueLoginStartResponse { + bytes response = 1; +} + +message OpaqueLoginFinishRequest { + string username = 1; + bytes finalization = 2; + bytes identity_key = 3; +} + +message OpaqueLoginFinishResponse { + bytes session_token = 1; } ``` --- -## Interface methods +## common.proto -### Authentication methods +Shared types and account deletion. -#### `uploadKeyPackage @0` +```protobuf +syntax = "proto3"; +package qpq.v1; -``` -uploadKeyPackage (identityKey :Data, package :Data, auth :Auth) -> (fingerprint :Data) -``` +// Auth context for federation and internal use. +// In v2, session authentication is carried at the QUIC connection level +// (session token), not per-message. +message Auth { + bytes access_token = 1; + bytes device_id = 2; +} -Uploads a single-use MLS KeyPackage. Identical semantics to the standalone [AuthenticationService](auth-schema.md) method, with the addition of the `auth` parameter for access control. +// Account deletion (ID 950). +message DeleteAccountRequest {} -| Parameter | Type | Size | Description | -|---|---|---|---| -| `identityKey` | `Data` | 32 bytes | Uploader's raw Ed25519 public key | -| `package` | `Data` | Variable | TLS-encoded openmls KeyPackage blob | -| `auth` | `Auth` | Struct | Authentication context (see [Auth struct](#auth-struct) below) | - -**Returns:** `fingerprint :Data` -- 32-byte SHA-256 digest of the stored package. - -#### `fetchKeyPackage @1` - -``` -fetchKeyPackage (identityKey :Data, auth :Auth) -> (package :Data) -``` - -Fetches and atomically removes one KeyPackage for the specified identity key. Returns empty `Data` if no packages are stored. See [Auth Schema](auth-schema.md) for full single-use semantics and [ADR-005](../design-rationale/adr-005-single-use-keypackages.md) for the design rationale. - -### Delivery methods - -#### `enqueue @2` - -``` -enqueue (recipientKey :Data, payload :Data, channelId :Data, version :UInt16, auth :Auth) -> () -``` - -Enqueues an opaque payload for delivery. Identical semantics to the standalone [DeliveryService](delivery-schema.md) `enqueue` method, with the addition of the `auth` parameter. - -| Parameter | Type | Size | Description | -|---|---|---|---| -| `recipientKey` | `Data` | 32 bytes | Recipient's raw Ed25519 public key | -| `payload` | `Data` | Variable | Opaque byte string (typically MLS ciphertext) | -| `channelId` | `Data` | 0 or 16 bytes | Channel identifier (empty for legacy, UUID recommended) | -| `version` | `UInt16` | 2 bytes | Wire version: `1` = current (required) | -| `auth` | `Auth` | Struct | Authentication context | - -#### `fetch @3` - -``` -fetch (recipientKey :Data, channelId :Data, version :UInt16, auth :Auth) -> (payloads :List(Data)) -``` - -Fetches and atomically drains all queued payloads for the specified recipient and channel. Returns an empty list if no messages are pending. See [Delivery Schema](delivery-schema.md) for full queue semantics. - -#### `fetchWait @4` - -``` -fetchWait (recipientKey :Data, channelId :Data, version :UInt16, timeoutMs :UInt64, auth :Auth) - -> (payloads :List(Data)) -``` - -**Long-polling variant of `fetch`.** This method blocks on the server side until either: - -1. One or more payloads become available in the queue, **or** -2. The `timeoutMs` duration expires. - -In case (1), the method returns all available payloads and drains the queue, identical to `fetch`. In case (2), the method returns an empty list. - -| Parameter | Type | Description | -|---|---|---| -| `timeoutMs` | `UInt64` | Maximum wait time in milliseconds. A value of `0` means return immediately (equivalent to `fetch`). | - -**Why long-polling?** Without `fetchWait`, clients must poll the server at a fixed interval, which wastes bandwidth when no messages are pending and introduces latency equal to half the polling interval on average. Long-polling provides near-real-time delivery while avoiding busy-wait overhead. - -**Server implementation:** The server holds the RPC response open until a payload is enqueued for the recipient or the timeout fires. The underlying mechanism is a `tokio::sync::Notify` per recipient, which is woken by `enqueue`. - -### Infrastructure methods - -#### `health @5` - -``` -health () -> (status :Text) -``` - -A readiness/liveness probe that takes no parameters and returns a human-readable status string (e.g., `"ok"`). This method: - -- Does not require authentication (`auth` is not a parameter). -- Is suitable for use as a Kubernetes or Docker health check endpoint. -- Can be extended in future versions to report more detailed status (e.g., queue depth, uptime). - -### Hybrid KEM methods - -#### `uploadHybridKey @6` - -``` -uploadHybridKey (identityKey :Data, hybridPublicKey :Data) -> () -``` - -Uploads the client's hybrid (X25519 + ML-KEM-768) public key for post-quantum sealed envelope encryption. Peers fetch this key to encrypt payloads with post-quantum protection before enqueuing them. - -| Parameter | Type | Description | -|---|---|---| -| `identityKey` | `Data` | Uploader's 32-byte Ed25519 public key (index key) | -| `hybridPublicKey` | `Data` | Concatenated X25519 public key (32 bytes) + ML-KEM-768 encapsulation key | - -#### `fetchHybridKey @7` - -``` -fetchHybridKey (identityKey :Data) -> (hybridPublicKey :Data) -``` - -Fetches a peer's hybrid public key. Unlike `fetchKeyPackage`, this is **not** a destructive operation -- the hybrid key persists across fetches because it is a long-lived public key, not a single-use package. - ---- - -## Auth struct - -```capnp -struct Auth { - version @0 :UInt16; - accessToken @1 :Data; - deviceId @2 :Data; +message DeleteAccountResponse { + bool success = 1; } ``` -The `Auth` struct is attached to every mutating or per-user method call. It provides a versioned authentication context that supports clean schema evolution. - -### Fields - -| Field | Type | Description | -|---|---|---| -| `version` | `UInt16` | Authentication protocol version. Determines how `accessToken` and `deviceId` are interpreted. | -| `accessToken` | `Data` | Opaque bearer token issued at login. The server validates this token against its auth backend. | -| `deviceId` | `Data` | Optional device identifier (UUID bytes). Used for auditing, rate limiting, and per-device session management. | - -### Version semantics - -| Version | Behavior | -|---|---| -| `1` | **Token-based authentication (required).** The server validates `accessToken` (static token or OPAQUE session) and rejects requests with missing or invalid tokens. `deviceId` is used for audit logging. | - -Auth version `0` is no longer supported; clients must send `version=1` and a valid token. +`DeleteAccountRequest` is empty; the server derives the user identity from the authenticated QUIC session. On success, all user data is purged: OPAQUE record, identity keys, key packages, hybrid keys, queued deliveries, channel memberships, devices, and recovery bundles. --- -## Method ordinal summary +## delivery.proto (IDs 200-205) -| Ordinal | Method | Category | -|---|---|---| -| `@0` | `uploadKeyPackage` | Auth | -| `@1` | `fetchKeyPackage` | Auth | -| `@2` | `enqueue` | Delivery | -| `@3` | `fetch` | Delivery | -| `@4` | `fetchWait` | Delivery | -| `@5` | `health` | Infrastructure | -| `@6` | `uploadHybridKey` | Auth / PQ | -| `@7` | `fetchHybridKey` | Auth / PQ | -| `@8` | `fetchHybridKeys` | Auth / PQ (batch) | -| `@9` | `opaqueRegisterStart` | Auth / OPAQUE | -| `@10` | `opaqueRegisterFinish` | Auth / OPAQUE | -| `@11` | `opaqueLoginStart` | Auth / OPAQUE | -| `@12` | `opaqueLoginFinish` | Auth / OPAQUE | -| `@13` | `peek` | Delivery (non-destructive read) | -| `@14` | `ack` | Delivery (acknowledge after peek) | -| `@15` | `batchEnqueue` | Delivery (fan-out) | -| `@16` | `createChannel` | Channels | -| `@17` | `resolveUser` | Discovery | -| `@18` | `resolveIdentity` | Discovery (reverse lookup) | -| `@19` | `registerDevice` | Devices | -| `@20` | `listDevices` | Devices | -| `@21` | `uploadBlob` | File transfer | -| `@22` | `downloadBlob` | File transfer | -| `@23` | `deleteAccount` | Account management | -| `@24` | `revokeDevice` | Devices | -| `@25` | `publishEndpoint` | P2P discovery | -| `@26` | `resolveEndpoint` | P2P discovery | +Store-and-forward message relay. See [Delivery Schema](delivery-schema.md) for field-level documentation. -Ordinals are stable and must not be reused. New methods are appended with the next available ordinal. This is a fundamental Cap'n Proto schema evolution rule: removing a method does not free its ordinal. +```protobuf +syntax = "proto3"; +package qpq.v1; -### Notable additions since initial release +message Envelope { + uint64 seq = 1; + bytes data = 2; +} -- **OPAQUE (@9-@12):** Password-authenticated key exchange. The password never leaves the client. -- **Channels (@16):** `createChannel` returns `(channelId :Data, wasNew :Bool)` for 1:1 DM creation with deduplication. -- **File transfer (@21-@22):** `uploadBlob` accepts 256 KB chunks with SHA-256 content addressing; `downloadBlob` retrieves chunks with hash verification. Max 50 MB. -- **Account deletion (@23):** Transactional purge of all user data (user record, identity keys, key packages, hybrid keys, queued deliveries, channel memberships). -- **TTL support:** `enqueue` and `batchEnqueue` accept an optional `ttlSecs` parameter for disappearing messages with server-side garbage collection. -- **P2P discovery (@25-@26):** `publishEndpoint` and `resolveEndpoint` for iroh node address exchange. +message EnqueueRequest { + bytes recipient_key = 1; + bytes payload = 2; + bytes channel_id = 3; + uint32 ttl_secs = 4; + bytes message_id = 5; +} + +message EnqueueResponse { + uint64 seq = 1; + bytes delivery_proof = 2; + bool duplicate = 3; +} + +message FetchRequest { + bytes recipient_key = 1; + bytes channel_id = 2; + uint32 limit = 3; + bytes device_id = 4; +} + +message FetchResponse { + repeated Envelope payloads = 1; +} + +message FetchWaitRequest { + bytes recipient_key = 1; + bytes channel_id = 2; + uint64 timeout_ms = 3; + uint32 limit = 4; + bytes device_id = 5; +} + +message FetchWaitResponse { + repeated Envelope payloads = 1; +} + +message PeekRequest { + bytes recipient_key = 1; + bytes channel_id = 2; + uint32 limit = 3; + bytes device_id = 4; +} + +message PeekResponse { + repeated Envelope payloads = 1; +} + +message AckRequest { + bytes recipient_key = 1; + bytes channel_id = 2; + uint64 seq_up_to = 3; + bytes device_id = 4; +} + +message AckResponse {} + +message BatchEnqueueRequest { + repeated bytes recipient_keys = 1; + bytes payload = 2; + bytes channel_id = 3; + uint32 ttl_secs = 4; + bytes message_id = 5; +} + +message BatchEnqueueResponse { + repeated uint64 seqs = 1; +} +``` --- -## Schema evolution +## keys.proto (IDs 300-304, 510-520) -Cap'n Proto supports forward-compatible schema evolution through several mechanisms, all of which are used in the NodeService interface: +MLS KeyPackages, hybrid PQ keys, and key transparency. See [Delivery Schema](delivery-schema.md) for field-level documentation. -1. **New methods can be added** by appending with a new ordinal. Old clients ignore unknown methods; new clients can call them. -2. **New struct fields can be added** to `Auth` (or any other struct) by appending with a new field number. Old structs that lack the new field will read the default value. -3. **The `version` field** provides application-level versioning on top of Cap'n Proto's structural versioning, allowing the server to change validation behavior without changing the schema. +```protobuf +syntax = "proto3"; +package qpq.v1; + +message UploadKeyPackageRequest { + bytes identity_key = 1; + bytes package = 2; +} + +message UploadKeyPackageResponse { + bytes fingerprint = 1; +} + +message FetchKeyPackageRequest { + bytes identity_key = 1; +} + +message FetchKeyPackageResponse { + bytes package = 1; +} + +message UploadHybridKeyRequest { + bytes identity_key = 1; + bytes hybrid_public_key = 2; +} + +message UploadHybridKeyResponse {} + +message FetchHybridKeyRequest { + bytes identity_key = 1; +} + +message FetchHybridKeyResponse { + bytes hybrid_public_key = 1; +} + +message FetchHybridKeysRequest { + repeated bytes identity_keys = 1; +} + +message FetchHybridKeysResponse { + repeated bytes keys = 1; +} + +// Key revocation (ID 510). +message RevokeKeyRequest { + bytes identity_key = 1; + string reason = 2; +} + +message RevokeKeyResponse { + bool success = 1; + uint64 leaf_index = 2; +} + +// Check revocation status (ID 511). +message CheckRevocationRequest { + bytes identity_key = 1; +} + +message CheckRevocationResponse { + bool revoked = 1; + string reason = 2; + uint64 timestamp_ms = 3; +} + +// KT audit log retrieval (ID 520). +message AuditKeyTransparencyRequest { + uint64 start = 1; + uint64 end = 2; +} + +message AuditKeyTransparencyResponse { + repeated LogEntry entries = 1; + uint64 tree_size = 2; + bytes root = 3; +} + +message LogEntry { + uint64 index = 1; + bytes leaf_hash = 2; +} +``` + +--- + +## channel.proto (ID 400) + +```protobuf +syntax = "proto3"; +package qpq.v1; + +// Channel create (1 method). +// Method ID: 400. + +message CreateChannelRequest { + bytes peer_key = 1; +} + +message CreateChannelResponse { + bytes channel_id = 1; + bool was_new = 2; +} +``` + +`CreateChannel` deterministically derives a channel ID from the calling user's identity key and `peer_key`, so repeated calls with the same peer return the same `channel_id`. `was_new` is `true` on the first creation. + +--- + +## group.proto (IDs 410-413) + +Group management: member removal, metadata, member listing, and key rotation. + +```protobuf +syntax = "proto3"; +package qpq.v1; + +// Group management (4 methods). +// Method IDs: 410-413. + +message RemoveMemberRequest { + bytes group_id = 1; + bytes member_identity_key = 2; +} + +message RemoveMemberResponse { + bytes commit = 1; +} + +message UpdateGroupMetadataRequest { + bytes group_id = 1; + string name = 2; + string description = 3; + bytes avatar_hash = 4; +} + +message UpdateGroupMetadataResponse { + bool success = 1; +} + +message ListGroupMembersRequest { + bytes group_id = 1; +} + +message ListGroupMembersResponse { + repeated GroupMemberInfo members = 1; +} + +message GroupMemberInfo { + bytes identity_key = 1; + string username = 2; + uint64 joined_at = 3; +} + +message RotateKeysRequest { + bytes group_id = 1; +} + +message RotateKeysResponse { + bytes commit = 1; +} + +// Server-side group metadata store. +message GroupMetadata { + bytes group_id = 1; + string name = 2; + string description = 3; + bytes avatar_hash = 4; + bytes creator_key = 5; + uint64 created_at = 6; +} +``` + +`RemoveMember` and `RotateKeys` return a `commit` field containing the MLS `Commit` message bytes that the caller must fan-out to the remaining group members via `BatchEnqueue`. `joined_at` in `GroupMemberInfo` is a Unix timestamp in milliseconds. + +--- + +## moderation.proto (IDs 420-424) + +Content moderation: encrypted reports, bans, and audit lists. + +```protobuf +syntax = "proto3"; +package qpq.v1; + +// Moderation service: report, ban, unban, list reports, list banned. +// Method IDs: 420-424. + +message ReportMessageRequest { + bytes encrypted_report = 1; + bytes conversation_id = 2; +} + +message ReportMessageResponse { + bool accepted = 1; +} + +message BanUserRequest { + bytes identity_key = 1; + string reason = 2; + uint64 duration_secs = 3; +} + +message BanUserResponse { + bool success = 1; +} + +message UnbanUserRequest { + bytes identity_key = 1; +} + +message UnbanUserResponse { + bool success = 1; +} + +message ListReportsRequest { + uint32 limit = 1; + uint32 offset = 2; +} + +message ReportEntry { + uint64 id = 1; + bytes encrypted_report = 2; + bytes conversation_id = 3; + bytes reporter_identity = 4; + uint64 timestamp = 5; +} + +message ListReportsResponse { + repeated ReportEntry reports = 1; +} + +message ListBannedRequest {} + +message BannedUserEntry { + bytes identity_key = 1; + string reason = 2; + uint64 banned_at = 3; + uint64 expires_at = 4; // 0 = permanent ban. +} + +message ListBannedResponse { + repeated BannedUserEntry users = 1; +} +``` + +`ReportMessageRequest.encrypted_report` is encrypted asymmetrically to the server's admin key; the server cannot read it without the admin private key. `BanUserRequest.duration_secs = 0` is a permanent ban. `BannedUserEntry.expires_at = 0` denotes a permanent ban. + +--- + +## user.proto (IDs 500-501) + +Forward and reverse user resolution. + +```protobuf +syntax = "proto3"; +package qpq.v1; + +// User resolve + identity (2 methods). +// Method IDs: 500-501. + +message ResolveUserRequest { + string username = 1; +} + +message ResolveUserResponse { + bytes identity_key = 1; + bytes inclusion_proof = 2; +} + +message ResolveIdentityRequest { + bytes identity_key = 1; +} + +message ResolveIdentityResponse { + string username = 1; +} +``` + +`ResolveUser` returns the user's Ed25519 identity key (32 bytes) along with a key-transparency inclusion proof. `ResolveIdentity` is the reverse lookup: given a key, return the username. + +--- + +## blob.proto (IDs 600-601) + +Content-addressed binary object storage with chunked upload and ranged download. + +```protobuf +syntax = "proto3"; +package qpq.v1; + +// Blob upload/download (2 methods). +// Method IDs: 600-601. + +message UploadBlobRequest { + bytes blob_hash = 1; + bytes chunk = 2; + uint64 offset = 3; + uint64 total_size = 4; + string mime_type = 5; +} + +message UploadBlobResponse { + bytes blob_id = 1; +} + +message DownloadBlobRequest { + bytes blob_id = 1; + uint64 offset = 2; + uint32 length = 3; +} + +message DownloadBlobResponse { + bytes chunk = 1; + uint64 total_size = 2; + string mime_type = 3; +} +``` + +`UploadBlob` accepts chunks; callers send multiple requests for large blobs using `offset` + `total_size` for reassembly. `blob_hash` is the SHA-256 of the entire blob (content addressing). `DownloadBlob` supports ranged reads via `offset` + `length`. + +--- + +## device.proto (IDs 700-702, 710) + +Multi-device management and push notification token registration. + +```protobuf +syntax = "proto3"; +package qpq.v1; + +// Device register/list/revoke (3 methods). +// Method IDs: 700-702. + +message RegisterDeviceRequest { + bytes device_id = 1; + string device_name = 2; +} + +message RegisterDeviceResponse { + bool success = 1; +} + +message ListDevicesRequest {} + +message ListDevicesResponse { + repeated Device devices = 1; +} + +message Device { + bytes device_id = 1; + string device_name = 2; + uint64 registered_at = 3; +} + +message RevokeDeviceRequest { + bytes device_id = 1; +} + +message RevokeDeviceResponse { + bool success = 1; +} + +// Push notification token registration (ID 710). +enum PushPlatform { + PUSH_PLATFORM_UNSPECIFIED = 0; + PUSH_PLATFORM_APNS = 1; + PUSH_PLATFORM_FCM = 2; + PUSH_PLATFORM_WEB_PUSH = 3; +} + +message RegisterPushTokenRequest { + bytes device_id = 1; + PushPlatform platform = 2; + string token = 3; +} + +message RegisterPushTokenResponse { + bool success = 1; +} +``` + +`Device.registered_at` is a Unix timestamp in milliseconds. `Device.device_id` is a client-generated UUID (16 bytes). `RegisterPushToken` associates a platform-specific push token with the device for server-initiated push notifications. + +--- + +## recovery.proto (IDs 750-752) + +Encrypted account recovery bundle storage. The server stores an opaque blob indexed by `SHA-256(recovery_token)`. The plaintext recovery token and bundle contents are never visible to the server. + +```protobuf +syntax = "proto3"; +package qpq.v1; + +// Recovery service. +// Method IDs: 750-752. + +message StoreRecoveryBundleRequest { + bytes token_hash = 1; // SHA-256(recovery_token) + bytes bundle = 2; // Encrypted recovery bundle (opaque to server). + uint64 ttl_secs = 3; // Default 90 days = 7776000. +} + +message StoreRecoveryBundleResponse { + bool success = 1; +} + +message FetchRecoveryBundleRequest { + bytes token_hash = 1; // SHA-256(recovery_token) +} + +message FetchRecoveryBundleResponse { + bytes bundle = 1; // Empty if no bundle found. +} + +message DeleteRecoveryBundleRequest { + bytes token_hash = 1; // SHA-256(recovery_token) +} + +message DeleteRecoveryBundleResponse { + bool success = 1; +} +``` + +--- + +## p2p.proto (IDs 800-802) + +iroh P2P node address exchange and server health probe. + +```protobuf +syntax = "proto3"; +package qpq.v1; + +// P2P endpoint publish/resolve + health (3 methods). +// Method IDs: 800-802. + +message PublishEndpointRequest { + bytes identity_key = 1; + bytes node_addr = 2; +} + +message PublishEndpointResponse {} + +message ResolveEndpointRequest { + bytes identity_key = 1; +} + +message ResolveEndpointResponse { + bytes node_addr = 1; +} + +message HealthRequest {} + +message HealthResponse { + string status = 1; + string node_id = 2; + string version = 3; + uint64 uptime_secs = 4; + string storage_backend = 5; +} +``` + +`node_addr` is an iroh `NodeAddr` serialised to bytes. `HealthResponse.storage_backend` is one of `"sql"`, `"file"`, or `"postgres"`. `HealthRequest` requires no authentication and is suitable for infrastructure health checks. + +--- + +## federation.proto (IDs 900-905) + +Cross-server relay and proxy operations. All federation methods include a `FederationAuth` struct carrying the origin server's domain for inter-server authentication. + +```protobuf +syntax = "proto3"; +package qpq.v1; + +// Federation relay + proxy (6 methods). +// Method IDs: 900-905. + +message FederationAuth { + string origin = 1; +} + +message RelayEnqueueRequest { + bytes recipient_key = 1; + bytes payload = 2; + bytes channel_id = 3; + FederationAuth auth = 4; +} + +message RelayEnqueueResponse { + uint64 seq = 1; +} + +message RelayBatchEnqueueRequest { + repeated bytes recipient_keys = 1; + bytes payload = 2; + bytes channel_id = 3; + FederationAuth auth = 4; +} + +message RelayBatchEnqueueResponse { + repeated uint64 seqs = 1; +} + +message ProxyFetchKeyPackageRequest { + bytes identity_key = 1; + FederationAuth auth = 2; +} + +message ProxyFetchKeyPackageResponse { + bytes package = 1; +} + +message ProxyFetchHybridKeyRequest { + bytes identity_key = 1; + FederationAuth auth = 2; +} + +message ProxyFetchHybridKeyResponse { + bytes hybrid_public_key = 1; +} + +message ProxyResolveUserRequest { + string username = 1; + FederationAuth auth = 2; +} + +message ProxyResolveUserResponse { + bytes identity_key = 1; +} + +message FederationHealthRequest {} + +message FederationHealthResponse { + string status = 1; + string server_domain = 2; +} +``` + +Federation relay methods (`RelayEnqueue`, `RelayBatchEnqueue`) are analogous to the client-facing delivery methods but originate from a peer server rather than a client. Proxy methods allow one server to fetch resources (key packages, hybrid keys, user identities) on behalf of a user whose home server is remote. + +--- + +## push.proto (Event IDs 1000-1003) + +Server-push event types sent on QUIC uni-streams using the push frame format. + +```protobuf +syntax = "proto3"; +package qpq.v1; + +// Server-push event types (sent on QUIC uni-streams). +// Event type IDs: 1000+. + +message PushEvent { + oneof event { + NewMessage new_message = 1; + TypingIndicator typing = 2; + PresenceUpdate presence = 3; + GroupMembershipChange membership = 4; + } +} + +message NewMessage { + bytes channel_id = 1; + bytes sender_key = 2; + uint64 seq = 3; + bytes payload = 4; + uint64 timestamp_ms = 5; +} + +message TypingIndicator { + bytes channel_id = 1; + bytes sender_key = 2; + bool is_typing = 3; +} + +message PresenceUpdate { + bytes identity_key = 1; + bool online = 2; + uint64 last_seen_ms = 3; +} + +message GroupMembershipChange { + bytes channel_id = 1; + bytes actor_key = 2; + bytes target_key = 3; + MembershipAction action = 4; +} + +enum MembershipAction { + MEMBERSHIP_ACTION_UNSPECIFIED = 0; + MEMBERSHIP_ACTION_ADDED = 1; + MEMBERSHIP_ACTION_REMOVED = 2; + MEMBERSHIP_ACTION_LEFT = 3; +} +``` + +Push events are sent by the server without a client request. The `event_type` field in the push frame header (see [Wire Format Overview](overview.md)) determines which `oneof` variant is present. Clients must handle all event types and ignore unknown types gracefully. --- ## Further reading -- [Wire Format Overview](overview.md) -- serialisation pipeline context -- [Auth Schema](auth-schema.md) -- standalone Authentication Service interface (subset of NodeService) -- [Delivery Schema](delivery-schema.md) -- standalone Delivery Service interface (subset of NodeService) -- [Envelope Schema](envelope-schema.md) -- legacy M1 framing that NodeService replaced -- [Architecture Overview](../architecture/overview.md) -- system-level view showing NodeService in context -- [ADR-005: Single-Use KeyPackages](../design-rationale/adr-005-single-use-keypackages.md) -- why fetchKeyPackage is destructive -- [ADR-004: MLS-Unaware DS](../design-rationale/adr-004-mls-unaware-ds.md) -- why payloads are opaque +- [Wire Format Overview](overview.md) -- frame format and transport parameters +- [Method ID Reference](envelope-schema.md) -- complete method ID table +- [Auth Schema](auth-schema.md) -- OPAQUE flow documentation +- [Delivery Schema](delivery-schema.md) -- delivery and key management documentation +- [Authentication Service Internals](../internals/authentication-service.md) -- server-side OPAQUE flow +- [Storage Backend](../internals/storage-backend.md) -- how data is persisted diff --git a/docs/src/wire-format/overview.md b/docs/src/wire-format/overview.md index 3053d64..348903b 100644 --- a/docs/src/wire-format/overview.md +++ b/docs/src/wire-format/overview.md @@ -1,6 +1,6 @@ # Wire Format Overview -This section documents the serialisation pipeline that transforms application-level data structures into encrypted bytes on the wire. Every byte exchanged between quicproquo clients and the server passes through this pipeline, so understanding it is prerequisite to reading the protocol deep dives or the server/client source code. +This section documents the v2 serialisation pipeline that transforms application-level data structures into bytes on the wire. Every byte exchanged between quicproquo clients and the server passes through this pipeline, so understanding it is prerequisite to reading the protocol deep dives or the server and client source code. --- @@ -9,69 +9,171 @@ This section documents the serialisation pipeline that transforms application-le Data flows through three stages on the send path. The receive path reverses the order. ```text - Stage 1 Stage 2 Stage 3 - -------- -------- -------- - Application Cap'n Proto Transport - data serialisation encryption + Stage 1 Stage 2 Stage 3 + -------- -------- -------- + Application Protobuf Transport + data serialisation encryption - RPC call capnp::serialize QUIC/TLS 1.3 - (zero-copy bytes) + RPC call prost::encode() QUIC/TLS 1.3 + + binary frame header - | | | - v v v - Rust structs Canonical byte Encrypted - & method representation ciphertext - invocations (no deserialization on the wire - needed on receive) + | | | + v v v + Rust structs 10-byte (request) or Encrypted + & method 9-byte (response) ciphertext + invocations binary header + on the wire + protobuf payload ``` ### Stage 1: Application creates a message or RPC call -At the application layer, the client or server constructs a typed Cap'n Proto message. In the legacy Envelope path (M1), this means building an `Envelope` struct with a `MsgType` discriminant, group ID, sender ID, and opaque payload. In the current NodeService path (M3+), this means invoking a Cap'n Proto RPC method such as `enqueue()` or `fetchKeyPackage()`. +At the application layer, the client or server constructs a typed Protobuf message defined in `proto/qpq/v1/*.proto`. Each RPC method has a corresponding request and response message type. -- **Envelope** (legacy): see [Envelope Schema](envelope-schema.md) -- **NodeService** (current): see [NodeService Schema](node-service-schema.md) -- **AuthenticationService** (standalone): see [Auth Schema](auth-schema.md) -- **DeliveryService** (standalone): see [Delivery Schema](delivery-schema.md) +- **Auth methods** (IDs 100-103): see [Auth Schema](auth-schema.md) +- **Delivery methods** (IDs 200-205): see [Delivery Schema](delivery-schema.md) +- **All methods**: see [Method ID Reference](envelope-schema.md) +- **Full RPC reference**: see [RPC Reference](node-service-schema.md) -### Stage 2: Cap'n Proto serialises to bytes +### Stage 2: Binary framing + Protobuf serialisation -Cap'n Proto converts the in-memory message to its canonical wire representation. This is a **zero-copy** format: the byte layout in memory is identical to the byte layout on the wire. No serialisation or deserialisation pass is required; readers can traverse the bytes in-place using pointer arithmetic. +The v2 protocol defines three frame types, each with a compact binary header followed by a Protobuf-encoded payload. All multi-byte integers are **big-endian**. -The wire representation consists of: +#### Request frame (client to server, bidirectional stream) -1. A **segment table** -- a list of segment sizes encoded as little-endian 32-bit integers. -2. One or more **segments** -- contiguous runs of 8-byte aligned words containing struct data, list data, and far pointers. +```text + 0 1 2 3 4 5 6 7 8 9 ++-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+ +| method_id (u16) | request_id (u32) | len (u32) | ++-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+ +| len (cont.) | protobuf payload ... ++-------+-------+-------... +``` -Cap'n Proto's canonical form is deterministic for a given message, which makes it suitable for signing: two implementations that build the same logical message will produce identical bytes. +| Offset | Field | Type | Description | +|--------|-------|------|-------------| +| 0-1 | `method_id` | `u16 BE` | RPC method identifier | +| 2-5 | `request_id` | `u32 BE` | Client-generated correlation ID | +| 6-9 | `payload_len` | `u32 BE` | Length of the protobuf payload in bytes | +| 10+ | payload | bytes | Protobuf-encoded request message | + +Header size: **10 bytes**. + +#### Response frame (server to client, same bidirectional stream) + +```text + 0 1 2 3 4 5 6 7 8 ++-------+-------+-------+-------+-------+-------+-------+-------+-------+ +| status| request_id (u32) | len (u32) | ++-------+-------+-------+-------+-------+-------+-------+-------+-------+ +| protobuf payload ... ++-------... +``` + +| Offset | Field | Type | Description | +|--------|-------|------|-------------| +| 0 | `status` | `u8` | RPC status code (0 = OK) | +| 1-4 | `request_id` | `u32 BE` | Echoes the request correlation ID | +| 5-8 | `payload_len` | `u32 BE` | Length of the protobuf payload in bytes | +| 9+ | payload | bytes | Protobuf-encoded response message | + +Header size: **9 bytes**. + +#### Push frame (server to client, QUIC uni-stream) + +```text + 0 1 2 3 4 5 ++-------+-------+-------+-------+-------+-------+ +| event_type (u16) | payload_len (u32) | ++-------+-------+-------+-------+-------+-------+ +| protobuf payload ... ++-------... +``` + +| Offset | Field | Type | Description | +|--------|-------|------|-------------| +| 0-1 | `event_type` | `u16 BE` | Push event type identifier | +| 2-5 | `payload_len` | `u32 BE` | Length of the protobuf payload in bytes | +| 6+ | payload | bytes | Protobuf-encoded push event message | + +Header size: **6 bytes**. + +#### Limits + +| Constraint | Value | +|------------|-------| +| Maximum payload size | 4 MiB (4,194,304 bytes) | +| Payloads exceeding this limit | rejected with `PayloadTooLarge` error | + +Source: `crates/quicproquo-rpc/src/framing.rs`. ### Stage 3: Transport encryption -The serialised bytes are encrypted by the QUIC/TLS 1.3 transport layer. The QUIC transport uses native QUIC stream framing, which provides its own length delimitation. Cap'n Proto RPC over QUIC relies on the `capnp-rpc` crate's built-in stream adapter. +The framed bytes are encrypted by the QUIC/TLS 1.3 transport layer. QUIC provides stream framing and length delimitation; the RPC framework reads exactly `header_size + payload_len` bytes per frame. | Transport | Encryption | Authentication | -|---|---|---| -| **QUIC + TLS 1.3** | AES-128-GCM or ChaCha20-Poly1305 (negotiated by TLS) | Server cert (rustls/quinn) | +|-----------|------------|----------------| +| QUIC + TLS 1.3 | AES-128-GCM or ChaCha20-Poly1305 (negotiated by TLS) | Server cert (rustls/quinn) | -The transport layer treats the payload as opaque bytes. It does not inspect or interpret the Cap'n Proto content. This clean separation means the serialisation format can evolve independently of the transport. +The transport layer treats the framed payload as opaque bytes. Serialisation format and transport evolve independently. + +--- + +## QUIC stream model + +Each RPC call uses a **dedicated QUIC bidirectional stream**: + +1. Client opens a new bidirectional stream. +2. Client sends one request frame and closes the write end. +3. Server reads the request, dispatches by `method_id`, and sends one response frame. +4. Server closes the write end. + +Push events (server-initiated) are sent on **QUIC uni-streams** opened by the server. There is no request correlation ID in push frames. + +This design allows unlimited concurrent RPCs with no head-of-line blocking. + +--- + +## Connection parameters + +| Parameter | Value | +|-----------|-------| +| Protocol | QUIC (RFC 9000) | +| ALPN | `"qpq"` | +| Default port | 5001 | +| TLS version | 1.3 only | +| Certificate | Server presents a TLS certificate; clients verify against a CA cert | --- ## Schema index -The Cap'n Proto schemas that define the wire-level messages are documented on dedicated pages: +Protobuf schemas are defined in `proto/qpq/v1/` and documented on dedicated pages: -| Schema File | Documentation Page | Purpose | -|---|---|---| -| `schemas/envelope.capnp` | [Envelope Schema](envelope-schema.md) | Legacy message envelope (M1) | -| `schemas/auth.capnp` | [Auth Schema](auth-schema.md) | Authentication Service RPC interface | -| `schemas/delivery.capnp` | [Delivery Schema](delivery-schema.md) | Delivery Service RPC interface | -| `schemas/node.capnp` | [NodeService Schema](node-service-schema.md) | Unified node RPC (current) | +| Proto File | Documentation | Purpose | +|------------|---------------|---------| +| `auth.proto` | [Auth Schema](auth-schema.md) | OPAQUE registration and login (IDs 100-103) | +| `delivery.proto` | [Delivery Schema](delivery-schema.md) | Message delivery (IDs 200-205) | +| `keys.proto` | [Delivery Schema](delivery-schema.md) | Key packages and hybrid keys (IDs 300-304, 510-520) | +| `channel.proto` | [RPC Reference](node-service-schema.md) | Channel creation (ID 400) | +| `group.proto` | [RPC Reference](node-service-schema.md) | Group management (IDs 410-413) | +| `moderation.proto` | [RPC Reference](node-service-schema.md) | Content moderation (IDs 420-424) | +| `user.proto` | [RPC Reference](node-service-schema.md) | User resolution (IDs 500-501) | +| `blob.proto` | [RPC Reference](node-service-schema.md) | Blob storage (IDs 600-601) | +| `device.proto` | [RPC Reference](node-service-schema.md) | Device management (IDs 700-702, 710) | +| `recovery.proto` | [RPC Reference](node-service-schema.md) | Account recovery (IDs 750-752) | +| `p2p.proto` | [RPC Reference](node-service-schema.md) | P2P endpoints and health (IDs 800-802) | +| `federation.proto` | [RPC Reference](node-service-schema.md) | Cross-server relay (IDs 900-905) | +| `push.proto` | [RPC Reference](node-service-schema.md) | Push event types (IDs 1000+) | +| `common.proto` | [RPC Reference](node-service-schema.md) | Auth context, account deletion (ID 950) | + +Method ID assignment: `crates/quicproquo-proto/src/lib.rs` (`method_ids` module). --- ## Further reading - [Architecture Overview](../architecture/overview.md) -- system-level view of how services compose -- [Protocol Layers Overview](../protocol-layers/overview.md) -- how transport, framing, and E2E encryption stack -- [ADR-002: Cap'n Proto over MessagePack](../design-rationale/adr-002-capnproto.md) -- why Cap'n Proto was chosen +- [Method ID Reference](envelope-schema.md) -- complete table of all 44 RPC methods +- [Auth Schema](auth-schema.md) -- OPAQUE authentication proto definitions +- [Delivery Schema](delivery-schema.md) -- message delivery proto definitions +- [RPC Reference](node-service-schema.md) -- complete proto definitions for all 14 files