feat: add post-quantum hybrid KEM + SQLCipher persistence
Feature 1 — Post-Quantum Hybrid KEM (X25519 + ML-KEM-768): - Create hybrid_kem.rs with keygen, encrypt, decrypt + 11 unit tests - Wire format: version(1) | x25519_eph_pk(32) | mlkem_ct(1088) | nonce(12) | ct - Add uploadHybridKey/fetchHybridKey RPCs to node.capnp schema - Server: hybrid key storage in FileBackedStore + RPC handlers - Client: hybrid keypair in StoredState, auto-wrap/unwrap in send/recv/invite/join - demo-group runs full hybrid PQ envelope round-trip Feature 2 — SQLCipher Persistence: - Extract Store trait from FileBackedStore API - Create SqlStore (rusqlite + bundled-sqlcipher) with encrypted-at-rest SQLite - Schema: key_packages, deliveries, hybrid_keys tables with indexes - Server CLI: --store-backend=sql, --db-path, --db-key flags - 5 unit tests for SqlStore (FIFO, round-trip, upsert, channel isolation) Also includes: client lib.rs refactor, auth config, TOML config file support, mdBook documentation, and various cleanups by user. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
259
docs/src/architecture/service-architecture.md
Normal file
259
docs/src/architecture/service-architecture.md
Normal file
@@ -0,0 +1,259 @@
|
||||
# Service Architecture
|
||||
|
||||
The quicnprotochat server exposes a single **NodeService** RPC endpoint that
|
||||
combines Authentication and Delivery operations. This page documents the RPC
|
||||
interface, per-connection lifecycle, storage model, long-polling mechanism, and
|
||||
authentication context.
|
||||
|
||||
---
|
||||
|
||||
## NodeService Endpoint
|
||||
|
||||
A single QUIC + TLS 1.3 listener on **port 7000** serves all operations.
|
||||
The schema is defined in `schemas/node.capnp` and documented in
|
||||
[NodeService Schema](../wire-format/node-service-schema.md).
|
||||
|
||||
```text
|
||||
NodeService (port 7000)
|
||||
├── Authentication methods
|
||||
│ ├── uploadKeyPackage(identityKey, package, auth) -> fingerprint
|
||||
│ ├── fetchKeyPackage(identityKey, auth) -> package
|
||||
│ ├── uploadHybridKey(identityKey, hybridPublicKey) -> ()
|
||||
│ └── fetchHybridKey(identityKey) -> hybridPublicKey
|
||||
│
|
||||
├── Delivery methods
|
||||
│ ├── enqueue(recipientKey, payload, channelId, version, auth) -> ()
|
||||
│ ├── fetch(recipientKey, channelId, version, auth) -> payloads
|
||||
│ └── fetchWait(recipientKey, channelId, version, timeoutMs, auth) -> payloads
|
||||
│
|
||||
└── Operational
|
||||
└── health() -> status
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## RPC Method Reference
|
||||
|
||||
### Authentication Service Methods
|
||||
|
||||
| Method | Params | Returns | Semantics |
|
||||
|----------------------|-------------------------------------|------------------|-----------|
|
||||
| `uploadKeyPackage` | `identityKey` (32 B Ed25519 pk), `package` (TLS-encoded KeyPackage), `auth` | `fingerprint` (SHA-256 of package) | Appends the KeyPackage to a per-identity FIFO queue. The fingerprint lets the client detect server-side tampering. Max package size: 1 MB. |
|
||||
| `fetchKeyPackage` | `identityKey` (32 B), `auth` | `package` (or empty `Data`) | Atomically pops and returns the oldest KeyPackage for the identity. Returns empty bytes if none are stored. Single-use semantics per RFC 9420. |
|
||||
| `uploadHybridKey` | `identityKey` (32 B), `hybridPublicKey` (X25519 pk + ML-KEM-768 ek) | `()` | Stores (or replaces) the hybrid PQ public key for envelope-level post-quantum encryption. |
|
||||
| `fetchHybridKey` | `identityKey` (32 B) | `hybridPublicKey` (or empty `Data`) | Returns the stored hybrid public key for a peer, or empty if none. |
|
||||
|
||||
### Delivery Service Methods
|
||||
|
||||
| Method | Params | Returns | Semantics |
|
||||
|--------------|------------------------------------------------------------------------|----------------------|-----------|
|
||||
| `enqueue` | `recipientKey` (32 B), `payload` (opaque), `channelId`, `version`, `auth` | `()` | Appends `payload` to the recipient's FIFO queue. Max payload: 5 MB. Wakes any `fetchWait` waiter for this recipient. Supported versions: 0 (legacy), 1 (current). |
|
||||
| `fetch` | `recipientKey` (32 B), `channelId`, `version`, `auth` | `payloads: List(Data)` | Atomically drains and returns the full queue in FIFO order. Returns empty list if nothing is pending. |
|
||||
| `fetchWait` | `recipientKey` (32 B), `channelId`, `version`, `timeoutMs`, `auth` | `payloads: List(Data)` | Same as `fetch`, but if the queue is empty and `timeoutMs > 0`, blocks up to `timeoutMs` milliseconds waiting for a `Notify` signal from `enqueue`. Returns whatever is in the queue when the wait completes or times out. |
|
||||
|
||||
### Operational Methods
|
||||
|
||||
| Method | Params | Returns | Semantics |
|
||||
|----------|--------|-----------------|-----------|
|
||||
| `health` | none | `status: Text` | Returns `"ok"`. Used for liveness/readiness probes. |
|
||||
|
||||
---
|
||||
|
||||
## Per-Connection Lifecycle
|
||||
|
||||
Each incoming QUIC connection follows this sequence:
|
||||
|
||||
```text
|
||||
┌──────────────────────────────────────────────────────────────────────┐
|
||||
│ Client Server │
|
||||
│ │
|
||||
│ 1. UDP packet -> │
|
||||
│ QUIC INITIAL │
|
||||
│ │
|
||||
│ 2. <- QUIC HANDSHAKE │
|
||||
│ TLS 1.3 ServerHello + │
|
||||
│ Certificate (self-signed) │
|
||||
│ ALPN: "capnp" │
|
||||
│ │
|
||||
│ 3. Client verifies server │
|
||||
│ cert against pinned CA │
|
||||
│ cert (--ca-cert flag) │
|
||||
│ │
|
||||
│ 4. QUIC connection established │
|
||||
│ │
|
||||
│ 5. Client opens bidirectional ──────────> Server accepts bi stream │
|
||||
│ QUIC stream (open_bi) (accept_bi) │
|
||||
│ │
|
||||
│ 6. tokio_util::compat adapters wrap the send/recv halves │
|
||||
│ into AsyncRead + AsyncWrite │
|
||||
│ │
|
||||
│ 7. capnp-rpc twoparty::VatNetwork │
|
||||
│ Client Side::Client Server Side::Server │
|
||||
│ │
|
||||
│ 8. RpcSystem::new() starts │
|
||||
│ promise-pipelined RPC loop │
|
||||
│ │
|
||||
│ 9. Client bootstraps │
|
||||
│ node_service::Client NodeServiceImpl created │
|
||||
│ (shares Arc<FileBackedStore>, │
|
||||
│ Arc<DashMap<..., Notify>>) │
|
||||
│ │
|
||||
│ 10. RPC calls flow over the bidirectional stream │
|
||||
│ until either side closes the connection. │
|
||||
└──────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### LocalSet requirement
|
||||
|
||||
`capnp-rpc` uses `Rc<RefCell<>>` internally, making it `!Send`. Therefore:
|
||||
|
||||
- The server runs the entire accept loop inside a `tokio::task::LocalSet`.
|
||||
- Each connection handler is `spawn_local`, ensuring all RPC futures stay on a
|
||||
single thread.
|
||||
- The client wraps each subcommand invocation in its own `LocalSet::run_until`.
|
||||
|
||||
This is a fundamental constraint of the Cap'n Proto RPC runtime in Rust.
|
||||
Attempts to spawn RPC futures on the multi-threaded Tokio executor will fail
|
||||
with a compile error.
|
||||
|
||||
---
|
||||
|
||||
## Storage Model
|
||||
|
||||
`NodeServiceImpl` holds two pieces of shared state:
|
||||
|
||||
### FileBackedStore
|
||||
|
||||
```text
|
||||
FileBackedStore
|
||||
├── key_packages: Mutex<HashMap<Vec<u8>, VecDeque<Vec<u8>>>>
|
||||
│ Key: Ed25519 public key (32 bytes)
|
||||
│ Value: FIFO queue of TLS-encoded KeyPackage blobs
|
||||
│ File: data/keypackages.bin (bincode)
|
||||
│
|
||||
├── deliveries: Mutex<HashMap<ChannelKey, VecDeque<Vec<u8>>>>
|
||||
│ ChannelKey: { channel_id: Vec<u8>, recipient_key: Vec<u8> }
|
||||
│ Value: FIFO queue of opaque payload blobs
|
||||
│ File: data/deliveries.bin (bincode, v2 format)
|
||||
│
|
||||
└── hybrid_keys: Mutex<HashMap<Vec<u8>, Vec<u8>>>
|
||||
Key: Ed25519 public key (32 bytes)
|
||||
Value: serialised HybridPublicKey blob
|
||||
File: data/hybridkeys.bin (bincode)
|
||||
```
|
||||
|
||||
Every mutation (upload, fetch, enqueue) acquires the relevant `Mutex`, modifies
|
||||
the in-memory `HashMap`, and then flushes the entire map to disk as a bincode
|
||||
blob. This is intentionally simple for MVP-scale workloads. A production
|
||||
deployment would replace this with an embedded database or external store.
|
||||
|
||||
The delivery map supports a **v1 -> v2 upgrade path**: if `deliveries.bin`
|
||||
contains the legacy `QueueMapV1` format (keyed by `recipientKey` only), the
|
||||
store transparently upgrades entries by wrapping them in `ChannelKey` with an
|
||||
empty `channel_id`.
|
||||
|
||||
### DashMap Waiters
|
||||
|
||||
```text
|
||||
Arc<DashMap<Vec<u8>, Arc<Notify>>>
|
||||
Key: recipient Ed25519 public key (32 bytes)
|
||||
Value: tokio::sync::Notify instance
|
||||
```
|
||||
|
||||
The waiters map is orthogonal to `FileBackedStore`. It lives entirely in
|
||||
memory and serves the `fetchWait` long-polling mechanism:
|
||||
|
||||
1. `enqueue` calls `waiter(&recipient_key).notify_waiters()` after storing the
|
||||
payload.
|
||||
2. `fetchWait` first tries a regular `fetch`. If the queue is empty and
|
||||
`timeoutMs > 0`:
|
||||
- Look up or insert a `Notify` for the recipient.
|
||||
- `tokio::time::timeout(Duration::from_millis(timeoutMs), notify.notified())`
|
||||
- When notified (or on timeout), perform a second `fetch` and return
|
||||
whatever is available.
|
||||
|
||||
This design avoids busy-polling while keeping the implementation lock-free
|
||||
(DashMap uses sharded RwLocks internally).
|
||||
|
||||
---
|
||||
|
||||
## Auth Struct
|
||||
|
||||
Every RPC method that modifies or reads user-specific state accepts an `Auth`
|
||||
parameter:
|
||||
|
||||
```capnp
|
||||
struct Auth {
|
||||
version @0 :UInt16; # 0 = legacy/none, 1 = token-based auth
|
||||
accessToken @1 :Data; # opaque bearer token
|
||||
deviceId @2 :Data; # optional UUID for auditing/rate limiting
|
||||
}
|
||||
```
|
||||
|
||||
### Version semantics
|
||||
|
||||
| Version | Meaning |
|
||||
|---------|------------------------------------------------------------|
|
||||
| 0 | Legacy / no authentication. The server accepts the request without checking credentials. Suitable for development and testing. |
|
||||
| 1 | Token-based authentication. The `accessToken` field should contain an opaque bearer token issued at login. The server validates the token against a token store (not yet implemented -- see [Auth, Devices, and Tokens](../roadmap/authz-plan.md)). |
|
||||
|
||||
The server validates the `version` field on every request via `validate_auth()`.
|
||||
Requests with unsupported versions are rejected with a Cap'n Proto error.
|
||||
|
||||
### Client-side usage
|
||||
|
||||
The client CLI accepts `--access-token` and `--device-id` flags (or the
|
||||
corresponding environment variables). These are bundled into a `ClientAuth`
|
||||
struct and injected into every outgoing RPC call via the `set_auth()` helper.
|
||||
|
||||
Currently, the client sends `version = 0` with empty token and device ID by
|
||||
default. When the token-based auth flow is implemented, the client will populate
|
||||
these fields.
|
||||
|
||||
---
|
||||
|
||||
## Validation and Limits
|
||||
|
||||
The server enforces the following constraints on every RPC call:
|
||||
|
||||
| Constraint | Value | Error on violation |
|
||||
|-----------------------------|--------------------|--------------------|
|
||||
| `identityKey` / `recipientKey` length | Exactly 32 bytes | Cap'n Proto error: "must be exactly 32 bytes" |
|
||||
| KeyPackage size | <= 1 MB | Cap'n Proto error: "package exceeds max size" |
|
||||
| Payload size | <= 5 MB | Cap'n Proto error: "payload exceeds max size" |
|
||||
| Wire version | 0 or 1 | Cap'n Proto error: "unsupported wire version" |
|
||||
| Auth version | 0 or 1 | Cap'n Proto error: "unsupported auth version" |
|
||||
| KeyPackage non-empty | `package.len() > 0`| Cap'n Proto error: "package must not be empty" |
|
||||
| Payload non-empty | `payload.len() > 0`| Cap'n Proto error: "payload must not be empty" |
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
The server binary is configured via CLI flags or environment variables:
|
||||
|
||||
| Flag | Env var | Default | Description |
|
||||
|----------------|----------------------------|----------------------|-------------|
|
||||
| `--listen` | `QUICNPROTOCHAT_LISTEN` | `0.0.0.0:7000` | QUIC listen address (host:port). |
|
||||
| `--data-dir` | `QUICNPROTOCHAT_DATA_DIR` | `data` | Directory for persisted KeyPackages, delivery queues, and hybrid keys. |
|
||||
| `--tls-cert` | `QUICNPROTOCHAT_TLS_CERT` | `data/server-cert.der` | Path to TLS certificate (DER). Auto-generated if missing. |
|
||||
| `--tls-key` | `QUICNPROTOCHAT_TLS_KEY` | `data/server-key.der` | Path to TLS private key (DER). Auto-generated if missing. |
|
||||
|
||||
If the TLS certificate or key files do not exist at startup, the server
|
||||
auto-generates a self-signed certificate for `localhost`, `127.0.0.1`, and
|
||||
`::1` using `rcgen`.
|
||||
|
||||
Logging level is controlled by the `RUST_LOG` environment variable (default:
|
||||
`info`).
|
||||
|
||||
---
|
||||
|
||||
## Further Reading
|
||||
|
||||
- [Architecture Overview](overview.md) -- two-service model and dual-key overview
|
||||
- [NodeService Schema](../wire-format/node-service-schema.md) -- full Cap'n Proto schema
|
||||
- [End-to-End Data Flow](data-flow.md) -- sequence diagrams showing registration, group creation, and messaging
|
||||
- [Delivery Service Internals](../internals/delivery-service.md) -- queue routing and channel-aware delivery
|
||||
- [Authentication Service Internals](../internals/authentication-service.md) -- KeyPackage lifecycle
|
||||
- [Storage Backend](../internals/storage-backend.md) -- FileBackedStore details and upgrade path
|
||||
- [Auth, Devices, and Tokens](../roadmap/authz-plan.md) -- planned token-based authentication
|
||||
Reference in New Issue
Block a user