feat: add post-quantum hybrid KEM + SQLCipher persistence

Feature 1 — Post-Quantum Hybrid KEM (X25519 + ML-KEM-768):
- Create hybrid_kem.rs with keygen, encrypt, decrypt + 11 unit tests
- Wire format: version(1) | x25519_eph_pk(32) | mlkem_ct(1088) | nonce(12) | ct
- Add uploadHybridKey/fetchHybridKey RPCs to node.capnp schema
- Server: hybrid key storage in FileBackedStore + RPC handlers
- Client: hybrid keypair in StoredState, auto-wrap/unwrap in send/recv/invite/join
- demo-group runs full hybrid PQ envelope round-trip

Feature 2 — SQLCipher Persistence:
- Extract Store trait from FileBackedStore API
- Create SqlStore (rusqlite + bundled-sqlcipher) with encrypted-at-rest SQLite
- Schema: key_packages, deliveries, hybrid_keys tables with indexes
- Server CLI: --store-backend=sql, --db-path, --db-key flags
- 5 unit tests for SqlStore (FIFO, round-trip, upsert, channel isolation)

Also includes: client lib.rs refactor, auth config, TOML config file support,
mdBook documentation, and various cleanups by user.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-22 08:07:48 +01:00
parent d1ddef4cea
commit f334ed3d43
81 changed files with 14502 additions and 2289 deletions

View File

@@ -0,0 +1,256 @@
# Auth, Devices, and Tokens
This page describes the authentication, device management, and authorisation
design for quicnprotochat. It introduces account and device identities, gates
server operations by authenticated identity, enforces rate and size limits, and
binds MLS identity keys to accounts.
This design cuts across milestones M4 through M6. For the broader production
readiness plan, see [Production Readiness WBS](production-readiness.md).
---
## Goals
1. **Introduce accounts and devices** with authenticated access to `NodeService`.
2. **Gate operations by identity:** enqueue/fetch/fetchWait require a valid token
bound to the caller's account and device.
3. **Enforce rate and size limits** per account, per device, and per IP.
4. **Bind MLS identity keys to accounts:** a KeyPackage upload must be associated
with the uploading account, preventing impersonation.
5. **Keep wire changes minimal and versioned:** the `Auth` struct is additive
and uses a version field for backward compatibility.
---
## Data Model (Server)
### Accounts
| Field | Type | Description |
|-------|------|-------------|
| `account_id` | UUID | Unique account identifier |
| `created_at` | Timestamp | Account creation time |
| `status` | Enum | `active`, `suspended`, `deleted` |
### Devices
| Field | Type | Description |
|-------|------|-------------|
| `device_id` | UUID | Unique device identifier |
| `account_id` | UUID | Owning account (foreign key) |
| `device_pubkey` | Ed25519 public key (32 bytes) | Device signing key |
| `created_at` | Timestamp | Device registration time |
| `status` | Enum | `active`, `revoked` |
### Sessions / Tokens
| Field | Type | Description |
|-------|------|-------------|
| `session_id` | UUID | Unique session identifier |
| `account_id` | UUID | Owning account |
| `device_id` | UUID | Originating device |
| `access_token` | Opaque bytes | Short-lived bearer token |
| `refresh_token` | Opaque bytes | Long-lived token for renewal |
| `expires_at` | Timestamp | Access token expiry |
| `created_at` | Timestamp | Session creation time |
### Identity Binding
| Field | Type | Description |
|-------|------|-------------|
| `account_id` | UUID | Owning account |
| `mls_identity_key` | Ed25519 public key (32 bytes) | MLS credential public key |
| `verified_fp` | SHA-256 fingerprint (32 bytes) | Fingerprint of the bound key |
The identity binding table ensures that only the account that registered an
Ed25519 public key can upload KeyPackages for that key. This prevents a
compromised or malicious client from uploading KeyPackages under another
account's identity.
---
## Wire / API Changes
### Auth Struct
A new `Auth` struct is added to all `NodeService` RPC methods:
```capnp
struct Auth {
version @0 :UInt16; # 0 = legacy (no auth), 1 = token-based
accessToken @1 :Data; # opaque bearer token
deviceId @2 :Data; # optional UUID (16 bytes) for audit/rate limit
}
```
The `Auth` struct is included as a parameter in `enqueue`, `fetch`, `fetchWait`,
`uploadKeyPackage`, and `fetchKeyPackage`.
### Versioning
| Version | Meaning |
|---------|---------|
| 0 | Legacy mode: no authentication. Server can allow-list in development but defaults to rejecting in production. |
| 1 | Token-based authentication. `accessToken` is required and validated. |
The server rejects any `version` value higher than its current maximum. This
ensures that a newer client connecting to an older server fails cleanly rather
than silently skipping auth.
### Optional Device ID
The `deviceId` field is optional. When present, the server uses it for:
- Per-device rate limiting (in addition to per-account limits).
- Audit logging (which device performed which operation).
- Future: device revocation without revoking the entire account.
---
## Server Enforcement
### Token Validation
1. Extract `Auth` struct from the incoming RPC.
2. If `version == 0` and server is in production mode, reject with
`AUTHENTICATION_REQUIRED`.
3. If `version == 1`, validate `accessToken`:
- Token must exist in the session store.
- Token must not be expired (`expires_at > now`).
- Associated account must have `status == active`.
- Associated device (if `deviceId` present) must have `status == active`.
4. Map validated token to `(account_id, device_id)` for downstream authorisation.
### Identity Matching
- **uploadKeyPackage:** The `identityKey` in the RPC must match an identity
binding for the authenticated account. Reject with `IDENTITY_MISMATCH` if the
key is not bound to the caller's account.
- **fetchKeyPackage:** No identity restriction (any authenticated client can
fetch any identity's KeyPackage -- this is required for the MLS add-member flow).
- **enqueue:** If `channelId` is present, the caller's identity must be in the
channel membership. If `channelId` is absent (legacy mode), the operation is
allowed for any authenticated client.
- **fetch / fetchWait:** The `recipientKey` must correspond to an identity bound
to the caller's account.
### Rate Limits
| Limit | Scope | Default |
|-------|-------|---------|
| Request rate | Per IP | 50 requests/second |
| Request rate | Per account | 50 requests/second |
| Request rate | Per device | 50 requests/second |
| Payload size | Per RPC call | 5 MB |
| KeyPackage TTL | Per package | 24 hours |
| KeyPackage uploads | Per account | Configurable (prevents store exhaustion) |
Rate limit counters use a sliding window. When a limit is exceeded, the server
responds with `RATE_LIMITED` and includes a `Retry-After` hint.
### Audit Logging
The following events are logged at audit level:
- Authentication success (account, device, IP).
- Authentication failure (reason, IP).
- Token issuance and refresh (account, device).
- KeyPackage upload (account, identity key fingerprint).
- Enqueue (account, channel, recipient).
- Fetch / fetchWait (account, recipient).
- Rate limit exceeded (scope, account/IP, current rate).
All audit log entries include a timestamp and correlation ID. Sensitive fields
(token values, ciphertext, private keys) are never logged.
---
## Client Changes
### Login / Register Flow
1. **Register:** Client generates an Ed25519 identity keypair, sends the public
key to the server. Server creates an account, binds the identity key, and
returns an `(access_token, refresh_token)` pair.
2. **Login:** Client presents credentials (initially: signed challenge from
device key). Server validates and issues tokens.
3. **Token storage:** Access and refresh tokens stored in the client state file
(same location as identity keypair). The state file should be
permission-restricted (`0600`).
4. **Token refresh:** Client detects `TOKEN_EXPIRED` errors and uses the refresh
token to obtain a new access token without re-authenticating.
### RPC Integration
Every RPC call includes the `Auth` struct:
```rust
// Pseudocode for client RPC calls
let auth = Auth {
version: 1,
access_token: state.access_token.clone(),
device_id: Some(state.device_id),
};
node_service.enqueue(auth, recipient_key, channel_id, payload).await?;
```
### Identity Binding
At registration, the client's Ed25519 public key is bound to the new account.
The client must refuse to upload KeyPackages if the local identity key does not
match the bound key -- this prevents accidental identity confusion after key
rotation.
---
## Compatibility
### Wire Version Field
The `Auth` struct includes its own `version` field, independent of the delivery
message version. This allows auth changes to evolve separately from the delivery
protocol.
### Legacy Support
- `version == 0`: No auth. Server behaviour is configurable:
- **Development:** Allow legacy calls (default for `cargo run`).
- **Production:** Reject legacy calls (default for Docker deployment).
- `version == 1`: Full auth. This is the target for M4+.
### N-1 Integration Tests
Compatibility testing covers:
- New client (v1 auth) against new server -- expected: full auth flow works.
- Old client (v0 legacy) against new server in dev mode -- expected: legacy
calls succeed.
- Old client (v0 legacy) against new server in prod mode -- expected: clean
rejection with `AUTHENTICATION_REQUIRED`.
- New client (v1 auth) against old server -- expected: server ignores unknown
`Auth` struct fields; operations succeed if server does not enforce auth.
---
## Implementation Sequence
1. Extend Cap'n Proto schemas with the `Auth` struct and add it to all
`NodeService` methods.
2. Implement token validation middleware in server RPC handlers; add an in-memory
token store (upgradeable to SQLite at M6).
3. Bind `identityKey` to account on upload; enforce on fetch/enqueue.
4. Add tests: unit tests for token validation; integration tests for auth
success and failure paths.
5. Add rate limiting middleware with configurable thresholds.
6. Add audit logging for all auth-related events.
---
## Cross-references
- [Milestones](milestones.md) -- M4 and M6 deliverables
- [Production Readiness WBS](production-readiness.md) -- Phase 3 (Auth/Device/Server Hardening)
- [1:1 Channel Design](dm-channels.md) -- channel-level authz
- [Wire Format: NodeService Schema](../wire-format/node-service-schema.md) -- RPC schema
- [Coding Standards](../contributing/coding-standards.md) -- security-by-design requirements