docs: add threat model, crypto boundaries, and audit scope documents

Security audit preparation:
- Threat model with STRIDE analysis and 5 threat actors
- Crypto boundaries documenting all 11 primitives and key lifecycle
- Audit scope document for external security firms
This commit is contained in:
2026-03-09 20:48:15 +01:00
parent c256c38ffb
commit 266bcfed59
4 changed files with 621 additions and 0 deletions

View File

@@ -0,0 +1,148 @@
# Security Audit Scope
## Project Summary
quicprochat is a production-grade end-to-end encrypted group messenger implemented in Rust. It uses MLS (RFC 9420) for group key agreement with a hybrid post-quantum KEM (X25519 + ML-KEM-768), OPAQUE for password-authenticated key exchange, and QUIC + TLS 1.3 as the transport layer. The project comprises approximately 38,000 lines of Rust across 9 workspace crates, with 300+ tests passing.
## Scope
### Primary Scope (Critical)
These components handle all cryptographic operations and should receive the deepest scrutiny.
#### `quicprochat-core` -- All Cryptographic Primitives
| File | Lines | Responsibility |
|------|-------|----------------|
| `src/hybrid_kem.rs` | ~630 | Hybrid X25519 + ML-KEM-768 KEM: key generation, encrypt/decrypt, encapsulate/decapsulate, HKDF key derivation |
| `src/hybrid_crypto.rs` | ~540 | OpenMLS `OpenMlsCrypto` + `OpenMlsProvider` implementation with hybrid HPKE routing |
| `src/identity.rs` | ~250 | Ed25519 identity keypair: generation, signing, verification, zeroization |
| `src/group.rs` | ~1080 | MLS group state machine: create, join, add/remove members, send/receive, epoch management |
| `src/keypackage.rs` | ~100 | MLS KeyPackage generation with hybrid init keys |
| `src/keystore.rs` | ~710 | Disk-backed OpenMLS key store with file permission restrictions |
| `src/sealed_sender.rs` | ~160 | Sender identity + Ed25519 signature envelope inside MLS payload |
| `src/pq_noise.rs` | ~690 | Post-quantum Noise_XX handshake with ML-KEM-768 mixing |
| `src/opaque_auth.rs` | ~20 | OPAQUE cipher suite definition (Ristretto255, Triple-DH, Argon2id) |
| `src/recovery.rs` | ~340 | Recovery code generation, Argon2id key derivation, encrypted backup bundles |
| `src/padding.rs` | ~270 | Message padding to fixed buckets + uniform boundary padding |
| `src/safety_numbers.rs` | ~80 | Signal-style safety number computation |
| `src/transcript.rs` | ~400 | Encrypted hash-chained transcript archive |
#### `quicprochat-server/src/domain/auth.rs` -- OPAQUE Server Logic
| File | Lines | Responsibility |
|------|-------|----------------|
| `src/domain/auth.rs` | ~170 | OPAQUE registration (start/finish), session token validation, expiry cleanup |
### Secondary Scope (Important)
These components handle transport security and trust anchors. They are lower risk than the primary scope but still security-relevant.
#### `quicprochat-rpc` -- RPC Transport Security
| File | Lines | Responsibility |
|------|-------|----------------|
| `src/framing.rs` | ~200 | Wire format encoding/decoding, payload size limits (4 MiB max) |
| `src/auth_handshake.rs` | ~80 | Session token exchange over QUIC bi-stream |
| `src/server.rs` | ~300 | QUIC server setup, TLS configuration, connection handling |
| `src/middleware.rs` | ~200 | Tower middleware: rate limiting, timeouts |
#### `quicprochat-kt` -- Key Transparency
| File | Lines | Responsibility |
|------|-------|----------------|
| `src/lib.rs` | ~65 | Merkle log hash functions (leaf_hash, node_hash) with RFC 6962 domain separation |
| `src/tree.rs` | ~150 | Append-only Merkle tree (insert, root computation) |
| `src/proof.rs` | ~100 | Inclusion proof generation and verification |
| `src/revocation.rs` | ~100 | Key revocation log with reason codes |
#### `quicprochat-server` -- Security-Relevant Server Components
| File | Lines | Responsibility |
|------|-------|----------------|
| `src/domain/rate_limit.rs` | ~150 | Per-client rate limiting |
| `src/domain/traffic_resistance.rs` | ~100 | Decoy traffic generation, timing jitter, payload padding |
| `src/domain/moderation.rs` | ~100 | Abuse prevention (report handling) |
| `src/tls.rs` | ~100 | TLS certificate loading and configuration |
## Out of Scope
The following components are explicitly excluded from the audit:
- **Client SDKs**: `quicprochat-sdk`, Go/Python/TypeScript/WASM/FFI bindings (thin wrappers)
- **CLI/TUI client**: `quicprochat-client` (user interface, no crypto logic)
- **Plugin API**: `quicprochat-plugin-api` (`#![no_std]` C-ABI interface, no crypto)
- **P2P networking**: `quicprochat-p2p` (iroh integration, feature-gated)
- **Proto definitions**: `quicprochat-proto` (generated code, no security logic)
- **Documentation**: mdBook sources, README, ROADMAP
- **CI/CD**: GitHub Actions, Dockerfiles, justfile
- **Non-crypto server domain**: user management, group metadata, blob storage, delivery routing
## Specific Questions for Auditors
### Hybrid KEM Construction
1. Is the hybrid combiner `HKDF-SHA256(salt, X25519_ss || ML-KEM_ss, info)` a sound dual-PRF construction? Does it provide IND-CCA2 security assuming either component is secure?
2. Is the nonce handling correct? The hybrid KEM uses random 12-byte nonces (not derived from HKDF). Is there a nonce collision risk at the expected message volume?
3. Is the `derive_from_ikm()` construction (HKDF -> seeded StdRng -> key generation) suitable for deterministic key derivation in the MLS HPKE key schedule?
### OpenMLS Integration
4. Does the `HybridCryptoProvider` correctly satisfy the `OpenMlsProvider` trait contract? Are there edge cases where hybrid key detection by length could fail or be spoofed?
5. Is the `DiskKeyStore` implementation (wrapping `MemoryStorage` with disk flush) safe for concurrent access? Could a crash between `MemoryStorage` update and disk flush cause key loss?
6. Is the MLS group lifecycle (create, add_member with merge_pending_commit, join_group via StagedWelcome) correctly implemented? Are there state consistency issues after failed operations?
### Timing Side-Channels
7. Are there timing side-channels in the OPAQUE registration/login flow? The `opaque-ke` crate uses Ristretto255 (constant-time), but is the surrounding code (deserialization, error handling) timing-safe?
8. Is `constant_time_eq()` in `recovery.rs` correctly implemented? The early return on length mismatch is intentional (lengths are not secret), but verify the XOR-accumulate loop.
### Noise_XX + ML-KEM Layering
9. Is the PQ Noise handshake (`pq_noise.rs`) sound? Specifically:
- Is ML-KEM ciphertext placement in message 2 correct (after `ee` DH, before `se` DH)?
- Is `mix_key(mlkem_ss)` the right integration point for the post-quantum shared secret?
- Does ML-KEM implicit rejection (pseudorandom wrong shared secret) cause any subtle failures beyond AEAD decryption error?
### Zeroization
10. Is zeroization complete? Specifically:
- `x25519_dalek::StaticSecret` does not publicly implement `Zeroize`. Is the `HybridKeypair` struct's `x25519_sk` field securely erased on drop?
- Are there intermediate buffers or stack copies of secret material that escape `Zeroizing` wrappers?
- Does the `DiskKeyStore` flush-on-write pattern leave secret material in OS page cache or filesystem buffers?
### Key Transparency
11. Is the Merkle log construction (RFC 6962-style domain separation with `0x00`/`0x01` prefixes) resistant to second-preimage attacks?
12. Could an adversarial server forge inclusion proofs for non-existent entries?
## Access
- **Repository**: `git clone https://github.com/quicprochat/quicprochat`
- **Build**: `cargo build --workspace` (Rust 1.75+, no system dependencies -- `protobuf-src` vendors protoc)
- **Test**: `cargo test --workspace` (300+ tests, runs in ~60s)
- **Lint**: `cargo clippy --workspace -- -D warnings`
- **Documentation**: `cd docs && mdbook build`
### Key Entry Points
- Crypto primitives: `crates/quicprochat-core/src/`
- Server auth: `crates/quicprochat-server/src/domain/auth.rs`
- RPC framing: `crates/quicprochat-rpc/src/framing.rs`
- Key transparency: `crates/quicprochat-kt/src/`
## Timeline and Budget
Based on the scope (approximately 5,000 lines of security-critical Rust code in primary scope, plus ~1,000 lines in secondary scope), we recommend:
- **Duration**: 4-6 weeks (2 auditors)
- **Budget range**: $80,000 - $150,000
- **Firm type**: Specialized cryptography audit firm (e.g., NCC Group Cryptography Services, Trail of Bits, Cure53, Quarkslab)
### Suggested Audit Phases
1. **Week 1-2**: Hybrid KEM construction review, HKDF key derivation, zeroization audit
2. **Week 2-3**: OpenMLS integration, MLS group lifecycle, KeyPackage handling
3. **Week 3-4**: OPAQUE integration, PQ Noise handshake, timing analysis
4. **Week 4-5**: Key Transparency Merkle log, transport security (RPC framing)
5. **Week 5-6**: Report writing, finding triage, remediation discussion