Full codebase review by 4 independent agents (security, architecture,
code quality, correctness) identified ~80 findings. This commit fixes 40
of them across all workspace crates.
Critical fixes:
- Federation service: validate origin against mTLS cert CN/SAN (C1)
- WS bridge: add DM channel auth, size limits, rate limiting (C2)
- hpke_seal: panic on error instead of silent empty ciphertext (C3)
- hpke_setup_sender_and_export: error on parse fail, no PQ downgrade (C7)
Security fixes:
- Zeroize: seed_bytes() returns Zeroizing<[u8;32]>, private_to_bytes()
returns Zeroizing<Vec<u8>>, ClientAuth.access_token, SessionState.password,
conversation hex_key all wrapped in Zeroizing
- Keystore: 0o600 file permissions on Unix
- MeshIdentity: 0o600 file permissions on Unix
- Timing floors: resolveIdentity + WS bridge resolve_user get 5ms floor
- Mobile: TLS verification gated behind insecure-dev feature flag
- Proto: from_bytes default limit tightened from 64 MiB to 8 MiB
Correctness fixes:
- fetch_wait: register waiter before fetch to close TOCTOU window
- MeshEnvelope: exclude hop_count from signature (forwarding no longer
invalidates sender signature)
- BroadcastChannel: encrypt returns Result instead of panicking
- transcript: rename verify_transcript_chain → validate_transcript_structure
- group.rs: extract shared process_incoming() for receive_message variants
- auth_ops: remove spurious RegistrationRequest deserialization
- MeshStore.seen: bounded to 100K with FIFO eviction
Quality fixes:
- FFI error classification: typed downcast instead of string matching
- Plugin HookVTable: SAFETY documentation for unsafe Send+Sync
- clippy::unwrap_used: warn → deny workspace-wide
- Various .unwrap_or("") → proper error returns
Review report: docs/REVIEW-2026-03-04.md
152 tests passing (72 core + 35 server + 14 E2E + 1 doctest + 30 P2P)
318 lines
17 KiB
Markdown
318 lines
17 KiB
Markdown
# Consolidated Codebase Review — quicproquo
|
|
**Date:** 2026-03-04
|
|
**Reviewers:** 4 independent agents (security, architecture, code quality, correctness)
|
|
**Scope:** Full codebase — all workspace crates, schemas, Cargo.toml
|
|
|
|
---
|
|
|
|
## CRITICAL (7 findings)
|
|
|
|
### C1. Federation service has NO authentication on inbound requests
|
|
**Source:** Security | **File:** `crates/quicproquo-server/src/federation/service.rs:22-201`
|
|
`FederationServiceImpl` handles inbound federation requests (`relay_enqueue`, `relay_batch_enqueue`, `proxy_fetch_key_package`, `proxy_resolve_user`) but performs **zero authentication** on the caller. The `auth` field in the request is only logged (`origin` string), never validated. While mTLS protects the transport, any server with a valid federation certificate can inject arbitrary messages, enumerate users, and fetch KeyPackages. The `FederationAuth.origin` field is a self-declared string, not verified against the mTLS certificate's subject.
|
|
**Fix:** Validate `origin` against the mTLS client certificate's CN/SAN. Enforce per-peer rate limits. Consider signing federation messages at the application layer.
|
|
|
|
### C2. WebSocket bridge bypasses DM channel authorization
|
|
**Source:** Security | **File:** `crates/quicproquo-server/src/ws_bridge.rs:230-305`
|
|
The `handle_send` function lets any authenticated user enqueue a message to any recipient, bypassing the DM channel membership verification that the Cap'n Proto `enqueue` path enforces in `delivery.rs:93-135`. The WS bridge calls `store.enqueue()` directly, skipping channel membership auth, payload size limits (5 MB), rate limiting, hook invocations, delivery proof generation, and audit logging.
|
|
**Fix:** Apply the same authorization, size limits, rate limiting, hooks, and audit logging as the Cap'n Proto delivery path.
|
|
|
|
### C3. `hpke_seal` silently returns empty ciphertext on error
|
|
**Source:** Quality | **File:** `crates/quicproquo-core/src/hybrid_crypto.rs:198-201,216-219`
|
|
`HybridCrypto::hpke_seal()` catches errors and returns `Ok(vec![])` instead of propagating. Empty ciphertexts are sent as if valid — data loss and security issue.
|
|
**Fix:** Propagate errors via `Result`.
|
|
|
|
### C4. NodeServiceImpl god object (15 fields, 27 RPC methods, no capability segmentation)
|
|
**Source:** Architecture | **File:** `crates/quicproquo-server/src/node_service/mod.rs:253-336`
|
|
Single struct implements 27 methods spanning auth, delivery, key management, channels, blobs, devices, federation, and account lifecycle. `handle_node_connection` takes 15 parameters. Unauthenticated clients get capability to invoke all 27 methods.
|
|
**Fix:** Extract `ServerContext` struct. Split Cap'n Proto schema into capability interfaces (AuthService, DeliveryService, etc.) vended after auth.
|
|
|
|
### C5. FileBackedStore O(n) full-map serialization on every mutation
|
|
**Source:** Architecture | **File:** `crates/quicproquo-server/src/storage.rs:327-442`
|
|
Every write locks Mutex, mutates HashMap, serializes **entire** map to disk via `fs::write`. No fsync, no atomic rename. Performance cliff and data-loss vector.
|
|
**Fix:** Make SqlStore the default. If FileBackedStore remains for dev, use write-to-temp-then-rename.
|
|
|
|
### C6. `std::sync::Mutex` in async context (server and P2P)
|
|
**Source:** Architecture | **Files:** `storage.rs:1-7,25-28`, `sql_store.rs:4,50`, `node_service/mod.rs:272`
|
|
Holding `std::sync::Mutex` across disk I/O blocks Tokio worker threads, causing head-of-line blocking.
|
|
**Fix:** Replace with `tokio::sync::Mutex` or use `spawn_blocking` for disk I/O.
|
|
|
|
### C7. `hpke_setup_sender_and_export` silently downgrades to classical crypto on parse error
|
|
**Source:** Security | **File:** `crates/quicproquo-core/src/hybrid_crypto.rs:263`
|
|
On invalid hybrid public key, silently falls back to classical RustCrypto provider. Malicious server can force PQ downgrade.
|
|
**Fix:** Return error on hybrid key parse failure, consistent with `hpke_seal`.
|
|
|
|
---
|
|
|
|
## HIGH (14 findings)
|
|
|
|
### H1. Global `AUTH_CONTEXT: RwLock<Option<ClientAuth>>`
|
|
**Source:** Architecture + Quality | **File:** `crates/quicproquo-client/src/lib.rs:36,40,82`
|
|
Blocks multi-account, creates hidden coupling, root cause of `AUTH_LOCK` test serialization hack.
|
|
**Fix:** Replace with `ClientContext` struct passed to all functions.
|
|
|
|
### H2. Store trait is 30+ method monolith
|
|
**Source:** Architecture | **File:** `crates/quicproquo-server/src/storage.rs:33-180`
|
|
Any new storage backend must implement every method. Cannot be composed or partially implemented.
|
|
**Fix:** Split into sub-traits: `KeyPackageStore`, `DeliveryStore`, `UserStore`, `ChannelStore`, etc.
|
|
|
|
### H3. CoreError::Mls wraps errors as String, losing type info
|
|
**Source:** Architecture | **File:** `crates/quicproquo-core/src/error.rs:16-17`
|
|
Impossible to match on specific MLS error conditions.
|
|
**Fix:** Create MLS sub-error variants or wrap boxed error.
|
|
|
|
### H4. Proto `from_bytes` uses default 64 MiB traversal limit
|
|
**Source:** Architecture | **File:** `crates/quicproquo-proto/src/lib.rs:67-72`
|
|
DoS amplification vector for direct callers (client, bot, FFI).
|
|
**Fix:** Accept `ReaderOptions` as parameter, make default stricter.
|
|
|
|
### H5. Mobile crate hardcodes SkipServerVerification
|
|
**Source:** Architecture + Security | **File:** `crates/quicproquo-mobile/src/lib.rs:93-100,165-172`
|
|
Unconditionally skips TLS verification. Inherently MITM-vulnerable.
|
|
**Fix:** Add certificate_verifier parameter or feature flag.
|
|
|
|
### H6. Duplicate InsecureServerCertVerifier implementations
|
|
**Source:** Architecture | **Files:** `client/rpc.rs:27-29`, `mobile/lib.rs:165-167`
|
|
**Fix:** Consolidate into shared crate behind `cfg(feature = "insecure")`.
|
|
|
|
### H7. DiskKeyStore writes HPKE private keys to disk unencrypted
|
|
**Source:** Security | **File:** `crates/quicproquo-core/src/keystore.rs`
|
|
No encryption, no file permissions. HPKE private keys are MLS epoch secrets.
|
|
**Fix:** Encrypt with Argon2id + ChaCha20-Poly1305. Set 0o600 permissions.
|
|
|
|
### H8. `identity.rs:seed_bytes()` returns unzeroized copy of secret seed
|
|
**Source:** Security | **File:** `crates/quicproquo-core/src/identity.rs:52`
|
|
Copies 32-byte Ed25519 seed out of `Zeroizing` wrapper. Returned value not zeroized.
|
|
**Fix:** Return `&[u8]` reference or `Zeroizing<[u8; 32]>`.
|
|
|
|
### H9. `hybrid_kem.rs:private_to_bytes()` returns unzeroized `Vec<u8>`
|
|
**Source:** Security | **File:** `crates/quicproquo-core/src/hybrid_kem.rs:162`
|
|
Hybrid private key material lingers in memory.
|
|
**Fix:** Return `Zeroizing<Vec<u8>>`.
|
|
|
|
### H10. MeshIdentity stores Ed25519 seed as plaintext JSON
|
|
**Source:** Security | **File:** `crates/quicproquo-p2p/src/identity.rs:72-79`
|
|
No encryption, no file permissions.
|
|
**Fix:** Encrypt identity file, set 0o600 permissions.
|
|
|
|
### H11. WebSocket bridge has rate_limits field but never checks it
|
|
**Source:** Security | **File:** `crates/quicproquo-server/src/ws_bridge.rs:28-36`
|
|
**Fix:** Apply `check_rate_limit()` in all WS bridge handlers.
|
|
|
|
### H12. ~100 lines duplicated between `receive_message` and `receive_message_with_sender`
|
|
**Source:** Quality | **File:** `crates/quicproquo-core/src/group.rs:471-583`
|
|
Bug fix in one must be manually replicated to other. Security-critical MLS code.
|
|
**Fix:** Extract shared MLS message processing helper.
|
|
|
|
### H13. Synchronous file I/O in async blob handler
|
|
**Source:** Quality | **File:** `crates/quicproquo-server/src/node_service/blob_ops.rs:124-137`
|
|
Blocking `std::fs` calls in async handler stall event loop.
|
|
**Fix:** Use `tokio::fs` or `spawn_blocking`.
|
|
|
|
### H14. `MeshEnvelope::forwarded()` invalidates signature without re-signing
|
|
**Source:** Quality + Correctness + Security | **File:** `crates/quicproquo-p2p/src/envelope.rs:172-176`
|
|
Increments `hop_count` included in signed bytes. All forwarded envelopes fail `verify()`.
|
|
**Fix:** Exclude `hop_count` from signature, or add separate forwarding signature.
|
|
|
|
---
|
|
|
|
## MEDIUM (19 findings)
|
|
|
|
### M1. fetch_wait TOCTOU: missed notification window
|
|
**Source:** Correctness | **File:** `crates/quicproquo-server/src/node_service/delivery.rs:496-522`
|
|
TOCTOU between initial fetch (empty) and waiter registration. Enqueue between these points fires notify before waiter exists.
|
|
**Fix:** Register waiter before initial fetch.
|
|
|
|
### M2. `verify_transcript_chain` never checks hashes — misleading name
|
|
**Source:** Correctness | **File:** `crates/quicproquo-core/src/transcript.rs:215-251`
|
|
Only validates structural integrity, not hash chain. Name implies verification.
|
|
**Fix:** Rename to `validate_transcript_structure` or implement actual chain verification.
|
|
|
|
### M3. Non-atomic file writes in FileBackedStore
|
|
**Source:** Correctness | **File:** `crates/quicproquo-server/src/storage.rs:332-337` (all flush_* methods)
|
|
`fs::write` directly — crash mid-write corrupts file, loses all data.
|
|
**Fix:** Use `tempfile::NamedTempFile` + `persist()`.
|
|
|
|
### M4. `delete_account` non-atomic multi-lock
|
|
**Source:** Correctness | **File:** `crates/quicproquo-server/src/storage.rs:800-864`
|
|
6 sequential Mutex locks. Concurrent fetch could see partially deleted account.
|
|
**Fix:** Use single transaction or hold all locks simultaneously.
|
|
|
|
### M5. Timing side channel in `resolveIdentity` — no timing floor
|
|
**Source:** Security | **File:** `crates/quicproquo-server/src/node_service/user_ops.rs:142-178`
|
|
Unlike `resolveUser` which has 5ms floor.
|
|
**Fix:** Apply same `RESOLVE_TIMING_FLOOR`.
|
|
|
|
### M6. WS bridge `resolve_user` has no timing floor
|
|
**Source:** Security | **File:** `crates/quicproquo-server/src/ws_bridge.rs:158-181`
|
|
**Fix:** Add same timing floor as Cap'n Proto handler.
|
|
|
|
### M7. `AuthContext.token` not zeroized
|
|
**Source:** Security | **File:** `crates/quicproquo-server/src/auth.rs:68-72`
|
|
**Fix:** Wrap in `Zeroizing<Vec<u8>>`.
|
|
|
|
### M8. Client `ClientAuth.access_token` not zeroized
|
|
**Source:** Security | **File:** `crates/quicproquo-client/src/lib.rs:50-55`
|
|
**Fix:** Use `Zeroizing<Vec<u8>>`.
|
|
|
|
### M9. `SessionState.password` stores plaintext password in memory
|
|
**Source:** Security | **File:** `crates/quicproquo-client/src/client/session.rs:29`
|
|
**Fix:** Use `Zeroizing<String>`, derive key at startup and zeroize password.
|
|
|
|
### M10. `conversation.rs:172` hex-encodes derived key without zeroization
|
|
**Source:** Security | **File:** `crates/quicproquo-client/src/client/conversation.rs:172`
|
|
**Fix:** Use `Zeroizing<String>` for `hex_key`.
|
|
|
|
### M11. `device_ops.rs:49` uses `.unwrap_or("")` on untrusted input
|
|
**Source:** Security | **File:** `crates/quicproquo-server/src/node_service/device_ops.rs:49`
|
|
**Fix:** Return error for invalid UTF-8.
|
|
|
|
### M12. `MeshEnvelope::forwarded()` invalidates signature (duplicate of H14)
|
|
**Source:** Security | **File:** `crates/quicproquo-p2p/src/envelope.rs:172-176`
|
|
|
|
### M13. FileBackedStore `create_channel` O(n) linear scan
|
|
**Source:** Architecture + Quality + Correctness | **File:** `crates/quicproquo-server/src/storage.rs:749-765`
|
|
**Fix:** Secondary index or deterministic channel ID from member pair hash.
|
|
|
|
### M14. `resolve_identity_key` O(n) linear scan
|
|
**Source:** Architecture + Quality | **File:** `crates/quicproquo-server/src/storage.rs:676-684`
|
|
**Fix:** Maintain reverse map.
|
|
|
|
### M15. FFI error classification by string matching
|
|
**Source:** Architecture | **File:** `crates/quicproquo-ffi/src/lib.rs:183`
|
|
**Fix:** Match on typed error variants.
|
|
|
|
### M16. Documentation drift: master-prompt.md says Noise/TCP, code uses QUIC/TLS
|
|
**Source:** Architecture | **Files:** `master-prompt.md`, server/client `Cargo.toml`
|
|
**Fix:** Update master-prompt.md to reflect actual transport.
|
|
|
|
### M17. Plugin `HookVTable` unsafe Send+Sync without safety docs
|
|
**Source:** Architecture | **File:** `crates/quicproquo-plugin-api/src/lib.rs:190-192`
|
|
**Fix:** Add `// SAFETY:` documentation blocks.
|
|
|
|
### M18. OPAQUE register_finish: spurious RegistrationRequest deserialization
|
|
**Source:** Quality + Correctness | **File:** `crates/quicproquo-server/src/node_service/auth_ops.rs:335-343`
|
|
Dead code — deserializes upload_bytes as wrong type first.
|
|
**Fix:** Remove lines 335-343.
|
|
|
|
### M19. Mixed serialization formats in DiskKeyStore (bincode + serde_json)
|
|
**Source:** Quality | **File:** `crates/quicproquo-core/src/keystore.rs`
|
|
**Fix:** Standardize on one format.
|
|
|
|
---
|
|
|
|
## LOW (23 findings)
|
|
|
|
### L1. `BroadcastChannel.key` not zeroized on drop
|
|
**File:** `crates/quicproquo-p2p/src/broadcast.rs:18-19`
|
|
|
|
### L2. Plugin loader `CStr::from_ptr` on plugin-returned string (UB risk)
|
|
**File:** `crates/quicproquo-server/src/plugin_loader.rs:102`
|
|
|
|
### L3. Token cache stores session token as plaintext hex when no password set
|
|
**File:** `crates/quicproquo-client/src/client/token_cache.rs:63-68`
|
|
|
|
### L4. `--password` CLI flag visible in process list
|
|
**File:** `crates/quicproquo-client/src/main.rs:104`
|
|
|
|
### L5. `clippy::unwrap_used` is warn not deny
|
|
**File:** `Cargo.toml:88`
|
|
|
|
### L6. `strip = "symbols"` hinders post-mortem debugging
|
|
**File:** `Cargo.toml:94`
|
|
|
|
### L7. `thread_rng()` for channel ID generation instead of `OsRng`
|
|
**Files:** `storage.rs:760`, `sql_store.rs:545`
|
|
|
|
### L8. `conversation.rs:548` — `limit` cast from `usize` to `u32` without saturation
|
|
**File:** `crates/quicproquo-client/src/client/conversation.rs:548`
|
|
|
|
### L9. `conversation.rs:363-365,420-422` — bincode deserialize errors silently dropped
|
|
**File:** `crates/quicproquo-client/src/client/conversation.rs`
|
|
|
|
### L10. `repl.rs:610` — static AtomicU64 for padding timer instead of session state
|
|
**File:** `crates/quicproquo-client/src/client/repl.rs:610`
|
|
|
|
### L11. `command_engine.rs:148` — `to_slash()` clones entire Command enum unnecessarily
|
|
**File:** `crates/quicproquo-client/src/client/command_engine.rs:148`
|
|
|
|
### L12. `conversation.rs:201-203` — SQL ATTACH with format string
|
|
**File:** `crates/quicproquo-client/src/client/conversation.rs:201-203`
|
|
|
|
### L13. Client `hex.rs` trivial wrapper with zero value-add
|
|
**File:** `crates/quicproquo-client/src/client/hex.rs`
|
|
|
|
### L14. `config.rs` `#[allow(dead_code)]` on EffectiveFederationConfig
|
|
**File:** `crates/quicproquo-server/src/config.rs`
|
|
|
|
### L15. `federation/address.rs` `#[allow(dead_code)]` on entire module
|
|
**File:** `crates/quicproquo-server/src/federation/address.rs`
|
|
|
|
### L16. ANSI escape codes hardcoded without terminal capability detection
|
|
**File:** `crates/quicproquo-client/src/client/display.rs`
|
|
|
|
### L17. GUI `lib.rs:75` `.expect()` on Tauri run
|
|
**File:** `crates/quicproquo-gui/src/lib.rs:75`
|
|
|
|
### L18. `session.rs:186` DiskKeyStore failure silently falls back to ephemeral
|
|
**File:** `crates/quicproquo-client/src/client/session.rs:186`
|
|
|
|
### L19. `retry.rs:37` `thread_rng()` for jitter instead of `OsRng`
|
|
**File:** `crates/quicproquo-client/src/client/retry.rs:37`
|
|
|
|
### L20. `MeshStore.seen` set grows unboundedly
|
|
**File:** `crates/quicproquo-p2p/src/` (MeshStore)
|
|
|
|
### L21. `envelope.rs:58` `.expect()` on system clock in non-test code
|
|
**File:** `crates/quicproquo-p2p/src/envelope.rs:58`
|
|
|
|
### L22. `broadcast.rs:47` `.expect()` on encryption in non-test code
|
|
**File:** `crates/quicproquo-p2p/src/broadcast.rs:47`
|
|
|
|
### L23. Bot crate hardcodes sender as "peer" (TODO)
|
|
**File:** `crates/quicproquo-bot/src/lib.rs`
|
|
|
|
---
|
|
|
|
## TESTING GAPS (6 findings)
|
|
|
|
### T1. No unit tests for `plugin_loader.rs`
|
|
### T2. No unit tests for `federation/tls.rs`
|
|
### T3. No tests for `blob_ops.rs`
|
|
### T4. No tests for `delivery.rs`
|
|
### T5. `conversation.rs` migration code untested
|
|
### T6. No negative test for `MeshEnvelope::verify()` with wrong key
|
|
|
|
---
|
|
|
|
## Strengths (positive findings from security audit)
|
|
|
|
- Constant-time token comparison via `subtle::ConstantTimeEq`
|
|
- Parameterized SQL everywhere (no injection)
|
|
- Proper Argon2id (19 MiB, t=2, p=1) + ChaCha20-Poly1305 for client state
|
|
- Zero `.unwrap()` in non-test server code (grep-verified)
|
|
- TLS 1.3 only, mTLS for federation
|
|
- Rate limiting: 100 enqueues/60s, 50 connections/IP/60s
|
|
- Delivery proof signing: SHA-256(seq||recipient||timestamp) + Ed25519
|
|
- KeyPackage ciphersuite validation (only 0x0001 accepted)
|
|
- Payload size limits: 5 MB message, 50 MB blob, 1 MB KeyPackage
|
|
- Queue depth limits: 1000/inbox, 100K sessions, 100K waiters
|
|
- Blob path traversal protection via hex-encoded hash
|
|
- Audit logging with secret redaction (identity keys prefix-only)
|
|
- Production config validation (rejects devtoken, empty auth, missing TLS)
|
|
|
|
---
|
|
|
|
## Recommended fix priority
|
|
|
|
1. **Federation auth** (C1) — auth-gate inbound requests, validate origin against mTLS cert
|
|
2. **WS bridge authz** (C2) + rate limits (H11) + timing floors (M6) — parity with Cap'n Proto path
|
|
3. **Crypto error propagation** (C3, C7) — hpke_seal and hpke_setup_sender_and_export
|
|
4. **Zeroization sweep** (H8, H9, H10, M7-M10, L1) — all leaked secret material
|
|
5. **ServerContext extraction** (C4) — foundation for capability-based security
|
|
6. **FileBackedStore atomic writes** (C5, M3) — prevent data loss on crash
|
|
7. **std::sync::Mutex → tokio::sync::Mutex** (C6) — unblock Tokio workers
|
|
8. **Mobile TLS verification** (H5) — remove hardcoded skip
|
|
9. **fetch_wait TOCTOU** (M1) — register waiter before fetch
|
|
10. **Testing gaps** (T1-T6) — critical untested paths
|