Files
quicproquo/docs/REVIEW-2026-03-04.md
Christian Nennemann 394199b19b fix: security hardening — 40 findings from full codebase review
Full codebase review by 4 independent agents (security, architecture,
code quality, correctness) identified ~80 findings. This commit fixes 40
of them across all workspace crates.

Critical fixes:
- Federation service: validate origin against mTLS cert CN/SAN (C1)
- WS bridge: add DM channel auth, size limits, rate limiting (C2)
- hpke_seal: panic on error instead of silent empty ciphertext (C3)
- hpke_setup_sender_and_export: error on parse fail, no PQ downgrade (C7)

Security fixes:
- Zeroize: seed_bytes() returns Zeroizing<[u8;32]>, private_to_bytes()
  returns Zeroizing<Vec<u8>>, ClientAuth.access_token, SessionState.password,
  conversation hex_key all wrapped in Zeroizing
- Keystore: 0o600 file permissions on Unix
- MeshIdentity: 0o600 file permissions on Unix
- Timing floors: resolveIdentity + WS bridge resolve_user get 5ms floor
- Mobile: TLS verification gated behind insecure-dev feature flag
- Proto: from_bytes default limit tightened from 64 MiB to 8 MiB

Correctness fixes:
- fetch_wait: register waiter before fetch to close TOCTOU window
- MeshEnvelope: exclude hop_count from signature (forwarding no longer
  invalidates sender signature)
- BroadcastChannel: encrypt returns Result instead of panicking
- transcript: rename verify_transcript_chain → validate_transcript_structure
- group.rs: extract shared process_incoming() for receive_message variants
- auth_ops: remove spurious RegistrationRequest deserialization
- MeshStore.seen: bounded to 100K with FIFO eviction

Quality fixes:
- FFI error classification: typed downcast instead of string matching
- Plugin HookVTable: SAFETY documentation for unsafe Send+Sync
- clippy::unwrap_used: warn → deny workspace-wide
- Various .unwrap_or("") → proper error returns

Review report: docs/REVIEW-2026-03-04.md
152 tests passing (72 core + 35 server + 14 E2E + 1 doctest + 30 P2P)
2026-03-04 07:52:12 +01:00

17 KiB

Consolidated Codebase Review — quicproquo

Date: 2026-03-04 Reviewers: 4 independent agents (security, architecture, code quality, correctness) Scope: Full codebase — all workspace crates, schemas, Cargo.toml


CRITICAL (7 findings)

C1. Federation service has NO authentication on inbound requests

Source: Security | File: crates/quicproquo-server/src/federation/service.rs:22-201 FederationServiceImpl handles inbound federation requests (relay_enqueue, relay_batch_enqueue, proxy_fetch_key_package, proxy_resolve_user) but performs zero authentication on the caller. The auth field in the request is only logged (origin string), never validated. While mTLS protects the transport, any server with a valid federation certificate can inject arbitrary messages, enumerate users, and fetch KeyPackages. The FederationAuth.origin field is a self-declared string, not verified against the mTLS certificate's subject. Fix: Validate origin against the mTLS client certificate's CN/SAN. Enforce per-peer rate limits. Consider signing federation messages at the application layer.

C2. WebSocket bridge bypasses DM channel authorization

Source: Security | File: crates/quicproquo-server/src/ws_bridge.rs:230-305 The handle_send function lets any authenticated user enqueue a message to any recipient, bypassing the DM channel membership verification that the Cap'n Proto enqueue path enforces in delivery.rs:93-135. The WS bridge calls store.enqueue() directly, skipping channel membership auth, payload size limits (5 MB), rate limiting, hook invocations, delivery proof generation, and audit logging. Fix: Apply the same authorization, size limits, rate limiting, hooks, and audit logging as the Cap'n Proto delivery path.

C3. hpke_seal silently returns empty ciphertext on error

Source: Quality | File: crates/quicproquo-core/src/hybrid_crypto.rs:198-201,216-219 HybridCrypto::hpke_seal() catches errors and returns Ok(vec![]) instead of propagating. Empty ciphertexts are sent as if valid — data loss and security issue. Fix: Propagate errors via Result.

C4. NodeServiceImpl god object (15 fields, 27 RPC methods, no capability segmentation)

Source: Architecture | File: crates/quicproquo-server/src/node_service/mod.rs:253-336 Single struct implements 27 methods spanning auth, delivery, key management, channels, blobs, devices, federation, and account lifecycle. handle_node_connection takes 15 parameters. Unauthenticated clients get capability to invoke all 27 methods. Fix: Extract ServerContext struct. Split Cap'n Proto schema into capability interfaces (AuthService, DeliveryService, etc.) vended after auth.

C5. FileBackedStore O(n) full-map serialization on every mutation

Source: Architecture | File: crates/quicproquo-server/src/storage.rs:327-442 Every write locks Mutex, mutates HashMap, serializes entire map to disk via fs::write. No fsync, no atomic rename. Performance cliff and data-loss vector. Fix: Make SqlStore the default. If FileBackedStore remains for dev, use write-to-temp-then-rename.

C6. std::sync::Mutex in async context (server and P2P)

Source: Architecture | Files: storage.rs:1-7,25-28, sql_store.rs:4,50, node_service/mod.rs:272 Holding std::sync::Mutex across disk I/O blocks Tokio worker threads, causing head-of-line blocking. Fix: Replace with tokio::sync::Mutex or use spawn_blocking for disk I/O.

C7. hpke_setup_sender_and_export silently downgrades to classical crypto on parse error

Source: Security | File: crates/quicproquo-core/src/hybrid_crypto.rs:263 On invalid hybrid public key, silently falls back to classical RustCrypto provider. Malicious server can force PQ downgrade. Fix: Return error on hybrid key parse failure, consistent with hpke_seal.


HIGH (14 findings)

H1. Global AUTH_CONTEXT: RwLock<Option<ClientAuth>>

Source: Architecture + Quality | File: crates/quicproquo-client/src/lib.rs:36,40,82 Blocks multi-account, creates hidden coupling, root cause of AUTH_LOCK test serialization hack. Fix: Replace with ClientContext struct passed to all functions.

H2. Store trait is 30+ method monolith

Source: Architecture | File: crates/quicproquo-server/src/storage.rs:33-180 Any new storage backend must implement every method. Cannot be composed or partially implemented. Fix: Split into sub-traits: KeyPackageStore, DeliveryStore, UserStore, ChannelStore, etc.

H3. CoreError::Mls wraps errors as String, losing type info

Source: Architecture | File: crates/quicproquo-core/src/error.rs:16-17 Impossible to match on specific MLS error conditions. Fix: Create MLS sub-error variants or wrap boxed error.

H4. Proto from_bytes uses default 64 MiB traversal limit

Source: Architecture | File: crates/quicproquo-proto/src/lib.rs:67-72 DoS amplification vector for direct callers (client, bot, FFI). Fix: Accept ReaderOptions as parameter, make default stricter.

H5. Mobile crate hardcodes SkipServerVerification

Source: Architecture + Security | File: crates/quicproquo-mobile/src/lib.rs:93-100,165-172 Unconditionally skips TLS verification. Inherently MITM-vulnerable. Fix: Add certificate_verifier parameter or feature flag.

H6. Duplicate InsecureServerCertVerifier implementations

Source: Architecture | Files: client/rpc.rs:27-29, mobile/lib.rs:165-167 Fix: Consolidate into shared crate behind cfg(feature = "insecure").

H7. DiskKeyStore writes HPKE private keys to disk unencrypted

Source: Security | File: crates/quicproquo-core/src/keystore.rs No encryption, no file permissions. HPKE private keys are MLS epoch secrets. Fix: Encrypt with Argon2id + ChaCha20-Poly1305. Set 0o600 permissions.

H8. identity.rs:seed_bytes() returns unzeroized copy of secret seed

Source: Security | File: crates/quicproquo-core/src/identity.rs:52 Copies 32-byte Ed25519 seed out of Zeroizing wrapper. Returned value not zeroized. Fix: Return &[u8] reference or Zeroizing<[u8; 32]>.

H9. hybrid_kem.rs:private_to_bytes() returns unzeroized Vec<u8>

Source: Security | File: crates/quicproquo-core/src/hybrid_kem.rs:162 Hybrid private key material lingers in memory. Fix: Return Zeroizing<Vec<u8>>.

H10. MeshIdentity stores Ed25519 seed as plaintext JSON

Source: Security | File: crates/quicproquo-p2p/src/identity.rs:72-79 No encryption, no file permissions. Fix: Encrypt identity file, set 0o600 permissions.

H11. WebSocket bridge has rate_limits field but never checks it

Source: Security | File: crates/quicproquo-server/src/ws_bridge.rs:28-36 Fix: Apply check_rate_limit() in all WS bridge handlers.

H12. ~100 lines duplicated between receive_message and receive_message_with_sender

Source: Quality | File: crates/quicproquo-core/src/group.rs:471-583 Bug fix in one must be manually replicated to other. Security-critical MLS code. Fix: Extract shared MLS message processing helper.

H13. Synchronous file I/O in async blob handler

Source: Quality | File: crates/quicproquo-server/src/node_service/blob_ops.rs:124-137 Blocking std::fs calls in async handler stall event loop. Fix: Use tokio::fs or spawn_blocking.

H14. MeshEnvelope::forwarded() invalidates signature without re-signing

Source: Quality + Correctness + Security | File: crates/quicproquo-p2p/src/envelope.rs:172-176 Increments hop_count included in signed bytes. All forwarded envelopes fail verify(). Fix: Exclude hop_count from signature, or add separate forwarding signature.


MEDIUM (19 findings)

M1. fetch_wait TOCTOU: missed notification window

Source: Correctness | File: crates/quicproquo-server/src/node_service/delivery.rs:496-522 TOCTOU between initial fetch (empty) and waiter registration. Enqueue between these points fires notify before waiter exists. Fix: Register waiter before initial fetch.

M2. verify_transcript_chain never checks hashes — misleading name

Source: Correctness | File: crates/quicproquo-core/src/transcript.rs:215-251 Only validates structural integrity, not hash chain. Name implies verification. Fix: Rename to validate_transcript_structure or implement actual chain verification.

M3. Non-atomic file writes in FileBackedStore

Source: Correctness | File: crates/quicproquo-server/src/storage.rs:332-337 (all flush_* methods) fs::write directly — crash mid-write corrupts file, loses all data. Fix: Use tempfile::NamedTempFile + persist().

M4. delete_account non-atomic multi-lock

Source: Correctness | File: crates/quicproquo-server/src/storage.rs:800-864 6 sequential Mutex locks. Concurrent fetch could see partially deleted account. Fix: Use single transaction or hold all locks simultaneously.

M5. Timing side channel in resolveIdentity — no timing floor

Source: Security | File: crates/quicproquo-server/src/node_service/user_ops.rs:142-178 Unlike resolveUser which has 5ms floor. Fix: Apply same RESOLVE_TIMING_FLOOR.

M6. WS bridge resolve_user has no timing floor

Source: Security | File: crates/quicproquo-server/src/ws_bridge.rs:158-181 Fix: Add same timing floor as Cap'n Proto handler.

M7. AuthContext.token not zeroized

Source: Security | File: crates/quicproquo-server/src/auth.rs:68-72 Fix: Wrap in Zeroizing<Vec<u8>>.

M8. Client ClientAuth.access_token not zeroized

Source: Security | File: crates/quicproquo-client/src/lib.rs:50-55 Fix: Use Zeroizing<Vec<u8>>.

M9. SessionState.password stores plaintext password in memory

Source: Security | File: crates/quicproquo-client/src/client/session.rs:29 Fix: Use Zeroizing<String>, derive key at startup and zeroize password.

M10. conversation.rs:172 hex-encodes derived key without zeroization

Source: Security | File: crates/quicproquo-client/src/client/conversation.rs:172 Fix: Use Zeroizing<String> for hex_key.

M11. device_ops.rs:49 uses .unwrap_or("") on untrusted input

Source: Security | File: crates/quicproquo-server/src/node_service/device_ops.rs:49 Fix: Return error for invalid UTF-8.

M12. MeshEnvelope::forwarded() invalidates signature (duplicate of H14)

Source: Security | File: crates/quicproquo-p2p/src/envelope.rs:172-176

M13. FileBackedStore create_channel O(n) linear scan

Source: Architecture + Quality + Correctness | File: crates/quicproquo-server/src/storage.rs:749-765 Fix: Secondary index or deterministic channel ID from member pair hash.

M14. resolve_identity_key O(n) linear scan

Source: Architecture + Quality | File: crates/quicproquo-server/src/storage.rs:676-684 Fix: Maintain reverse map.

M15. FFI error classification by string matching

Source: Architecture | File: crates/quicproquo-ffi/src/lib.rs:183 Fix: Match on typed error variants.

M16. Documentation drift: master-prompt.md says Noise/TCP, code uses QUIC/TLS

Source: Architecture | Files: master-prompt.md, server/client Cargo.toml Fix: Update master-prompt.md to reflect actual transport.

M17. Plugin HookVTable unsafe Send+Sync without safety docs

Source: Architecture | File: crates/quicproquo-plugin-api/src/lib.rs:190-192 Fix: Add // SAFETY: documentation blocks.

M18. OPAQUE register_finish: spurious RegistrationRequest deserialization

Source: Quality + Correctness | File: crates/quicproquo-server/src/node_service/auth_ops.rs:335-343 Dead code — deserializes upload_bytes as wrong type first. Fix: Remove lines 335-343.

M19. Mixed serialization formats in DiskKeyStore (bincode + serde_json)

Source: Quality | File: crates/quicproquo-core/src/keystore.rs Fix: Standardize on one format.


LOW (23 findings)

L1. BroadcastChannel.key not zeroized on drop

File: crates/quicproquo-p2p/src/broadcast.rs:18-19

L2. Plugin loader CStr::from_ptr on plugin-returned string (UB risk)

File: crates/quicproquo-server/src/plugin_loader.rs:102

L3. Token cache stores session token as plaintext hex when no password set

File: crates/quicproquo-client/src/client/token_cache.rs:63-68

L4. --password CLI flag visible in process list

File: crates/quicproquo-client/src/main.rs:104

L5. clippy::unwrap_used is warn not deny

File: Cargo.toml:88

L6. strip = "symbols" hinders post-mortem debugging

File: Cargo.toml:94

L7. thread_rng() for channel ID generation instead of OsRng

Files: storage.rs:760, sql_store.rs:545

L8. conversation.rs:548limit cast from usize to u32 without saturation

File: crates/quicproquo-client/src/client/conversation.rs:548

L9. conversation.rs:363-365,420-422 — bincode deserialize errors silently dropped

File: crates/quicproquo-client/src/client/conversation.rs

L10. repl.rs:610 — static AtomicU64 for padding timer instead of session state

File: crates/quicproquo-client/src/client/repl.rs:610

L11. command_engine.rs:148to_slash() clones entire Command enum unnecessarily

File: crates/quicproquo-client/src/client/command_engine.rs:148

L12. conversation.rs:201-203 — SQL ATTACH with format string

File: crates/quicproquo-client/src/client/conversation.rs:201-203

L13. Client hex.rs trivial wrapper with zero value-add

File: crates/quicproquo-client/src/client/hex.rs

L14. config.rs #[allow(dead_code)] on EffectiveFederationConfig

File: crates/quicproquo-server/src/config.rs

L15. federation/address.rs #[allow(dead_code)] on entire module

File: crates/quicproquo-server/src/federation/address.rs

L16. ANSI escape codes hardcoded without terminal capability detection

File: crates/quicproquo-client/src/client/display.rs

L17. GUI lib.rs:75 .expect() on Tauri run

File: crates/quicproquo-gui/src/lib.rs:75

L18. session.rs:186 DiskKeyStore failure silently falls back to ephemeral

File: crates/quicproquo-client/src/client/session.rs:186

L19. retry.rs:37 thread_rng() for jitter instead of OsRng

File: crates/quicproquo-client/src/client/retry.rs:37

L20. MeshStore.seen set grows unboundedly

File: crates/quicproquo-p2p/src/ (MeshStore)

L21. envelope.rs:58 .expect() on system clock in non-test code

File: crates/quicproquo-p2p/src/envelope.rs:58

L22. broadcast.rs:47 .expect() on encryption in non-test code

File: crates/quicproquo-p2p/src/broadcast.rs:47

L23. Bot crate hardcodes sender as "peer" (TODO)

File: crates/quicproquo-bot/src/lib.rs


TESTING GAPS (6 findings)

T1. No unit tests for plugin_loader.rs

T2. No unit tests for federation/tls.rs

T3. No tests for blob_ops.rs

T4. No tests for delivery.rs

T5. conversation.rs migration code untested

T6. No negative test for MeshEnvelope::verify() with wrong key


Strengths (positive findings from security audit)

  • Constant-time token comparison via subtle::ConstantTimeEq
  • Parameterized SQL everywhere (no injection)
  • Proper Argon2id (19 MiB, t=2, p=1) + ChaCha20-Poly1305 for client state
  • Zero .unwrap() in non-test server code (grep-verified)
  • TLS 1.3 only, mTLS for federation
  • Rate limiting: 100 enqueues/60s, 50 connections/IP/60s
  • Delivery proof signing: SHA-256(seq||recipient||timestamp) + Ed25519
  • KeyPackage ciphersuite validation (only 0x0001 accepted)
  • Payload size limits: 5 MB message, 50 MB blob, 1 MB KeyPackage
  • Queue depth limits: 1000/inbox, 100K sessions, 100K waiters
  • Blob path traversal protection via hex-encoded hash
  • Audit logging with secret redaction (identity keys prefix-only)
  • Production config validation (rejects devtoken, empty auth, missing TLS)

  1. Federation auth (C1) — auth-gate inbound requests, validate origin against mTLS cert
  2. WS bridge authz (C2) + rate limits (H11) + timing floors (M6) — parity with Cap'n Proto path
  3. Crypto error propagation (C3, C7) — hpke_seal and hpke_setup_sender_and_export
  4. Zeroization sweep (H8, H9, H10, M7-M10, L1) — all leaked secret material
  5. ServerContext extraction (C4) — foundation for capability-based security
  6. FileBackedStore atomic writes (C5, M3) — prevent data loss on crash
  7. std::sync::Mutex → tokio::sync::Mutex (C6) — unblock Tokio workers
  8. Mobile TLS verification (H5) — remove hardcoded skip
  9. fetch_wait TOCTOU (M1) — register waiter before fetch
  10. Testing gaps (T1-T6) — critical untested paths