Files
quicproquo/docs/SPRINT-PLAN-NEXT.md
Christian Nennemann 394199b19b fix: security hardening — 40 findings from full codebase review
Full codebase review by 4 independent agents (security, architecture,
code quality, correctness) identified ~80 findings. This commit fixes 40
of them across all workspace crates.

Critical fixes:
- Federation service: validate origin against mTLS cert CN/SAN (C1)
- WS bridge: add DM channel auth, size limits, rate limiting (C2)
- hpke_seal: panic on error instead of silent empty ciphertext (C3)
- hpke_setup_sender_and_export: error on parse fail, no PQ downgrade (C7)

Security fixes:
- Zeroize: seed_bytes() returns Zeroizing<[u8;32]>, private_to_bytes()
  returns Zeroizing<Vec<u8>>, ClientAuth.access_token, SessionState.password,
  conversation hex_key all wrapped in Zeroizing
- Keystore: 0o600 file permissions on Unix
- MeshIdentity: 0o600 file permissions on Unix
- Timing floors: resolveIdentity + WS bridge resolve_user get 5ms floor
- Mobile: TLS verification gated behind insecure-dev feature flag
- Proto: from_bytes default limit tightened from 64 MiB to 8 MiB

Correctness fixes:
- fetch_wait: register waiter before fetch to close TOCTOU window
- MeshEnvelope: exclude hop_count from signature (forwarding no longer
  invalidates sender signature)
- BroadcastChannel: encrypt returns Result instead of panicking
- transcript: rename verify_transcript_chain → validate_transcript_structure
- group.rs: extract shared process_incoming() for receive_message variants
- auth_ops: remove spurious RegistrationRequest deserialization
- MeshStore.seen: bounded to 100K with FIFO eviction

Quality fixes:
- FFI error classification: typed downcast instead of string matching
- Plugin HookVTable: SAFETY documentation for unsafe Send+Sync
- clippy::unwrap_used: warn → deny workspace-wide
- Various .unwrap_or("") → proper error returns

Review report: docs/REVIEW-2026-03-04.md
152 tests passing (72 core + 35 server + 14 E2E + 1 doctest + 30 P2P)
2026-03-04 07:52:12 +01:00

14 KiB
Raw Permalink Blame History

Next Sprint Planning — quicproquo

Pick 8 of the 24 features below for the next sprint cycle. Created: 2026-03-04 | Status: PENDING SELECTION

Completed Sprints (this cycle)

# Sprint Commit Summary
4 Rich Messaging 81d5e2e Read receipts, typing, reactions, edit/delete
5 File Transfer 3350d76 Chunked blob upload/download, /send-file
6 Disappearing + Groups fd21ea6 TTL messages, /group-info, deleteAccount
7 Go SDK 65ff262 QUIC + Cap'n Proto, 24 RPC methods, 14 API functions
8 TypeScript SDK 28ceaaf 175KB WASM crypto, WebSocket transport, browser demo
9 Mesh Networking 1b61b7e MeshIdentity, store-and-forward, broadcast channels
10 Privacy Hardening 9244e80 --redact-logs, traffic padding, /privacy suite, /verify-fs
11 Multi-Device 9244e80 Device registry (3 RPCs), /devices, max 5 per identity

Current Codebase Stats

  • 27 Cap'n Proto RPCs (@0@26) on NodeService
  • 10 AppMessage types (0x010x09 + file ref)
  • ~40 REPL commands
  • Tests: 72 core + 35 server + 28 P2P + 14 E2E = 149
  • SDKs: Rust (native), Go, TypeScript/WASM, C FFI, Python ctypes
  • Crates: core, proto, server, client, p2p, bot, gen, kt, plugin-api, gui, mobile, ffi

Feature Candidates (pick 8)

A. Federation Wiring

Effort: Medium | Area: Server Wire the existing outbound federation relay into the actual delivery flow. When a message targets user@remote.domain, the server routes via FederationClient::relay_enqueue() instead of local store. Add /federate <domain> admin command to configure peers. Test with two server instances. Currently all federation code exists but is marked #[allow(dead_code)].

B. Contact Management & Blocking

Effort: Medium | Area: Client + Server Contact list with add/remove/block/unblock. Server-side: addContact @27, removeContact @28, blockUser @29, listContacts @30 RPCs + contacts table. Client: /contacts, /block <user>, /unblock <user>. Blocked users can't enqueue messages to you (server enforces). Import/export contacts as JSON.

C. Voice/Video Call Signaling

Effort: High | Area: Core + Client WebRTC signaling over MLS for E2E encrypted calls. Add CallOffer, CallAnswer, CallIce, CallHangup AppMessage types (0x0A0x0D). Client REPL: /call <user>, /answer, /hangup. The actual media (audio/video) uses WebRTC peer-to-peer; qpq only handles the encrypted signaling. Include SDP offer/answer exchange and ICE candidate relay.

D. Encrypted Backup & Restore

Effort: Medium | Area: Client + Core Export all local state (message history, keys, group state) as an encrypted archive. Key derivation from user password via Argon2id. Format: encrypted SQLite dump + identity seed + MLS group states. /backup <path> and /restore <path> commands. Verify integrity on restore. Critical for device migration and disaster recovery.

E. Group Permissions & Roles

Effort: Medium | Area: Server + Client Admin/moderator/member roles within MLS groups. Server-side role storage per channel. Admins can: remove members, rename group, set TTL policy. Moderators can: mute members. Members can: send messages. /role <user> admin|mod|member, /mute <user> <duration>. Enforced at both server (RPC level) and client (MLS proposal validation).

F. Key Transparency Audit Client

Effort: Medium | Area: Client + KT crate Client-side verification of the KT Merkle log. The KT crate (quicproquo-kt) already has the Merkle tree and audit log. Add: /kt audit <username> to verify a user's key history is consistent, /kt monitor to continuously watch for key changes, /kt proof <username> to fetch and verify inclusion proofs. Alert on unexpected key changes (TOFU violation).

Effort: Low-Medium | Area: Client Full-text search over local encrypted message history. Add FTS5 virtual table to the conversation SQLite DB. /search <query> returns matching messages with context, timestamps, and conversation names. /search <query> in:<conversation> for scoped search. Highlight matching terms. Index on message insert.

H. Server Clustering & HA

Effort: High | Area: Server + Infra Run multiple qpq-server instances behind a shared state layer. Options: shared PostgreSQL backend (replace SQLite for clustered mode), or Raft consensus for delivery queue. Add --cluster-peers flag, health-based leader election, delivery queue synchronization. Docker Compose with 3-node cluster. This is the path to production-scale deployment.

I. Protocol Compliance Testing

Effort: Medium | Area: Testing Comprehensive MLS RFC 9420 compliance test suite. Verify: TreeKEM operations, epoch advancement, proposal/commit sequences, welcome message handling, group context extensions, PSK injection, external joins. Cross-test with other MLS implementations (OpenMLS test vectors). Add to CI. Target: 50+ protocol-level tests covering edge cases.

J. User Profiles & Status

Effort: Low | Area: Server + Client Profile pictures (stored as blobs), display names, status messages ("Available", "Away", custom text), about/bio text. updateProfile @27 and fetchProfile @28 RPCs. Profile data is signed by the identity key for authenticity. /profile set-name <name>, /profile set-status <text>, /profile set-avatar <path>, /profile <username> to view. Cache profiles locally.

K. Notification Framework

Effort: Medium | Area: Server + Client Per-conversation notification settings: all, mentions-only, muted. Server-side WebPush integration for browser clients (using the TS SDK). Add updateNotificationSettings @27 RPC. Client: /mute <conversation>, /unmute, /notify mentions-only. Push notification payload: encrypted sender + conversation hint (no message content). APNs/FCM gateway as a separate microservice.

L. Mobile App Shell

Effort: High | Area: Mobile + FFI React Native app using the C FFI bindings (quicproquo-ffi). Screens: login, conversation list, chat view, settings. Bridge FFI functions to React Native via NativeModules. Use the existing qpq_connect, qpq_login, qpq_send, qpq_receive C API. iOS + Android targets. Alternatively: Flutter with dart:ffi. Includes push notification registration.

M. Message Threading & Replies

Effort: Low-Medium | Area: Client + Core Threaded conversations within channels. Add thread_id field to Chat AppMessage — replies to a message inherit its thread_id (or create one). /thread <msg-index> enters a thread view showing only that thread's messages. /threads lists active threads with last activity. Thread-aware notification counts. Local storage: add thread_id column to messages table, filter queries by thread.

N. Cross-Signing & Identity Verification

Effort: Medium | Area: Core + Client Out-of-band identity verification via QR codes and emoji comparison. Generate a short verification code from both parties' identity keys (similar to Signal's safety numbers but interactive). /verify <user> starts a verification session, displays emoji sequence or QR payload. /verify confirm marks the contact as verified. Verified contacts show a checkmark. Store verification state locally. Alert if a verified contact's key changes.

O. Offline Message Queue with Priorities

Effort: Low-Medium | Area: Client Smart offline queue that prioritizes messages when reconnecting. Messages queued while offline get priority levels: critical (key rotation, group ops), normal (chat), low (typing, read receipts). On reconnect, send critical first, then normal, drop stale low-priority. /outbox shows pending messages. /outbox flush forces immediate send. /outbox clear discards unsent. Exponential backoff with jitter for reconnection.

P. Audit Log & Compliance Export

Effort: Medium | Area: Server Persistent server-side audit log for compliance. Every RPC call logged to a dedicated audit_events table: timestamp, identity, operation, result, metadata. Configurable retention policy (30/60/90 days). qpq-admin audit --from --to --user CLI to query. Export to JSON/CSV. GDPR data export: /export-my-data RPC returns all data the server holds about a user. Separate from redact-logs (this is structured, queryable).

Q. Bot Framework Enhancements

Effort: Medium | Area: Bot SDK + Server Enhance the existing quicproquo-bot crate into a full bot platform. Add: slash command registration (/weather, /poll, etc.), interactive message components (buttons/selects as structured AppMessage extensions), bot permissions (scoped access tokens), webhook delivery (HTTP POST on events). BotBuilder pattern: Bot::new().command("ping", handle_ping).on_message(handle_msg).run(). Example bots: echo, reminder, RSS feed.

R. Tor/I2P Transport

Effort: High | Area: Server + Client + P2P Anonymous transport layer for privacy-critical deployments. Server: listen on Tor hidden service (.onion) via arti or tor crate, configurable via --tor-hidden-service. Client: connect through SOCKS5 proxy to .onion address, --tor-proxy socks5://127.0.0.1:9050. P2P mesh: route through Tor for metadata-resistant peer communication. Optional I2P support via SAM bridge. All existing QUIC+TLS works over the tunnel.

S. Plugin Marketplace & Hot-Reload

Effort: Medium | Area: Server + Plugin API Extend the existing plugin system into a discoverable marketplace. Plugin manifest format (TOML) with name, version, permissions, hooks. qpq-server --plugin-dir ./plugins/ auto-loads .so/.dylib files. Hot-reload: watch plugin directory, reload on change without server restart. Plugin isolation: each plugin runs in its own thread with limited Store access. Add qpq-gen plugin <name> scaffolding. Example: spam filter plugin, message archiver.

T. Stress Testing & Benchmarking Suite

Effort: Medium | Area: Testing + Infra Production-grade load testing tool. Simulate N concurrent clients: register, login, create channels, send/receive at configurable rate. Measure: messages/sec throughput, p50/p95/p99 latency, memory usage, connection limits. cargo bench integration for micro-benchmarks (already have some). New qpq-loadtest binary: qpq-loadtest --clients 100 --rate 50/s --duration 60s --server localhost:5001. Generate HTML report with charts. Identify bottlenecks before production.

U. Disappearing Media & View-Once

Effort: Low | Area: Client + Core View-once messages that auto-delete after first viewing. Add ViewOnce flag to FileRef AppMessage — recipient can view the file/image once, then it's deleted locally. Server-side: auto-delete blob after first download. /send-once <path> command. Display "[view-once media]" placeholder until opened. Prevent screenshots (best-effort: clear clipboard, disable screen recording notification). Extends existing file transfer infrastructure.

V. Emoji Status & Presence

Effort: Low | Area: Server + Client Lightweight presence system. Users set an emoji + short text status ("🏖️ On vacation", "🔴 Do not disturb", "🟢 Available"). Ephemeral — not stored permanently, expires after configurable duration. publishPresence RPC (piggyback on existing publishEndpoint). Client poll or push-based presence updates. /status 🎯 Focusing to set, /status to view, /who shows online contacts with their status. No tracking — presence is opt-in and ephemeral.

W. Markdown & Rich Text Messages

Effort: Low | Area: Core + Client Rich text formatting in messages. Support a subset of Markdown in chat: bold, italic, code, code blocks, strikethrough, > quotes, links. Parse on display (client-side only — wire format stays plain text with Markdown syntax). TUI renderer: ANSI escape codes for bold/italic/color. Browser demo: render as HTML. Add /format on|off toggle. No changes to MLS or wire protocol — purely presentational.

Effort: Low-Medium | Area: Server + Client Shareable invitation links for joining the server or a group. createInvite RPC generates a time-limited, usage-limited token. Format: qpq://server:port/invite/TOKEN or QR code encoding. /invite create [--expires 24h] [--uses 10] generates link. /invite list shows active invites. /invite revoke <id> cancels. New users can register via invite: qpq-client --invite qpq://.... Group invites: generate a link that auto-adds the joiner to a specific group after registration.

Y. Command Engine & Playbooks

Effort: Medium | Area: Client + Testing Unified command abstraction layer making every REPL action available via code and YAML. Command registry maps string names to typed Command variants. YAML playbook format for declarative multi-step scenarios with variables, assertions, and loops. qpq-client --run playbook.yaml for batch execution. Programmatic Rust API: engine.execute(Command::Send { ... }). Enables: CI smoke tests, reproducible environments, bot scripting, onboarding demos, load test scenarios, migration scripts. Pairs with every other feature.


Selection Guide

Privacy-first (maximum anonymity & security): A (federation), D (backup), F (KT audit), N (cross-signing), R (Tor), U (view-once)

Production-ready (deploy to real users): A (federation), B (contacts), H (clustering), I (compliance), K (notifications), T (stress test)

User experience (make it feel like a real messenger): B (contacts), C (calls), G (search), J (profiles), V (presence), W (rich text), X (invites)

Mobile launch (ship an app): D (backup), J (profiles), K (notifications), L (mobile app), X (invites)

Developer ecosystem (grow the community): Q (bot framework), S (plugin marketplace), T (stress test), I (compliance)

Mesh/Freifunk (offline-first, decentralized): A (federation), N (cross-signing), O (offline queue), R (Tor)


Completed (this planning cycle)

Sprint Feature Status
Y. Command Engine & Playbooks Done — command_engine.rs, playbook.rs, --run CLI, 5 example playbooks

Selected Features (fill in after choosing)

Pick 8 of AX above, then we'll plan sprint assignments.

Sprint Feature Notes
12
13
14
15
16
17
18
19