Rename all project references from quicproquo/qpq to quicprochat/qpc across documentation, Docker configuration, CI workflows, packaging scripts, operational configs, and build tooling. - Docker: crate paths, binary names, user/group, data dirs, env vars - CI: workflow crate references, binary names, artifact names - Docs: all markdown files under docs/, SDK READMEs, book.toml - Packaging: OpenWrt Makefile, init script, UCI config (file renames) - Scripts: justfile, dev-shell, screenshot, cross-compile, ai_team - Operations: Prometheus config, alert rules, Grafana dashboard - Config: .env.example (QPQ_* → QPC_*), CODEOWNERS paths - Top-level: README, CONTRIBUTING, ROADMAP, CLAUDE.md
8.8 KiB
8.8 KiB
Future Improvements
This document consolidates suggested improvements for quicprochat, drawn from the roadmap, production readiness WBS, security audit, production readiness audit, and future research. Items are grouped by theme and ordered by impact and dependency.
1. Security and hardening
1.1 M7 — Post-quantum MLS (next milestone)
- Goal: Hybrid X25519 + ML-KEM-768 in the MLS crypto provider so group key material has post-quantum confidentiality.
- Ref: Milestones § M7, Hybrid KEM.
- Status: Hybrid KEM exists at the envelope level; integrate into OpenMLS provider and run full test suite.
1.2 CA-signed TLS / certificate lifecycle
- Current: Self-signed certs; client pins by using server cert as
ca_cert. - Improve: Document or add support for CA-issued certs (e.g. Let's Encrypt), cert rotation, and optional OCSP/CRL. Keep pinning as the recommended option for single-server deployments.
- Ref: Threat model § Known gaps.
1.3 Stronger credential binding
- Current: MLS
BasicCredential(raw Ed25519); no revocation or CA chain. - Improve: X.509-based MLS credentials, or Key Transparency / verifiable log for public keys to detect substitution.
- Ref: Threat model, Future research.
1.4 Username enumeration
- Current: OPAQUE login start uses
get_user_record; timing or response shape might reveal user existence. - Improve: If user enumeration is in scope, consider constant-time or uniform response for unknown users (without weakening OPAQUE).
- Ref: Security audit § 8.3.
2. Authorization and abuse prevention
2.1 Full AUTHZ plan (accounts, devices, tokens)
- Current: Bearer/session tokens and identity binding; no formal account/device model.
- Improve: Implement the authz plan: accounts, devices, device_id in Auth, per-account/per-device rate limits, and binding KeyPackage uploads to the authenticated account.
- Ref: Production readiness WBS, Threat model § No client auth on DS.
2.2 Per-IP and connection limits
- Current: Per-token rate limit; no per-IP or global connection cap.
- Improve: Configurable per-IP rate limit and max concurrent QUIC connections to reduce DoS and resource exhaustion.
- Ref: Production readiness WBS § Abuse / DoS.
3. Reliability and resilience
3.1 Client offline queue and retry
- Current: Retry with backoff for RPCs; no offline queue or gap detection.
- Improve: Offline message queue, idempotent message IDs, and gap detection so clients can recover after long disconnects without duplicate or lost messages.
- Ref: Production readiness WBS § Client resilience.
3.2 Connection draining and graceful shutdown
- Current: QUIC endpoint closed on ctrl_c; in-flight RPCs may be cut.
- Improve: Draining period: stop accepting new connections, wait for in-flight RPCs (with timeout), then close. Document expected behaviour for load balancers.
3.3 N-1 compatibility and wire versioning
- Current:
CURRENT_WIRE_VERSIONand server-side check; no formal N-1 support policy. - Improve: Document supported client/server version matrix and how to deprecate old wire versions safely.
- Ref: Production readiness WBS § Compatibility.
4. Operations and observability
4.1 CI pipeline
- Add: GitHub Actions (or equivalent) for:
cargo test --workspacecargo clippycargo fmt --checkcargo audit(and optionallycargo deny check)
- Ref: Production readiness audit § 10.
4.2 CODEOWNERS and review policy
- Add:
.github/CODEOWNERSmapping crates to owners; document that security-sensitive changes (crypto, auth, wire format) require two reviewers. - Ref: Production readiness WBS § Governance.
4.3 Dependency policy (deny.toml)
- Add:
deny.toml(or equivalent) forcargo deny(licenses, duplicate crates, banned crates, etc.) and run in CI. - Ref: Production readiness audit § 13.
4.4 HTTP health endpoint (optional)
- Current: Health is an RPC over QUIC; no separate HTTP endpoint.
- Improve: Optional HTTP (e.g. port 8080)
/healthor/readyfor load balancers and orchestrators that expect HTTP, or document that health is QUIC-only and how to probe it.
4.5 Docker user and writable paths
- Current: Image runs as
nobody; data dir may not be writable. - Improve: Create a dedicated user/group in the image and set
QPC_DATA_DIR(and cert paths) to a directory writable by that user; document in deployment docs. - Ref: Production readiness audit § 15.
5. Features and product
5.1 Private 1:1 channels (DM)
- Goal: Channel creation, per-channel authz, TTL, and DM-specific flows so 1:1 chats are first-class and access-controlled.
- Ref: DM channels, Production readiness WBS.
5.2 MLS lifecycle (remove, update, proposals)
- Current: Add member, send, receive; no remove/update or explicit proposal handling.
- Improve: Member remove, credential update, and handling of MLS proposals (Remove, Update) for full group lifecycle.
- Ref: Milestones § M5 (optional follow-ups).
5.3 Sealed Sender and metadata resistance
- Goal: Hide sender identity from the server (sender inside MLS ciphertext); optionally PIR for fetch so server does not learn which queue was accessed.
- Ref: Threat model § Future mitigations, Future research.
5.4 Traffic analysis resistance
- Goal: Padding and/or traffic shaping to reduce inference from message sizes and timing.
- Ref: Threat model § Future mitigations.
6. Transport and topology
6.1 P2P / NAT traversal (iroh, LibP2P)
- Goal: Direct peer-to-peer when possible; server as optional relay/rendezvous. Reduces single-point-of-failure and can improve latency.
- Ref: Future research § LibP2P / iroh. The
quicprochat-p2pcrate is a starting point.
6.2 WebTransport (browser client)
- Goal: HTTP/3 + WebTransport endpoint so a web client can use the same RPC layer without raw QUIC in the browser.
- Ref: Future research § WebTransport.
6.3 Tor / I2P
- Goal: Optional routing over Tor or I2P to hide client IP and reduce metadata leakage.
- Ref: Threat model § Future mitigations, Future research.
7. Code and maintenance
7.1 Warnings and dead code
- Clean up: Cap'n Proto generated
unused_parens;SessionInfodead fields (use or document); E2E deprecatedcargo_binandunused_mut; track openmls future-incompat. - Ref: Production readiness audit § 14.
7.2 Integration and E2E coverage
- Add: More integration tests (e.g. auth + delivery together, failure paths, concurrent register, rate limit, queue full). Broader E2E scenarios (multi-party, rejoin, key refresh).
- Ref: Multi-perspective review maintainability section.
Priority overview
| Priority | Theme | Examples |
|---|---|---|
| High | Security | M7 PQ, CA/pinning docs, AUTHZ plan, CI + audit |
| High | Ops | CI, CODEOWNERS, deny.toml, Docker user/paths |
| Medium | Reliability | Offline queue, draining, N-1 policy |
| Medium | Features | DM channels, MLS remove/update |
| Lower | Research | Sealed Sender, PIR, P2P, WebTransport, Tor |
Related documents
- ROADMAP.md — phased execution plan (Phases 1–8) incorporating all items below
- Milestones — M7 and beyond
- Production readiness WBS — phased hardening
- Future research — technologies and options
- Security audit — recommendations and status
- Production readiness audit — checklist and fixes
- ADR-006: SDK-First Adoption — no REST gateway, native QUIC SDKs