DM channels (createChannel), channel authz, security/docs, future improvements
- Add createChannel RPC (node.capnp @18): create 1:1 channel, returns 16-byte channelId - Store: create_channel(member_a, member_b), get_channel_members(channel_id) - FileBackedStore: channels.bin; SqlStore: migration 003_channels, schema v4 - channel_ops: handle_create_channel (auth + identity, peerKey 32 bytes) - Delivery authz: when channel_id.len() == 16, require caller and recipient are channel members (E022/E023) - Error codes E022 CHANNEL_ACCESS_DENIED, E023 CHANNEL_NOT_FOUND - SUMMARY: link Certificate lifecycle; security audit, future improvements, multi-agent plan docs - Certificate lifecycle doc, SECURITY-AUDIT, FUTURE-IMPROVEMENTS, MULTI-AGENT-WORK-PLAN - Client/core/tls/auth/server main: assorted fixes and updates from review and audit Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
182
docs/FUTURE-IMPROVEMENTS.md
Normal file
182
docs/FUTURE-IMPROVEMENTS.md
Normal file
@@ -0,0 +1,182 @@
|
||||
# Future Improvements
|
||||
|
||||
This document consolidates suggested improvements for quicnprotochat, drawn from the [roadmap](src/roadmap/milestones.md), [production readiness WBS](src/roadmap/production-readiness.md), [security audit](SECURITY-AUDIT.md), [production readiness audit](PRODUCTION-READINESS-AUDIT.md), and [future research](src/roadmap/future-research.md). Items are grouped by theme and ordered by impact and dependency.
|
||||
|
||||
---
|
||||
|
||||
## 1. Security and hardening
|
||||
|
||||
### 1.1 M7 — Post-quantum MLS (next milestone)
|
||||
|
||||
- **Goal:** Hybrid X25519 + ML-KEM-768 in the MLS crypto provider so group key material has post-quantum confidentiality.
|
||||
- **Ref:** [Milestones § M7](src/roadmap/milestones.md), [Hybrid KEM](src/protocol-layers/hybrid-kem.md).
|
||||
- **Status:** Hybrid KEM exists at the envelope level; integrate into OpenMLS provider and run full test suite.
|
||||
|
||||
### 1.2 CA-signed TLS / certificate lifecycle
|
||||
|
||||
- **Current:** Self-signed certs; client pins by using server cert as `ca_cert`.
|
||||
- **Improve:** Document or add support for CA-issued certs (e.g. Let's Encrypt), cert rotation, and optional OCSP/CRL. Keep pinning as the recommended option for single-server deployments.
|
||||
- **Ref:** [Threat model § Known gaps](src/cryptography/threat-model.md).
|
||||
|
||||
### 1.3 Stronger credential binding
|
||||
|
||||
- **Current:** MLS `BasicCredential` (raw Ed25519); no revocation or CA chain.
|
||||
- **Improve:** X.509-based MLS credentials, or Key Transparency / verifiable log for public keys to detect substitution.
|
||||
- **Ref:** [Threat model](src/cryptography/threat-model.md), [Future research](src/roadmap/future-research.md).
|
||||
|
||||
### 1.4 Username enumeration
|
||||
|
||||
- **Current:** OPAQUE login start uses `get_user_record`; timing or response shape might reveal user existence.
|
||||
- **Improve:** If user enumeration is in scope, consider constant-time or uniform response for unknown users (without weakening OPAQUE).
|
||||
- **Ref:** [Security audit § 8.3](SECURITY-AUDIT.md).
|
||||
|
||||
---
|
||||
|
||||
## 2. Authorization and abuse prevention
|
||||
|
||||
### 2.1 Full AUTHZ plan (accounts, devices, tokens)
|
||||
|
||||
- **Current:** Bearer/session tokens and identity binding; no formal account/device model.
|
||||
- **Improve:** Implement the [authz plan](src/roadmap/authz-plan.md): accounts, devices, device_id in Auth, per-account/per-device rate limits, and binding KeyPackage uploads to the authenticated account.
|
||||
- **Ref:** [Production readiness WBS](src/roadmap/production-readiness.md), [Threat model § No client auth on DS](src/cryptography/threat-model.md).
|
||||
|
||||
### 2.2 Per-IP and connection limits
|
||||
|
||||
- **Current:** Per-token rate limit; no per-IP or global connection cap.
|
||||
- **Improve:** Configurable per-IP rate limit and max concurrent QUIC connections to reduce DoS and resource exhaustion.
|
||||
- **Ref:** [Production readiness WBS § Abuse / DoS](src/roadmap/production-readiness.md).
|
||||
|
||||
---
|
||||
|
||||
## 3. Reliability and resilience
|
||||
|
||||
### 3.1 Client offline queue and retry
|
||||
|
||||
- **Current:** Retry with backoff for RPCs; no offline queue or gap detection.
|
||||
- **Improve:** Offline message queue, idempotent message IDs, and gap detection so clients can recover after long disconnects without duplicate or lost messages.
|
||||
- **Ref:** [Production readiness WBS § Client resilience](src/roadmap/production-readiness.md).
|
||||
|
||||
### 3.2 Connection draining and graceful shutdown
|
||||
|
||||
- **Current:** QUIC endpoint closed on ctrl_c; in-flight RPCs may be cut.
|
||||
- **Improve:** Draining period: stop accepting new connections, wait for in-flight RPCs (with timeout), then close. Document expected behaviour for load balancers.
|
||||
|
||||
### 3.3 N-1 compatibility and wire versioning
|
||||
|
||||
- **Current:** `CURRENT_WIRE_VERSION` and server-side check; no formal N-1 support policy.
|
||||
- **Improve:** Document supported client/server version matrix and how to deprecate old wire versions safely.
|
||||
- **Ref:** [Production readiness WBS § Compatibility](src/roadmap/production-readiness.md).
|
||||
|
||||
---
|
||||
|
||||
## 4. Operations and observability
|
||||
|
||||
### 4.1 CI pipeline
|
||||
|
||||
- **Add:** GitHub Actions (or equivalent) for:
|
||||
- `cargo test --workspace`
|
||||
- `cargo clippy`
|
||||
- `cargo fmt --check`
|
||||
- `cargo audit` (and optionally `cargo deny check`)
|
||||
- **Ref:** [Production readiness audit § 10](PRODUCTION-READINESS-AUDIT.md).
|
||||
|
||||
### 4.2 CODEOWNERS and review policy
|
||||
|
||||
- **Add:** `.github/CODEOWNERS` mapping crates to owners; document that security-sensitive changes (crypto, auth, wire format) require two reviewers.
|
||||
- **Ref:** [Production readiness WBS § Governance](src/roadmap/production-readiness.md).
|
||||
|
||||
### 4.3 Dependency policy (deny.toml)
|
||||
|
||||
- **Add:** `deny.toml` (or equivalent) for `cargo deny` (licenses, duplicate crates, banned crates, etc.) and run in CI.
|
||||
- **Ref:** [Production readiness audit § 13](PRODUCTION-READINESS-AUDIT.md).
|
||||
|
||||
### 4.4 HTTP health endpoint (optional)
|
||||
|
||||
- **Current:** Health is an RPC over QUIC; no separate HTTP endpoint.
|
||||
- **Improve:** Optional HTTP (e.g. port 8080) `/health` or `/ready` for load balancers and orchestrators that expect HTTP, or document that health is QUIC-only and how to probe it.
|
||||
|
||||
### 4.5 Docker user and writable paths
|
||||
|
||||
- **Current:** Image runs as `nobody`; data dir may not be writable.
|
||||
- **Improve:** Create a dedicated user/group in the image and set `QUICNPROTOCHAT_DATA_DIR` (and cert paths) to a directory writable by that user; document in deployment docs.
|
||||
- **Ref:** [Production readiness audit § 15](PRODUCTION-READINESS-AUDIT.md).
|
||||
|
||||
---
|
||||
|
||||
## 5. Features and product
|
||||
|
||||
### 5.1 Private 1:1 channels (DM)
|
||||
|
||||
- **Goal:** Channel creation, per-channel authz, TTL, and DM-specific flows so 1:1 chats are first-class and access-controlled.
|
||||
- **Ref:** [DM channels](src/roadmap/dm-channels.md), [Production readiness WBS](src/roadmap/production-readiness.md).
|
||||
|
||||
### 5.2 MLS lifecycle (remove, update, proposals)
|
||||
|
||||
- **Current:** Add member, send, receive; no remove/update or explicit proposal handling.
|
||||
- **Improve:** Member remove, credential update, and handling of MLS proposals (Remove, Update) for full group lifecycle.
|
||||
- **Ref:** [Milestones § M5](src/roadmap/milestones.md) (optional follow-ups).
|
||||
|
||||
### 5.3 Sealed Sender and metadata resistance
|
||||
|
||||
- **Goal:** Hide sender identity from the server (sender inside MLS ciphertext); optionally PIR for fetch so server does not learn which queue was accessed.
|
||||
- **Ref:** [Threat model § Future mitigations](src/cryptography/threat-model.md), [Future research](src/roadmap/future-research.md).
|
||||
|
||||
### 5.4 Traffic analysis resistance
|
||||
|
||||
- **Goal:** Padding and/or traffic shaping to reduce inference from message sizes and timing.
|
||||
- **Ref:** [Threat model § Future mitigations](src/cryptography/threat-model.md).
|
||||
|
||||
---
|
||||
|
||||
## 6. Transport and topology
|
||||
|
||||
### 6.1 P2P / NAT traversal (iroh, LibP2P)
|
||||
|
||||
- **Goal:** Direct peer-to-peer when possible; server as optional relay/rendezvous. Reduces single-point-of-failure and can improve latency.
|
||||
- **Ref:** [Future research § LibP2P / iroh](src/roadmap/future-research.md). The `quicnprotochat-p2p` crate is a starting point.
|
||||
|
||||
### 6.2 WebTransport (browser client)
|
||||
|
||||
- **Goal:** HTTP/3 + WebTransport endpoint so a web client can use the same RPC layer without raw QUIC in the browser.
|
||||
- **Ref:** [Future research § WebTransport](src/roadmap/future-research.md).
|
||||
|
||||
### 6.3 Tor / I2P
|
||||
|
||||
- **Goal:** Optional routing over Tor or I2P to hide client IP and reduce metadata leakage.
|
||||
- **Ref:** [Threat model § Future mitigations](src/cryptography/threat-model.md), [Future research](src/roadmap/future-research.md).
|
||||
|
||||
---
|
||||
|
||||
## 7. Code and maintenance
|
||||
|
||||
### 7.1 Warnings and dead code
|
||||
|
||||
- **Clean up:** Cap'n Proto generated `unused_parens`; `SessionInfo` dead fields (use or document); E2E deprecated `cargo_bin` and `unused_mut`; track openmls future-incompat.
|
||||
- **Ref:** [Production readiness audit § 14](PRODUCTION-READINESS-AUDIT.md).
|
||||
|
||||
### 7.2 Integration and E2E coverage
|
||||
|
||||
- **Add:** More integration tests (e.g. auth + delivery together, failure paths, concurrent register, rate limit, queue full). Broader E2E scenarios (multi-party, rejoin, key refresh).
|
||||
- **Ref:** [Multi-perspective review](SECURITY-AUDIT.md) maintainability section.
|
||||
|
||||
---
|
||||
|
||||
## Priority overview
|
||||
|
||||
| Priority | Theme | Examples |
|
||||
|----------|--------|----------|
|
||||
| **High** | Security | M7 PQ, CA/pinning docs, AUTHZ plan, CI + audit |
|
||||
| **High** | Ops | CI, CODEOWNERS, deny.toml, Docker user/paths |
|
||||
| **Medium** | Reliability | Offline queue, draining, N-1 policy |
|
||||
| **Medium** | Features | DM channels, MLS remove/update |
|
||||
| **Lower** | Research | Sealed Sender, PIR, P2P, WebTransport, Tor |
|
||||
|
||||
---
|
||||
|
||||
## Related documents
|
||||
|
||||
- [Milestones](src/roadmap/milestones.md) — M7 and beyond
|
||||
- [Production readiness WBS](src/roadmap/production-readiness.md) — phased hardening
|
||||
- [Future research](src/roadmap/future-research.md) — technologies and options
|
||||
- [Security audit](SECURITY-AUDIT.md) — recommendations and status
|
||||
- [Production readiness audit](PRODUCTION-READINESS-AUDIT.md) — checklist and fixes
|
||||
Reference in New Issue
Block a user