chore: fix all clippy warnings across workspace

This commit is contained in:
2026-03-04 14:13:58 +01:00
parent 4013b223ff
commit 5a66c2e954
43 changed files with 2124 additions and 57 deletions

380
docs/V2-DESIGN-ANALYSIS.md Normal file
View File

@@ -0,0 +1,380 @@
# quicproquo v2 — Design Analysis & Recommendations
> Multi-perspective retrospective of the v1 architecture.
> Produced 2026-03-04 by four parallel analysis agents examining server,
> client/UX, crypto/security, and project structure/DX.
---
## Executive Summary
quicproquo v1 demonstrates strong fundamentals: QUIC-native transport, RFC 9420
MLS group encryption, post-quantum hybrid KEM, OPAQUE zero-knowledge auth, and a
working multi-language SDK surface. These are the right bets and put the project
ahead of most open-source messengers on the crypto front.
However, three architectural choices limit the path to production:
1. **capnp-rpc is `!Send`** — forces single-threaded RPC handling, blocking
scalability.
2. **Monolithic client with global state** — business logic is tangled into the
REPL, duplicated across TUI/GUI/Web, and cannot be used as a library.
3. **Poll-based delivery** — 1-second polling wastes bandwidth and adds latency;
no server-push channel exists.
A v2 should keep the crypto stack (MLS + hybrid PQ KEM + OPAQUE), keep QUIC, but
rearchitect the RPC layer, extract an SDK crate, and add push-based delivery.
---
## Part 1 — What Works Well
### Transport & Protocol
- **QUIC (quinn) + TLS 1.3** — correct choice. Built-in encryption, connection
migration, 0-RTT potential. No reason to change.
- **Cap'n Proto schemas as API contract** — zero-copy wire format, compact
binary, schema evolution via ordinals. The *schemas* are good; the *RPC
runtime* is the problem.
### Cryptography
- **MLS (RFC 9420, openmls)** — only IETF-standard group E2E protocol. No
realistic alternative for groups > 2 members. Test suite is thorough (1005
lines covering 2-party, 3-party, hybrid, removal, leave, stale epoch).
- **Hybrid PQ KEM (X25519 + ML-KEM-768)** — forward-thinking dual-algorithm
protection. Well-implemented with versioned wire format, proper zeroization,
and 12 targeted tests. Ahead of Signal (PQXDH, late 2023) and Matrix (no PQ).
- **OPAQUE (RFC 9497)** — server never sees passwords. Ristretto255 + Argon2id
is best-in-class.
- **Sealed sender, safety numbers, message padding** — all clean, simple,
correct. Safety numbers match Signal's 5200-iteration HMAC-SHA256 cost.
- **Zeroization discipline** — secrets wrapped in `Zeroizing`, Debug impls
redact keys, no `.unwrap()` in crypto paths.
- **WASM feature gating** — `core/native` cleanly separates WASM-safe crypto
from native-only modules (MLS, OPAQUE, filesystem).
### Server Design
- **Store trait abstraction** — 30+ methods, clean backend swap (SqlStore vs
FileBackedStore). Well-factored.
- **OPAQUE auth with timing floors** — `resolveUser`/`resolveIdentity` mask
lookup timing to prevent username enumeration.
- **Delivery proofs** — Ed25519-signed receipt of server acceptance. Clients get
cryptographic evidence.
- **`wasNew` flag on createChannel** — elegantly solves the dual-MLS-group race
condition where both DM parties try to initialize.
- **Plugin hooks (C-ABI)** — `#![no_std]` vtable, zero dependencies, chained
hooks with continue/reject protocol. Clean extensibility.
- **Production config validation** — enforces encrypted storage, strong auth
tokens, pre-existing TLS certs.
### Client & DX
- **Zero-config local dev** — `qpq --username alice --password pass` auto-starts
server, generates TLS certs, registers, and logs in. Genuinely excellent.
- **Encrypted-at-rest everything** — state file (QPCE), conversation DB
(SQLCipher), session cache. Argon2id + ChaCha20-Poly1305 throughout.
- **Playbook system** — YAML-scripted command execution with assertions. Great
for CI/integration testing.
- **Conversation store** — SQLite with deduplication, outbox for offline
queuing, activity tracking.
- **Conventional commits, GPG-signed** — consistent `feat:`/`fix:`/`docs:`
discipline.
- **Security lints enforced by build** — `clippy::unwrap_used = "deny"`,
`unsafe_code = "warn"`.
---
## Part 2 — What Needs Rethinking
### 2.1 RPC Layer: capnp-rpc is the #1 Scalability Bottleneck
**Problem:** `capnp-rpc` uses `Rc` internally and is `!Send`. Everything runs on
a `LocalSet` with `spawn_local`. All 27 RPC methods serialize through a single
thread. No work-stealing, no multi-core utilization.
**Impact:** With 1000+ concurrent clients, the single-threaded executor cannot
keep up. A slow `fetchWait` (30s timeout) blocks the entire connection.
**Also:** The WebSocket bridge (`ws_bridge.rs`, 645 lines) exists solely because
Cap'n Proto cannot run in browsers. This duplicates handler logic and creates
maintenance burden.
### 2.2 Client Architecture: Monolith with Global State
**Problem:** `AUTH_CONTEXT` is a process-wide `RwLock<Option<ClientAuth>>`.
Business logic (MLS processing, sealed sender, hybrid decryption, message
routing) lives inside `repl.rs`'s `poll_messages()` — a 100-line function that
mixes transport, crypto, routing, and storage.
**Impact:** Every frontend (REPL, TUI, GUI, Web) must reimplement message
processing. The TUI already duplicates it. The GUI stub and mobile PoC would need
yet another copy. Client cannot be used as a library.
### 2.3 Delivery Model: Poll-Based, No Push Channel
**Problem:** Client polls every 1 second with `fetch_wait(timeout_ms=0)` — never
actually long-polls. Constant network traffic even when idle. ~1 second latency
for message delivery.
**Also:** `fetch` is destructive (drains queue). If the client crashes between
receive and processing, messages are lost.
### 2.4 Connection Model: Single Stream
**Problem:** `max_concurrent_bidi_streams(1)` means the entire QUIC connection is
effectively single-stream. A blocking `fetchWait` prevents all other RPCs.
### 2.5 Storage: Single Mutex-Guarded SQLite Connection
**Problem:** `SqlStore` uses `Mutex<Connection>`. Every database operation
acquires a global lock. Under concurrent load, all storage access serializes.
**Also:** `FileBackedStore` flushes the entire map on every write (O(n) I/O).
Sessions are in-memory only — server restart forces all clients to re-login.
### 2.6 Key Management Gaps
- **DiskKeyStore** — HPKE private keys stored as plaintext bincode on disk. No
encryption at rest.
- **MLS group state** — `GroupMember` holds `MlsGroup` in memory only. Process
crash loses all group state.
- **Token zeroization** — `AuthContext.token`, `ClientAuth.access_token` are not
wrapped in `Zeroizing`.
### 2.7 Workspace Bloat
12 crates for a project at this maturity is excessive. Several are thin stubs
(`quicproquo-gen`, `quicproquo-bot` at 354 lines) or broken (`quicproquo-gui`
fails `cargo build --workspace`).
---
## Part 3 — v2 Architecture Recommendations
### 3.1 Replace capnp-rpc with a Send-Compatible RPC Framework
**Recommendation:** Switch to **tonic (gRPC)** or a custom framing layer.
| Dimension | capnp-rpc (v1) | tonic/gRPC (v2) |
|-----------|---------------|-----------------|
| Threading | `!Send`, single-threaded | `Send + Sync`, multi-threaded |
| Browser | Requires WS bridge | grpc-web native |
| Streaming | Not supported | Built-in |
| Middleware | None (copy-paste auth) | Interceptors/layers |
| Ecosystem | Niche | Massive (every language) |
**Alternative:** Keep Cap'n Proto *schemas* for serialization (zero-copy
advantage) but replace capnp-rpc with custom framing over QUIC streams. This
preserves the wire format while gaining `Send` compatibility.
The WS bridge would be eliminated entirely — grpc-web or WebTransport gives
browsers direct access.
### 3.2 Extract an SDK Crate (Most Important Client Change)
Create `quicproquo-sdk` that owns all business logic:
```
quicproquo-sdk/
src/
client.rs -- QpqClient: connect, login, send, receive
events.rs -- ClientEvent enum (push-based)
conversation.rs -- ConversationHandle, group management
crypto.rs -- MLS pipeline, sealed sender, hybrid decryption
sync.rs -- message sync, offline queue, retry
```
All frontends become thin shells:
```
CLI/REPL -> calls sdk
TUI -> calls sdk
Tauri GUI -> calls sdk (via Tauri commands)
Mobile -> calls sdk (via C FFI)
Web/WASM -> calls sdk (compiled to wasm32)
```
**Key API shape:**
```rust
pub struct QpqClient { /* session, rpc, crypto pipeline */ }
impl QpqClient {
pub async fn connect(config: ClientConfig) -> Result<Self>;
pub async fn login(username: &str, password: &str) -> Result<Self>;
pub async fn dm(&mut self, username: &str) -> Result<ConversationHandle>;
pub async fn create_group(&mut self, name: &str) -> Result<ConversationHandle>;
pub async fn send(&mut self, text: &str) -> Result<MessageId>;
pub fn subscribe(&self) -> Receiver<ClientEvent>;
}
```
No global state. No `AUTH_CONTEXT`. Auth context is per-`QpqClient` instance.
### 3.3 Add Push-Based Delivery
**Recommendation:** Dedicated QUIC unidirectional stream for server-push
notifications.
```
Client opens bidi stream 0 -> RPC channel (request/response)
Server opens uni stream 1 -> push notifications (new message, typing, etc.)
```
Benefits:
- Zero-latency message delivery (no polling)
- No idle network traffic
- Typing indicators delivered in real-time
- Graceful degradation: fall back to long-poll if push stream fails
**Also:** Make `peek` + `ack` the default delivery pattern (not destructive
`fetch`). Add idempotency keys to prevent duplicate messages on retry.
### 3.4 Multi-Stream Connections
Allow 4-8 concurrent bidirectional QUIC streams per connection. This enables:
- Pipelined RPCs (send while fetching)
- Concurrent blob upload + chat
- `fetchWait` on one stream without blocking others
### 3.5 Storage Improvements
| Change | Rationale |
|--------|-----------|
| Drop `FileBackedStore` | O(n) flush per write, no federation support |
| Connection pool for SQLite | Replace `Mutex<Connection>` with r2d2/deadpool |
| Persist sessions to DB | Server restart shouldn't force re-login |
| Encrypt DiskKeyStore at rest | HPKE private keys in plaintext is a real vuln |
| Persist MLS group state | Process crash shouldn't lose group state |
| Atomic keystore writes | tempfile-then-rename pattern |
### 3.6 Crypto Stack Refinements
The algorithms are correct. The refinements are operational:
| Change | Rationale |
|--------|-----------|
| Typed MLS error variants | Stop losing error info via `format!("{e:?}")` |
| Formalize hybrid PQ ciphersuite ID | Replace length-based key detection |
| Remove all InsecureServerCertVerifier | No TLS bypass on any platform |
| Add passkey/WebAuthn alt-auth | Better UX for GUI/mobile, no password to forget |
| Consider Double Ratchet for 1:1 DMs | MLS is over-engineered for 2-party; DR gives better per-message forward secrecy |
| Token/session secret zeroization | `AuthContext.token` et al. need `Zeroizing` wrappers |
| Fix serde deserialization of secrets | Intermediate non-zeroized `Vec<u8>` in `IdentityKeypair::deserialize` |
### 3.7 Workspace Restructuring
**Reduce from 12 to 8 crates:**
```
quicproquo-core -- crypto primitives (keep)
quicproquo-proto -- schema codegen (keep)
quicproquo-plugin-api -- #![no_std] C-ABI (keep)
quicproquo-kt -- key transparency (keep)
quicproquo-sdk -- NEW: business logic library
quicproquo-server -- server binary (keep)
quicproquo-client -- CLI/TUI binary, depends on sdk (keep, slimmed)
quicproquo-p2p -- mesh networking (keep, feature-flagged)
```
**Merge/remove:**
- `bot` -> `sdk::bot` module
- `ffi` -> `sdk` with `--features c-ffi`
- `gen` -> `scripts/` or `xtask`
- `gui` -> `apps/gui/` outside workspace (Tauri project)
- `mobile` -> `examples/` (research spike)
**Add `[workspace.default-members]`** so `cargo build` doesn't attempt GUI.
**Add `justfile`** with `build`, `test`, `test-e2e`, `build-wasm`, `docker`.
### 3.8 Plugin System Evolution
| Change | Rationale |
|--------|-----------|
| Add `version: u32` to `HookVTable` | ABI stability — check version on load |
| Config passthrough | `qpq_plugin_init(vtable, config_json)` |
| Async hooks | Plugins that call external services shouldn't block Tokio |
| Evaluate WASM plugins | Sandboxed community plugins (keep C-ABI for first-party) |
### 3.9 Federation Improvements
| Change | Rationale |
|--------|-----------|
| DNS SRV / .well-known discovery | Static peer config doesn't scale |
| Persistent relay queue with retry | Messages to offline peers are currently lost |
| Deterministic channel ID derivation | Avoid cross-server channel conflicts |
| Keep mDNS as optional mesh feature | Not for internet-scale, but good for LAN |
### 3.10 Test & CI Improvements
| Change | Rationale |
|--------|-----------|
| Per-client auth context | Removes `--test-threads 1` constraint |
| Mock server for client unit tests | Fast tests without spawning real server |
| Fuzz testing (cargo-fuzz) | Hybrid KEM, sealed sender, padding, Cap'n Proto deser |
| WS bridge unit tests | 645 lines, zero tests, security-critical |
| WASM + Go SDK in CI | Currently untested in CI |
| Separate E2E from unit test CI job | Different speed, different failure modes |
| macOS CI | FFI/mobile cross-compilation validation |
| Release automation | Binary artifacts, Docker tags, WASM npm publish |
---
## Part 4 — Ecosystem Positioning
### Don't compete with Signal or Matrix directly.
**Target: Privacy-first messaging infrastructure for developers and
organizations.**
quicproquo's differentiators — QUIC-native transport, post-quantum crypto, MLS,
plugin system, multi-language SDKs, embeddable architecture — point toward an
infrastructure play, not a consumer app.
Think: *"the Postgres of E2E encrypted messaging"* — a high-quality open-source
server and protocol that other projects build on.
| Segment | Value Proposition |
|---------|-------------------|
| **Developer tool** | API-first messenger for encrypted bots and integrations |
| **Embeddable** | C FFI + WASM + Go SDK for embedding in other apps |
| **Enterprise** | On-prem, plugins for compliance/audit, OPAQUE zero-knowledge auth |
| **Research** | Post-quantum crypto, MLS reference implementation, mesh networking |
---
## Part 5 — Priority Ordering
### Phase 1: Foundation (unblocks everything else)
1. Replace capnp-rpc with Send-compatible framework
2. Extract SDK crate from client
3. Per-client auth context (no global state)
### Phase 2: Reliability
4. Push-based delivery (QUIC uni-stream)
5. Multi-stream connections
6. Persist sessions + MLS group state
7. Encrypt DiskKeyStore at rest
8. peek+ack as default delivery
### Phase 3: Polish
9. Workspace restructuring (12 -> 8 crates)
10. TUI as primary interactive mode (built on SDK)
11. Plugin system v2 (versioning, config, async)
12. Federation retry queue + discovery
### Phase 4: Ecosystem
13. Full MLS in WASM (browser E2E)
14. WebTransport (eliminate WS bridge)
15. Tauri GUI (built on SDK)
16. Release automation + expanded CI
---
## Appendix — Analysis Sources
This document was produced by four parallel analysis agents:
| Agent | Scope | Files Read |
|-------|-------|-----------|
| server-analyst | Transport, RPC, delivery, storage, federation | 27 server .rs files, 4 schemas, core transport |
| client-analyst | REPL, UX, state, multi-platform, SDK design | All client .rs, GUI, mobile, TS demo |
| security-analyst | MLS, OPAQUE, hybrid KEM, keystore, identity | All core .rs, review doc |
| dx-analyst | Workspace, build, tests, plugins, CI, ecosystem | All Cargo.toml, tests, CI, plugins, SDKs |

328
docs/V2-MASTER-PLAN.md Normal file
View File

@@ -0,0 +1,328 @@
# quicproquo v2 — Master Implementation Plan
> Created 2026-03-04. This is the authoritative plan for the v2 rewrite.
> See also: `docs/V2-DESIGN-ANALYSIS.md` for the detailed retrospective.
## Context
The v1 codebase has strong crypto foundations (MLS, hybrid PQ KEM, OPAQUE) but three
architectural bottlenecks: capnp-rpc is `!Send` (single-threaded), client business logic
is trapped in a monolithic REPL with global state, and delivery is poll-based.
This plan creates v2 on a new branch, keeping the crypto stack intact and replacing
the RPC/transport layer, extracting an SDK, and restructuring the workspace.
**Key decisions:**
- Transport: Protobuf (prost) + custom framing over QUIC (quinn)
- Mobile: Tauri 2 (same Rust SDK backend, web UI)
- Branch strategy: `v2` branch from main, not a fresh repo
- Constraints: Rust, QUIC, GPG-signed commits, zeroize secrets, no stubs
---
## Architecture Overview
```
┌─────────────────────────────────────────────────────┐
│ Frontends │
│ CLI/TUI │ Tauri GUI/Mobile │ Web (WebTransport)│
└─────┬─────┴────────┬───────────┴──────────┬─────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────┐
│ quicproquo-sdk │
│ QpqClient { connect, login, send, recv, subscribe } │
│ Event system (tokio broadcast) │
│ Crypto pipeline (MLS, sealed sender, hybrid) │
│ Conversation store (SQLCipher) │
└──────────────────────┬──────────────────────────────┘
┌─────────────────────────────────────────────────────┐
│ quicproquo-rpc │
│ QUIC framing: [method:u16][req_id:u32][len:u32][pb] │
│ Multi-stream (1 RPC per stream) │
│ Server-push via uni-streams │
│ tower middleware (auth, rate-limit) │
└──────────────────────┬──────────────────────────────┘
┌─────────────────────────────────────────────────────┐
│ quicproquo-server │
│ Domain services (auth, delivery, channel, blob) │
│ Store trait → SqlStore (connection pool) │
│ Plugin hooks, federation, KT │
└─────────────────────────────────────────────────────┘
```
### Wire Format
Per QUIC bidirectional stream (request/response):
```
Request: [method_id: u16][request_id: u32][payload_len: u32][protobuf bytes]
Response: [status: u8][request_id: u32][payload_len: u32][protobuf bytes]
```
Per QUIC unidirectional stream (server → client push):
```
Push: [event_type: u16][payload_len: u32][protobuf bytes]
```
Each RPC opens its own QUIC bidi stream → natural multi-stream, no head-of-line blocking.
---
## Workspace Structure (v2: 9 crates)
```
quicproquo/
├── crates/
│ ├── quicproquo-core/ # KEEP AS-IS — crypto primitives, MLS, hybrid KEM
│ ├── quicproquo-kt/ # KEEP AS-IS — key transparency
│ ├── quicproquo-plugin-api/ # KEEP AS-IS — #![no_std] C-ABI
│ ├── quicproquo-proto/ # REWRITE — protobuf schemas + prost codegen
│ ├── quicproquo-rpc/ # NEW — QUIC RPC framework (framing, dispatch, tower)
│ ├── quicproquo-sdk/ # NEW — client business logic library
│ ├── quicproquo-server/ # REWRITE — domain services + RPC handlers
│ ├── quicproquo-client/ # REWRITE — thin CLI/TUI shell over SDK
│ └── quicproquo-p2p/ # KEEP — iroh mesh (feature-flagged, later)
├── apps/
│ └── gui/ # Tauri 2 desktop + mobile app (outside workspace)
├── proto/ # .proto source files
│ └── qpq/v1/
│ ├── auth.proto # OPAQUE registration + login (4 methods)
│ ├── delivery.proto # enqueue, fetch, peek, ack, batch (6 methods)
│ ├── keys.proto # key package + hybrid key CRUD (5 methods)
│ ├── channel.proto # channel create (1 method)
│ ├── user.proto # resolve user/identity (2 methods)
│ ├── blob.proto # upload/download (2 methods)
│ ├── device.proto # register/list/revoke (3 methods)
│ ├── p2p.proto # endpoint publish/resolve + health (3 methods)
│ ├── federation.proto # relay + proxy (6 methods)
│ ├── push.proto # server-push events (NEW)
│ └── common.proto # shared types (Auth, Envelope, Error)
├── sdks/
│ ├── go/ # Go SDK (regenerate from .proto)
│ └── typescript/ # TS SDK (WebTransport client)
├── justfile # NEW — build commands
└── Cargo.toml # workspace root
```
**Removed from workspace:**
- `quicproquo-bot``sdk::bot` module
- `quicproquo-ffi``sdk` with `--features c-ffi`
- `quicproquo-gen``scripts/`
- `quicproquo-gui``apps/gui/` (Tauri project, outside workspace)
- `quicproquo-mobile` → merged into `apps/gui/` (Tauri 2 mobile)
---
## Crate Reuse Assessment
| v1 Crate | capnp deps? | v2 Action | Effort |
|----------|:-----------:|-----------|--------|
| **quicproquo-core** | None | Copy as-is | Zero |
| **quicproquo-kt** | None | Copy as-is | Zero |
| **quicproquo-plugin-api** | None | Copy as-is | Zero |
| **quicproquo-p2p** | None | Copy as-is | Zero |
| **quicproquo-proto** | 100% capnp | Replace with prost codegen | Medium |
| **quicproquo-server** | 16/20 files | Extract domain logic, rewrite handlers | High |
| **quicproquo-client** | 6/10 files | Extract to SDK, thin CLI shell | High |
### Key Files to Reuse Directly
| Source (v1) | Destination (v2) | Notes |
|-------------|------------------|-------|
| `crates/quicproquo-core/` (entire) | same path | Zero changes |
| `crates/quicproquo-kt/` (entire) | same path | Zero changes |
| `crates/quicproquo-plugin-api/` (entire) | same path | Zero changes |
| `server/src/storage.rs` | `server/src/storage.rs` | Store trait — keep |
| `server/src/sql_store.rs` | `server/src/sql_store.rs` | Add connection pool |
| `server/src/hooks.rs` | `server/src/hooks.rs` | Plugin system — keep |
| `server/src/plugin_loader.rs` | `server/src/plugin_loader.rs` | Keep |
| `server/src/error_codes.rs` | `server/src/error_codes.rs` | Keep |
| `server/src/config.rs` | `server/src/config.rs` | Update for new transport |
| `client/src/conversation.rs` | `sdk/src/conversation.rs` | Move to SDK |
| `client/src/token_cache.rs` | `sdk/src/token_cache.rs` | Move to SDK |
| `client/src/display.rs` | `client/src/display.rs` | Keep in CLI |
| `schemas/*.capnp` | reference only | Translate to .proto |
---
## Phased Implementation
### Phase 1: Foundation
**Goal:** v2 branch with new workspace, proto schemas, RPC framework skeleton, SDK skeleton.
**Scope:** Compiles, no runtime functionality yet.
1. **Create v2 branch** from main
2. **Restructure workspace** — update root Cargo.toml, create new crate dirs, add justfile
3. **Write .proto files** — translate all 33 RPC methods + push events from Cap'n Proto
4. **Create quicproquo-proto crate** — prost-build codegen
5. **Create quicproquo-rpc crate** — QUIC RPC framework:
- `framing.rs` — wire format encode/decode (request, response, push)
- `server.rs` — accept QUIC connections, dispatch to handlers
- `client.rs` — connect, send requests, receive responses + push events
- `middleware.rs` — tower-based auth + rate-limit layers
- `method.rs` — method registry (method_id → async handler fn)
6. **Create quicproquo-sdk crate** — public API skeleton:
- `client.rs``QpqClient` struct
- `events.rs``ClientEvent` enum
- `conversation.rs``ConversationHandle`, `ConversationStore`
- `config.rs``ClientConfig`
7. **Extract server domain types**`server/src/domain/` module:
- `types.rs` — plain Rust request/response types
- `auth.rs` — OPAQUE logic extracted from auth_ops.rs
- `delivery.rs` — enqueue/fetch logic extracted from delivery.rs
**Verification:**
- `cargo build --workspace` succeeds
- `cargo test -p quicproquo-core` passes (72 tests)
- Proto codegen works
- RPC framework compiles
---
### Phase 2: Server Core
**Goal:** Working server with all 33 RPC handlers over QUIC.
1. **RPC dispatch** — method registry, connection lifecycle
2. **Domain handlers** — all 33 methods as `async fn(Request) -> Result<Response>`
- Auth (4): OPAQUE register start/finish, login start/finish
- Delivery (6): enqueue, fetch, fetchWait, peek, ack, batchEnqueue
- Keys (5): upload/fetch key package, upload/fetch/batch-fetch hybrid key
- Channels (1): createChannel
- Users (2): resolveUser, resolveIdentity
- Blobs (2): uploadBlob, downloadBlob
- Devices (3): registerDevice, listDevices, revokeDevice
- P2P (3): health, publishEndpoint, resolveEndpoint
- Federation (6): relay enqueue/batch, proxy fetch/resolve, health
3. **Server-push** — notification stream via QUIC uni-stream
4. **Storage upgrades:**
- Drop `FileBackedStore`
- Connection pool (deadpool-sqlite)
- Persist sessions to SQLite
- Atomic queue depth check + enqueue
5. **Tower middleware** — auth validation, rate limiting, audit logging
6. **Multi-stream** — concurrent RPCs per connection (remove 1-stream limit)
**Verification:**
- Server starts, accepts QUIC connections
- Health check RPC works
- OPAQUE registration + login works
- Message enqueue + fetch round-trip
---
### Phase 3: SDK
**Goal:** Complete client SDK library — the heart of v2.
1. **QpqClient** — connect, OPAQUE auth, session management (no global state)
2. **Crypto pipeline** — MLS processing, sealed sender unwrap, hybrid decrypt
(extracted from repl.rs `poll_messages()`)
3. **Conversation management** — create DM, create group, invite, remove, send, receive
4. **Event system**`tokio::broadcast<ClientEvent>` replacing poll loop
- `MessageReceived`, `TypingIndicator`, `ConversationCreated`
- `MemberJoined`, `MemberLeft`, `ConnectionLost`, `Reconnected`
5. **Offline support** — outbox queue, retry with backoff, sync on reconnect
6. **ConversationStore** — SQLCipher local DB (migrate from client/conversation.rs)
7. **Key management** — encrypted DiskKeyStore, MLS group state persistence
8. **Token/secret zeroization**`AuthContext.token` etc. wrapped in `Zeroizing`
**Verification:**
- SDK integration test: connect → login → create DM → send → receive
- No global state (`AUTH_CONTEXT` eliminated)
- Event subscription works
- Offline outbox drains on reconnect
---
### Phase 4: Client
**Goal:** CLI and TUI as thin shells over SDK.
1. **CLI binary** (`qpq`) — clap subcommands calling `QpqClient`
2. **REPL** — readline with tab-completion (rustyline), categorized `/help`
3. **TUI** — ratatui, subscribes to `QpqClient::subscribe()` events
4. **Simplified commands:**
- Hide MLS/KeyPackage internals (auto-refresh)
- Message references by short ID (not index)
- Batch operations (`/create-group team alice bob`)
- Categorized help (Chat, Groups, Security, System)
5. **Auto-server-launch** — keep zero-config DX from v1
6. **Playbook system** — keep YAML-based test scripting
**Verification:**
- `qpq --username alice --password pass` starts REPL (same UX as v1)
- TUI mode works with live event updates
- Tab-completion for commands and usernames
- E2E test: two clients exchange messages
---
### Phase 5: Desktop & Mobile
**Goal:** Tauri 2 app for all platforms.
1. **Tauri 2 project** in `apps/gui/`
2. **Rust backend** — Tauri commands wrapping `QpqClient`
3. **Web frontend** — Svelte or vanilla HTML/JS
4. **Desktop** — Linux, macOS, Windows
5. **Mobile** — iOS, Android via Tauri 2 mobile
6. **QUIC connection migration** — automatic wifi↔cellular handoff
**Verification:**
- Desktop app builds and runs on Linux
- Mobile app builds for Android (emulator)
- Send message from CLI → received in GUI
---
### Phase 6: Polish & Ecosystem
**Goal:** Production readiness.
1. **Federation improvements** — DNS SRV discovery, persistent relay queue with retry
2. **Plugin system v2** — version field, config passthrough, async hooks, WASM plugins
3. **WebTransport** — browser clients over HTTP/3 (same quinn endpoint)
4. **WASM MLS** — compile openmls to wasm32 for browser E2E encryption
5. **CI/CD** — release automation, WASM CI, multi-platform (Linux + macOS)
6. **Security hardening:**
- Fuzz testing (hybrid KEM, sealed sender, padding, protobuf deser)
- Remove all `InsecureServerCertVerifier` paths
- Certificate pinning
- Add passkey/WebAuthn as alternative auth
7. **Double Ratchet for 1:1 DMs** — better per-message forward secrecy than MLS for 2-party
---
## RPC Method Inventory (33 total)
| Category | Methods | Proto File |
|----------|---------|-----------|
| Auth (OPAQUE) | opaqueRegisterStart, opaqueRegisterFinish, opaqueLoginStart, opaqueLoginFinish | auth.proto |
| Delivery | enqueue, fetch, fetchWait, peek, ack, batchEnqueue | delivery.proto |
| Keys | uploadKeyPackage, fetchKeyPackage, uploadHybridKey, fetchHybridKey, fetchHybridKeys | keys.proto |
| Channel | createChannel | channel.proto |
| User | resolveUser, resolveIdentity | user.proto |
| Blob | uploadBlob, downloadBlob | blob.proto |
| Device | registerDevice, listDevices, revokeDevice | device.proto |
| P2P | health, publishEndpoint, resolveEndpoint | p2p.proto |
| Federation | relayEnqueue, relayBatchEnqueue, proxyFetchKeyPackage, proxyFetchHybridKey, proxyResolveUser, federationHealth | federation.proto |
**New in v2:**
| Push Events | Description | Proto File |
|-------------|-------------|-----------|
| MessageNotification | New message available | push.proto |
| TypingNotification | Peer is typing | push.proto |
| ChannelUpdate | Channel created/member changed | push.proto |
| SessionExpired | Auth session expired | push.proto |
---
## Engineering Standards (carried from v1)
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `test:`, `refactor:`
- GPG-signed commits only
- No `Co-authored-by` trailers
- No `.unwrap()` on crypto or I/O in non-test paths
- Secrets: zeroize on drop, never in logs
- No stubs / `todo!()` / `unimplemented!()` in production code
- `clippy::unwrap_used = "deny"` at workspace level