329 lines
15 KiB
Markdown
329 lines
15 KiB
Markdown
# quicproquo v2 — Master Implementation Plan
|
|
|
|
> Created 2026-03-04. This is the authoritative plan for the v2 rewrite.
|
|
> See also: `docs/V2-DESIGN-ANALYSIS.md` for the detailed retrospective.
|
|
|
|
## Context
|
|
|
|
The v1 codebase has strong crypto foundations (MLS, hybrid PQ KEM, OPAQUE) but three
|
|
architectural bottlenecks: capnp-rpc is `!Send` (single-threaded), client business logic
|
|
is trapped in a monolithic REPL with global state, and delivery is poll-based.
|
|
|
|
This plan creates v2 on a new branch, keeping the crypto stack intact and replacing
|
|
the RPC/transport layer, extracting an SDK, and restructuring the workspace.
|
|
|
|
**Key decisions:**
|
|
- Transport: Protobuf (prost) + custom framing over QUIC (quinn)
|
|
- Mobile: Tauri 2 (same Rust SDK backend, web UI)
|
|
- Branch strategy: `v2` branch from main, not a fresh repo
|
|
- Constraints: Rust, QUIC, GPG-signed commits, zeroize secrets, no stubs
|
|
|
|
---
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────┐
|
|
│ Frontends │
|
|
│ CLI/TUI │ Tauri GUI/Mobile │ Web (WebTransport)│
|
|
└─────┬─────┴────────┬───────────┴──────────┬─────────┘
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌─────────────────────────────────────────────────────┐
|
|
│ quicproquo-sdk │
|
|
│ QpqClient { connect, login, send, recv, subscribe } │
|
|
│ Event system (tokio broadcast) │
|
|
│ Crypto pipeline (MLS, sealed sender, hybrid) │
|
|
│ Conversation store (SQLCipher) │
|
|
└──────────────────────┬──────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────┐
|
|
│ quicproquo-rpc │
|
|
│ QUIC framing: [method:u16][req_id:u32][len:u32][pb] │
|
|
│ Multi-stream (1 RPC per stream) │
|
|
│ Server-push via uni-streams │
|
|
│ tower middleware (auth, rate-limit) │
|
|
└──────────────────────┬──────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────┐
|
|
│ quicproquo-server │
|
|
│ Domain services (auth, delivery, channel, blob) │
|
|
│ Store trait → SqlStore (connection pool) │
|
|
│ Plugin hooks, federation, KT │
|
|
└─────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Wire Format
|
|
|
|
Per QUIC bidirectional stream (request/response):
|
|
```
|
|
Request: [method_id: u16][request_id: u32][payload_len: u32][protobuf bytes]
|
|
Response: [status: u8][request_id: u32][payload_len: u32][protobuf bytes]
|
|
```
|
|
|
|
Per QUIC unidirectional stream (server → client push):
|
|
```
|
|
Push: [event_type: u16][payload_len: u32][protobuf bytes]
|
|
```
|
|
|
|
Each RPC opens its own QUIC bidi stream → natural multi-stream, no head-of-line blocking.
|
|
|
|
---
|
|
|
|
## Workspace Structure (v2: 9 crates)
|
|
|
|
```
|
|
quicproquo/
|
|
├── crates/
|
|
│ ├── quicproquo-core/ # KEEP AS-IS — crypto primitives, MLS, hybrid KEM
|
|
│ ├── quicproquo-kt/ # KEEP AS-IS — key transparency
|
|
│ ├── quicproquo-plugin-api/ # KEEP AS-IS — #![no_std] C-ABI
|
|
│ ├── quicproquo-proto/ # REWRITE — protobuf schemas + prost codegen
|
|
│ ├── quicproquo-rpc/ # NEW — QUIC RPC framework (framing, dispatch, tower)
|
|
│ ├── quicproquo-sdk/ # NEW — client business logic library
|
|
│ ├── quicproquo-server/ # REWRITE — domain services + RPC handlers
|
|
│ ├── quicproquo-client/ # REWRITE — thin CLI/TUI shell over SDK
|
|
│ └── quicproquo-p2p/ # KEEP — iroh mesh (feature-flagged, later)
|
|
├── apps/
|
|
│ └── gui/ # Tauri 2 desktop + mobile app (outside workspace)
|
|
├── proto/ # .proto source files
|
|
│ └── qpq/v1/
|
|
│ ├── auth.proto # OPAQUE registration + login (4 methods)
|
|
│ ├── delivery.proto # enqueue, fetch, peek, ack, batch (6 methods)
|
|
│ ├── keys.proto # key package + hybrid key CRUD (5 methods)
|
|
│ ├── channel.proto # channel create (1 method)
|
|
│ ├── user.proto # resolve user/identity (2 methods)
|
|
│ ├── blob.proto # upload/download (2 methods)
|
|
│ ├── device.proto # register/list/revoke (3 methods)
|
|
│ ├── p2p.proto # endpoint publish/resolve + health (3 methods)
|
|
│ ├── federation.proto # relay + proxy (6 methods)
|
|
│ ├── push.proto # server-push events (NEW)
|
|
│ └── common.proto # shared types (Auth, Envelope, Error)
|
|
├── sdks/
|
|
│ ├── go/ # Go SDK (regenerate from .proto)
|
|
│ └── typescript/ # TS SDK (WebTransport client)
|
|
├── justfile # NEW — build commands
|
|
└── Cargo.toml # workspace root
|
|
```
|
|
|
|
**Removed from workspace:**
|
|
- `quicproquo-bot` → `sdk::bot` module
|
|
- `quicproquo-ffi` → `sdk` with `--features c-ffi`
|
|
- `quicproquo-gen` → `scripts/`
|
|
- `quicproquo-gui` → `apps/gui/` (Tauri project, outside workspace)
|
|
- `quicproquo-mobile` → merged into `apps/gui/` (Tauri 2 mobile)
|
|
|
|
---
|
|
|
|
## Crate Reuse Assessment
|
|
|
|
| v1 Crate | capnp deps? | v2 Action | Effort |
|
|
|----------|:-----------:|-----------|--------|
|
|
| **quicproquo-core** | None | Copy as-is | Zero |
|
|
| **quicproquo-kt** | None | Copy as-is | Zero |
|
|
| **quicproquo-plugin-api** | None | Copy as-is | Zero |
|
|
| **quicproquo-p2p** | None | Copy as-is | Zero |
|
|
| **quicproquo-proto** | 100% capnp | Replace with prost codegen | Medium |
|
|
| **quicproquo-server** | 16/20 files | Extract domain logic, rewrite handlers | High |
|
|
| **quicproquo-client** | 6/10 files | Extract to SDK, thin CLI shell | High |
|
|
|
|
### Key Files to Reuse Directly
|
|
|
|
| Source (v1) | Destination (v2) | Notes |
|
|
|-------------|------------------|-------|
|
|
| `crates/quicproquo-core/` (entire) | same path | Zero changes |
|
|
| `crates/quicproquo-kt/` (entire) | same path | Zero changes |
|
|
| `crates/quicproquo-plugin-api/` (entire) | same path | Zero changes |
|
|
| `server/src/storage.rs` | `server/src/storage.rs` | Store trait — keep |
|
|
| `server/src/sql_store.rs` | `server/src/sql_store.rs` | Add connection pool |
|
|
| `server/src/hooks.rs` | `server/src/hooks.rs` | Plugin system — keep |
|
|
| `server/src/plugin_loader.rs` | `server/src/plugin_loader.rs` | Keep |
|
|
| `server/src/error_codes.rs` | `server/src/error_codes.rs` | Keep |
|
|
| `server/src/config.rs` | `server/src/config.rs` | Update for new transport |
|
|
| `client/src/conversation.rs` | `sdk/src/conversation.rs` | Move to SDK |
|
|
| `client/src/token_cache.rs` | `sdk/src/token_cache.rs` | Move to SDK |
|
|
| `client/src/display.rs` | `client/src/display.rs` | Keep in CLI |
|
|
| `schemas/*.capnp` | reference only | Translate to .proto |
|
|
|
|
---
|
|
|
|
## Phased Implementation
|
|
|
|
### Phase 1: Foundation
|
|
**Goal:** v2 branch with new workspace, proto schemas, RPC framework skeleton, SDK skeleton.
|
|
**Scope:** Compiles, no runtime functionality yet.
|
|
|
|
1. **Create v2 branch** from main
|
|
2. **Restructure workspace** — update root Cargo.toml, create new crate dirs, add justfile
|
|
3. **Write .proto files** — translate all 33 RPC methods + push events from Cap'n Proto
|
|
4. **Create quicproquo-proto crate** — prost-build codegen
|
|
5. **Create quicproquo-rpc crate** — QUIC RPC framework:
|
|
- `framing.rs` — wire format encode/decode (request, response, push)
|
|
- `server.rs` — accept QUIC connections, dispatch to handlers
|
|
- `client.rs` — connect, send requests, receive responses + push events
|
|
- `middleware.rs` — tower-based auth + rate-limit layers
|
|
- `method.rs` — method registry (method_id → async handler fn)
|
|
6. **Create quicproquo-sdk crate** — public API skeleton:
|
|
- `client.rs` — `QpqClient` struct
|
|
- `events.rs` — `ClientEvent` enum
|
|
- `conversation.rs` — `ConversationHandle`, `ConversationStore`
|
|
- `config.rs` — `ClientConfig`
|
|
7. **Extract server domain types** — `server/src/domain/` module:
|
|
- `types.rs` — plain Rust request/response types
|
|
- `auth.rs` — OPAQUE logic extracted from auth_ops.rs
|
|
- `delivery.rs` — enqueue/fetch logic extracted from delivery.rs
|
|
|
|
**Verification:**
|
|
- `cargo build --workspace` succeeds
|
|
- `cargo test -p quicproquo-core` passes (72 tests)
|
|
- Proto codegen works
|
|
- RPC framework compiles
|
|
|
|
---
|
|
|
|
### Phase 2: Server Core
|
|
**Goal:** Working server with all 33 RPC handlers over QUIC.
|
|
|
|
1. **RPC dispatch** — method registry, connection lifecycle
|
|
2. **Domain handlers** — all 33 methods as `async fn(Request) -> Result<Response>`
|
|
- Auth (4): OPAQUE register start/finish, login start/finish
|
|
- Delivery (6): enqueue, fetch, fetchWait, peek, ack, batchEnqueue
|
|
- Keys (5): upload/fetch key package, upload/fetch/batch-fetch hybrid key
|
|
- Channels (1): createChannel
|
|
- Users (2): resolveUser, resolveIdentity
|
|
- Blobs (2): uploadBlob, downloadBlob
|
|
- Devices (3): registerDevice, listDevices, revokeDevice
|
|
- P2P (3): health, publishEndpoint, resolveEndpoint
|
|
- Federation (6): relay enqueue/batch, proxy fetch/resolve, health
|
|
3. **Server-push** — notification stream via QUIC uni-stream
|
|
4. **Storage upgrades:**
|
|
- Drop `FileBackedStore`
|
|
- Connection pool (deadpool-sqlite)
|
|
- Persist sessions to SQLite
|
|
- Atomic queue depth check + enqueue
|
|
5. **Tower middleware** — auth validation, rate limiting, audit logging
|
|
6. **Multi-stream** — concurrent RPCs per connection (remove 1-stream limit)
|
|
|
|
**Verification:**
|
|
- Server starts, accepts QUIC connections
|
|
- Health check RPC works
|
|
- OPAQUE registration + login works
|
|
- Message enqueue + fetch round-trip
|
|
|
|
---
|
|
|
|
### Phase 3: SDK
|
|
**Goal:** Complete client SDK library — the heart of v2.
|
|
|
|
1. **QpqClient** — connect, OPAQUE auth, session management (no global state)
|
|
2. **Crypto pipeline** — MLS processing, sealed sender unwrap, hybrid decrypt
|
|
(extracted from repl.rs `poll_messages()`)
|
|
3. **Conversation management** — create DM, create group, invite, remove, send, receive
|
|
4. **Event system** — `tokio::broadcast<ClientEvent>` replacing poll loop
|
|
- `MessageReceived`, `TypingIndicator`, `ConversationCreated`
|
|
- `MemberJoined`, `MemberLeft`, `ConnectionLost`, `Reconnected`
|
|
5. **Offline support** — outbox queue, retry with backoff, sync on reconnect
|
|
6. **ConversationStore** — SQLCipher local DB (migrate from client/conversation.rs)
|
|
7. **Key management** — encrypted DiskKeyStore, MLS group state persistence
|
|
8. **Token/secret zeroization** — `AuthContext.token` etc. wrapped in `Zeroizing`
|
|
|
|
**Verification:**
|
|
- SDK integration test: connect → login → create DM → send → receive
|
|
- No global state (`AUTH_CONTEXT` eliminated)
|
|
- Event subscription works
|
|
- Offline outbox drains on reconnect
|
|
|
|
---
|
|
|
|
### Phase 4: Client
|
|
**Goal:** CLI and TUI as thin shells over SDK.
|
|
|
|
1. **CLI binary** (`qpq`) — clap subcommands calling `QpqClient`
|
|
2. **REPL** — readline with tab-completion (rustyline), categorized `/help`
|
|
3. **TUI** — ratatui, subscribes to `QpqClient::subscribe()` events
|
|
4. **Simplified commands:**
|
|
- Hide MLS/KeyPackage internals (auto-refresh)
|
|
- Message references by short ID (not index)
|
|
- Batch operations (`/create-group team alice bob`)
|
|
- Categorized help (Chat, Groups, Security, System)
|
|
5. **Auto-server-launch** — keep zero-config DX from v1
|
|
6. **Playbook system** — keep YAML-based test scripting
|
|
|
|
**Verification:**
|
|
- `qpq --username alice --password pass` starts REPL (same UX as v1)
|
|
- TUI mode works with live event updates
|
|
- Tab-completion for commands and usernames
|
|
- E2E test: two clients exchange messages
|
|
|
|
---
|
|
|
|
### Phase 5: Desktop & Mobile
|
|
**Goal:** Tauri 2 app for all platforms.
|
|
|
|
1. **Tauri 2 project** in `apps/gui/`
|
|
2. **Rust backend** — Tauri commands wrapping `QpqClient`
|
|
3. **Web frontend** — Svelte or vanilla HTML/JS
|
|
4. **Desktop** — Linux, macOS, Windows
|
|
5. **Mobile** — iOS, Android via Tauri 2 mobile
|
|
6. **QUIC connection migration** — automatic wifi↔cellular handoff
|
|
|
|
**Verification:**
|
|
- Desktop app builds and runs on Linux
|
|
- Mobile app builds for Android (emulator)
|
|
- Send message from CLI → received in GUI
|
|
|
|
---
|
|
|
|
### Phase 6: Polish & Ecosystem
|
|
**Goal:** Production readiness.
|
|
|
|
1. **Federation improvements** — DNS SRV discovery, persistent relay queue with retry
|
|
2. **Plugin system v2** — version field, config passthrough, async hooks, WASM plugins
|
|
3. **WebTransport** — browser clients over HTTP/3 (same quinn endpoint)
|
|
4. **WASM MLS** — compile openmls to wasm32 for browser E2E encryption
|
|
5. **CI/CD** — release automation, WASM CI, multi-platform (Linux + macOS)
|
|
6. **Security hardening:**
|
|
- Fuzz testing (hybrid KEM, sealed sender, padding, protobuf deser)
|
|
- Remove all `InsecureServerCertVerifier` paths
|
|
- Certificate pinning
|
|
- Add passkey/WebAuthn as alternative auth
|
|
7. **Double Ratchet for 1:1 DMs** — better per-message forward secrecy than MLS for 2-party
|
|
|
|
---
|
|
|
|
## RPC Method Inventory (33 total)
|
|
|
|
| Category | Methods | Proto File |
|
|
|----------|---------|-----------|
|
|
| Auth (OPAQUE) | opaqueRegisterStart, opaqueRegisterFinish, opaqueLoginStart, opaqueLoginFinish | auth.proto |
|
|
| Delivery | enqueue, fetch, fetchWait, peek, ack, batchEnqueue | delivery.proto |
|
|
| Keys | uploadKeyPackage, fetchKeyPackage, uploadHybridKey, fetchHybridKey, fetchHybridKeys | keys.proto |
|
|
| Channel | createChannel | channel.proto |
|
|
| User | resolveUser, resolveIdentity | user.proto |
|
|
| Blob | uploadBlob, downloadBlob | blob.proto |
|
|
| Device | registerDevice, listDevices, revokeDevice | device.proto |
|
|
| P2P | health, publishEndpoint, resolveEndpoint | p2p.proto |
|
|
| Federation | relayEnqueue, relayBatchEnqueue, proxyFetchKeyPackage, proxyFetchHybridKey, proxyResolveUser, federationHealth | federation.proto |
|
|
|
|
**New in v2:**
|
|
| Push Events | Description | Proto File |
|
|
|-------------|-------------|-----------|
|
|
| MessageNotification | New message available | push.proto |
|
|
| TypingNotification | Peer is typing | push.proto |
|
|
| ChannelUpdate | Channel created/member changed | push.proto |
|
|
| SessionExpired | Auth session expired | push.proto |
|
|
|
|
---
|
|
|
|
## Engineering Standards (carried from v1)
|
|
|
|
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `test:`, `refactor:`
|
|
- GPG-signed commits only
|
|
- No `Co-authored-by` trailers
|
|
- No `.unwrap()` on crypto or I/O in non-test paths
|
|
- Secrets: zeroize on drop, never in logs
|
|
- No stubs / `todo!()` / `unimplemented!()` in production code
|
|
- `clippy::unwrap_used = "deny"` at workspace level
|