chore: fix all clippy warnings across workspace
This commit is contained in:
380
docs/V2-DESIGN-ANALYSIS.md
Normal file
380
docs/V2-DESIGN-ANALYSIS.md
Normal file
@@ -0,0 +1,380 @@
|
||||
# quicproquo v2 — Design Analysis & Recommendations
|
||||
|
||||
> Multi-perspective retrospective of the v1 architecture.
|
||||
> Produced 2026-03-04 by four parallel analysis agents examining server,
|
||||
> client/UX, crypto/security, and project structure/DX.
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
quicproquo v1 demonstrates strong fundamentals: QUIC-native transport, RFC 9420
|
||||
MLS group encryption, post-quantum hybrid KEM, OPAQUE zero-knowledge auth, and a
|
||||
working multi-language SDK surface. These are the right bets and put the project
|
||||
ahead of most open-source messengers on the crypto front.
|
||||
|
||||
However, three architectural choices limit the path to production:
|
||||
|
||||
1. **capnp-rpc is `!Send`** — forces single-threaded RPC handling, blocking
|
||||
scalability.
|
||||
2. **Monolithic client with global state** — business logic is tangled into the
|
||||
REPL, duplicated across TUI/GUI/Web, and cannot be used as a library.
|
||||
3. **Poll-based delivery** — 1-second polling wastes bandwidth and adds latency;
|
||||
no server-push channel exists.
|
||||
|
||||
A v2 should keep the crypto stack (MLS + hybrid PQ KEM + OPAQUE), keep QUIC, but
|
||||
rearchitect the RPC layer, extract an SDK crate, and add push-based delivery.
|
||||
|
||||
---
|
||||
|
||||
## Part 1 — What Works Well
|
||||
|
||||
### Transport & Protocol
|
||||
- **QUIC (quinn) + TLS 1.3** — correct choice. Built-in encryption, connection
|
||||
migration, 0-RTT potential. No reason to change.
|
||||
- **Cap'n Proto schemas as API contract** — zero-copy wire format, compact
|
||||
binary, schema evolution via ordinals. The *schemas* are good; the *RPC
|
||||
runtime* is the problem.
|
||||
|
||||
### Cryptography
|
||||
- **MLS (RFC 9420, openmls)** — only IETF-standard group E2E protocol. No
|
||||
realistic alternative for groups > 2 members. Test suite is thorough (1005
|
||||
lines covering 2-party, 3-party, hybrid, removal, leave, stale epoch).
|
||||
- **Hybrid PQ KEM (X25519 + ML-KEM-768)** — forward-thinking dual-algorithm
|
||||
protection. Well-implemented with versioned wire format, proper zeroization,
|
||||
and 12 targeted tests. Ahead of Signal (PQXDH, late 2023) and Matrix (no PQ).
|
||||
- **OPAQUE (RFC 9497)** — server never sees passwords. Ristretto255 + Argon2id
|
||||
is best-in-class.
|
||||
- **Sealed sender, safety numbers, message padding** — all clean, simple,
|
||||
correct. Safety numbers match Signal's 5200-iteration HMAC-SHA256 cost.
|
||||
- **Zeroization discipline** — secrets wrapped in `Zeroizing`, Debug impls
|
||||
redact keys, no `.unwrap()` in crypto paths.
|
||||
- **WASM feature gating** — `core/native` cleanly separates WASM-safe crypto
|
||||
from native-only modules (MLS, OPAQUE, filesystem).
|
||||
|
||||
### Server Design
|
||||
- **Store trait abstraction** — 30+ methods, clean backend swap (SqlStore vs
|
||||
FileBackedStore). Well-factored.
|
||||
- **OPAQUE auth with timing floors** — `resolveUser`/`resolveIdentity` mask
|
||||
lookup timing to prevent username enumeration.
|
||||
- **Delivery proofs** — Ed25519-signed receipt of server acceptance. Clients get
|
||||
cryptographic evidence.
|
||||
- **`wasNew` flag on createChannel** — elegantly solves the dual-MLS-group race
|
||||
condition where both DM parties try to initialize.
|
||||
- **Plugin hooks (C-ABI)** — `#![no_std]` vtable, zero dependencies, chained
|
||||
hooks with continue/reject protocol. Clean extensibility.
|
||||
- **Production config validation** — enforces encrypted storage, strong auth
|
||||
tokens, pre-existing TLS certs.
|
||||
|
||||
### Client & DX
|
||||
- **Zero-config local dev** — `qpq --username alice --password pass` auto-starts
|
||||
server, generates TLS certs, registers, and logs in. Genuinely excellent.
|
||||
- **Encrypted-at-rest everything** — state file (QPCE), conversation DB
|
||||
(SQLCipher), session cache. Argon2id + ChaCha20-Poly1305 throughout.
|
||||
- **Playbook system** — YAML-scripted command execution with assertions. Great
|
||||
for CI/integration testing.
|
||||
- **Conversation store** — SQLite with deduplication, outbox for offline
|
||||
queuing, activity tracking.
|
||||
- **Conventional commits, GPG-signed** — consistent `feat:`/`fix:`/`docs:`
|
||||
discipline.
|
||||
- **Security lints enforced by build** — `clippy::unwrap_used = "deny"`,
|
||||
`unsafe_code = "warn"`.
|
||||
|
||||
---
|
||||
|
||||
## Part 2 — What Needs Rethinking
|
||||
|
||||
### 2.1 RPC Layer: capnp-rpc is the #1 Scalability Bottleneck
|
||||
|
||||
**Problem:** `capnp-rpc` uses `Rc` internally and is `!Send`. Everything runs on
|
||||
a `LocalSet` with `spawn_local`. All 27 RPC methods serialize through a single
|
||||
thread. No work-stealing, no multi-core utilization.
|
||||
|
||||
**Impact:** With 1000+ concurrent clients, the single-threaded executor cannot
|
||||
keep up. A slow `fetchWait` (30s timeout) blocks the entire connection.
|
||||
|
||||
**Also:** The WebSocket bridge (`ws_bridge.rs`, 645 lines) exists solely because
|
||||
Cap'n Proto cannot run in browsers. This duplicates handler logic and creates
|
||||
maintenance burden.
|
||||
|
||||
### 2.2 Client Architecture: Monolith with Global State
|
||||
|
||||
**Problem:** `AUTH_CONTEXT` is a process-wide `RwLock<Option<ClientAuth>>`.
|
||||
Business logic (MLS processing, sealed sender, hybrid decryption, message
|
||||
routing) lives inside `repl.rs`'s `poll_messages()` — a 100-line function that
|
||||
mixes transport, crypto, routing, and storage.
|
||||
|
||||
**Impact:** Every frontend (REPL, TUI, GUI, Web) must reimplement message
|
||||
processing. The TUI already duplicates it. The GUI stub and mobile PoC would need
|
||||
yet another copy. Client cannot be used as a library.
|
||||
|
||||
### 2.3 Delivery Model: Poll-Based, No Push Channel
|
||||
|
||||
**Problem:** Client polls every 1 second with `fetch_wait(timeout_ms=0)` — never
|
||||
actually long-polls. Constant network traffic even when idle. ~1 second latency
|
||||
for message delivery.
|
||||
|
||||
**Also:** `fetch` is destructive (drains queue). If the client crashes between
|
||||
receive and processing, messages are lost.
|
||||
|
||||
### 2.4 Connection Model: Single Stream
|
||||
|
||||
**Problem:** `max_concurrent_bidi_streams(1)` means the entire QUIC connection is
|
||||
effectively single-stream. A blocking `fetchWait` prevents all other RPCs.
|
||||
|
||||
### 2.5 Storage: Single Mutex-Guarded SQLite Connection
|
||||
|
||||
**Problem:** `SqlStore` uses `Mutex<Connection>`. Every database operation
|
||||
acquires a global lock. Under concurrent load, all storage access serializes.
|
||||
|
||||
**Also:** `FileBackedStore` flushes the entire map on every write (O(n) I/O).
|
||||
Sessions are in-memory only — server restart forces all clients to re-login.
|
||||
|
||||
### 2.6 Key Management Gaps
|
||||
|
||||
- **DiskKeyStore** — HPKE private keys stored as plaintext bincode on disk. No
|
||||
encryption at rest.
|
||||
- **MLS group state** — `GroupMember` holds `MlsGroup` in memory only. Process
|
||||
crash loses all group state.
|
||||
- **Token zeroization** — `AuthContext.token`, `ClientAuth.access_token` are not
|
||||
wrapped in `Zeroizing`.
|
||||
|
||||
### 2.7 Workspace Bloat
|
||||
|
||||
12 crates for a project at this maturity is excessive. Several are thin stubs
|
||||
(`quicproquo-gen`, `quicproquo-bot` at 354 lines) or broken (`quicproquo-gui`
|
||||
fails `cargo build --workspace`).
|
||||
|
||||
---
|
||||
|
||||
## Part 3 — v2 Architecture Recommendations
|
||||
|
||||
### 3.1 Replace capnp-rpc with a Send-Compatible RPC Framework
|
||||
|
||||
**Recommendation:** Switch to **tonic (gRPC)** or a custom framing layer.
|
||||
|
||||
| Dimension | capnp-rpc (v1) | tonic/gRPC (v2) |
|
||||
|-----------|---------------|-----------------|
|
||||
| Threading | `!Send`, single-threaded | `Send + Sync`, multi-threaded |
|
||||
| Browser | Requires WS bridge | grpc-web native |
|
||||
| Streaming | Not supported | Built-in |
|
||||
| Middleware | None (copy-paste auth) | Interceptors/layers |
|
||||
| Ecosystem | Niche | Massive (every language) |
|
||||
|
||||
**Alternative:** Keep Cap'n Proto *schemas* for serialization (zero-copy
|
||||
advantage) but replace capnp-rpc with custom framing over QUIC streams. This
|
||||
preserves the wire format while gaining `Send` compatibility.
|
||||
|
||||
The WS bridge would be eliminated entirely — grpc-web or WebTransport gives
|
||||
browsers direct access.
|
||||
|
||||
### 3.2 Extract an SDK Crate (Most Important Client Change)
|
||||
|
||||
Create `quicproquo-sdk` that owns all business logic:
|
||||
|
||||
```
|
||||
quicproquo-sdk/
|
||||
src/
|
||||
client.rs -- QpqClient: connect, login, send, receive
|
||||
events.rs -- ClientEvent enum (push-based)
|
||||
conversation.rs -- ConversationHandle, group management
|
||||
crypto.rs -- MLS pipeline, sealed sender, hybrid decryption
|
||||
sync.rs -- message sync, offline queue, retry
|
||||
```
|
||||
|
||||
All frontends become thin shells:
|
||||
|
||||
```
|
||||
CLI/REPL -> calls sdk
|
||||
TUI -> calls sdk
|
||||
Tauri GUI -> calls sdk (via Tauri commands)
|
||||
Mobile -> calls sdk (via C FFI)
|
||||
Web/WASM -> calls sdk (compiled to wasm32)
|
||||
```
|
||||
|
||||
**Key API shape:**
|
||||
```rust
|
||||
pub struct QpqClient { /* session, rpc, crypto pipeline */ }
|
||||
|
||||
impl QpqClient {
|
||||
pub async fn connect(config: ClientConfig) -> Result<Self>;
|
||||
pub async fn login(username: &str, password: &str) -> Result<Self>;
|
||||
pub async fn dm(&mut self, username: &str) -> Result<ConversationHandle>;
|
||||
pub async fn create_group(&mut self, name: &str) -> Result<ConversationHandle>;
|
||||
pub async fn send(&mut self, text: &str) -> Result<MessageId>;
|
||||
pub fn subscribe(&self) -> Receiver<ClientEvent>;
|
||||
}
|
||||
```
|
||||
|
||||
No global state. No `AUTH_CONTEXT`. Auth context is per-`QpqClient` instance.
|
||||
|
||||
### 3.3 Add Push-Based Delivery
|
||||
|
||||
**Recommendation:** Dedicated QUIC unidirectional stream for server-push
|
||||
notifications.
|
||||
|
||||
```
|
||||
Client opens bidi stream 0 -> RPC channel (request/response)
|
||||
Server opens uni stream 1 -> push notifications (new message, typing, etc.)
|
||||
```
|
||||
|
||||
Benefits:
|
||||
- Zero-latency message delivery (no polling)
|
||||
- No idle network traffic
|
||||
- Typing indicators delivered in real-time
|
||||
- Graceful degradation: fall back to long-poll if push stream fails
|
||||
|
||||
**Also:** Make `peek` + `ack` the default delivery pattern (not destructive
|
||||
`fetch`). Add idempotency keys to prevent duplicate messages on retry.
|
||||
|
||||
### 3.4 Multi-Stream Connections
|
||||
|
||||
Allow 4-8 concurrent bidirectional QUIC streams per connection. This enables:
|
||||
- Pipelined RPCs (send while fetching)
|
||||
- Concurrent blob upload + chat
|
||||
- `fetchWait` on one stream without blocking others
|
||||
|
||||
### 3.5 Storage Improvements
|
||||
|
||||
| Change | Rationale |
|
||||
|--------|-----------|
|
||||
| Drop `FileBackedStore` | O(n) flush per write, no federation support |
|
||||
| Connection pool for SQLite | Replace `Mutex<Connection>` with r2d2/deadpool |
|
||||
| Persist sessions to DB | Server restart shouldn't force re-login |
|
||||
| Encrypt DiskKeyStore at rest | HPKE private keys in plaintext is a real vuln |
|
||||
| Persist MLS group state | Process crash shouldn't lose group state |
|
||||
| Atomic keystore writes | tempfile-then-rename pattern |
|
||||
|
||||
### 3.6 Crypto Stack Refinements
|
||||
|
||||
The algorithms are correct. The refinements are operational:
|
||||
|
||||
| Change | Rationale |
|
||||
|--------|-----------|
|
||||
| Typed MLS error variants | Stop losing error info via `format!("{e:?}")` |
|
||||
| Formalize hybrid PQ ciphersuite ID | Replace length-based key detection |
|
||||
| Remove all InsecureServerCertVerifier | No TLS bypass on any platform |
|
||||
| Add passkey/WebAuthn alt-auth | Better UX for GUI/mobile, no password to forget |
|
||||
| Consider Double Ratchet for 1:1 DMs | MLS is over-engineered for 2-party; DR gives better per-message forward secrecy |
|
||||
| Token/session secret zeroization | `AuthContext.token` et al. need `Zeroizing` wrappers |
|
||||
| Fix serde deserialization of secrets | Intermediate non-zeroized `Vec<u8>` in `IdentityKeypair::deserialize` |
|
||||
|
||||
### 3.7 Workspace Restructuring
|
||||
|
||||
**Reduce from 12 to 8 crates:**
|
||||
|
||||
```
|
||||
quicproquo-core -- crypto primitives (keep)
|
||||
quicproquo-proto -- schema codegen (keep)
|
||||
quicproquo-plugin-api -- #![no_std] C-ABI (keep)
|
||||
quicproquo-kt -- key transparency (keep)
|
||||
quicproquo-sdk -- NEW: business logic library
|
||||
quicproquo-server -- server binary (keep)
|
||||
quicproquo-client -- CLI/TUI binary, depends on sdk (keep, slimmed)
|
||||
quicproquo-p2p -- mesh networking (keep, feature-flagged)
|
||||
```
|
||||
|
||||
**Merge/remove:**
|
||||
- `bot` -> `sdk::bot` module
|
||||
- `ffi` -> `sdk` with `--features c-ffi`
|
||||
- `gen` -> `scripts/` or `xtask`
|
||||
- `gui` -> `apps/gui/` outside workspace (Tauri project)
|
||||
- `mobile` -> `examples/` (research spike)
|
||||
|
||||
**Add `[workspace.default-members]`** so `cargo build` doesn't attempt GUI.
|
||||
**Add `justfile`** with `build`, `test`, `test-e2e`, `build-wasm`, `docker`.
|
||||
|
||||
### 3.8 Plugin System Evolution
|
||||
|
||||
| Change | Rationale |
|
||||
|--------|-----------|
|
||||
| Add `version: u32` to `HookVTable` | ABI stability — check version on load |
|
||||
| Config passthrough | `qpq_plugin_init(vtable, config_json)` |
|
||||
| Async hooks | Plugins that call external services shouldn't block Tokio |
|
||||
| Evaluate WASM plugins | Sandboxed community plugins (keep C-ABI for first-party) |
|
||||
|
||||
### 3.9 Federation Improvements
|
||||
|
||||
| Change | Rationale |
|
||||
|--------|-----------|
|
||||
| DNS SRV / .well-known discovery | Static peer config doesn't scale |
|
||||
| Persistent relay queue with retry | Messages to offline peers are currently lost |
|
||||
| Deterministic channel ID derivation | Avoid cross-server channel conflicts |
|
||||
| Keep mDNS as optional mesh feature | Not for internet-scale, but good for LAN |
|
||||
|
||||
### 3.10 Test & CI Improvements
|
||||
|
||||
| Change | Rationale |
|
||||
|--------|-----------|
|
||||
| Per-client auth context | Removes `--test-threads 1` constraint |
|
||||
| Mock server for client unit tests | Fast tests without spawning real server |
|
||||
| Fuzz testing (cargo-fuzz) | Hybrid KEM, sealed sender, padding, Cap'n Proto deser |
|
||||
| WS bridge unit tests | 645 lines, zero tests, security-critical |
|
||||
| WASM + Go SDK in CI | Currently untested in CI |
|
||||
| Separate E2E from unit test CI job | Different speed, different failure modes |
|
||||
| macOS CI | FFI/mobile cross-compilation validation |
|
||||
| Release automation | Binary artifacts, Docker tags, WASM npm publish |
|
||||
|
||||
---
|
||||
|
||||
## Part 4 — Ecosystem Positioning
|
||||
|
||||
### Don't compete with Signal or Matrix directly.
|
||||
|
||||
**Target: Privacy-first messaging infrastructure for developers and
|
||||
organizations.**
|
||||
|
||||
quicproquo's differentiators — QUIC-native transport, post-quantum crypto, MLS,
|
||||
plugin system, multi-language SDKs, embeddable architecture — point toward an
|
||||
infrastructure play, not a consumer app.
|
||||
|
||||
Think: *"the Postgres of E2E encrypted messaging"* — a high-quality open-source
|
||||
server and protocol that other projects build on.
|
||||
|
||||
| Segment | Value Proposition |
|
||||
|---------|-------------------|
|
||||
| **Developer tool** | API-first messenger for encrypted bots and integrations |
|
||||
| **Embeddable** | C FFI + WASM + Go SDK for embedding in other apps |
|
||||
| **Enterprise** | On-prem, plugins for compliance/audit, OPAQUE zero-knowledge auth |
|
||||
| **Research** | Post-quantum crypto, MLS reference implementation, mesh networking |
|
||||
|
||||
---
|
||||
|
||||
## Part 5 — Priority Ordering
|
||||
|
||||
### Phase 1: Foundation (unblocks everything else)
|
||||
1. Replace capnp-rpc with Send-compatible framework
|
||||
2. Extract SDK crate from client
|
||||
3. Per-client auth context (no global state)
|
||||
|
||||
### Phase 2: Reliability
|
||||
4. Push-based delivery (QUIC uni-stream)
|
||||
5. Multi-stream connections
|
||||
6. Persist sessions + MLS group state
|
||||
7. Encrypt DiskKeyStore at rest
|
||||
8. peek+ack as default delivery
|
||||
|
||||
### Phase 3: Polish
|
||||
9. Workspace restructuring (12 -> 8 crates)
|
||||
10. TUI as primary interactive mode (built on SDK)
|
||||
11. Plugin system v2 (versioning, config, async)
|
||||
12. Federation retry queue + discovery
|
||||
|
||||
### Phase 4: Ecosystem
|
||||
13. Full MLS in WASM (browser E2E)
|
||||
14. WebTransport (eliminate WS bridge)
|
||||
15. Tauri GUI (built on SDK)
|
||||
16. Release automation + expanded CI
|
||||
|
||||
---
|
||||
|
||||
## Appendix — Analysis Sources
|
||||
|
||||
This document was produced by four parallel analysis agents:
|
||||
|
||||
| Agent | Scope | Files Read |
|
||||
|-------|-------|-----------|
|
||||
| server-analyst | Transport, RPC, delivery, storage, federation | 27 server .rs files, 4 schemas, core transport |
|
||||
| client-analyst | REPL, UX, state, multi-platform, SDK design | All client .rs, GUI, mobile, TS demo |
|
||||
| security-analyst | MLS, OPAQUE, hybrid KEM, keystore, identity | All core .rs, review doc |
|
||||
| dx-analyst | Workspace, build, tests, plugins, CI, ecosystem | All Cargo.toml, tests, CI, plugins, SDKs |
|
||||
328
docs/V2-MASTER-PLAN.md
Normal file
328
docs/V2-MASTER-PLAN.md
Normal file
@@ -0,0 +1,328 @@
|
||||
# quicproquo v2 — Master Implementation Plan
|
||||
|
||||
> Created 2026-03-04. This is the authoritative plan for the v2 rewrite.
|
||||
> See also: `docs/V2-DESIGN-ANALYSIS.md` for the detailed retrospective.
|
||||
|
||||
## Context
|
||||
|
||||
The v1 codebase has strong crypto foundations (MLS, hybrid PQ KEM, OPAQUE) but three
|
||||
architectural bottlenecks: capnp-rpc is `!Send` (single-threaded), client business logic
|
||||
is trapped in a monolithic REPL with global state, and delivery is poll-based.
|
||||
|
||||
This plan creates v2 on a new branch, keeping the crypto stack intact and replacing
|
||||
the RPC/transport layer, extracting an SDK, and restructuring the workspace.
|
||||
|
||||
**Key decisions:**
|
||||
- Transport: Protobuf (prost) + custom framing over QUIC (quinn)
|
||||
- Mobile: Tauri 2 (same Rust SDK backend, web UI)
|
||||
- Branch strategy: `v2` branch from main, not a fresh repo
|
||||
- Constraints: Rust, QUIC, GPG-signed commits, zeroize secrets, no stubs
|
||||
|
||||
---
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ Frontends │
|
||||
│ CLI/TUI │ Tauri GUI/Mobile │ Web (WebTransport)│
|
||||
└─────┬─────┴────────┬───────────┴──────────┬─────────┘
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ quicproquo-sdk │
|
||||
│ QpqClient { connect, login, send, recv, subscribe } │
|
||||
│ Event system (tokio broadcast) │
|
||||
│ Crypto pipeline (MLS, sealed sender, hybrid) │
|
||||
│ Conversation store (SQLCipher) │
|
||||
└──────────────────────┬──────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ quicproquo-rpc │
|
||||
│ QUIC framing: [method:u16][req_id:u32][len:u32][pb] │
|
||||
│ Multi-stream (1 RPC per stream) │
|
||||
│ Server-push via uni-streams │
|
||||
│ tower middleware (auth, rate-limit) │
|
||||
└──────────────────────┬──────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ quicproquo-server │
|
||||
│ Domain services (auth, delivery, channel, blob) │
|
||||
│ Store trait → SqlStore (connection pool) │
|
||||
│ Plugin hooks, federation, KT │
|
||||
└─────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Wire Format
|
||||
|
||||
Per QUIC bidirectional stream (request/response):
|
||||
```
|
||||
Request: [method_id: u16][request_id: u32][payload_len: u32][protobuf bytes]
|
||||
Response: [status: u8][request_id: u32][payload_len: u32][protobuf bytes]
|
||||
```
|
||||
|
||||
Per QUIC unidirectional stream (server → client push):
|
||||
```
|
||||
Push: [event_type: u16][payload_len: u32][protobuf bytes]
|
||||
```
|
||||
|
||||
Each RPC opens its own QUIC bidi stream → natural multi-stream, no head-of-line blocking.
|
||||
|
||||
---
|
||||
|
||||
## Workspace Structure (v2: 9 crates)
|
||||
|
||||
```
|
||||
quicproquo/
|
||||
├── crates/
|
||||
│ ├── quicproquo-core/ # KEEP AS-IS — crypto primitives, MLS, hybrid KEM
|
||||
│ ├── quicproquo-kt/ # KEEP AS-IS — key transparency
|
||||
│ ├── quicproquo-plugin-api/ # KEEP AS-IS — #![no_std] C-ABI
|
||||
│ ├── quicproquo-proto/ # REWRITE — protobuf schemas + prost codegen
|
||||
│ ├── quicproquo-rpc/ # NEW — QUIC RPC framework (framing, dispatch, tower)
|
||||
│ ├── quicproquo-sdk/ # NEW — client business logic library
|
||||
│ ├── quicproquo-server/ # REWRITE — domain services + RPC handlers
|
||||
│ ├── quicproquo-client/ # REWRITE — thin CLI/TUI shell over SDK
|
||||
│ └── quicproquo-p2p/ # KEEP — iroh mesh (feature-flagged, later)
|
||||
├── apps/
|
||||
│ └── gui/ # Tauri 2 desktop + mobile app (outside workspace)
|
||||
├── proto/ # .proto source files
|
||||
│ └── qpq/v1/
|
||||
│ ├── auth.proto # OPAQUE registration + login (4 methods)
|
||||
│ ├── delivery.proto # enqueue, fetch, peek, ack, batch (6 methods)
|
||||
│ ├── keys.proto # key package + hybrid key CRUD (5 methods)
|
||||
│ ├── channel.proto # channel create (1 method)
|
||||
│ ├── user.proto # resolve user/identity (2 methods)
|
||||
│ ├── blob.proto # upload/download (2 methods)
|
||||
│ ├── device.proto # register/list/revoke (3 methods)
|
||||
│ ├── p2p.proto # endpoint publish/resolve + health (3 methods)
|
||||
│ ├── federation.proto # relay + proxy (6 methods)
|
||||
│ ├── push.proto # server-push events (NEW)
|
||||
│ └── common.proto # shared types (Auth, Envelope, Error)
|
||||
├── sdks/
|
||||
│ ├── go/ # Go SDK (regenerate from .proto)
|
||||
│ └── typescript/ # TS SDK (WebTransport client)
|
||||
├── justfile # NEW — build commands
|
||||
└── Cargo.toml # workspace root
|
||||
```
|
||||
|
||||
**Removed from workspace:**
|
||||
- `quicproquo-bot` → `sdk::bot` module
|
||||
- `quicproquo-ffi` → `sdk` with `--features c-ffi`
|
||||
- `quicproquo-gen` → `scripts/`
|
||||
- `quicproquo-gui` → `apps/gui/` (Tauri project, outside workspace)
|
||||
- `quicproquo-mobile` → merged into `apps/gui/` (Tauri 2 mobile)
|
||||
|
||||
---
|
||||
|
||||
## Crate Reuse Assessment
|
||||
|
||||
| v1 Crate | capnp deps? | v2 Action | Effort |
|
||||
|----------|:-----------:|-----------|--------|
|
||||
| **quicproquo-core** | None | Copy as-is | Zero |
|
||||
| **quicproquo-kt** | None | Copy as-is | Zero |
|
||||
| **quicproquo-plugin-api** | None | Copy as-is | Zero |
|
||||
| **quicproquo-p2p** | None | Copy as-is | Zero |
|
||||
| **quicproquo-proto** | 100% capnp | Replace with prost codegen | Medium |
|
||||
| **quicproquo-server** | 16/20 files | Extract domain logic, rewrite handlers | High |
|
||||
| **quicproquo-client** | 6/10 files | Extract to SDK, thin CLI shell | High |
|
||||
|
||||
### Key Files to Reuse Directly
|
||||
|
||||
| Source (v1) | Destination (v2) | Notes |
|
||||
|-------------|------------------|-------|
|
||||
| `crates/quicproquo-core/` (entire) | same path | Zero changes |
|
||||
| `crates/quicproquo-kt/` (entire) | same path | Zero changes |
|
||||
| `crates/quicproquo-plugin-api/` (entire) | same path | Zero changes |
|
||||
| `server/src/storage.rs` | `server/src/storage.rs` | Store trait — keep |
|
||||
| `server/src/sql_store.rs` | `server/src/sql_store.rs` | Add connection pool |
|
||||
| `server/src/hooks.rs` | `server/src/hooks.rs` | Plugin system — keep |
|
||||
| `server/src/plugin_loader.rs` | `server/src/plugin_loader.rs` | Keep |
|
||||
| `server/src/error_codes.rs` | `server/src/error_codes.rs` | Keep |
|
||||
| `server/src/config.rs` | `server/src/config.rs` | Update for new transport |
|
||||
| `client/src/conversation.rs` | `sdk/src/conversation.rs` | Move to SDK |
|
||||
| `client/src/token_cache.rs` | `sdk/src/token_cache.rs` | Move to SDK |
|
||||
| `client/src/display.rs` | `client/src/display.rs` | Keep in CLI |
|
||||
| `schemas/*.capnp` | reference only | Translate to .proto |
|
||||
|
||||
---
|
||||
|
||||
## Phased Implementation
|
||||
|
||||
### Phase 1: Foundation
|
||||
**Goal:** v2 branch with new workspace, proto schemas, RPC framework skeleton, SDK skeleton.
|
||||
**Scope:** Compiles, no runtime functionality yet.
|
||||
|
||||
1. **Create v2 branch** from main
|
||||
2. **Restructure workspace** — update root Cargo.toml, create new crate dirs, add justfile
|
||||
3. **Write .proto files** — translate all 33 RPC methods + push events from Cap'n Proto
|
||||
4. **Create quicproquo-proto crate** — prost-build codegen
|
||||
5. **Create quicproquo-rpc crate** — QUIC RPC framework:
|
||||
- `framing.rs` — wire format encode/decode (request, response, push)
|
||||
- `server.rs` — accept QUIC connections, dispatch to handlers
|
||||
- `client.rs` — connect, send requests, receive responses + push events
|
||||
- `middleware.rs` — tower-based auth + rate-limit layers
|
||||
- `method.rs` — method registry (method_id → async handler fn)
|
||||
6. **Create quicproquo-sdk crate** — public API skeleton:
|
||||
- `client.rs` — `QpqClient` struct
|
||||
- `events.rs` — `ClientEvent` enum
|
||||
- `conversation.rs` — `ConversationHandle`, `ConversationStore`
|
||||
- `config.rs` — `ClientConfig`
|
||||
7. **Extract server domain types** — `server/src/domain/` module:
|
||||
- `types.rs` — plain Rust request/response types
|
||||
- `auth.rs` — OPAQUE logic extracted from auth_ops.rs
|
||||
- `delivery.rs` — enqueue/fetch logic extracted from delivery.rs
|
||||
|
||||
**Verification:**
|
||||
- `cargo build --workspace` succeeds
|
||||
- `cargo test -p quicproquo-core` passes (72 tests)
|
||||
- Proto codegen works
|
||||
- RPC framework compiles
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Server Core
|
||||
**Goal:** Working server with all 33 RPC handlers over QUIC.
|
||||
|
||||
1. **RPC dispatch** — method registry, connection lifecycle
|
||||
2. **Domain handlers** — all 33 methods as `async fn(Request) -> Result<Response>`
|
||||
- Auth (4): OPAQUE register start/finish, login start/finish
|
||||
- Delivery (6): enqueue, fetch, fetchWait, peek, ack, batchEnqueue
|
||||
- Keys (5): upload/fetch key package, upload/fetch/batch-fetch hybrid key
|
||||
- Channels (1): createChannel
|
||||
- Users (2): resolveUser, resolveIdentity
|
||||
- Blobs (2): uploadBlob, downloadBlob
|
||||
- Devices (3): registerDevice, listDevices, revokeDevice
|
||||
- P2P (3): health, publishEndpoint, resolveEndpoint
|
||||
- Federation (6): relay enqueue/batch, proxy fetch/resolve, health
|
||||
3. **Server-push** — notification stream via QUIC uni-stream
|
||||
4. **Storage upgrades:**
|
||||
- Drop `FileBackedStore`
|
||||
- Connection pool (deadpool-sqlite)
|
||||
- Persist sessions to SQLite
|
||||
- Atomic queue depth check + enqueue
|
||||
5. **Tower middleware** — auth validation, rate limiting, audit logging
|
||||
6. **Multi-stream** — concurrent RPCs per connection (remove 1-stream limit)
|
||||
|
||||
**Verification:**
|
||||
- Server starts, accepts QUIC connections
|
||||
- Health check RPC works
|
||||
- OPAQUE registration + login works
|
||||
- Message enqueue + fetch round-trip
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: SDK
|
||||
**Goal:** Complete client SDK library — the heart of v2.
|
||||
|
||||
1. **QpqClient** — connect, OPAQUE auth, session management (no global state)
|
||||
2. **Crypto pipeline** — MLS processing, sealed sender unwrap, hybrid decrypt
|
||||
(extracted from repl.rs `poll_messages()`)
|
||||
3. **Conversation management** — create DM, create group, invite, remove, send, receive
|
||||
4. **Event system** — `tokio::broadcast<ClientEvent>` replacing poll loop
|
||||
- `MessageReceived`, `TypingIndicator`, `ConversationCreated`
|
||||
- `MemberJoined`, `MemberLeft`, `ConnectionLost`, `Reconnected`
|
||||
5. **Offline support** — outbox queue, retry with backoff, sync on reconnect
|
||||
6. **ConversationStore** — SQLCipher local DB (migrate from client/conversation.rs)
|
||||
7. **Key management** — encrypted DiskKeyStore, MLS group state persistence
|
||||
8. **Token/secret zeroization** — `AuthContext.token` etc. wrapped in `Zeroizing`
|
||||
|
||||
**Verification:**
|
||||
- SDK integration test: connect → login → create DM → send → receive
|
||||
- No global state (`AUTH_CONTEXT` eliminated)
|
||||
- Event subscription works
|
||||
- Offline outbox drains on reconnect
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Client
|
||||
**Goal:** CLI and TUI as thin shells over SDK.
|
||||
|
||||
1. **CLI binary** (`qpq`) — clap subcommands calling `QpqClient`
|
||||
2. **REPL** — readline with tab-completion (rustyline), categorized `/help`
|
||||
3. **TUI** — ratatui, subscribes to `QpqClient::subscribe()` events
|
||||
4. **Simplified commands:**
|
||||
- Hide MLS/KeyPackage internals (auto-refresh)
|
||||
- Message references by short ID (not index)
|
||||
- Batch operations (`/create-group team alice bob`)
|
||||
- Categorized help (Chat, Groups, Security, System)
|
||||
5. **Auto-server-launch** — keep zero-config DX from v1
|
||||
6. **Playbook system** — keep YAML-based test scripting
|
||||
|
||||
**Verification:**
|
||||
- `qpq --username alice --password pass` starts REPL (same UX as v1)
|
||||
- TUI mode works with live event updates
|
||||
- Tab-completion for commands and usernames
|
||||
- E2E test: two clients exchange messages
|
||||
|
||||
---
|
||||
|
||||
### Phase 5: Desktop & Mobile
|
||||
**Goal:** Tauri 2 app for all platforms.
|
||||
|
||||
1. **Tauri 2 project** in `apps/gui/`
|
||||
2. **Rust backend** — Tauri commands wrapping `QpqClient`
|
||||
3. **Web frontend** — Svelte or vanilla HTML/JS
|
||||
4. **Desktop** — Linux, macOS, Windows
|
||||
5. **Mobile** — iOS, Android via Tauri 2 mobile
|
||||
6. **QUIC connection migration** — automatic wifi↔cellular handoff
|
||||
|
||||
**Verification:**
|
||||
- Desktop app builds and runs on Linux
|
||||
- Mobile app builds for Android (emulator)
|
||||
- Send message from CLI → received in GUI
|
||||
|
||||
---
|
||||
|
||||
### Phase 6: Polish & Ecosystem
|
||||
**Goal:** Production readiness.
|
||||
|
||||
1. **Federation improvements** — DNS SRV discovery, persistent relay queue with retry
|
||||
2. **Plugin system v2** — version field, config passthrough, async hooks, WASM plugins
|
||||
3. **WebTransport** — browser clients over HTTP/3 (same quinn endpoint)
|
||||
4. **WASM MLS** — compile openmls to wasm32 for browser E2E encryption
|
||||
5. **CI/CD** — release automation, WASM CI, multi-platform (Linux + macOS)
|
||||
6. **Security hardening:**
|
||||
- Fuzz testing (hybrid KEM, sealed sender, padding, protobuf deser)
|
||||
- Remove all `InsecureServerCertVerifier` paths
|
||||
- Certificate pinning
|
||||
- Add passkey/WebAuthn as alternative auth
|
||||
7. **Double Ratchet for 1:1 DMs** — better per-message forward secrecy than MLS for 2-party
|
||||
|
||||
---
|
||||
|
||||
## RPC Method Inventory (33 total)
|
||||
|
||||
| Category | Methods | Proto File |
|
||||
|----------|---------|-----------|
|
||||
| Auth (OPAQUE) | opaqueRegisterStart, opaqueRegisterFinish, opaqueLoginStart, opaqueLoginFinish | auth.proto |
|
||||
| Delivery | enqueue, fetch, fetchWait, peek, ack, batchEnqueue | delivery.proto |
|
||||
| Keys | uploadKeyPackage, fetchKeyPackage, uploadHybridKey, fetchHybridKey, fetchHybridKeys | keys.proto |
|
||||
| Channel | createChannel | channel.proto |
|
||||
| User | resolveUser, resolveIdentity | user.proto |
|
||||
| Blob | uploadBlob, downloadBlob | blob.proto |
|
||||
| Device | registerDevice, listDevices, revokeDevice | device.proto |
|
||||
| P2P | health, publishEndpoint, resolveEndpoint | p2p.proto |
|
||||
| Federation | relayEnqueue, relayBatchEnqueue, proxyFetchKeyPackage, proxyFetchHybridKey, proxyResolveUser, federationHealth | federation.proto |
|
||||
|
||||
**New in v2:**
|
||||
| Push Events | Description | Proto File |
|
||||
|-------------|-------------|-----------|
|
||||
| MessageNotification | New message available | push.proto |
|
||||
| TypingNotification | Peer is typing | push.proto |
|
||||
| ChannelUpdate | Channel created/member changed | push.proto |
|
||||
| SessionExpired | Auth session expired | push.proto |
|
||||
|
||||
---
|
||||
|
||||
## Engineering Standards (carried from v1)
|
||||
|
||||
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `test:`, `refactor:`
|
||||
- GPG-signed commits only
|
||||
- No `Co-authored-by` trailers
|
||||
- No `.unwrap()` on crypto or I/O in non-test paths
|
||||
- Secrets: zeroize on drop, never in logs
|
||||
- No stubs / `todo!()` / `unimplemented!()` in production code
|
||||
- `clippy::unwrap_used = "deny"` at workspace level
|
||||
Reference in New Issue
Block a user