# Future Research Directions This page catalogues technologies and research directions that could strengthen quicprochat beyond the current [milestone plan](milestones.md). Each entry includes a brief description, the problem it solves, relevant crates or specifications, and how it maps to the project architecture. For the production readiness work breakdown, see [Production Readiness WBS](production-readiness.md). --- ## Transport and Networking ### WebTransport (HTTP/3) **Problem:** Browser clients cannot use raw QUIC. The current stack requires a native Rust binary. **Solution:** [WebTransport](https://w3c.github.io/webtransport/) exposes QUIC-like semantics (multiplexed bidirectional streams, datagrams) to browsers over HTTP/3. A WebTransport endpoint alongside the existing QUIC listener would enable a web client without WebSocket degradation. **Architecture impact:** Add a second listener (HTTP/3 + WebTransport) that terminates WebTransport and bridges into the existing `NodeService` RPC layer. Cap'n Proto serialisation works in WASM via `capnp` crate. **Crates:** `h3`, `h3-webtransport`, `wtransport` ### Tor / I2P Integration **Problem:** MLS protects message content, but connection metadata (who connects to the server, when, how often) leaks to the server and network observers. **Solution:** Route client-server connections through [Tor](https://www.torproject.org/) onion services or [I2P](https://geti2p.net/) tunnels. This provides metadata resistance at the network layer. **Architecture impact:** The server exposes a `.onion` address (Tor) or an I2P destination. Clients connect through the anonymity network. Latency increases significantly, so this should be optional. **Crates:** `arti` (Tor client in Rust), `arti-client` --- ## Storage and Persistence ### CRDTs (Automerge / Yrs) **Problem:** Multi-device support requires synchronising state (group membership, read receipts, settings) across devices without a central authority resolving conflicts. **Solution:** Conflict-free replicated data types (CRDTs) allow concurrent edits to converge without coordination. [Automerge](https://automerge.org/) and [Yrs](https://docs.rs/yrs/) (Yjs in Rust) provide production-quality CRDT implementations. **Architecture impact:** Client-side state (contact list, group membership cache, read markers) stored as CRDT documents. Synchronisation happens over the existing MLS-encrypted channel, ensuring the server never sees the state. **Crates:** `automerge`, `yrs` ### Object Storage (S3-compatible) **Problem:** Encrypted file and media attachments need a storage backend that the server can host without seeing the content. **Solution:** An S3-compatible object store (MinIO, Garage, or a cloud provider) for encrypted blobs. Clients encrypt attachments client-side (using a key derived from the MLS group secret) and upload the ciphertext. The server stores and serves opaque blobs. **Architecture impact:** Add a media upload/download RPC to `NodeService`. The server proxies to the object store or returns pre-signed URLs. **Crates:** `aws-sdk-s3`, `opendal` --- ## Cryptography and Privacy ### ML-KEM + ML-DSA Hybrid (Post-Quantum MLS) **Problem:** Quantum computers threaten X25519 and Ed25519. While MLS content is protected by ephemeral key exchange, the init keys and credential signatures are vulnerable to harvest-now-decrypt-later attacks. **Solution:** Hybrid X25519 + ML-KEM-768 KEM for MLS init keys, and optionally hybrid Ed25519 + ML-DSA-65 for credential signatures. The `ml-kem` crate is already vendored in the workspace. **Architecture impact:** Custom `OpenMlsCryptoProvider` in `quicprochat-core` implementing the hybrid combiner. This is the M7 milestone -- see [Milestones](milestones.md#m7----post-quantum-planned) and [Hybrid KEM](../protocol-layers/hybrid-kem.md). **Crates:** `ml-kem`, `ml-dsa` **References:** NIST FIPS 203 (ML-KEM), `draft-ietf-tls-hybrid-design` ### Private Information Retrieval (PIR) **Problem:** When a client fetches messages or KeyPackages, the server learns *which* recipient is requesting -- even though it cannot read the content. **Solution:** Private Information Retrieval (PIR) allows a client to fetch a record from the server without revealing which record was requested. [SealPIR](https://github.com/microsoft/SealPIR) and SimplePIR provide practical constructions. **Architecture impact:** Replace the `fetch` / `fetchKeyPackage` RPCs with PIR queries. This is a significant performance trade-off: PIR has high computational cost. Suitable for KeyPackage fetch (small database) before message fetch (large database). ### Key Transparency (RFC draft) **Problem:** A compromised server could substitute public keys, performing a man-in-the-middle attack on MLS group formation. **Solution:** A verifiable, append-only log of public key bindings (similar to Certificate Transparency for TLS). Clients verify that the server's response matches the log before trusting a fetched KeyPackage. **Architecture impact:** Add a key transparency log (Merkle tree) alongside the Authentication Service. Clients verify inclusion proofs on every `fetchKeyPackage` response. **References:** `draft-ietf-keytrans-protocol` --- ## Identity and Authentication ### DIDs (Decentralized Identifiers) **Problem:** User identities are currently bound to the server. If the server goes away, identities are lost. **Solution:** [Decentralized Identifiers](https://www.w3.org/TR/did-core/) (`did:key`, `did:web`) provide self-sovereign identity. A user's DID is derived from their Ed25519 public key and is portable across servers. **Architecture impact:** Replace raw Ed25519 public keys in MLS credentials with DID URIs. The server resolves DIDs to public keys for routing. **Crates:** `did-key`, `ssi` ### WebAuthn / Passkeys **Problem:** Password-based auth (even with OPAQUE) is vulnerable to phishing. Hardware-backed authentication provides stronger device binding. **Solution:** [WebAuthn](https://www.w3.org/TR/webauthn-3/) / Passkeys allow authentication via hardware tokens (YubiKey), platform authenticators (Touch ID, Windows Hello), or synced passkeys. **Architecture impact:** Add a WebAuthn registration/authentication flow to the account system. Requires a server-side WebAuthn relying party implementation. **Crates:** `webauthn-rs` ### Verifiable Credentials (W3C VC) **Problem:** Proving attributes (organization membership, role, age) without revealing full identity. **Solution:** [Verifiable Credentials](https://www.w3.org/TR/vc-data-model/) allow a user to present cryptographic proofs of attributes issued by a trusted authority. **Architecture impact:** Extend MLS credentials with VC presentation. A group admin could require proof of organization membership before allowing join. --- ## Application Layer ### Matrix-style Federation **Problem:** A single server is a single point of failure and a single point of trust. Users on different servers cannot communicate. **Solution:** Federation allows multiple quicprochat servers to exchange messages, similar to [Matrix](https://matrix.org/) homeserver federation. Each server manages its own users and relays messages to peer servers. **Architecture impact:** Major. Requires server-to-server protocol, distributed identity resolution, and cross-server MLS group management. ### WASM Plugin System **Problem:** Extensibility (bots, bridges, custom message types) currently requires forking the codebase. **Solution:** A sandboxed WASM plugin system allows third-party extensions to run inside the client or server without access to private key material. **Architecture impact:** Define a plugin API (message hooks, command handlers). Plugins compiled to WASM and loaded at runtime via `wasmtime` or `wasmer`. **Crates:** `wasmtime`, `wasmer`, `extism` ### Double-Ratchet DM Layer **Problem:** MLS is optimised for groups. For efficient 1:1 conversations, the Signal double ratchet (X3DH + Axolotl) provides better performance characteristics (no tree overhead for two parties). **Solution:** Implement a double-ratchet layer for 1:1 DMs, using MLS only for groups with N > 2. The [1:1 Channel Design](dm-channels.md) currently uses MLS for DMs; this would be an optimisation. **References:** [The Double Ratchet Algorithm](https://signal.org/docs/specifications/doubleratchet/), [X3DH Key Agreement Protocol](https://signal.org/docs/specifications/x3dh/) --- ## Observability and Operations ### OpenTelemetry (Tracing + Metrics) **Problem:** The current logging is `tracing`-based but lacks distributed tracing context and structured metrics export. **Solution:** [OpenTelemetry](https://opentelemetry.io/) provides a unified framework for distributed tracing, metrics, and log correlation. OTLP export enables integration with any observability backend. **Architecture impact:** Add `tracing-opentelemetry` and `opentelemetry-otlp` to the server. Instrument RPC handlers with spans. Export to Jaeger, Grafana Tempo, or any OTLP-compatible backend. **Crates:** `opentelemetry`, `opentelemetry-otlp`, `tracing-opentelemetry` ### Prometheus + Grafana **Problem:** No quantitative visibility into server performance (throughput, latency, queue depth, epoch advancement rate). **Solution:** Export Prometheus metrics from the server. Visualise with Grafana dashboards. **Metrics to export:** message throughput (enqueue/fetch per second), RPC latency histograms, MLS epoch advancement rate, delivery queue depth, KeyPackage store size, active connections. **Crates:** `prometheus`, `metrics`, `metrics-exporter-prometheus` ### Testcontainers-rs **Problem:** Integration tests currently run server and client in the same process (`tokio::spawn`). This does not test real network conditions, container startup, or multi-process interactions. **Solution:** [Testcontainers-rs](https://docs.rs/testcontainers/) runs Docker containers from Rust tests, enabling true end-to-end CI with real network boundaries. **Architecture impact:** Add testcontainers-based integration tests alongside the existing in-process tests. The Docker image is already maintained. **Crates:** `testcontainers`, `testcontainers-modules` --- ## Developer Experience ### Tauri / Dioxus (Native GUI) **Problem:** The current interface is CLI-only. A graphical client would broaden the user base for testing and demonstration. **Solution:** [Tauri](https://tauri.app/) or [Dioxus](https://dioxuslabs.com/) provide native cross-platform GUI frameworks in Rust. The `quicprochat-core` crate can be shared directly with the GUI client. **Architecture impact:** Add a `quicprochat-gui` crate that depends on `quicprochat-core` and `quicprochat-proto`. The GUI drives the same `GroupMember` and RPC logic as the CLI client. **Crates:** `tauri`, `dioxus` ### uniffi / diplomat (Mobile FFI) **Problem:** Mobile clients (iOS, Android) cannot use the Rust binary directly. **Solution:** [uniffi](https://github.com/aspect-build/aspect-cli) (Mozilla) and [diplomat](https://github.com/nickelc/diplomat) generate idiomatic Swift and Kotlin bindings from Rust definitions. **Architecture impact:** Expose `quicprochat-core` through a C-compatible FFI layer. Mobile apps call into the Rust crypto and protocol logic. **Crates:** `uniffi`, `diplomat` ### Nix Flakes **Problem:** The development environment requires `capnp` (Cap'n Proto compiler), a specific Rust toolchain version, and test infrastructure. Setup varies across developer machines. **Solution:** [Nix flakes](https://nixos.wiki/wiki/Flakes) provide a reproducible, declarative development environment. A single `nix develop` command sets up the toolchain, `capnp`, and all dependencies. **Architecture impact:** Add `flake.nix` and `flake.lock` to the repository root. --- ## Top Priority Implementations The following table ranks the most impactful technologies for near-term adoption, considering the current state of the codebase and the [milestone plan](milestones.md). Items marked **Implemented** are already part of the v2 codebase. | Priority | Technology | Why | Status | |----------|-----------|-----|--------| | -- | **Post-quantum hybrid KEM** | `ml-kem` vendored; custom `OpenMlsCryptoProvider` with X25519 + ML-KEM-768. | **Implemented** | | -- | **SQLCipher persistence** | Encrypted-at-rest storage via rusqlite + bundled-sqlcipher + Argon2id key derivation. | **Implemented** | | -- | **OPAQUE auth** | Zero-knowledge password authentication via `opaque-ke`. Server never stores passwords. | **Implemented** | | -- | **iroh P2P** | NAT traversal and optional P2P mesh via the `quicprochat-p2p` crate (feature-flagged). | **Implemented** | | -- | **Sealed Sender** | `--sealed-sender` flag encrypts sender identity inside MLS ciphertext. | **Implemented** | | 1 | **PIR (Private Information Retrieval)** | Fetch messages without revealing the recipient's identity to the server. | Future | | 2 | **Key Transparency** | Verifiable, append-only log of public key bindings. Detects key substitution attacks. | Future | | 3 | **WebTransport (HTTP/3)** | Enables browser clients without a WebSocket bridge. | Future | | 4 | **OpenTelemetry** | Distributed tracing and structured metrics for production observability. | Future | | 5 | **WebAuthn / Passkeys** | Hardware-backed authentication to replace password-based login. | Future | --- ## Cross-references - [Milestones](milestones.md) -- current milestone tracker - [Production Readiness WBS](production-readiness.md) -- phased work breakdown - [Auth, Devices, and Tokens](authz-plan.md) -- OPAQUE integration point - [1:1 Channel Design](dm-channels.md) -- double-ratchet optimisation context - [Hybrid KEM](../protocol-layers/hybrid-kem.md) -- existing PQ design - [References](../appendix/references.md) -- standards and crate documentation