# Reticulum-Inspired Mesh Upgrade Plan > **Goal:** Transform quicprochat's P2P layer from a simple direct/relay hybrid into a > self-organizing, multi-hop mesh capable of running over LoRa, Packet Radio, Serial, > and other low-bandwidth transports — incorporating 8 years of Reticulum design > learnings, but with Rust, MLS, and post-quantum crypto. > > Created: 2026-03-30 | Sprints: 6 | Area: `quicprochat-p2p` + `quicprochat-core` --- ## Architecture Vision ``` Before (current): Client A ──── iroh QUIC ────► Client B (direct P2P) │ │ └── QUIC/TLS ── Server ── QUIC/TLS ┘ (relay fallback) After (target): Client A ── LoRa ── Node X ── WiFi ── Node Y ── Serial ── Client B │ │ └── iroh QUIC ── Server (optional) ── iroh QUIC ──────────┘ ▲ any transport works: LoRa, Serial, TCP, UDP, WiFi, Packet Radio, QUIC ``` Key difference from Reticulum: we keep MLS group encryption, post-quantum hybrid KEM, and formal Protobuf framing. Reticulum's transport-agnostic routing and announce semantics are the inspiration, not the crypto. --- ## Sprint Overview | Sprint | Name | Focus | Key Deliverable | |--------|------|-------|-----------------| | S1 | Binary Wire Format | Efficiency | CBOR `MeshEnvelope`, ~70% size reduction | | S2 | Transport Abstraction | Architecture | `MeshTransport` trait, pluggable backends | | S3 | Announce & Discovery | Self-Organization | Network-wide announce propagation + routing table | | S4 | Multi-Hop Routing | Core Mesh | Autonomous packet forwarding across intermediate nodes | | S5 | Truncated Addresses + Lightweight Handshake | LoRa-Ready | 16-byte addresses, minimal handshake for constrained links | | S6 | LoRa Transport + Integration | Hardware | Working LoRa backend, end-to-end mesh demo | --- ## S1 — Binary Wire Format **Problem:** `MeshEnvelope::to_bytes()` uses JSON serialization. A typical envelope is ~500-800 bytes in JSON. On LoRa at 300 bps, that's 13-21 seconds per message. **Solution:** CBOR binary serialization via `ciborium` (already in workspace deps). **Deliverables:** 1. **`envelope_binary.rs`** — new serialization functions: - `MeshEnvelope::to_cbor() -> Vec` — compact binary encoding - `MeshEnvelope::from_cbor(bytes: &[u8]) -> Result` — decoding - Keep `to_bytes()`/`from_bytes()` as JSON for debug/human-readable use - Add `to_wire() -> Vec` as the default wire format (CBOR) - Add `from_wire(bytes: &[u8]) -> Result` for receiving 2. **Compact field encoding:** - `sender_key`: 32 bytes raw (not hex-encoded) - `recipient_key`: 32 bytes raw (or 16 bytes truncated, prep for S5) - `signature`: 64 bytes raw - `id`: 32 bytes raw - `payload`: raw bytes (no base64) - `timestamp`: u64 (8 bytes) - `ttl_secs`: u32 (4 bytes) - `hop_count`: u8 (1 byte) - `max_hops`: u8 (1 byte) 3. **Size comparison test:** - Create identical envelopes, serialize both ways, assert CBOR < 50% of JSON - Expected: ~140-160 bytes CBOR vs ~500-800 bytes JSON for a typical message 4. **Migration:** `P2pNode::send_mesh()` and `broadcast()` switch to `to_wire()`. `from_wire()` tries CBOR first, falls back to JSON for backward compat. **Tests:** Roundtrip CBOR, size comparison, backward compat with JSON, fuzz test for malformed CBOR input. **Estimated changes:** ~150 lines new code, ~20 lines modified. --- ## S2 — Transport Abstraction **Problem:** P2P layer is hardcoded to iroh QUIC. Cannot support LoRa, Serial, Packet Radio, or other media. **Solution:** Abstract transport behind a trait. Reticulum calls this "Interface" — we call it `MeshTransport`. **Deliverables:** 1. **`transport.rs`** — trait definition: ```rust #[async_trait] pub trait MeshTransport: Send + Sync { /// Human-readable transport name (e.g., "iroh-quic", "lora", "serial"). fn name(&self) -> &str; /// Maximum transmission unit in bytes. fn mtu(&self) -> usize; /// Estimated bitrate in bits/second (for routing cost calculation). fn bitrate(&self) -> u64; /// Whether this transport supports bidirectional communication. fn is_bidirectional(&self) -> bool; /// Send raw bytes to a destination address. async fn send(&self, dest: &TransportAddr, data: &[u8]) -> Result<()>; /// Receive the next incoming packet. Blocks until data arrives. async fn recv(&self) -> Result<(TransportAddr, Vec)>; /// List reachable peers on this transport (e.g., mDNS scan, LoRa beacon). async fn discover(&self) -> Result>; } /// Transport-agnostic address. pub enum TransportAddr { /// iroh node ID + optional relay. Iroh(iroh::EndpointAddr), /// IP:port for TCP/UDP transports. Socket(std::net::SocketAddr), /// LoRa device address (4 bytes). LoRa([u8; 4]), /// Serial port path. Serial(String), /// Raw bytes for unknown transports. Raw(Vec), } ``` 2. **`transport_iroh.rs`** — refactor existing `P2pNode` send/recv into `IrohTransport` implementing `MeshTransport`. 3. **`transport_tcp.rs`** — simple TCP transport for testing and wired mesh nodes. Length-prefixed packets over a TCP stream. 4. **`P2pNode` refactor:** Accept `Vec>` instead of hardcoded `Endpoint`. The node listens on all transports simultaneously. 5. **`TransportManager`** — manages multiple transports, routes outbound packets to the best available transport for a given destination. **Tests:** IrohTransport passes existing P2P tests, TcpTransport roundtrip, multi-transport node startup. **Estimated changes:** ~400 lines new code, ~100 lines refactored. --- ## S3 — Announce & Discovery Protocol **Problem:** No mesh-wide discovery. mDNS only works on LAN. Nodes beyond one hop are invisible. **Solution:** Reticulum-style announce propagation. Nodes broadcast signed announcements that propagate through the mesh, building a distributed routing table. **Deliverables:** 1. **`announce.rs`** — Announce packet: ```rust pub struct MeshAnnounce { /// Ed25519 public key of the announcing node. pub identity_key: [u8; 32], /// Truncated address (hash of identity_key, 16 bytes). Prep for S5. pub address: [u8; 16], /// Capabilities bitfield (supports_relay, supports_store, etc.). pub capabilities: u16, /// Sequence number (monotonically increasing per node). pub sequence: u64, /// Unix timestamp. pub timestamp: u64, /// Transports this node is reachable on (list of transport name + addr). pub reachable_via: Vec<(String, Vec)>, /// Ed25519 signature over all above fields. pub signature: [u8; 64], } ``` 2. **Announce propagation rules (Reticulum-inspired):** - On startup: broadcast own announce on all transports - On receiving an announce: verify signature, check sequence > last_seen, update routing table, re-broadcast on all *other* transports (not the one it arrived on) with hop_count incremented - Dedup by `(identity_key, sequence)` — don't re-broadcast already-seen announces - TTL: announces expire after configurable duration (default 30 minutes) - Periodic re-announce: every 10 minutes (configurable) 3. **`routing_table.rs`** — Distributed routing table: ```rust pub struct RoutingTable { /// Known destinations: address -> routing entry. entries: HashMap<[u8; 16], RoutingEntry>, } pub struct RoutingEntry { /// Full public key of the destination. pub identity_key: [u8; 32], /// Next-hop transport + address to reach this destination. pub next_hop: (String, TransportAddr), /// Number of hops to destination (from announce hop_count). pub hops: u8, /// Estimated cost (hops * inverse_bitrate_weight). pub cost: f64, /// When this entry was last refreshed. pub last_seen: Instant, /// Capabilities of the destination. pub capabilities: u16, } ``` 4. **REPL commands:** - `/mesh announce` — force re-announce - `/mesh routes` — show full routing table (replaces current `/mesh route`) - `/mesh nodes` — list all known nodes with hop count and transport **Tests:** Announce create/verify, propagation dedup, routing table CRUD, announce expiry, 3-node propagation simulation. **Estimated changes:** ~500 lines new code. --- ## S4 — Multi-Hop Routing **Problem:** Messages can only be sent directly or via server relay. No intermediate node forwarding. **Solution:** Autonomous packet forwarding using the routing table from S3. Every node can relay packets for other nodes. **Deliverables:** 1. **`router.rs`** — replace `HybridRouter` with `MeshRouter`: ```rust pub struct MeshRouter { /// This node's identity. identity: MeshIdentity, /// Routing table (populated by announce protocol). routes: Arc>, /// Available transports. transports: Arc, /// Optional server relay (kept as last-resort fallback). server_relay: Option>, /// Store-and-forward for unreachable destinations. store: Arc>, /// Per-peer delivery stats. stats: Arc>>, } ``` 2. **Routing algorithm:** ``` send(destination_addr, payload): 1. Look up destination in routing table 2. If direct transport available → send directly 3. If next-hop known → wrap in MeshEnvelope, send to next-hop (next-hop node will repeat this process) 4. If no route → store-and-forward (queue for later) 5. If server relay available → use as last resort ``` 3. **Forwarding logic (every node runs this):** ``` on_receive(envelope): 1. Verify signature 2. If addressed to us → deliver to application layer 3. If addressed to someone else: a. Check hop_count < max_hops and not expired b. Look up destination in routing table c. Forward via next-hop transport d. If no route → store for later forwarding ``` 4. **Path MTU Discovery:** - When routing across transports with different MTUs, fragment if needed - Fragment header: `[fragment_id: u32][seq: u8][total: u8][payload]` - Reassembly buffer with timeout 5. **Routing metrics:** - Track per-path latency, success rate, hop count - Prefer routes with lower cost (fewer hops, higher bitrate) - Exponential backoff on failed routes 6. **REPL commands:** - `/mesh send
` — now works multi-hop - `/mesh trace
` — show the route a message would take - `/mesh stats` — delivery statistics per destination **Tests:** 3-node relay chain (A→B→C), route failover, fragmentation roundtrip, store-and-forward when intermediate node offline, routing metric updates. **Estimated changes:** ~600 lines new code, ~200 lines refactored from existing router. --- ## S5 — Truncated Addresses & Lightweight Handshake **Problem:** Full 32-byte public keys in every envelope waste bandwidth on constrained links. QUIC TLS handshake is too heavy for LoRa (2-4 KB). **Solution:** Truncated hash-based addresses (Reticulum-style) and a minimal ECDH handshake for low-bandwidth transports. **Deliverables:** 1. **`address.rs`** — Mesh address type: ```rust /// 16-byte truncated address derived from Ed25519 public key. /// Matches Reticulum's approach but with different hash construction. pub struct MeshAddress([u8; 16]); impl MeshAddress { /// Derive from an Ed25519 public key. /// SHA-256(public_key)[0..16] pub fn from_public_key(key: &[u8; 32]) -> Self; /// Check if this address matches a given public key. pub fn matches(&self, key: &[u8; 32]) -> bool; } ``` 2. **Envelope v2 with truncated addresses:** - Replace `sender_key: Vec` (32 bytes) with `sender_addr: MeshAddress` (16 bytes) - Replace `recipient_key: Vec` (32 bytes) with `recipient_addr: MeshAddress` (16 bytes) - Full public keys are exchanged during announce (S3) and cached in routing table - Saves 32 bytes per envelope (significant on LoRa) 3. **Lightweight handshake for constrained transports:** ``` Link Setup (inspired by Reticulum, but with PQ option): Packet 1 (Initiator → Responder): 80 bytes [initiator_addr: 16][ephemeral_x25519_pub: 32][nonce: 24][flags: 8] Packet 2 (Responder → Initiator): 112 bytes [responder_addr: 16][ephemeral_x25519_pub: 32][encrypted_identity_proof: 48][nonce: 16] Packet 3 (Initiator → Responder): 48 bytes [encrypted_identity_proof: 48] Total: 240 bytes (vs 2000-4000 for QUIC TLS) Shared secret: HKDF-SHA256(X25519(eph_a, eph_b) || X25519(id_a, eph_b)) ``` 4. **`link.rs`** — `MeshLink` session type: - Negotiated via lightweight handshake on constrained transports - ChaCha20-Poly1305 for subsequent messages (using derived shared secret) - Heartbeat to keep link alive (configurable, default every 5 min) - Link teardown notification - Automatic upgrade to QUIC if both sides support it 5. **Feature flag:** `--features constrained-transport` gates the lightweight handshake. QUIC remains the default for Internet/LAN. **Tests:** Address derivation, collision resistance (generate 10K addresses, check no collisions), handshake 3-packet roundtrip, link encryption roundtrip, envelope v2 with truncated addresses. **Estimated changes:** ~500 lines new code. --- ## S6 — LoRa Transport & Integration Demo **Problem:** All the mesh infrastructure from S1-S5 needs a real constrained-transport to prove it works. **Solution:** LoRa transport backend + end-to-end demo with Meshtastic-compatible or standalone LoRa hardware. **Deliverables:** 1. **`transport_lora.rs`** — LoRa transport implementation: ```rust pub struct LoRaTransport { /// Serial connection to LoRa modem (e.g., SX1276/SX1262 via UART). serial: AsyncSerial, /// LoRa parameters. config: LoRaConfig, } pub struct LoRaConfig { /// Serial port path (e.g., /dev/ttyUSB0). pub port: String, /// Baud rate for serial connection to modem. pub baud_rate: u32, /// LoRa frequency in Hz (e.g., 868_100_000 for EU868). pub frequency: u64, /// Spreading factor (7-12). pub spreading_factor: u8, /// Bandwidth in Hz (125000, 250000, 500000). pub bandwidth: u32, /// Coding rate (5-8, meaning 4/5 to 4/8). pub coding_rate: u8, /// TX power in dBm. pub tx_power: i8, } ``` 2. **MTU-aware fragmentation:** - LoRa MTU is typically 222 bytes (SF7/BW125) to 51 bytes (SF12/BW125) - Automatic fragmentation/reassembly in `TransportManager` - Fragment numbering for out-of-order reassembly 3. **Duty cycle management:** - EU868: 1% duty cycle enforcement - TX budget tracking: don't exceed legal limits - Queue with priority (announces < data < emergency) 4. **End-to-end integration demo:** ``` Setup: Node A (Laptop + LoRa) ── LoRa ── Node B (RPi + LoRa) ── WiFi ── Node C (Laptop) Demo script: 1. All three nodes start, announce on their transports 2. A discovers C through B's routing announcements 3. A sends encrypted message to C: LoRa → B (relay) → WiFi → C 4. C replies: WiFi → B (relay) → LoRa → A 5. Show routing table, hop counts, delivery stats at each node ``` 5. **`scripts/mesh-demo.sh`** — automated demo setup script. 6. **Termux integration:** - Update existing Termux build scripts for the mesh features - Android phone as a LoRa mesh node (via USB OTG to LoRa modem) **Tests:** LoRa transport with mock serial (loopback), fragmentation across LoRa MTU, duty cycle enforcement, 3-node integration test (simulated transports). **Hardware needed:** 2-3x LoRa modules (SX1262 recommended), RPi or similar. **Estimated changes:** ~600 lines new code, ~50 lines build/script changes. --- ## Dependency Graph ``` S1 (Binary Wire) S2 (Transport Trait) │ │ └──────┬───────────────┘ │ S3 (Announce/Discovery) │ S4 (Multi-Hop Routing) │ S5 (Addresses + Handshake) │ S6 (LoRa + Demo) ``` S1 and S2 can run in **parallel** (no dependency). S3+ are sequential. --- ## Comparison: quicprochat (after) vs Reticulum | Dimension | Reticulum | quicprochat (post-upgrade) | |-----------|-----------|---------------------------| | Language | Python | Rust (no_std possible) | | Crypto | X25519, AES-256-CBC, HMAC-SHA256 | Ed25519, X25519+ML-KEM-768, ChaCha20-Poly1305, MLS | | Post-Quantum | No | Yes (ML-KEM-768 hybrid) | | Group Encryption | None (link-level only) | MLS RFC 9420 (forward secrecy + PCS) | | Wire Format | msgpack | CBOR (compact, IETF standard) | | Spec | Reference implementation only | Protobuf schemas + potential IETF Draft | | Transport Agnostic | Yes (mature, 8 years) | Yes (new, but Rust-native) | | Multi-Hop Routing | Yes (announce + path discovery) | Yes (inspired by Reticulum) | | Handshake Size | 297 bytes | ~240 bytes | | Security Audit | None | Designed for auditability (fuzzing, formal model) | | Embedded Targets | No (CPython required) | Yes (Rust cross-compile, no_std core) | | LoRa Support | Yes (via RNode) | Yes (direct SX1262 + Meshtastic compat) | --- ## Risk Register | Risk | Impact | Mitigation | |------|--------|------------| | LoRa hardware availability | Blocks S6 | S1-S5 work with simulated transports; LoRa is optional | | iroh API breaking changes | Medium | Pin iroh version, abstract behind transport trait (S2) | | Address collision (16-byte truncation) | Low (birthday: ~2^64) | Monitor, option to use full 32-byte if needed | | Lightweight handshake security gaps | High | Get crypto review before deploying on real networks | | Fragmentation complexity | Medium | Start with simple stop-and-wait, optimize later | --- ## Success Criteria After S4 (minimum viable mesh): - [ ] 3+ nodes form a self-organizing mesh over TCP transports - [ ] Messages route automatically through intermediate nodes - [ ] Node join/leave is handled gracefully (re-announce, route expiry) - [ ] Wire format is <200 bytes for a typical chat message envelope After S6 (full demo): - [ ] Working LoRa ↔ WiFi ↔ QUIC heterogeneous mesh - [ ] Message delivery across 3 hops with different transports - [ ] Duty cycle compliance on EU868 - [ ] Android (Termux) node participates in the mesh