Implement transport abstraction (TCP/iroh), announce and routing table, multi-hop mesh router, truncated-address link layer, and LoRa mock medium with fragmentation plus EU868-style duty-cycle accounting. Add mesh_lora_relay_demo and scripts/mesh-demo.sh. Relax CBOR vs JSON size assertion to match fixed-size cryptographic overhead. Extend .gitignore for nested targets and node_modules. Made-with: Cursor
19 KiB
Reticulum-Inspired Mesh Upgrade Plan
Goal: Transform quicprochat's P2P layer from a simple direct/relay hybrid into a self-organizing, multi-hop mesh capable of running over LoRa, Packet Radio, Serial, and other low-bandwidth transports — incorporating 8 years of Reticulum design learnings, but with Rust, MLS, and post-quantum crypto.
Created: 2026-03-30 | Sprints: 6 | Area:
quicprochat-p2p+quicprochat-core
Architecture Vision
Before (current):
Client A ──── iroh QUIC ────► Client B (direct P2P)
│ │
└── QUIC/TLS ── Server ── QUIC/TLS ┘ (relay fallback)
After (target):
Client A ── LoRa ── Node X ── WiFi ── Node Y ── Serial ── Client B
│ │
└── iroh QUIC ── Server (optional) ── iroh QUIC ──────────┘
▲
any transport works:
LoRa, Serial, TCP, UDP, WiFi, Packet Radio, QUIC
Key difference from Reticulum: we keep MLS group encryption, post-quantum hybrid KEM, and formal Protobuf framing. Reticulum's transport-agnostic routing and announce semantics are the inspiration, not the crypto.
Sprint Overview
| Sprint | Name | Focus | Key Deliverable |
|---|---|---|---|
| S1 | Binary Wire Format | Efficiency | CBOR MeshEnvelope, ~70% size reduction |
| S2 | Transport Abstraction | Architecture | MeshTransport trait, pluggable backends |
| S3 | Announce & Discovery | Self-Organization | Network-wide announce propagation + routing table |
| S4 | Multi-Hop Routing | Core Mesh | Autonomous packet forwarding across intermediate nodes |
| S5 | Truncated Addresses + Lightweight Handshake | LoRa-Ready | 16-byte addresses, minimal handshake for constrained links |
| S6 | LoRa Transport + Integration | Hardware | Working LoRa backend, end-to-end mesh demo |
S1 — Binary Wire Format
Problem: MeshEnvelope::to_bytes() uses JSON serialization. A typical envelope
is ~500-800 bytes in JSON. On LoRa at 300 bps, that's 13-21 seconds per message.
Solution: CBOR binary serialization via ciborium (already in workspace deps).
Deliverables:
-
envelope_binary.rs— new serialization functions:MeshEnvelope::to_cbor() -> Vec<u8>— compact binary encodingMeshEnvelope::from_cbor(bytes: &[u8]) -> Result<Self>— decoding- Keep
to_bytes()/from_bytes()as JSON for debug/human-readable use - Add
to_wire() -> Vec<u8>as the default wire format (CBOR) - Add
from_wire(bytes: &[u8]) -> Result<Self>for receiving
-
Compact field encoding:
sender_key: 32 bytes raw (not hex-encoded)recipient_key: 32 bytes raw (or 16 bytes truncated, prep for S5)signature: 64 bytes rawid: 32 bytes rawpayload: raw bytes (no base64)timestamp: u64 (8 bytes)ttl_secs: u32 (4 bytes)hop_count: u8 (1 byte)max_hops: u8 (1 byte)
-
Size comparison test:
- Create identical envelopes, serialize both ways, assert CBOR < 50% of JSON
- Expected: ~140-160 bytes CBOR vs ~500-800 bytes JSON for a typical message
-
Migration:
P2pNode::send_mesh()andbroadcast()switch toto_wire().from_wire()tries CBOR first, falls back to JSON for backward compat.
Tests: Roundtrip CBOR, size comparison, backward compat with JSON, fuzz test for malformed CBOR input.
Estimated changes: ~150 lines new code, ~20 lines modified.
S2 — Transport Abstraction
Problem: P2P layer is hardcoded to iroh QUIC. Cannot support LoRa, Serial, Packet Radio, or other media.
Solution: Abstract transport behind a trait. Reticulum calls this "Interface" —
we call it MeshTransport.
Deliverables:
-
transport.rs— trait definition:#[async_trait] pub trait MeshTransport: Send + Sync { /// Human-readable transport name (e.g., "iroh-quic", "lora", "serial"). fn name(&self) -> &str; /// Maximum transmission unit in bytes. fn mtu(&self) -> usize; /// Estimated bitrate in bits/second (for routing cost calculation). fn bitrate(&self) -> u64; /// Whether this transport supports bidirectional communication. fn is_bidirectional(&self) -> bool; /// Send raw bytes to a destination address. async fn send(&self, dest: &TransportAddr, data: &[u8]) -> Result<()>; /// Receive the next incoming packet. Blocks until data arrives. async fn recv(&self) -> Result<(TransportAddr, Vec<u8>)>; /// List reachable peers on this transport (e.g., mDNS scan, LoRa beacon). async fn discover(&self) -> Result<Vec<TransportAddr>>; } /// Transport-agnostic address. pub enum TransportAddr { /// iroh node ID + optional relay. Iroh(iroh::EndpointAddr), /// IP:port for TCP/UDP transports. Socket(std::net::SocketAddr), /// LoRa device address (4 bytes). LoRa([u8; 4]), /// Serial port path. Serial(String), /// Raw bytes for unknown transports. Raw(Vec<u8>), } -
transport_iroh.rs— refactor existingP2pNodesend/recv intoIrohTransportimplementingMeshTransport. -
transport_tcp.rs— simple TCP transport for testing and wired mesh nodes. Length-prefixed packets over a TCP stream. -
P2pNoderefactor: AcceptVec<Box<dyn MeshTransport>>instead of hardcodedEndpoint. The node listens on all transports simultaneously. -
TransportManager— manages multiple transports, routes outbound packets to the best available transport for a given destination.
Tests: IrohTransport passes existing P2P tests, TcpTransport roundtrip, multi-transport node startup.
Estimated changes: ~400 lines new code, ~100 lines refactored.
S3 — Announce & Discovery Protocol
Problem: No mesh-wide discovery. mDNS only works on LAN. Nodes beyond one hop are invisible.
Solution: Reticulum-style announce propagation. Nodes broadcast signed announcements that propagate through the mesh, building a distributed routing table.
Deliverables:
-
announce.rs— Announce packet:pub struct MeshAnnounce { /// Ed25519 public key of the announcing node. pub identity_key: [u8; 32], /// Truncated address (hash of identity_key, 16 bytes). Prep for S5. pub address: [u8; 16], /// Capabilities bitfield (supports_relay, supports_store, etc.). pub capabilities: u16, /// Sequence number (monotonically increasing per node). pub sequence: u64, /// Unix timestamp. pub timestamp: u64, /// Transports this node is reachable on (list of transport name + addr). pub reachable_via: Vec<(String, Vec<u8>)>, /// Ed25519 signature over all above fields. pub signature: [u8; 64], } -
Announce propagation rules (Reticulum-inspired):
- On startup: broadcast own announce on all transports
- On receiving an announce: verify signature, check sequence > last_seen, update routing table, re-broadcast on all other transports (not the one it arrived on) with hop_count incremented
- Dedup by
(identity_key, sequence)— don't re-broadcast already-seen announces - TTL: announces expire after configurable duration (default 30 minutes)
- Periodic re-announce: every 10 minutes (configurable)
-
routing_table.rs— Distributed routing table:pub struct RoutingTable { /// Known destinations: address -> routing entry. entries: HashMap<[u8; 16], RoutingEntry>, } pub struct RoutingEntry { /// Full public key of the destination. pub identity_key: [u8; 32], /// Next-hop transport + address to reach this destination. pub next_hop: (String, TransportAddr), /// Number of hops to destination (from announce hop_count). pub hops: u8, /// Estimated cost (hops * inverse_bitrate_weight). pub cost: f64, /// When this entry was last refreshed. pub last_seen: Instant, /// Capabilities of the destination. pub capabilities: u16, } -
REPL commands:
/mesh announce— force re-announce/mesh routes— show full routing table (replaces current/mesh route)/mesh nodes— list all known nodes with hop count and transport
Tests: Announce create/verify, propagation dedup, routing table CRUD, announce expiry, 3-node propagation simulation.
Estimated changes: ~500 lines new code.
S4 — Multi-Hop Routing
Problem: Messages can only be sent directly or via server relay. No intermediate node forwarding.
Solution: Autonomous packet forwarding using the routing table from S3. Every node can relay packets for other nodes.
Deliverables:
-
router.rs— replaceHybridRouterwithMeshRouter:pub struct MeshRouter { /// This node's identity. identity: MeshIdentity, /// Routing table (populated by announce protocol). routes: Arc<RwLock<RoutingTable>>, /// Available transports. transports: Arc<TransportManager>, /// Optional server relay (kept as last-resort fallback). server_relay: Option<Arc<dyn ServerRelay>>, /// Store-and-forward for unreachable destinations. store: Arc<Mutex<MeshStore>>, /// Per-peer delivery stats. stats: Arc<Mutex<HashMap<[u8; 16], ConnectionStats>>>, } -
Routing algorithm:
send(destination_addr, payload): 1. Look up destination in routing table 2. If direct transport available → send directly 3. If next-hop known → wrap in MeshEnvelope, send to next-hop (next-hop node will repeat this process) 4. If no route → store-and-forward (queue for later) 5. If server relay available → use as last resort -
Forwarding logic (every node runs this):
on_receive(envelope): 1. Verify signature 2. If addressed to us → deliver to application layer 3. If addressed to someone else: a. Check hop_count < max_hops and not expired b. Look up destination in routing table c. Forward via next-hop transport d. If no route → store for later forwarding -
Path MTU Discovery:
- When routing across transports with different MTUs, fragment if needed
- Fragment header:
[fragment_id: u32][seq: u8][total: u8][payload] - Reassembly buffer with timeout
-
Routing metrics:
- Track per-path latency, success rate, hop count
- Prefer routes with lower cost (fewer hops, higher bitrate)
- Exponential backoff on failed routes
-
REPL commands:
/mesh send <address> <message>— now works multi-hop/mesh trace <address>— show the route a message would take/mesh stats— delivery statistics per destination
Tests: 3-node relay chain (A→B→C), route failover, fragmentation roundtrip, store-and-forward when intermediate node offline, routing metric updates.
Estimated changes: ~600 lines new code, ~200 lines refactored from existing router.
S5 — Truncated Addresses & Lightweight Handshake
Problem: Full 32-byte public keys in every envelope waste bandwidth on constrained links. QUIC TLS handshake is too heavy for LoRa (2-4 KB).
Solution: Truncated hash-based addresses (Reticulum-style) and a minimal ECDH handshake for low-bandwidth transports.
Deliverables:
-
address.rs— Mesh address type:/// 16-byte truncated address derived from Ed25519 public key. /// Matches Reticulum's approach but with different hash construction. pub struct MeshAddress([u8; 16]); impl MeshAddress { /// Derive from an Ed25519 public key. /// SHA-256(public_key)[0..16] pub fn from_public_key(key: &[u8; 32]) -> Self; /// Check if this address matches a given public key. pub fn matches(&self, key: &[u8; 32]) -> bool; } -
Envelope v2 with truncated addresses:
- Replace
sender_key: Vec<u8>(32 bytes) withsender_addr: MeshAddress(16 bytes) - Replace
recipient_key: Vec<u8>(32 bytes) withrecipient_addr: MeshAddress(16 bytes) - Full public keys are exchanged during announce (S3) and cached in routing table
- Saves 32 bytes per envelope (significant on LoRa)
- Replace
-
Lightweight handshake for constrained transports:
Link Setup (inspired by Reticulum, but with PQ option): Packet 1 (Initiator → Responder): 80 bytes [initiator_addr: 16][ephemeral_x25519_pub: 32][nonce: 24][flags: 8] Packet 2 (Responder → Initiator): 112 bytes [responder_addr: 16][ephemeral_x25519_pub: 32][encrypted_identity_proof: 48][nonce: 16] Packet 3 (Initiator → Responder): 48 bytes [encrypted_identity_proof: 48] Total: 240 bytes (vs 2000-4000 for QUIC TLS) Shared secret: HKDF-SHA256(X25519(eph_a, eph_b) || X25519(id_a, eph_b)) -
link.rs—MeshLinksession type:- Negotiated via lightweight handshake on constrained transports
- ChaCha20-Poly1305 for subsequent messages (using derived shared secret)
- Heartbeat to keep link alive (configurable, default every 5 min)
- Link teardown notification
- Automatic upgrade to QUIC if both sides support it
-
Feature flag:
--features constrained-transportgates the lightweight handshake. QUIC remains the default for Internet/LAN.
Tests: Address derivation, collision resistance (generate 10K addresses, check no collisions), handshake 3-packet roundtrip, link encryption roundtrip, envelope v2 with truncated addresses.
Estimated changes: ~500 lines new code.
S6 — LoRa Transport & Integration Demo
Problem: All the mesh infrastructure from S1-S5 needs a real constrained-transport to prove it works.
Solution: LoRa transport backend + end-to-end demo with Meshtastic-compatible or standalone LoRa hardware.
Deliverables:
-
transport_lora.rs— LoRa transport implementation:pub struct LoRaTransport { /// Serial connection to LoRa modem (e.g., SX1276/SX1262 via UART). serial: AsyncSerial, /// LoRa parameters. config: LoRaConfig, } pub struct LoRaConfig { /// Serial port path (e.g., /dev/ttyUSB0). pub port: String, /// Baud rate for serial connection to modem. pub baud_rate: u32, /// LoRa frequency in Hz (e.g., 868_100_000 for EU868). pub frequency: u64, /// Spreading factor (7-12). pub spreading_factor: u8, /// Bandwidth in Hz (125000, 250000, 500000). pub bandwidth: u32, /// Coding rate (5-8, meaning 4/5 to 4/8). pub coding_rate: u8, /// TX power in dBm. pub tx_power: i8, } -
MTU-aware fragmentation:
- LoRa MTU is typically 222 bytes (SF7/BW125) to 51 bytes (SF12/BW125)
- Automatic fragmentation/reassembly in
TransportManager - Fragment numbering for out-of-order reassembly
-
Duty cycle management:
- EU868: 1% duty cycle enforcement
- TX budget tracking: don't exceed legal limits
- Queue with priority (announces < data < emergency)
-
End-to-end integration demo:
Setup: Node A (Laptop + LoRa) ── LoRa ── Node B (RPi + LoRa) ── WiFi ── Node C (Laptop) Demo script: 1. All three nodes start, announce on their transports 2. A discovers C through B's routing announcements 3. A sends encrypted message to C: LoRa → B (relay) → WiFi → C 4. C replies: WiFi → B (relay) → LoRa → A 5. Show routing table, hop counts, delivery stats at each node -
scripts/mesh-demo.sh— automated demo setup script. -
Termux integration:
- Update existing Termux build scripts for the mesh features
- Android phone as a LoRa mesh node (via USB OTG to LoRa modem)
Tests: LoRa transport with mock serial (loopback), fragmentation across LoRa MTU, duty cycle enforcement, 3-node integration test (simulated transports).
Hardware needed: 2-3x LoRa modules (SX1262 recommended), RPi or similar.
Estimated changes: ~600 lines new code, ~50 lines build/script changes.
Dependency Graph
S1 (Binary Wire) S2 (Transport Trait)
│ │
└──────┬───────────────┘
│
S3 (Announce/Discovery)
│
S4 (Multi-Hop Routing)
│
S5 (Addresses + Handshake)
│
S6 (LoRa + Demo)
S1 and S2 can run in parallel (no dependency). S3+ are sequential.
Comparison: quicprochat (after) vs Reticulum
| Dimension | Reticulum | quicprochat (post-upgrade) |
|---|---|---|
| Language | Python | Rust (no_std possible) |
| Crypto | X25519, AES-256-CBC, HMAC-SHA256 | Ed25519, X25519+ML-KEM-768, ChaCha20-Poly1305, MLS |
| Post-Quantum | No | Yes (ML-KEM-768 hybrid) |
| Group Encryption | None (link-level only) | MLS RFC 9420 (forward secrecy + PCS) |
| Wire Format | msgpack | CBOR (compact, IETF standard) |
| Spec | Reference implementation only | Protobuf schemas + potential IETF Draft |
| Transport Agnostic | Yes (mature, 8 years) | Yes (new, but Rust-native) |
| Multi-Hop Routing | Yes (announce + path discovery) | Yes (inspired by Reticulum) |
| Handshake Size | 297 bytes | ~240 bytes |
| Security Audit | None | Designed for auditability (fuzzing, formal model) |
| Embedded Targets | No (CPython required) | Yes (Rust cross-compile, no_std core) |
| LoRa Support | Yes (via RNode) | Yes (direct SX1262 + Meshtastic compat) |
Risk Register
| Risk | Impact | Mitigation |
|---|---|---|
| LoRa hardware availability | Blocks S6 | S1-S5 work with simulated transports; LoRa is optional |
| iroh API breaking changes | Medium | Pin iroh version, abstract behind transport trait (S2) |
| Address collision (16-byte truncation) | Low (birthday: ~2^64) | Monitor, option to use full 32-byte if needed |
| Lightweight handshake security gaps | High | Get crypto review before deploying on real networks |
| Fragmentation complexity | Medium | Start with simple stop-and-wait, optimize later |
Success Criteria
After S4 (minimum viable mesh):
- 3+ nodes form a self-organizing mesh over TCP transports
- Messages route automatically through intermediate nodes
- Node join/leave is handled gracefully (re-announce, route expiry)
- Wire format is <200 bytes for a typical chat message envelope
After S6 (full demo):
- Working LoRa ↔ WiFi ↔ QUIC heterogeneous mesh
- Message delivery across 3 hops with different transports
- Duty cycle compliance on EU868
- Android (Termux) node participates in the mesh