# Status Log ## 2026-04-11 — Observability & MeshNode run() wiring ### Completed - **observability.rs** — new module with health checks, Prometheus text export, HTTP server - `NodeHealth` struct with per-subsystem health checks (transport, routing, store) - `HealthStatus` enum (Healthy/Degraded/Draining/Unhealthy) with HTTP status codes - `prometheus_text()` — renders `MetricsSnapshot` in Prometheus exposition format - `HealthServer` — lightweight TCP-based HTTP server for `/healthz` and `/metricsz` - **MeshNode.run()** — starts background tasks and returns a `RunHandle` - Periodic GC task (store, routing table, rate limiters) with configurable interval - Health/metrics HTTP server (optional, via `MeshNodeBuilder.health_listen()`) - Shutdown coordination via `watch` channel - **RunHandle** — public API for interacting with a running node - `.node()` — access to the MeshNode - `.health()` — current health snapshot - `.metrics_snapshot()` — current metrics - `.health_addr()` — bound health server address - `.shutdown()` — graceful shutdown (signals tasks + drains transports) - **Tracing spans** — `#[tracing::instrument]` on `process_incoming()` and `send()` - Includes sender/dest address and payload length as span fields - GC cycle wrapped in `mesh_gc` info span - **Draining flag** — `AtomicBool` for shutdown awareness; health endpoint returns 503 ### Test Coverage - 232 total tests passing (212 lib + 3 fapp_flow + 1 meshservice + 16 multi_node) - 7 new observability unit tests (health healthy/degraded/draining, prometheus format) - Full workspace `cargo check` clean ### What's Next 1. Wire `MeshNode.run()` into an example binary or the server 2. Announce loop task (periodic re-announce to neighbors) 3. Grafana dashboard for mesh metrics 4. Integration test for health HTTP endpoint --- ## 2026-04-01 — meshservice workspace integration ### Completed - **Workspace** — `crates/meshservice/` is a workspace member (`Cargo.toml`); `cargo check -p meshservice` and full `cargo check --workspace` succeed. - **P2P bridge test** — `crates/quicprochat-p2p/tests/meshservice_tcp_transport.rs`: same Ed25519 seed for `MeshIdentity` and `meshservice::ServiceIdentity`; FAPP announce encoded with `meshservice::wire`, sent over `TcpTransport`, decoded and handled by `ServiceRouter` + `FappService::relay()`. - **Client command engine** — `SlashCommand::MeshTrace` / `MeshStats` wired through `Command` and `execute_slash` (fixes non-exhaustive match); playbook steps `mesh-trace` / `mesh-stats` added. ### Integration notes - **Transport**: `meshservice` is transport-agnostic; carry `wire::encode` bytes inside `MeshEnvelope` / mesh ALPN (`quicprochat/mesh/1`) for production — not yet a direct dependency from `quicprochat-p2p` lib code. - **FAPP duplication**: `quicprochat-p2p::fapp` (legacy mesh FAPP) and `meshservice::services::fapp` (generic service layer) coexist; long-term alignment TBD. --- ## 2026-04-01 — Production Infrastructure Sprint ### Completed - **Error handling** — `error.rs`: Structured error types with context for all subsystems - MeshError, TransportError, RoutingError, CryptoError, ProtocolError, StoreError, ConfigError - ErrorContext trait for chaining errors with context - Helper methods for common error construction - **Configuration** — `config.rs`: Runtime config with TOML parsing - MeshConfig, IdentityConfig, AnnounceConfig, RoutingConfig, StoreConfig - TransportConfig (QUIC/TCP/LoRa), CryptoConfig, RateLimitConfig, LoggingConfig - Validation with meaningful error messages - MeshConfig::constrained() preset for low-resource devices - **Metrics/Observability** — `metrics.rs`: Counter/Gauge/Histogram primitives - Per-transport metrics (sent/received/errors/bytes) - Routing metrics (table size, lookups, misses) - Store metrics (stored/delivered/expired) - Crypto metrics (encryptions, failures, replay detections) - JSON-serializable MetricsSnapshot for export - **Rate limiting** — `rate_limit.rs`: DoS protection - TokenBucket with configurable refill rate - Per-peer limiters for messages, announces, KeyPackage requests - DutyCycleTracker for LoRa EU868 compliance - BackpressureController with priority-based shedding - **Persistence** — `persistence.rs`: Durable storage - AppendLog with JSON entries and compaction - PersistentRoutingTable with TTL-based expiry - PersistentMessageStore for offline delivery - Atomic file operations with fsync - **Graceful shutdown** — `shutdown.rs`: Coordinated termination - ShutdownCoordinator with phase transitions (Draining → Persisting → Cleanup → Complete) - TaskGuard RAII for tracking active tasks - ConnectionDrainer for clean connection teardown - ShutdownHooks for persist/cleanup callbacks - **Integration tests** — `tests/multi_node.rs`: 16 production scenarios - Rate limiting per-peer isolation - Store-and-forward, message dedup, GC - Envelope V2 signatures, forwarding, broadcast - Config validation, TOML roundtrip - Shutdown coordination, concurrent access ### Test Coverage - 189 unit tests + 16 integration tests = **205 total** - All passing ### What's Next 1. Wire new modules into P2pNode startup 2. Add tracing spans for distributed tracing 3. Health check HTTP endpoint 4. Prometheus metrics export --- ## 2026-04-01 — MeshNode: Production Integration ### Completed - **MeshNode** — `mesh_node.rs`: Production-ready node integrating all subsystems - `MeshNodeBuilder`: Fluent API for configuration - `MeshConfig` integration for all settings - `MeshMetrics` tracking for all operations - Rate limiting on incoming messages via `RateLimiter` - Backpressure control via `BackpressureController` - Graceful shutdown via `ShutdownCoordinator` - Optional `FappRouter` based on capabilities - `MeshRouter` for envelope routing - `TransportManager` for multi-transport support ### Key APIs ```rust // Build a mesh node let node = MeshNodeBuilder::new() .config(config) .identity(identity) .fapp_relay() .fapp_patient() .build() .await?; // Process incoming with rate limiting + metrics let action = node.process_incoming(&sender_addr, envelope)?; // Garbage collection node.gc()?; // Graceful shutdown node.shutdown().await; ``` ### Test Coverage - 222 total tests (203 lib + 3 fapp_flow + 16 multi_node) - 5 new mesh_node tests --- ## 2026-04-01 — FAPP: Complete E2E Flow ### Completed (Latest) - **E2E Encryption** — `fapp.rs`: SlotReserve/SlotConfirm with X25519 + ChaCha20-Poly1305 - `PatientEphemeralKey`: generates X25519 keypair for reservation - `TherapistCrypto`: decrypts reserves, creates confirms with forward secrecy - `PatientCrypto`: creates reserves, decrypts confirmations - Each confirmation uses fresh ephemeral key for forward secrecy - **FappRouter Reserve/Confirm** — `fapp_router.rs`: - `DeliverReserve` / `DeliverConfirm` action variants - `process_slot_reserve()`: routes to therapist or floods - `process_slot_confirm()`: delivers to patient - `send_reserve()` / `send_confirm()`: capability-checked sends - `send_response()`: relay-to-patient response routing - **Integration Tests** — `tests/fapp_flow.rs`: - `full_fapp_flow_announce_query_reserve_confirm`: Complete flow from announce to confirmed appointment - `fapp_rejection_flow`: Tests therapist declining a reservation - `fapp_query_filters`: Tests Fachrichtung, PLZ, and other filters ### Test Coverage - 217 total tests (198 lib + 3 fapp_flow + 16 multi_node) - 31 FAPP-specific tests (24 fapp + 7 fapp_router) ### What's Next 1. Wire FappRouter into P2pNode startup 2. LoRa testing for FAPP messages --- ## 2026-03-31 — FAPP: Free Appointment Propagation Protocol ### Completed - **Protocol spec** — `docs/specs/fapp-protocol.md`: decentralized psychotherapy appointment discovery over mesh - **Rust module** — `crates/quicprochat-p2p/src/fapp.rs`: full data structures, store, query matching, signature verification - **Message types**: SlotAnnounce, SlotQuery, SlotResponse, SlotReserve, SlotConfirm - **Domain model**: Fachrichtung, Modalitaet, Kostentraeger, SlotType (German enum names for domain concepts) - **FappStore**: in-memory cache with dedup (therapist_address + sequence), TTL expiry, signature verification, capacity limits - **Query matching**: filter by Fachrichtung, Modalitaet, Kostentraeger, PLZ prefix, time range, SlotType, max_results - **Privacy model**: therapist identity public (Approbation-bound), patient queries anonymous ### Design Decisions - Extends announce.rs capability bitfield with CAP_FAPP_THERAPIST (0x0100), CAP_FAPP_RELAY (0x0200), CAP_FAPP_PATIENT (0x0400) - Uses same signing pattern as MeshAnnounce: hop_count excluded from signature, forwarding nodes don't re-sign - CBOR wire format consistent with existing envelope/announce code - Location hint is PLZ only (e.g. "80331") — never exact address - Anti-spam: Approbation hash binding, signature verification, sequence-based dedup, rate limiting, TTL enforcement --- ## 2026-03-30 — Mesh Protocol Infrastructure Sprint ### Completed (Latest) - **KeyPackage distribution** — `keypackage_cache.rs` + `mesh_protocol.rs` - MeshAnnounce extended with `keypackage_hash` field - KeyPackageRequest/Response/Unavailable messages - KeyPackageCache with TTL, per-address limits, LRU eviction - **Transport capability negotiation** — `transport.rs` TransportCapability - Auto-classification: Unconstrained/Medium/Constrained/SeverelyConstrained - CryptoMode recommendation per capability level - TransportManager.recommended_crypto(), select_for_size() - **MLS-Lite upgrade path** — `crypto_negotiation.rs` - GroupCryptoState tracks current mode - MlsLiteBootstrap derives MLS-Lite keys from MLS epoch secret - Enables same group to use full MLS on WiFi, MLS-Lite on LoRa ### Previously Completed - **S4: Multi-hop routing** — `MeshRouter` with `send()`, `handle_incoming()`, `forward()`, `drain_store_for()` - **S4: REPL commands** — `/mesh trace
` and `/mesh stats` - **S5: Truncated addresses** — `MeshEnvelopeV2` with 16-byte addresses (~18% smaller) - **MLS-Lite** — Lightweight symmetric mode for constrained links (`mls_lite.rs`) - **Size measurements** — Actual MLS and envelope sizes benchmarked ### Actual Measured Sizes (Key Finding!) | Component | Size | LoRa SF12 fragments | |-----------|------|---------------------| | MLS KeyPackage | 306 bytes | 6 | | MLS Welcome | 840 bytes | 17 | | MLS-Lite (no sig) | 129 bytes | 3 | | MLS-Lite (with sig) | 262 bytes | 6 | | MeshEnvelope V1 | 410 bytes | 9 | | MeshEnvelope V2 | 336 bytes | 7 | | MLS KeyPackage (PQ hybrid) | 2,676 bytes | 53 | **Key insight:** Classical MLS is actually LoRa-viable! 6 fragments for KeyPackage, ~14 sec for group setup at 1% duty. PQ hybrid remains impractical. ### What's Next 1. KeyPackage distribution over mesh (announce-based) 2. Transport capability negotiation 3. Real hardware testing (LoRa boards) 4. MLS-Lite upgrade path to full MLS --- ## 2026-03-30 — Mesh Protocol Gap Analysis ### Completed - Created `docs/plans/mesh-protocol-gaps.md` — honest assessment of QuicProChat vs. Reticulum/Meshtastic/Briar - Created `docs/src/design-rationale/mesh-protocol-comparison.md` — technical comparison document - Updated `docs/positioning.md` — sharper messaging + honest limitations ### Key Insight QuicProChat has **best-in-class crypto** AND **viable mesh efficiency** (for classical MLS). PQ hybrid mode needs constrained-link fallback. ### Open Design Questions - How to distribute KeyPackages over mesh without server? - Should we implement LXMF compatibility for Reticulum interop? --- ## 2026-03-30 — Sprint 6: LoRa transport & integration demo ### Completed - Added `transport_lora.rs`: `LoRaConfig`, Semtech-style airtime estimate, `DutyCycleTracker` (rolling 1 h window, `eu868_one_percent()`), `LoRaMockMedium` + `LoRaTransport` implementing `MeshTransport` (`lora` name for `TransportManager`), LR framing with automatic fragmentation/reassembly, tests (mock roundtrip, fragmentation, duty accounting, `split_for_mtu`). - Example `mesh_lora_relay_demo`: A (LoRa mock) → B (relay) → C (TCP) and reply path; `scripts/mesh-demo.sh` runs it. - Wired `pub mod transport_lora` in `lib.rs`. - Adjusted `cbor_smaller_than_json` to assert CBOR is materially smaller than JSON (fixed overhead dominates; a strict half-JSON threshold failed on current envelope sizes). ### What's next - Optional: UART-backed `LoRaTransport` behind a feature flag (modem-specific framing). - Hardware runbook: replace mock medium with RNode / SX1262 serial when available. ## 2026-03-30 — Sprint 3: Announce & Discovery Protocol ### Completed - Created `MeshAnnounce` struct with Ed25519 signed announcements, CBOR wire format, hop forwarding - Created `compute_address()` — SHA-256 truncation of identity key to 16-byte mesh address - Created `RoutingTable` with `RoutingEntry` — keyed by 16-byte address, supports lookup by address or full key, TTL-based expiry, sequence-based stale rejection - Created `AnnounceDedup` for loop prevention (address+sequence deduplication) - Created `AnnounceConfig` with sensible defaults (10min interval, 30min max age, 8 max hops) - Created `create_announce()` and `process_received_announce()` — complete announce processing pipeline (verify, expiry check, dedup, routing update, propagation decision) - Capability flags: CAP_RELAY, CAP_STORE, CAP_GATEWAY, CAP_CONSTRAINED - Tests: 17 tests across 3 modules covering signature verification, tampering, forwarding, expiry, dedup, routing updates, stale rejection, CBOR roundtrip, address determinism - Updated lib.rs with `announce`, `announce_protocol`, `routing_table` modules ### What's Next - S4: Multi-Hop Routing - Integrate announce protocol with TransportManager for actual broadcast/receive loops - Add tokio async announce loop (periodic re-announce, GC timer) ### Notes - Signature excludes `hop_count` (same design as MeshEnvelope) so forwarding doesn't break verification - Protocol engine uses free functions rather than a stateful struct — simpler, more testable - Cannot run `cargo test` in this environment (no C toolchain / linker available) ## 2026-03-30 — Sprint 2: Transport Abstraction Layer ### Completed - Created `MeshTransport` trait with `send`, `recv`, `discover`, `close` methods - Created `TransportAddr` enum for transport-agnostic addressing (Iroh, Socket, LoRa, Serial, Raw) - Created `TransportInfo` struct for transport capability metadata - Implemented `IrohTransport` wrapping iroh `Endpoint` with same length-prefixed framing as `P2pNode` - Implemented `TcpTransport` using tokio `TcpListener`/`TcpStream` with length-prefixed framing - Implemented `TransportManager` for multi-transport routing based on address type - Added `async-trait` dependency, enabled tokio `net` + `io-util` features - Tests: TransportAddr Display formatting, TCP roundtrip, TransportManager routing, error cases ### What's Next - S3: Announce & Discovery Protocol - Future: integrate transport layer into `HybridRouter` / replace direct iroh usage ### Notes - New transport layer sits alongside existing `P2pNode` — no breaking changes - `IrohTransport` uses separate ALPN (`quicprochat/mesh/1`) to avoid conflicts with `P2pNode` - Cannot run `cargo test`/`cargo clippy` in this environment (no Rust toolchain installed)