Files
quicproquo/docs/status.md
Christian Nennemann 8eba12170e feat: integrate meshservice crate into workspace
- Add meshservice to workspace members
- Fix quicprochat-client: add MeshTrace/MeshStats slash commands
- Add integration test: meshservice_tcp_transport
- Document integration points in README and docs/status.md
- Verify shared identity (IdentityKeypair → MeshAddress)
2026-04-01 18:56:25 +02:00

282 lines
13 KiB
Markdown

# Status Log
## 2026-04-01 — meshservice workspace integration
### Completed
- **Workspace** — `crates/meshservice/` is a workspace member (`Cargo.toml`); `cargo check -p meshservice` and full `cargo check --workspace` succeed.
- **P2P bridge test** — `crates/quicprochat-p2p/tests/meshservice_tcp_transport.rs`: same Ed25519 seed for `MeshIdentity` and `meshservice::ServiceIdentity`; FAPP announce encoded with `meshservice::wire`, sent over `TcpTransport`, decoded and handled by `ServiceRouter` + `FappService::relay()`.
- **Client command engine** — `SlashCommand::MeshTrace` / `MeshStats` wired through `Command` and `execute_slash` (fixes non-exhaustive match); playbook steps `mesh-trace` / `mesh-stats` added.
### Integration notes
- **Transport**: `meshservice` is transport-agnostic; carry `wire::encode` bytes inside `MeshEnvelope` / mesh ALPN (`quicprochat/mesh/1`) for production — not yet a direct dependency from `quicprochat-p2p` lib code.
- **FAPP duplication**: `quicprochat-p2p::fapp` (legacy mesh FAPP) and `meshservice::services::fapp` (generic service layer) coexist; long-term alignment TBD.
---
## 2026-04-01 — Production Infrastructure Sprint
### Completed
- **Error handling** — `error.rs`: Structured error types with context for all subsystems
- MeshError, TransportError, RoutingError, CryptoError, ProtocolError, StoreError, ConfigError
- ErrorContext trait for chaining errors with context
- Helper methods for common error construction
- **Configuration** — `config.rs`: Runtime config with TOML parsing
- MeshConfig, IdentityConfig, AnnounceConfig, RoutingConfig, StoreConfig
- TransportConfig (QUIC/TCP/LoRa), CryptoConfig, RateLimitConfig, LoggingConfig
- Validation with meaningful error messages
- MeshConfig::constrained() preset for low-resource devices
- **Metrics/Observability** — `metrics.rs`: Counter/Gauge/Histogram primitives
- Per-transport metrics (sent/received/errors/bytes)
- Routing metrics (table size, lookups, misses)
- Store metrics (stored/delivered/expired)
- Crypto metrics (encryptions, failures, replay detections)
- JSON-serializable MetricsSnapshot for export
- **Rate limiting** — `rate_limit.rs`: DoS protection
- TokenBucket with configurable refill rate
- Per-peer limiters for messages, announces, KeyPackage requests
- DutyCycleTracker for LoRa EU868 compliance
- BackpressureController with priority-based shedding
- **Persistence** — `persistence.rs`: Durable storage
- AppendLog with JSON entries and compaction
- PersistentRoutingTable with TTL-based expiry
- PersistentMessageStore for offline delivery
- Atomic file operations with fsync
- **Graceful shutdown** — `shutdown.rs`: Coordinated termination
- ShutdownCoordinator with phase transitions (Draining → Persisting → Cleanup → Complete)
- TaskGuard RAII for tracking active tasks
- ConnectionDrainer for clean connection teardown
- ShutdownHooks for persist/cleanup callbacks
- **Integration tests** — `tests/multi_node.rs`: 16 production scenarios
- Rate limiting per-peer isolation
- Store-and-forward, message dedup, GC
- Envelope V2 signatures, forwarding, broadcast
- Config validation, TOML roundtrip
- Shutdown coordination, concurrent access
### Test Coverage
- 189 unit tests + 16 integration tests = **205 total**
- All passing
### What's Next
1. Wire new modules into P2pNode startup
2. Add tracing spans for distributed tracing
3. Health check HTTP endpoint
4. Prometheus metrics export
---
## 2026-04-01 — MeshNode: Production Integration
### Completed
- **MeshNode** — `mesh_node.rs`: Production-ready node integrating all subsystems
- `MeshNodeBuilder`: Fluent API for configuration
- `MeshConfig` integration for all settings
- `MeshMetrics` tracking for all operations
- Rate limiting on incoming messages via `RateLimiter`
- Backpressure control via `BackpressureController`
- Graceful shutdown via `ShutdownCoordinator`
- Optional `FappRouter` based on capabilities
- `MeshRouter` for envelope routing
- `TransportManager` for multi-transport support
### Key APIs
```rust
// Build a mesh node
let node = MeshNodeBuilder::new()
.config(config)
.identity(identity)
.fapp_relay()
.fapp_patient()
.build()
.await?;
// Process incoming with rate limiting + metrics
let action = node.process_incoming(&sender_addr, envelope)?;
// Garbage collection
node.gc()?;
// Graceful shutdown
node.shutdown().await;
```
### Test Coverage
- 222 total tests (203 lib + 3 fapp_flow + 16 multi_node)
- 5 new mesh_node tests
---
## 2026-04-01 — FAPP: Complete E2E Flow
### Completed (Latest)
- **E2E Encryption** — `fapp.rs`: SlotReserve/SlotConfirm with X25519 + ChaCha20-Poly1305
- `PatientEphemeralKey`: generates X25519 keypair for reservation
- `TherapistCrypto`: decrypts reserves, creates confirms with forward secrecy
- `PatientCrypto`: creates reserves, decrypts confirmations
- Each confirmation uses fresh ephemeral key for forward secrecy
- **FappRouter Reserve/Confirm** — `fapp_router.rs`:
- `DeliverReserve` / `DeliverConfirm` action variants
- `process_slot_reserve()`: routes to therapist or floods
- `process_slot_confirm()`: delivers to patient
- `send_reserve()` / `send_confirm()`: capability-checked sends
- `send_response()`: relay-to-patient response routing
- **Integration Tests** — `tests/fapp_flow.rs`:
- `full_fapp_flow_announce_query_reserve_confirm`: Complete flow from announce to confirmed appointment
- `fapp_rejection_flow`: Tests therapist declining a reservation
- `fapp_query_filters`: Tests Fachrichtung, PLZ, and other filters
### Test Coverage
- 217 total tests (198 lib + 3 fapp_flow + 16 multi_node)
- 31 FAPP-specific tests (24 fapp + 7 fapp_router)
### What's Next
1. Wire FappRouter into P2pNode startup
2. LoRa testing for FAPP messages
---
## 2026-03-31 — FAPP: Free Appointment Propagation Protocol
### Completed
- **Protocol spec** — `docs/specs/fapp-protocol.md`: decentralized psychotherapy appointment discovery over mesh
- **Rust module** — `crates/quicprochat-p2p/src/fapp.rs`: full data structures, store, query matching, signature verification
- **Message types**: SlotAnnounce, SlotQuery, SlotResponse, SlotReserve, SlotConfirm
- **Domain model**: Fachrichtung, Modalitaet, Kostentraeger, SlotType (German enum names for domain concepts)
- **FappStore**: in-memory cache with dedup (therapist_address + sequence), TTL expiry, signature verification, capacity limits
- **Query matching**: filter by Fachrichtung, Modalitaet, Kostentraeger, PLZ prefix, time range, SlotType, max_results
- **Privacy model**: therapist identity public (Approbation-bound), patient queries anonymous
### Design Decisions
- Extends announce.rs capability bitfield with CAP_FAPP_THERAPIST (0x0100), CAP_FAPP_RELAY (0x0200), CAP_FAPP_PATIENT (0x0400)
- Uses same signing pattern as MeshAnnounce: hop_count excluded from signature, forwarding nodes don't re-sign
- CBOR wire format consistent with existing envelope/announce code
- Location hint is PLZ only (e.g. "80331") — never exact address
- Anti-spam: Approbation hash binding, signature verification, sequence-based dedup, rate limiting, TTL enforcement
---
## 2026-03-30 — Mesh Protocol Infrastructure Sprint
### Completed (Latest)
- **KeyPackage distribution** — `keypackage_cache.rs` + `mesh_protocol.rs`
- MeshAnnounce extended with `keypackage_hash` field
- KeyPackageRequest/Response/Unavailable messages
- KeyPackageCache with TTL, per-address limits, LRU eviction
- **Transport capability negotiation** — `transport.rs` TransportCapability
- Auto-classification: Unconstrained/Medium/Constrained/SeverelyConstrained
- CryptoMode recommendation per capability level
- TransportManager.recommended_crypto(), select_for_size()
- **MLS-Lite upgrade path** — `crypto_negotiation.rs`
- GroupCryptoState tracks current mode
- MlsLiteBootstrap derives MLS-Lite keys from MLS epoch secret
- Enables same group to use full MLS on WiFi, MLS-Lite on LoRa
### Previously Completed
- **S4: Multi-hop routing** — `MeshRouter` with `send()`, `handle_incoming()`, `forward()`, `drain_store_for()`
- **S4: REPL commands** — `/mesh trace <address>` and `/mesh stats`
- **S5: Truncated addresses** — `MeshEnvelopeV2` with 16-byte addresses (~18% smaller)
- **MLS-Lite** — Lightweight symmetric mode for constrained links (`mls_lite.rs`)
- **Size measurements** — Actual MLS and envelope sizes benchmarked
### Actual Measured Sizes (Key Finding!)
| Component | Size | LoRa SF12 fragments |
|-----------|------|---------------------|
| MLS KeyPackage | 306 bytes | 6 |
| MLS Welcome | 840 bytes | 17 |
| MLS-Lite (no sig) | 129 bytes | 3 |
| MLS-Lite (with sig) | 262 bytes | 6 |
| MeshEnvelope V1 | 410 bytes | 9 |
| MeshEnvelope V2 | 336 bytes | 7 |
| MLS KeyPackage (PQ hybrid) | 2,676 bytes | 53 |
**Key insight:** Classical MLS is actually LoRa-viable! 6 fragments for KeyPackage, ~14 sec for group setup at 1% duty. PQ hybrid remains impractical.
### What's Next
1. KeyPackage distribution over mesh (announce-based)
2. Transport capability negotiation
3. Real hardware testing (LoRa boards)
4. MLS-Lite upgrade path to full MLS
---
## 2026-03-30 — Mesh Protocol Gap Analysis
### Completed
- Created `docs/plans/mesh-protocol-gaps.md` — honest assessment of QuicProChat vs. Reticulum/Meshtastic/Briar
- Created `docs/src/design-rationale/mesh-protocol-comparison.md` — technical comparison document
- Updated `docs/positioning.md` — sharper messaging + honest limitations
### Key Insight
QuicProChat has **best-in-class crypto** AND **viable mesh efficiency** (for classical MLS). PQ hybrid mode needs constrained-link fallback.
### Open Design Questions
- How to distribute KeyPackages over mesh without server?
- Should we implement LXMF compatibility for Reticulum interop?
---
## 2026-03-30 — Sprint 6: LoRa transport & integration demo
### Completed
- Added `transport_lora.rs`: `LoRaConfig`, Semtech-style airtime estimate, `DutyCycleTracker` (rolling 1 h window, `eu868_one_percent()`), `LoRaMockMedium` + `LoRaTransport` implementing `MeshTransport` (`lora` name for `TransportManager`), LR framing with automatic fragmentation/reassembly, tests (mock roundtrip, fragmentation, duty accounting, `split_for_mtu`).
- Example `mesh_lora_relay_demo`: A (LoRa mock) → B (relay) → C (TCP) and reply path; `scripts/mesh-demo.sh` runs it.
- Wired `pub mod transport_lora` in `lib.rs`.
- Adjusted `cbor_smaller_than_json` to assert CBOR is materially smaller than JSON (fixed overhead dominates; a strict half-JSON threshold failed on current envelope sizes).
### What's next
- Optional: UART-backed `LoRaTransport` behind a feature flag (modem-specific framing).
- Hardware runbook: replace mock medium with RNode / SX1262 serial when available.
## 2026-03-30 — Sprint 3: Announce & Discovery Protocol
### Completed
- Created `MeshAnnounce` struct with Ed25519 signed announcements, CBOR wire format, hop forwarding
- Created `compute_address()` — SHA-256 truncation of identity key to 16-byte mesh address
- Created `RoutingTable` with `RoutingEntry` — keyed by 16-byte address, supports lookup by address or full key, TTL-based expiry, sequence-based stale rejection
- Created `AnnounceDedup` for loop prevention (address+sequence deduplication)
- Created `AnnounceConfig` with sensible defaults (10min interval, 30min max age, 8 max hops)
- Created `create_announce()` and `process_received_announce()` — complete announce processing pipeline (verify, expiry check, dedup, routing update, propagation decision)
- Capability flags: CAP_RELAY, CAP_STORE, CAP_GATEWAY, CAP_CONSTRAINED
- Tests: 17 tests across 3 modules covering signature verification, tampering, forwarding, expiry, dedup, routing updates, stale rejection, CBOR roundtrip, address determinism
- Updated lib.rs with `announce`, `announce_protocol`, `routing_table` modules
### What's Next
- S4: Multi-Hop Routing
- Integrate announce protocol with TransportManager for actual broadcast/receive loops
- Add tokio async announce loop (periodic re-announce, GC timer)
### Notes
- Signature excludes `hop_count` (same design as MeshEnvelope) so forwarding doesn't break verification
- Protocol engine uses free functions rather than a stateful struct — simpler, more testable
- Cannot run `cargo test` in this environment (no C toolchain / linker available)
## 2026-03-30 — Sprint 2: Transport Abstraction Layer
### Completed
- Created `MeshTransport` trait with `send`, `recv`, `discover`, `close` methods
- Created `TransportAddr` enum for transport-agnostic addressing (Iroh, Socket, LoRa, Serial, Raw)
- Created `TransportInfo` struct for transport capability metadata
- Implemented `IrohTransport` wrapping iroh `Endpoint` with same length-prefixed framing as `P2pNode`
- Implemented `TcpTransport` using tokio `TcpListener`/`TcpStream` with length-prefixed framing
- Implemented `TransportManager` for multi-transport routing based on address type
- Added `async-trait` dependency, enabled tokio `net` + `io-util` features
- Tests: TransportAddr Display formatting, TCP roundtrip, TransportManager routing, error cases
### What's Next
- S3: Announce & Discovery Protocol
- Future: integrate transport layer into `HybridRouter` / replace direct iroh usage
### Notes
- New transport layer sits alongside existing `P2pNode` — no breaking changes
- `IrohTransport` uses separate ALPN (`quicprochat/mesh/1`) to avoid conflicts with `P2pNode`
- Cannot run `cargo test`/`cargo clippy` in this environment (no Rust toolchain installed)