Files
quicproquo/docs/status.md

224 lines
11 KiB
Markdown

# Status Log
## 2026-04-01 — Production Infrastructure Sprint
### Completed
- **Error handling** — `error.rs`: Structured error types with context for all subsystems
- MeshError, TransportError, RoutingError, CryptoError, ProtocolError, StoreError, ConfigError
- ErrorContext trait for chaining errors with context
- Helper methods for common error construction
- **Configuration** — `config.rs`: Runtime config with TOML parsing
- MeshConfig, IdentityConfig, AnnounceConfig, RoutingConfig, StoreConfig
- TransportConfig (QUIC/TCP/LoRa), CryptoConfig, RateLimitConfig, LoggingConfig
- Validation with meaningful error messages
- MeshConfig::constrained() preset for low-resource devices
- **Metrics/Observability** — `metrics.rs`: Counter/Gauge/Histogram primitives
- Per-transport metrics (sent/received/errors/bytes)
- Routing metrics (table size, lookups, misses)
- Store metrics (stored/delivered/expired)
- Crypto metrics (encryptions, failures, replay detections)
- JSON-serializable MetricsSnapshot for export
- **Rate limiting** — `rate_limit.rs`: DoS protection
- TokenBucket with configurable refill rate
- Per-peer limiters for messages, announces, KeyPackage requests
- DutyCycleTracker for LoRa EU868 compliance
- BackpressureController with priority-based shedding
- **Persistence** — `persistence.rs`: Durable storage
- AppendLog with JSON entries and compaction
- PersistentRoutingTable with TTL-based expiry
- PersistentMessageStore for offline delivery
- Atomic file operations with fsync
- **Graceful shutdown** — `shutdown.rs`: Coordinated termination
- ShutdownCoordinator with phase transitions (Draining → Persisting → Cleanup → Complete)
- TaskGuard RAII for tracking active tasks
- ConnectionDrainer for clean connection teardown
- ShutdownHooks for persist/cleanup callbacks
- **Integration tests** — `tests/multi_node.rs`: 16 production scenarios
- Rate limiting per-peer isolation
- Store-and-forward, message dedup, GC
- Envelope V2 signatures, forwarding, broadcast
- Config validation, TOML roundtrip
- Shutdown coordination, concurrent access
### Test Coverage
- 189 unit tests + 16 integration tests = **205 total**
- All passing
### What's Next
1. Wire new modules into P2pNode startup
2. Add tracing spans for distributed tracing
3. Health check HTTP endpoint
4. Prometheus metrics export
---
## 2026-03-31 — FAPP: Free Appointment Propagation Protocol
### Completed
- **Protocol spec** — `docs/specs/fapp-protocol.md`: decentralized psychotherapy appointment discovery over mesh
- **Rust module** — `crates/quicprochat-p2p/src/fapp.rs`: full data structures, store, query matching, signature verification
- **Message types**: SlotAnnounce, SlotQuery, SlotResponse, SlotReserve, SlotConfirm
- **Domain model**: Fachrichtung, Modalitaet, Kostentraeger, SlotType (German enum names for domain concepts)
- **FappStore**: in-memory cache with dedup (therapist_address + sequence), TTL expiry, signature verification, capacity limits
- **Query matching**: filter by Fachrichtung, Modalitaet, Kostentraeger, PLZ prefix, time range, SlotType, max_results
- **Tests**: 16 inline tests covering creation, signing, verification, tampering, forwarding, expiry, CBOR roundtrip, store dedup, sequence supersede, query filters (PLZ, SlotType, Kostentraeger, max_results)
- **Privacy model**: therapist identity public (Approbation-bound), patient queries anonymous
### Design Decisions
- Extends announce.rs capability bitfield with CAP_FAPP_THERAPIST (0x0100), CAP_FAPP_RELAY (0x0200), CAP_FAPP_PATIENT (0x0400)
- Uses same signing pattern as MeshAnnounce: hop_count excluded from signature, forwarding nodes don't re-sign
- CBOR wire format consistent with existing envelope/announce code
- Location hint is PLZ only (e.g. "80331") — never exact address
- Anti-spam: Approbation hash binding, signature verification, sequence-based dedup, rate limiting, TTL enforcement
### FAPP integration — status
**2026-04-01: FappRouter implemented!**
New `fapp_router.rs` module:
- `FappAction` enum: Ignore, Dropped, Forward, QueryResponse
- Wire format: 1-byte tag (0x01-0x05) + CBOR body
- `FappRouter` struct with shared `RoutingTable` + `TransportManager`
- `handle_incoming()` decodes and dispatches FAPP frames
- `process_slot_announce()` with relay/flood logic (dedup, hop check, store, forward)
- `process_slot_query()` answers from local `FappStore`
- `broadcast_announce()` / `send_query()` for outbound floods
- `drain_pending_sends()` for async send integration
- 3 unit tests passing
**Remaining steps**
1. **Integration test:** Multi-node demo (therapist → relay → patient flow)
2. **Wire to P2pNode:** Add `FappRouter` to `start_with_mesh()` or similar
3. **SlotReserve/SlotConfirm:** E2E encrypted reservation flow
4. **LoRa test:** Verify FAPP over constrained links
**Definition of done**
- announce → query → response works over multi-hop (automated or manual)
- SlotReserve/Confirm E2E encryption works
- LoRa test or documented blocker
---
## 2026-03-30 — Mesh Protocol Infrastructure Sprint
### Completed (Latest)
- **KeyPackage distribution** — `keypackage_cache.rs` + `mesh_protocol.rs`
- MeshAnnounce extended with `keypackage_hash` field
- KeyPackageRequest/Response/Unavailable messages
- KeyPackageCache with TTL, per-address limits, LRU eviction
- **Transport capability negotiation** — `transport.rs` TransportCapability
- Auto-classification: Unconstrained/Medium/Constrained/SeverelyConstrained
- CryptoMode recommendation per capability level
- TransportManager.recommended_crypto(), select_for_size()
- **MLS-Lite upgrade path** — `crypto_negotiation.rs`
- GroupCryptoState tracks current mode
- MlsLiteBootstrap derives MLS-Lite keys from MLS epoch secret
- Enables same group to use full MLS on WiFi, MLS-Lite on LoRa
### Previously Completed
- **S4: Multi-hop routing** — `MeshRouter` with `send()`, `handle_incoming()`, `forward()`, `drain_store_for()`
- **S4: REPL commands** — `/mesh trace <address>` and `/mesh stats`
- **S5: Truncated addresses** — `MeshEnvelopeV2` with 16-byte addresses (~18% smaller)
- **MLS-Lite** — Lightweight symmetric mode for constrained links (`mls_lite.rs`)
- **Size measurements** — Actual MLS and envelope sizes benchmarked
### Actual Measured Sizes (Key Finding!)
| Component | Size | LoRa SF12 fragments |
|-----------|------|---------------------|
| MLS KeyPackage | 306 bytes | 6 |
| MLS Welcome | 840 bytes | 17 |
| MLS-Lite (no sig) | 129 bytes | 3 |
| MLS-Lite (with sig) | 262 bytes | 6 |
| MeshEnvelope V1 | 410 bytes | 9 |
| MeshEnvelope V2 | 336 bytes | 7 |
| MLS KeyPackage (PQ hybrid) | 2,676 bytes | 53 |
**Key insight:** Classical MLS is actually LoRa-viable! 6 fragments for KeyPackage, ~14 sec for group setup at 1% duty. PQ hybrid remains impractical.
### What's Next
1. KeyPackage distribution over mesh (announce-based)
2. Transport capability negotiation
3. Real hardware testing (LoRa boards)
4. MLS-Lite upgrade path to full MLS
---
## 2026-03-30 — Mesh Protocol Gap Analysis
### Completed
- Created `docs/plans/mesh-protocol-gaps.md` — honest assessment of QuicProChat vs. Reticulum/Meshtastic/Briar
- Created `docs/src/design-rationale/mesh-protocol-comparison.md` — technical comparison document
- Updated `docs/positioning.md` — sharper messaging + honest limitations
### Key Insight
QuicProChat has **best-in-class crypto** AND **viable mesh efficiency** (for classical MLS). PQ hybrid mode needs constrained-link fallback.
### Open Design Questions
- How to distribute KeyPackages over mesh without server?
- Should we implement LXMF compatibility for Reticulum interop?
---
## 2026-03-30 — Sprint 6: LoRa transport & integration demo
### Completed
- Added `transport_lora.rs`: `LoRaConfig`, Semtech-style airtime estimate, `DutyCycleTracker` (rolling 1 h window, `eu868_one_percent()`), `LoRaMockMedium` + `LoRaTransport` implementing `MeshTransport` (`lora` name for `TransportManager`), LR framing with automatic fragmentation/reassembly, tests (mock roundtrip, fragmentation, duty accounting, `split_for_mtu`).
- Example `mesh_lora_relay_demo`: A (LoRa mock) → B (relay) → C (TCP) and reply path; `scripts/mesh-demo.sh` runs it.
- Wired `pub mod transport_lora` in `lib.rs`.
- Adjusted `cbor_smaller_than_json` to assert CBOR is materially smaller than JSON (fixed overhead dominates; a strict half-JSON threshold failed on current envelope sizes).
### What's next
- Optional: UART-backed `LoRaTransport` behind a feature flag (modem-specific framing).
- Hardware runbook: replace mock medium with RNode / SX1262 serial when available.
## 2026-03-30 — Sprint 3: Announce & Discovery Protocol
### Completed
- Created `MeshAnnounce` struct with Ed25519 signed announcements, CBOR wire format, hop forwarding
- Created `compute_address()` — SHA-256 truncation of identity key to 16-byte mesh address
- Created `RoutingTable` with `RoutingEntry` — keyed by 16-byte address, supports lookup by address or full key, TTL-based expiry, sequence-based stale rejection
- Created `AnnounceDedup` for loop prevention (address+sequence deduplication)
- Created `AnnounceConfig` with sensible defaults (10min interval, 30min max age, 8 max hops)
- Created `create_announce()` and `process_received_announce()` — complete announce processing pipeline (verify, expiry check, dedup, routing update, propagation decision)
- Capability flags: CAP_RELAY, CAP_STORE, CAP_GATEWAY, CAP_CONSTRAINED
- Tests: 17 tests across 3 modules covering signature verification, tampering, forwarding, expiry, dedup, routing updates, stale rejection, CBOR roundtrip, address determinism
- Updated lib.rs with `announce`, `announce_protocol`, `routing_table` modules
### What's Next
- S4: Multi-Hop Routing
- Integrate announce protocol with TransportManager for actual broadcast/receive loops
- Add tokio async announce loop (periodic re-announce, GC timer)
### Notes
- Signature excludes `hop_count` (same design as MeshEnvelope) so forwarding doesn't break verification
- Protocol engine uses free functions rather than a stateful struct — simpler, more testable
- Cannot run `cargo test` in this environment (no C toolchain / linker available)
## 2026-03-30 — Sprint 2: Transport Abstraction Layer
### Completed
- Created `MeshTransport` trait with `send`, `recv`, `discover`, `close` methods
- Created `TransportAddr` enum for transport-agnostic addressing (Iroh, Socket, LoRa, Serial, Raw)
- Created `TransportInfo` struct for transport capability metadata
- Implemented `IrohTransport` wrapping iroh `Endpoint` with same length-prefixed framing as `P2pNode`
- Implemented `TcpTransport` using tokio `TcpListener`/`TcpStream` with length-prefixed framing
- Implemented `TransportManager` for multi-transport routing based on address type
- Added `async-trait` dependency, enabled tokio `net` + `io-util` features
- Tests: TransportAddr Display formatting, TCP roundtrip, TransportManager routing, error cases
### What's Next
- S3: Announce & Discovery Protocol
- Future: integrate transport layer into `HybridRouter` / replace direct iroh usage
### Notes
- New transport layer sits alongside existing `P2pNode` — no breaking changes
- `IrohTransport` uses separate ALPN (`quicprochat/mesh/1`) to avoid conflicts with `P2pNode`
- Cannot run `cargo test`/`cargo clippy` in this environment (no Rust toolchain installed)