feat: Phase 9 — developer experience, extensibility, and community growth

New crates:
- quicproquo-bot: Bot SDK with polling API + JSON pipe mode
- quicproquo-kt: Key Transparency Merkle log (RFC 9162 subset)
- quicproquo-plugin-api: no_std C-compatible plugin vtable API
- quicproquo-gen: scaffolding tool (qpq-gen plugin/bot/rpc/hook)

Server features:
- ServerHooks trait wired into all RPC handlers (enqueue, fetch, auth,
  channel, registration) with plugin rejection support
- Dynamic plugin loader (libloading) with --plugin-dir config
- Delivery proof canary tokens (Ed25519 server signatures on enqueue)
- Key Transparency Merkle log with inclusion proofs on resolveUser

Core library:
- Safety numbers (60-digit HMAC-SHA256 key verification codes)
- Verifiable transcript archive (CBOR + ChaCha20-Poly1305 + hash chain)
- Delivery proof verification utility
- Criterion benchmarks (hybrid KEM, MLS, identity, sealed sender, padding)

Client:
- /verify REPL command for out-of-band key verification
- Full-screen TUI via Ratatui (feature-gated --features tui)
- qpq export / qpq export-verify CLI subcommands
- KT inclusion proof verification on user resolution

Also: ROADMAP Phase 9 added, bot SDK docs, server hooks docs,
crate-responsibilities updated, example plugins (rate_limit, logging).
This commit is contained in:
2026-03-03 22:47:38 +01:00
parent b6483dedbc
commit dc4e4e49a0
62 changed files with 6959 additions and 62 deletions

View File

@@ -4,6 +4,7 @@
quicproquo AI Team
==================
A multi-agent Claude team specialised for the quicproquo Rust workspace.
Agents cover development, security, testing, documentation, and infrastructure.
Usage:
python scripts/ai_team.py "<task>" # orchestrator
@@ -12,6 +13,7 @@ Usage:
python scripts/ai_team.py --parallel \\
"rust-server-dev: Fix unwrap() in server" \\
"security-auditor: Audit quicproquo-core" # ad-hoc parallel
python scripts/ai_team.py --pipeline # full production readiness pipeline
python scripts/ai_team.py --list-agents
python scripts/ai_team.py --list-sprints
@@ -19,6 +21,8 @@ Requires:
pip install claude-agent-sdk
The ANTHROPIC_API_KEY environment variable must be set.
Team specification: docs/AGENT-TEAM.md
"""
import argparse
@@ -292,6 +296,86 @@ After writing tests, run them with Bash and report:
tools=["Read", "Glob", "Grep", "Edit", "Write", "Bash"],
),
"devops-engineer": AgentDefinition(
description=(
"Infrastructure and deployment engineer for quicproquo. Owns Docker, CI/CD "
"(GitHub Actions), deployment configs, cross-compilation, monitoring setup, "
"release automation, and binary size optimisation. Edits docker/, .github/, "
"docker-compose.yml, and infrastructure scripts."
),
prompt=f"""{PROJECT_CONTEXT}
You are the **DevOps Engineer** for quicproquo.
You own: `docker/`, `.github/`, `docker-compose.yml`, deployment configs, CI pipelines.
Responsibilities:
- Docker image builds: multi-stage, minimal final image, non-root user, security hardening.
- GitHub Actions CI: build matrix, test parallelism, caching, artifact publishing.
- Release automation: cargo-release workflow, CHANGELOG generation, version tagging.
- Cross-compilation: musl static builds for x86_64, armv7, aarch64 (OpenWrt targets).
- Monitoring: Prometheus metrics endpoint stub, health check endpoint.
- Infrastructure-as-code: docker-compose for dev/staging, systemd unit files.
Before any edit:
1. Read the target file in full.
2. Check ROADMAP.md Phase 1.3, 1.4, 2.3 for infrastructure items.
3. Test Docker builds with `docker build -f docker/Dockerfile .`
Quality gates:
- Docker image builds successfully.
- CI pipeline syntax is valid (check with `act --dryrun` if available).
- No secrets in Dockerfile ARG/ENV, no running as root in final stage.
- `.gitignore` covers all sensitive file patterns (*.der, *.pem, *.db, *.bin, *.ks).
""",
tools=["Read", "Glob", "Grep", "Edit", "Write", "Bash"],
),
"docs-engineer": AgentDefinition(
description=(
"Technical writer for quicproquo. Writes and maintains user guides, operator "
"documentation, API references, architecture docs, SECURITY.md, CONTRIBUTING.md, "
"and the mdBook site in docs/. Ensures all public APIs have doc comments. "
"Edits docs/, README.md, and inline doc comments only."
),
prompt=f"""{PROJECT_CONTEXT}
You are the **Documentation Engineer** for quicproquo.
You own: `docs/`, `README.md`, `CONTRIBUTING.md`, `SECURITY.md`, and inline `///` doc
comments on public API items.
Documentation tiers (in priority order):
1. **User docs** — Getting started, installation, REPL commands, configuration reference.
2. **Operator docs** — Deployment guide (Docker, systemd), certificate setup, backup/restore,
monitoring, operational runbook, troubleshooting.
3. **Developer docs** — Architecture overview, crate responsibilities, contribution guide,
coding standards, testing guide, PR review checklist.
4. **Protocol docs** — Wire format reference, Cap'n Proto schema docs, MLS integration,
Noise transport spec, federation protocol.
5. **Security docs** — Threat model, trust boundaries, key lifecycle, responsible disclosure
policy, audit report summaries.
Before any edit:
1. Read the target file and any related source code to ensure accuracy.
2. Check the mdBook structure in `docs/book.toml` and `docs/src/SUMMARY.md`.
3. Verify code examples compile (`cargo test --doc` for inline examples).
Quality gates:
- `mdbook build docs/` succeeds without warnings.
- All internal links resolve (no broken cross-references).
- No stale information — verify claims against current source code.
- Spelling and grammar are correct.
Style:
- Write for an audience of experienced developers who may not know Rust.
- Use active voice, present tense.
- Include code examples where they clarify usage.
- Cross-reference related docs sections with relative links.
""",
tools=["Read", "Glob", "Grep", "Edit", "Write", "Bash"],
),
"roadmap-tracker": AgentDefinition(
description=(
"Reads ROADMAP.md and the codebase to determine: which milestones are complete, "
@@ -314,10 +398,10 @@ Steps:
Output format (Markdown):
## Roadmap Status Report
### Completed
### Completed
- Phase X, item Y: ...
### In Progress 🔄
### In Progress
- Phase X, item Y: partial — what exists vs what's missing.
### Next Actionable Tasks (prioritised)
@@ -422,8 +506,193 @@ SPRINTS: dict[str, list[tuple[str, str]]] = {
"key material, any new logging that might leak secrets, and any new external inputs that "
"lack validation. Produce a concise finding report."),
],
# ── Documentation sprints ─────────────────────────────────────────────────
"docs-foundation": [
("docs-engineer",
"Create a root-level SECURITY.md file for quicproquo. Include: "
"(1) Responsible disclosure policy — where to report vulnerabilities (email, PGP key if available). "
"(2) Scope — what's covered (server, client, core crypto, protocol). "
"(3) Response timeline — acknowledge within 48h, triage within 7 days, fix within 30 days for critical. "
"(4) Security contact — project maintainer contact info. "
"(5) Out-of-scope — social engineering, DoS against test instances, etc. "
"Read existing docs/SECURITY-AUDIT.md for context on known security posture. "
"Keep it concise and professional. Follow the format used by major open-source crypto projects."),
("docs-engineer",
"Create a root-level CONTRIBUTING.md file for quicproquo. Read the existing guidance in "
"docs/src/contributing/coding-standards.md and docs/src/contributing/testing.md first. "
"Include: (1) Development setup (Rust toolchain, Cap'n Proto compiler, SQLCipher). "
"(2) Building the project (cargo build --workspace, feature flags). "
"(3) Running tests (cargo test --workspace, E2E with --test-threads 1). "
"(4) PR process (branch naming, conventional commits, review checklist). "
"(5) Coding standards summary (link to full docs). "
"(6) Security requirements for contributions (no unwrap on crypto, zeroize, etc). "
"Keep it actionable — a new contributor should be able to submit a PR after reading this."),
("docs-engineer",
"Write a comprehensive operator deployment guide at docs/src/getting-started/deployment.md. "
"Read the existing docs/src/getting-started/ pages and docker/Dockerfile first. "
"Cover: (1) Docker deployment (docker-compose, volume mounts, env vars). "
"(2) Bare-metal deployment (systemd unit file example, user/group setup). "
"(3) TLS certificate setup (self-signed for dev, Let's Encrypt for prod). "
"(4) Configuration reference (all QPQ_* environment variables). "
"(5) Backup and restore (SQLite/SQLCipher database, key material). "
"(6) Monitoring (structured log output, health checks). "
"(7) Troubleshooting common issues. "
"Update docs/src/SUMMARY.md to include the new page if needed."),
("docs-engineer",
"Audit all existing docs/src/ pages for accuracy against the current codebase. "
"Read each .md file in docs/src/ and cross-reference claims against actual source code. "
"Fix: (1) Stale API references (function names, struct names that changed). "
"(2) Broken internal links between docs pages. "
"(3) Outdated architecture descriptions (e.g. references to MessagePack, old ALPN strings). "
"(4) Missing entries in docs/src/SUMMARY.md for pages that exist. "
"Produce a list of all changes made and any issues you couldn't fix."),
],
"docs-api": [
("docs-engineer",
"Ensure every public API item in quicproquo-core has a doc comment (/// or //!). "
"Read crates/quicproquo-core/src/lib.rs to find all pub exports. For each pub fn, "
"pub struct, pub enum, and pub trait: check if it has a doc comment. If missing, "
"read the implementation to understand what it does, then add a concise doc comment "
"with: one-line summary, parameters, return value, errors, and a short example where "
"appropriate. Run `cargo doc -p quicproquo-core --no-deps` to verify."),
("docs-engineer",
"Document all Cap'n Proto schemas in schemas/. For each .capnp file (auth.capnp, "
"delivery.capnp, federation.capnp, node.capnp): read the schema and the Rust "
"implementation that uses it. Write or update docs/src/wire-format/ pages with: "
"(1) Purpose of each interface. (2) Method signatures with parameter semantics. "
"(3) Error conditions. (4) Example message flows (e.g. auth flow, message send flow). "
"Ensure docs/src/wire-format/overview.md links to all sub-pages."),
],
# ── Infrastructure sprints ────────────────────────────────────────────────
"infra-hardening": [
("devops-engineer",
"Fix the Dockerfile at docker/Dockerfile for production readiness. Read it first. "
"Changes needed: (1) Create a dedicated non-root user 'qpq' (not nobody) with a "
"specific UID/GID. (2) Set QPQ_DATA_DIR=/var/lib/qpq with correct ownership. "
"(3) Handle the excluded p2p crate correctly in workspace build. "
"(4) Add HEALTHCHECK instruction. (5) Use specific base image tags (not :latest). "
"(6) Ensure COPY commands don't pull in .git, target/, logs/, or test data. "
"Test with: docker build -f docker/Dockerfile ."),
("devops-engineer",
"Harden .gitignore at project root. Read the current .gitignore first. Add missing "
"patterns: data/, *.der, *.pem, *.db, *.db-shm, *.db-wal, *.bin, *.ks, "
"qpq-state.*, logs/ai_team/, .env, .env.*, *.key. "
"Verify no sensitive files are already tracked: run git ls-files for each pattern. "
"If any are tracked, report them (do NOT remove from git without confirmation)."),
("devops-engineer",
"Enhance CI pipeline at .github/workflows/ci.yml. Read it first. Add or verify: "
"(1) cargo fmt check passes. (2) cargo clippy --workspace -- -D warnings. "
"(3) cargo test --workspace (with --test-threads 1 for E2E). "
"(4) cargo deny check runs on every PR. (5) cargo audit as blocking check. "
"(6) Docker build validation job (docker build -f docker/Dockerfile .). "
"(7) Rust cache action for faster builds. (8) Matrix for stable + nightly Rust. "
"Also check .github/CODEOWNERS is correctly configured for crypto paths."),
],
# ── Security sprints ──────────────────────────────────────────────────────
"security-full": [
("security-auditor",
"Perform a FULL security audit of the entire quicproquo codebase. Read every .rs file "
"in crates/quicproquo-core/src/, crates/quicproquo-server/src/, and "
"crates/quicproquo-client/src/. Check every file for ALL of: "
"(1) .unwrap()/.expect() outside #[cfg(test)] on crypto, I/O, lock, or parse operations. "
"(2) Key material types missing Zeroize/ZeroizeOnDrop. "
"(3) Secret bytes (keys, passwords, tokens, nonces) potentially reaching tracing/log/println. "
"(4) Non-constant-time comparisons on auth tags, tokens, MACs, or passwords. "
"(5) panic!/unreachable! in production paths. "
"(6) unsafe blocks without // SAFETY: documentation. "
"(7) Missing input validation on RPC boundaries (data from network). "
"(8) Race conditions in shared state (DashMap, Mutex, RwLock). "
"(9) Replay attack vectors in message delivery. "
"(10) Timing side channels in OPAQUE or token validation. "
"Produce a prioritised finding report: Critical > High > Medium > Low > Informational. "
"Each finding must include: file:line, description, attack scenario, remediation."),
("security-auditor",
"Audit infrastructure security. Read docker/Dockerfile, docker-compose.yml, "
".github/workflows/ci.yml, and all files in scripts/. Check: "
"(1) Dockerfile: running as root? secrets in ENV/ARG? base image pinned? "
"(2) docker-compose: volumes expose host paths? ports exposed unnecessarily? "
"(3) CI: secrets handled correctly? artifact permissions? supply chain attacks? "
"(4) Scripts: command injection? path traversal? unsafe eval? "
"(5) Dependencies: check deny.toml config, look for unmaintained/yanked crates. "
"Produce a separate infrastructure security report."),
("security-auditor",
"Review the threat model at docs/src/cryptography/threat-model.md against the current "
"implementation. Read the threat model doc, then verify each claim: "
"(1) Are the stated trust boundaries correctly implemented in code? "
"(2) Does the OPAQUE flow match the documented auth model? "
"(3) Is the Noise_XX handshake configured as documented (XX pattern, not IK/KK)? "
"(4) Does the MLS integration follow RFC 9420 as claimed? "
"(5) Is the hybrid KEM combiner implemented as documented (HKDF-SHA256 with correct info string)? "
"(6) Are there attack vectors NOT covered by the threat model? "
"Produce a threat model gap analysis report."),
],
"security-review": [
("security-auditor",
"Post-change security review. Read all modified files from the most recent work. "
"Focus on: any new .unwrap()/.expect() introduced, new code paths handling key material, "
"new logging that might leak secrets, new external inputs lacking validation, and "
"any new unsafe blocks. Compare against the engineering standards in master-prompt.md. "
"Produce a concise pass/fail report with findings."),
("roadmap-tracker",
"Quick progress check after recent changes. Read ROADMAP.md and check which Phase 1 "
"and Phase 2 items have been completed by the recent work. Update the status report "
"with: items newly completed, items still in progress, next priorities."),
],
# ── Release preparation ───────────────────────────────────────────────────
"release-prep": [
("devops-engineer",
"Prepare release infrastructure. Read Cargo.toml (workspace root) and all crate "
"Cargo.toml files. (1) Verify version numbers are consistent across all crates. "
"(2) Create or update CHANGELOG.md at project root — read git log for recent commits "
"and categorise by: Added, Changed, Fixed, Security. Follow keepachangelog.com format. "
"(3) Verify docker/Dockerfile builds successfully with release profile. "
"(4) Check that cargo package -p quicproquo-server would succeed (dry run). "
"(5) Verify .github/workflows/ci.yml has a release/tag-triggered job if applicable."),
("docs-engineer",
"Final documentation review for release readiness. "
"(1) Verify README.md: feature matrix matches actual implementation, quick start "
"instructions work, crate layout is accurate, all badges are correct. "
"(2) Verify docs/src/getting-started/ pages are up to date. "
"(3) Check that SECURITY.md and CONTRIBUTING.md exist and are accurate. "
"(4) Run mdbook build docs/ and verify no warnings. "
"(5) Produce a docs readiness report: pass/fail with specific issues found."),
("roadmap-tracker",
"Final pre-release status report. Read ROADMAP.md completely. Classify every item as: "
"Complete (implemented + tested), Deferred (not blocking release), or Blocking (must fix "
"before release). Focus on Phase 1 (Production Hardening) — all items must be Complete "
"or have documented mitigations. Produce a release readiness assessment."),
],
}
# ── Production readiness pipeline ─────────────────────────────────────────────
# Ordered list of sprints that form the full production readiness path.
# Each sprint must pass its quality gate before the next begins.
# Sprints within a step run in parallel; steps run sequentially.
PIPELINE: list[tuple[str, str]] = [
("status", "Baseline: assess current state and recent security posture"),
("audit", "Deep dive: full security audit + detailed roadmap analysis"),
("phase1-hardening", "Code: eliminate crash paths across all crates (parallel by crate)"),
("phase1-infra", "Infra: fix Dockerfile, .gitignore, design TLS lifecycle"),
("infra-hardening", "Infra: CI hardening, Docker production config, .gitignore completion"),
("phase2-tests", "Tests: E2E coverage, unit tests for untested paths"),
("docs-foundation", "Docs: SECURITY.md, CONTRIBUTING.md, deployment guide, accuracy audit"),
("docs-api", "Docs: public API doc comments, Cap'n Proto schema documentation"),
("security-full", "Security: comprehensive audit of all code + infra + threat model"),
("security-review", "Gate: post-change security review + progress check"),
("release-prep", "Release: changelog, version consistency, final docs review"),
]
# ── Orchestrator system prompt ─────────────────────────────────────────────────
@@ -433,28 +702,40 @@ You are the **Orchestrator** for the quicproquo AI development team.
Your team of specialist subagents:
| Agent | Role |
|-------|------|
| rust-architect | Architecture design, ADRs, design reviews |
| rust-core-dev | quicproquo-core crate: crypto, MLS, Noise codec |
| rust-server-dev | quicproquo-server crate: AS, DS, RPC server |
| rust-client-dev | quicproquo-client crate: CLI, REPL, local state |
| security-auditor | Security review: unwrap(), zeroize, secrets in logs |
| test-engineer | Unit/integration tests, cargo test runs |
| roadmap-tracker | Roadmap progress assessment |
| Agent | Role | Edits? |
|-------|------|--------|
| rust-architect | Architecture design, ADRs, design reviews | No |
| rust-core-dev | quicproquo-core: crypto, MLS, Noise codec | Yes |
| rust-server-dev | quicproquo-server: AS, DS, RPC server | Yes |
| rust-client-dev | quicproquo-client: CLI, REPL, local state | Yes |
| security-auditor | Security review: code, infra, threat model | No |
| test-engineer | Unit, integration, E2E tests | Yes (tests) |
| devops-engineer | Docker, CI/CD, deployment, monitoring | Yes (infra) |
| docs-engineer | User/operator/developer/protocol docs | Yes (docs) |
| roadmap-tracker | Roadmap progress assessment | No |
Parallelisation rules:
- Agents that own DIFFERENT crates or concern areas can run in parallel.
- rust-core-dev, rust-server-dev, rust-client-dev ALWAYS run in parallel (different crates).
- security-auditor runs AFTER code-changing agents complete (reads their output).
- test-engineer runs AFTER code-changing agents complete (tests their changes).
- docs-engineer and devops-engineer can run in parallel with each other and with dev agents.
- roadmap-tracker can run in parallel with anything (read-only).
Workflow:
1. Read the task carefully.
2. Decide which agent(s) are needed. For multi-step tasks, sequence them logically.
3. Call each required agent with a precise, scoped prompt.
4. Synthesise the agents' outputs into a final report or code deliverable.
5. Always end with: "Next suggested task: ..." based on the ROADMAP.
3. Maximise parallelism: launch agents that touch different files simultaneously.
4. Call each required agent with a precise, scoped prompt.
5. Synthesise the agents' outputs into a final report or code deliverable.
6. Always end with: "Next suggested task: ..." based on the ROADMAP.
Rules:
- Read master-prompt.md and ROADMAP.md before delegating significant tasks.
- Do NOT delegate everything to one agent — split by crate/concern.
- If a task touches security, always invoke security-auditor after code changes.
- If a task adds/modifies functionality, always invoke test-engineer last.
- If a task touches security, always invoke security-auditor AFTER code changes.
- If a task adds/modifies functionality, always invoke test-engineer LAST.
- docs-engineer and devops-engineer work independently — launch them in parallel.
- Keep your synthesis concise — prefer structured output (headers, bullet lists).
"""
@@ -673,6 +954,17 @@ def build_parser() -> argparse.ArgumentParser:
action="store_true",
help="List predefined sprints and exit",
)
parser.add_argument(
"--pipeline",
action="store_true",
help="Run the full production readiness pipeline (all sprints in dependency order)",
)
parser.add_argument(
"--pipeline-from",
metavar="SPRINT",
default=None,
help="Start the pipeline from a specific sprint (skip earlier steps)",
)
parser.add_argument(
"--max-turns",
type=int,
@@ -707,6 +999,12 @@ def list_sprints() -> None:
print(f" [{agent}] {preview}")
print()
print("Production readiness pipeline (--pipeline):\n")
for i, (sprint_name, description) in enumerate(PIPELINE, 1):
count = len(SPRINTS[sprint_name])
print(f" {i:2d}. {sprint_name:<20s} {count} agent(s) — {description}")
print()
def parse_parallel_args(args: list[str]) -> list[tuple[str, str]]:
"""
@@ -733,6 +1031,62 @@ def parse_parallel_args(args: list[str]) -> list[tuple[str, str]]:
return pairs
# ── Pipeline runner ────────────────────────────────────────────────────────────
async def run_pipeline(
max_turns: int,
verbose: bool,
start_from: str | None = None,
) -> None:
"""
Run the full production readiness pipeline: all sprints in dependency order.
Each sprint runs its agents in parallel. Sprints run sequentially because
later sprints depend on earlier ones (e.g. security-review after code changes).
If start_from is set, skip all sprints before that one.
"""
pipeline = list(PIPELINE)
if start_from:
names = [name for name, _ in PIPELINE]
if start_from not in names:
print(f"ERROR: unknown sprint {start_from!r} in pipeline.")
print(f" Valid: {', '.join(names)}")
sys.exit(1)
idx = names.index(start_from)
pipeline = pipeline[idx:]
print(f"\n Skipping {idx} sprint(s), starting from: {start_from}\n")
total = len(pipeline)
print(f"\n{'=' * 70}")
print(f" quicproquo AI Team — Production Readiness Pipeline")
print(f" Steps: {total} | Max turns per agent: {max_turns}")
print(f"{'=' * 70}")
for i, (name, desc) in enumerate(pipeline, 1):
count = len(SPRINTS[name])
print(f" {i:2d}. [{name}] {count} agent(s) — {desc}")
print(f"{'=' * 70}\n")
for step, (sprint_name, description) in enumerate(pipeline, 1):
print(f"\n{'#' * 70}")
print(f" PIPELINE STEP {step}/{total}: {sprint_name}")
print(f" {description}")
print(f"{'#' * 70}\n")
agent_tasks = SPRINTS[sprint_name]
await run_parallel(
agent_tasks, max_turns, verbose, sprint_name=sprint_name
)
if step < total:
print(f"\n Step {step}/{total} complete. Proceeding to next step...\n")
print(f"\n{'=' * 70}")
print(f" PIPELINE COMPLETE — {total} steps executed")
print(f" Review outputs in: logs/ai_team/")
print(f"{'=' * 70}\n")
# ── Entry point ────────────────────────────────────────────────────────────────
async def main() -> None:
@@ -752,7 +1106,12 @@ async def main() -> None:
sys.exit(1)
try:
if args.sprint:
if args.pipeline or args.pipeline_from:
await run_pipeline(
args.max_turns, args.verbose, start_from=args.pipeline_from
)
elif args.sprint:
agent_tasks = SPRINTS[args.sprint]
await run_parallel(
agent_tasks, args.max_turns, args.verbose, sprint_name=args.sprint