diff --git a/skills/review/SKILL.md b/skills/review/SKILL.md new file mode 100644 index 0000000..5777d26 --- /dev/null +++ b/skills/review/SKILL.md @@ -0,0 +1,141 @@ +--- +name: review +description: | + Review-only mode. Run Guardian + optional reviewers on an existing diff or branch, + without any Plan/Do orchestration. The highest-ROI mode for catching design-level bugs. + User: "af-review" + User: "Review the last commit" + User: "af-review --reviewers guardian,skeptic" +--- + +# ArcheFlow Review Mode + +Run reviewers on existing code changes without orchestrating implementation. +This is the most cost-effective mode — it delivers Guardian's error-path analysis +without the Maker overhead. + +## When to Use + +- After you've implemented something and want a quality check +- On a PR or branch before merging +- When the sprint runner flags a task as DONE_WITH_CONCERNS +- As a pre-commit quality gate for complex changes + +## Invocation + +``` +af-review # Review uncommitted changes +af-review --branch feat/batch-api # Review branch diff against main +af-review --commit HEAD~3..HEAD # Review last 3 commits +af-review --reviewers guardian,skeptic,sage # Choose reviewers (default: guardian) +af-review --evidence # Enable evidence-gating (stricter) +``` + +--- + +## Execution + +### Step 1: Get the Diff + +```bash +# Uncommitted changes +DIFF=$(git diff HEAD) + +# Branch diff +DIFF=$(git diff main...HEAD) + +# Commit range +DIFF=$(git diff HEAD~3..HEAD) + +# If diff is too large (>500 lines), split by file +if [[ $(echo "$DIFF" | wc -l) -gt 500 ]]; then + # Review per-file to keep context focused + FILES=$(git diff --name-only HEAD) +fi +``` + +### Step 2: Spawn Reviewers + +Default: Guardian only (fastest, highest ROI). +With `--reviewers`: spawn requested reviewers in parallel. + +**Guardian** (always first): +``` +Agent( + description: "Guardian: review changes for ", + prompt: "You are the GUARDIAN archetype — security and risk reviewer. + + Review this diff for: security vulnerabilities, error handling gaps, + data loss scenarios, race conditions, and breaking changes. + + For each finding: cite specific code (file:line), state what you tested + or observed, state what the correct behavior should be. + + Diff: + + + STATUS: DONE | DONE_WITH_CONCERNS | NEEDS_CONTEXT | BLOCKED", + subagent_type: "code-reviewer" +) +``` + +**Skeptic** (if requested): +- Focus: hidden assumptions, edge cases, scalability +- Context: diff + any design docs + +**Sage** (if requested): +- Focus: code quality, test coverage, maintainability +- Context: diff + surrounding code + +**Trickster** (if requested): +- Focus: adversarial inputs, failure injection, chaos testing +- Context: diff only + +### Step 3: Collect and Report + +Parse each reviewer's output. Show findings: + +``` +── af-review: ─────────────────────── +Reviewers: guardian, skeptic + +🛡️ Guardian: 2 findings (1 HIGH, 1 MEDIUM) + [HIGH] Timeout marks variant as done — loses batch state (fanout.py:552) + [MEDIUM] No JSON error handling on corrupted state (batch.py:310) + +🤔 Skeptic: 1 finding (1 INFO) + [INFO] hash() non-deterministic across processes (fanout.py:524) + +Total: 3 findings (1 HIGH, 1 MEDIUM, 1 INFO) +──────────────────────────────────────────────── +``` + +### Step 4: Evidence Gate (if --evidence) + +When `--evidence` is active, apply the evidence requirements from `archeflow:check-phase`: +- Scan findings for banned phrases ("might be", "could potentially", etc.) +- Check for evidence markers (exit codes, line numbers, reproduction steps) +- Downgrade unsupported findings to INFO + +--- + +## Integration with Sprint Runner + +The sprint runner can invoke `af-review` automatically: + +| Sprint trigger | Review action | +|----------------|--------------| +| Task marked DONE_WITH_CONCERNS | Run Guardian on the agent's changes | +| Task is L/XL estimate | Run Guardian + Skeptic after completion | +| Task involves security keywords | Run Guardian automatically | +| User requests | Run specified reviewers | + +--- + +## Cost + +Review-only is 60-80% cheaper than full PDCA: +- No Explorer research (~30% of PDCA cost) +- No Creator planning (~20% of PDCA cost) +- No Maker implementation (already done) +- Only reviewer token costs remain diff --git a/skills/sprint/SKILL.md b/skills/sprint/SKILL.md new file mode 100644 index 0000000..3e70123 --- /dev/null +++ b/skills/sprint/SKILL.md @@ -0,0 +1,269 @@ +--- +name: sprint +description: | + Workspace sprint runner. Reads queue.json, spawns parallel agent teams across projects, + manages lifecycle (commit, push, next task), tracks progress. The main operational mode + for ArcheFlow in multi-project workspaces. + User: "af-sprint" + User: "Run the sprint" + User: "af-sprint --slots 5 --dry-run" +--- + +# Workspace Sprint Runner + +Read the task queue, spawn parallel agents across projects, collect results, commit+push, +spawn next batch. Repeat until the queue is drained or budget is exhausted. + +## When to Use + +This is the **primary operational mode** for ArcheFlow in multi-project workspaces. +Use it when the user says "run the sprint", "work the queue", "go autonomous", or +invokes `af-sprint`. + +Do NOT use `archeflow:run` for individual tasks within a sprint — the sprint runner +handles task dispatch internally, using `archeflow:run` only when a task warrants +full PDCA orchestration. + +## Prerequisites + +- `docs/orchestra/queue.json` — task queue (managed by `./scripts/ws`) +- `./scripts/ws` — workspace CLI for queue operations +- Each project is a separate git repo under the workspace root + +## Invocation + +``` +af-sprint # Run sprint with defaults (4 slots, AUTONOM mode) +af-sprint --slots 5 # Max 5 parallel agents +af-sprint --dry-run # Show what would run, don't execute +af-sprint --priority P0,P1 # Only process P0 and P1 items +af-sprint --project writing.colette # Only process items for this project +``` + +--- + +## Execution Protocol + +### Step 0: Orient + +```bash +# Load queue and workspace state +QUEUE=$(cat docs/orchestra/queue.json) +MODE=$(echo "$QUEUE" | jq -r '.mode') +``` + +Check mode: +- `AUTONOM` → proceed without asking +- `ATTENDED` → show plan, wait for user approval before each batch +- `PAUSED` → report status only, do not start tasks + +Show one-line status: +``` +sprint: AUTONOM · 7 pending (1×P0, 1×P2, 5×P3) · 4 slots +``` + +### Step 1: Select Batch + +Pick tasks for the next batch. Rules: + +1. **Priority cascade**: P0 first, then P1, then P2. Never start P3 unless user explicitly includes it. +2. **Dependency check**: Skip tasks whose `depends_on` items aren't all `completed`. +3. **One agent per project**: Never run two tasks on the same project simultaneously. +4. **Cost-aware concurrency**: + - Estimate task cost from `estimate` field: S=cheap, M=moderate, L=expensive, XL=very expensive + - **Expensive tasks** (L, XL): max 2 concurrent + - **Cheap tasks** (S, M): fill remaining slots + - Target mix: 1-2 expensive + 2-3 cheap = 4-5 total +5. **Slot limit**: Never exceed `--slots` (default 4). + +```python +# Pseudocode for batch selection +batch = [] +used_projects = set() +expensive_count = 0 + +for priority in ["P0", "P1", "P2"]: + for task in queue_items(priority, status="pending"): + if len(batch) >= MAX_SLOTS: + break + if task.project in used_projects: + continue # One agent per project + if not deps_satisfied(task): + continue + if task.estimate in ("L", "XL"): + if expensive_count >= 2: + continue + expensive_count += 1 + batch.append(task) + used_projects.add(task.project) +``` + +### Step 2: Assess and Dispatch + +For each task in the batch, decide the execution strategy: + +| Signal | Strategy | What happens | +|--------|----------|-------------| +| Estimate S, clear scope | **Direct** | Spawn Agent() with task description, no orchestration | +| Estimate M, multi-file | **Pipeline** | Spawn Agent() with af-run --strategy pipeline | +| Estimate L/XL, complex | **PDCA** | Spawn Agent() with af-run --strategy pdca | +| Task contains "validate", "test", "lint", "check" | **Direct** | Cheap analytical task, no orchestration | +| Task contains "review", "audit", "security" | **Review** | Spawn Guardian + relevant reviewers only | + +**Agent spawn template:** + +For each task in the batch, spawn an Agent in the SAME message (parallel dispatch): + +``` +Agent( + description: ": ", + prompt: "You are working on project at . + Task: + + + Rules: + - Read the project's CLAUDE.md first + - Commit with: git -c user.signingkey=/home/c/.ssh/id_ed25519_dev.pub commit + - NO Co-Authored-By trailers + - Conventional commits + - Push when done: GIT_SSH_COMMAND='ssh -i /home/c/.ssh/id_ed25519_dev -o IdentitiesOnly=yes' git push origin main + - Run tests if the project has them + - Report: what you did, what changed, any blockers + + STATUS: DONE | DONE_WITH_CONCERNS | NEEDS_CONTEXT | BLOCKED", + subagent_type: "general-purpose", + isolation: "worktree" # Only for L/XL tasks; S/M tasks run directly +) +``` + +**CRITICAL: Spawn all batch agents in a SINGLE message.** This enables parallel execution. +Do not spawn them sequentially. + +### Step 3: Mark Running + +After spawning, update the queue: + +```bash +# For each spawned task +./scripts/ws start # or manually update queue.json status to "running" +``` + +If `./scripts/ws start` doesn't exist, update queue.json directly: +```python +task["status"] = "running" +# Write back to docs/orchestra/queue.json +``` + +### Step 4: Collect Results + +As agents complete, process their results: + +1. **Parse status token** from agent output (last line: `STATUS: DONE|...`) +2. **Based on status**: + - `DONE` → mark completed, note result + - `DONE_WITH_CONCERNS` → mark completed, log concerns for user review + - `NEEDS_CONTEXT` → mark pending, add concern to notes, skip for now + - `BLOCKED` → mark failed, add blocker to notes +3. **Update queue**: + ```bash + ./scripts/ws done -r "" + # or + ./scripts/ws fail -r "" + ``` + +### Step 5: Report and Loop + +After batch completes, show sprint status: + +``` +── Sprint Batch 1 ────────────────────────────── + ✓ writing.colette fanout run done (45s) + ✓ book.3sets validation done (30s) + △ book.sos meta-book concept needs_context (missing outline) + ✓ tool.archeflow af-review mode done (60s) + +Queue: 3 completed, 1 blocked, 3 remaining +Next batch: 2 items ready +──────────────────────────────────────────────── +``` + +Then **immediately select and dispatch the next batch** (Step 1). Don't wait for user input in AUTONOM mode. + +### Step 6: Sprint Complete + +When no more tasks are schedulable (all done, blocked, or P3-only): + +1. Update `docs/control-center.md` Handoff section +2. Run `./scripts/ws log --summary ""` if available +3. Show final sprint report: + +``` +── Sprint Complete ───────────────────────────── +Duration: 12 min +Tasks: 5 completed, 1 blocked, 1 remaining (P3) +Projects touched: 4 +Commits: 7 +──────────────────────────────────────────────── +``` + +--- + +## Mode Behavior + +### AUTONOM +- Dispatch immediately, no user confirmation +- Commit + push after each agent completes +- Only pause for BLOCKED tasks or budget exhaustion +- Report between batches (one-line status) + +### ATTENDED +- Show the selected batch before dispatching +- Wait for user to approve: "Proceed with this batch? [y/n]" +- After each batch, show results and ask: "Continue to next batch? [y/n/edit]" +- "edit" lets the user reprioritize before next batch + +### PAUSED +- Show queue status only +- Do not dispatch any agents +- Useful for reviewing state between sessions + +--- + +## When to Use ArcheFlow Orchestration Within Sprint + +Most sprint tasks should be **direct agent dispatch** (no PDCA/pipeline overhead). +Only escalate to full orchestration when: + +| Signal | Action | +|--------|--------| +| Task is S/M, clear scope, single project | Direct dispatch | +| Task is L/XL | Use pipeline or PDCA strategy | +| Task mentions "security", "auth", "encryption" | Add Guardian review | +| Task is a review/audit | Spawn reviewers only (af-review mode) | +| Task failed in a previous sprint | Escalate to PDCA with Explorer | + +The sprint runner's job is **throughput**, not perfection. Ship fast, fix forward. + +--- + +## Integration with Existing Tools + +| Tool | How sprint uses it | +|------|-------------------| +| `./scripts/ws next` | Get next schedulable task | +| `./scripts/ws done ` | Mark task completed | +| `./scripts/ws fail ` | Mark task failed | +| `./scripts/ws orient` | Initial workspace overview | +| `./scripts/ws validate` | Pre-flight queue validation | +| `git` per project | Commit + push after each agent | +| `archeflow:run` | Only for L/XL tasks needing PDCA | + +--- + +## Error Recovery + +- **Agent crashes mid-task**: Mark task as `failed`, add error to notes, continue with next batch +- **Git push fails**: Log the error, do NOT retry. User will handle push conflicts manually. +- **Queue file corrupted**: Run `./scripts/ws validate`. If invalid, stop sprint and report. +- **Budget exceeded**: Stop sprint, report remaining tasks and estimated cost. +- **All tasks blocked**: Report dependency graph, suggest which blockers to resolve first.