--- name: run description: | Start an ArcheFlow PDCA run. Usage: /af-run [--workflow fast|standard|thorough] [--dry-run] [--start-from plan|do|check|act] --- # ArcheFlow Run — PDCA Orchestration One command runs the full cycle: Plan (Explorer+Creator) -> Do (Maker in worktree) -> Check (Guardian first, then others) -> Act (collect findings, route fixes, exit or cycle). ## 0. Initialize 1. Generate run ID: `-` 2. Create artifact directory: `mkdir -p .archeflow/artifacts/` 3. Verify `./lib/archeflow-*.sh` scripts exist before proceeding 4. Inject cross-run memory: `./lib/archeflow-memory.sh inject "$DOMAIN" "" --audit "$RUN_ID"` 5. Read `.archeflow/config.yaml` models section. Resolution order: per-workflow per-archetype > per-workflow default > per-archetype > global default. 6. Emit `run.start` event ### Strategy Selection Determine strategy from CLI flag `--strategy`, config `strategy:` field, or auto-detect: | Signal | Strategy | |--------|----------| | Task contains fix/bug/patch/hotfix | `pipeline` | | Task contains refactor/redesign/review | `pdca` | | Workflow is `fast` with single file | `pipeline` | | Workflow is `thorough` | `pdca` | | Default | `pdca` | If `pipeline`, skip to the Pipeline section at the end. Otherwise continue with PDCA below. ### Workflow Selection | Signal | Workflow | Max Cycles | |--------|----------|------------| | Small fix, low risk, single concern | `fast` | 1 | | Feature, multiple files, moderate risk | `standard` | 2 | | Security-sensitive, breaking changes, public API | `thorough` | 3 | --- ## Attention Filters Each agent receives only what it needs. This is the canonical reference: | Archetype | Receives | Excludes | |-----------|----------|----------| | Explorer | Task description, codebase access | Prior proposals, reviews, diffs | | Creator | Task + Explorer output (+ feedback cycle 2+) | Raw files, diffs, reviewer outputs | | Maker | Creator's proposal (+ Maker-routed feedback cycle 2+) | Explorer research, reviewer outputs | | Guardian | Maker's diff + proposal risk section | Full proposal, Explorer research | | Skeptic | Creator's proposal (assumptions focus) | Diff details, Explorer research | | Sage | Proposal + diff + implementation summary | Explorer research, other reviews | | Trickster | Maker's diff only | Everything else | --- ## Status Token Protocol Every agent ends output with `STATUS: `. Parse it to decide the next action. | Status | Action | |--------|--------| | `DONE` | Proceed to next phase | | `DONE_WITH_CONCERNS` | Log concerns, proceed | | `NEEDS_CONTEXT` | Pause, request info from user | | `BLOCKED` | Abort phase, report blocker | If no status token found, default to `DONE`. --- ## 1. Plan Phase ### 1a. Explorer (standard/thorough only) ``` Agent( description: "Explorer: research context for ", prompt: "You are the EXPLORER archetype. Research: 1) affected files/functions, 2) dependencies, 3) test coverage, 4) codebase patterns. Write a structured research report. Be thorough but focused — no rabbit holes.", subagent_type: "Explore" ) ``` Save output to `.archeflow/artifacts//plan-explorer.md`. ### 1b. Creator Fast workflow (no Explorer): Creator must perform Mini-Reflect first: 1. Restate the task in one sentence 2. List 3 assumptions 3. Name the highest-damage risk ``` Agent( description: "Creator: design proposal for ", prompt: "You are the CREATOR archetype. > > Design a proposal: 1. Architecture decisions (with rationale) 2. Files to create/modify (exact paths, specific changes, 2-5 min per item) 3. Alternatives considered (2+, with rejection rationale) 4. Test strategy (specific test cases) 5. Confidence table (task understanding, solution completeness, risk coverage — each 0.0-1.0) 6. Risks and mitigations Be decisive — ship a clear plan, not a menu.", subagent_type: "Plan" ) ``` Save output to `.archeflow/artifacts//plan-creator.md`. ### 1c. Confidence Gate (Rule A3) Parse the `### Confidence` table from `plan-creator.md`. If unparseable, default to 0.0 (triggers gate). | Axis | Score < 0.5 | Action | |------|-------------|--------| | Task understanding | Pause | Ask user for clarification. Do not spawn Maker. | | Solution completeness | Upgrade | If fast -> standard. Spawn Explorer, re-run Creator. | | Risk coverage | Mini-Explorer | Spawn focused Explorer for risky areas (5 min max, parallel with Do prep). | Mini-Explorer prompt: "You are the EXPLORER. Risk coverage scored . Identified risks: . Research ONLY the risky areas. Is the risk real? What mitigations exist?" Save to `plan-mini-explorer.md`. --- ## 2. Do Phase ### 2a. Maker ``` Agent( description: "Maker: implement ", prompt: "You are the MAKER archetype. Implement this proposal: > Rules: 1. Follow the proposal exactly — don't redesign 2. Write tests for every behavioral change 3. Commit with descriptive messages (CRITICAL: uncommitted worktree changes are LOST) 4. Run existing tests — nothing may break 5. If unclear, implement best interpretation and note it Self-review before finishing: - All proposal files changed? Tests added? No out-of-scope files? Existing tests pass?", isolation: "worktree", mode: "bypassPermissions" ) ``` Save summary to `do-maker.md`. Save changed file list to `do-maker-files.txt`. ### 2b. Test-First Gate Check `do-maker-files.txt` for test files. If none found and domain is not `writing`: - If Creator specified a test strategy: re-run Maker with targeted test instruction (1 retry within Do, not a full cycle) - If no test strategy: emit WARNING, proceed --- ## 3. Check Phase Spawn Guardian FIRST. Evaluate Rule A2 before spawning other reviewers. ### 3a. Guardian (always first) ``` Agent( description: "Guardian: security review for ", prompt: "You are the GUARDIAN archetype. Review changes in branch: Assess: security, reliability, breaking changes, dependencies. Output: APPROVED or REJECTED with findings table. Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix | Be rigorous but practical — flag real risks, not theoretical ones." ) ``` Save to `check-guardian.md`. ### 3b. Guardian Fast-Path (Rule A2) If Guardian found **0 CRITICAL and 0 WARNING** AND workflow is not escalated AND not first cycle of thorough: - Skip remaining reviewers, proceed directly to Act phase - Log "Guardian fast-path taken" ### 3c. Remaining Reviewers (spawn in parallel) **Skeptic** (standard/thorough) — receives Creator's proposal, focus on assumptions. Save to `check-skeptic.md`. **Sage** (standard/thorough) — receives proposal + diff + implementation summary. Save to `check-sage.md`. **Trickster** (thorough only) — receives Maker's diff only. "Think like a QA engineer paid per bug." Save to `check-trickster.md`. ### 3d. Evidence Validation After all reviewers complete, scan CRITICAL/WARNING findings. Downgrade to INFO if: - **Banned phrases** without evidence: "might be", "could potentially", "appears to", "seems like", "may not" - **No evidence**: no command output, code citation, line reference, or reproduction steps Track downgrades in events (do NOT modify artifact files). Act phase excludes downgraded findings from CRITICAL tallies. --- ## 4. Act Phase ### 4a. Collect Verdicts Read all `check-*.md` artifacts. Tally CRITICAL/WARNING/INFO per reviewer. ### 4b. Escalation Check (Rule A1) If `fast` workflow and Guardian found 2+ CRITICAL: upgrade next cycle to `standard` (add Skeptic + Sage). Once escalated, stays escalated. A2 does not apply to escalated workflows. ### 4c. Convergence Check (cycle 2+ only) Compare current findings against previous cycle. Classify each as NEW / RESOLVED / PERSISTENT / REGRESSED. ``` convergence_score = resolved / (resolved + new + regressed) ``` | Score | Status | Action | |-------|--------|--------| | > 0.8 | Converging | Continue if cycles remain | | 0.5-0.8 | Stalling | Continue with caution | | < 0.5 | Diverging | STOP if 2 consecutive cycles | | 0.0 (all persistent) | Stuck | STOP | **Oscillation**: Finding present in cycle N-2, absent in N-1, back in N. Two or more oscillating findings = STOP and escalate to user. Convergence STOP overrides normal cycle-back even if cycles remain. ### 4d. All Approved 1. Run pre-merge hooks from `.archeflow/hooks.yaml` if defined 2. Merge: `./lib/archeflow-git.sh merge "$RUN_ID" --no-ff` 3. Post-merge test validation: `./lib/archeflow-rollback.sh "$RUN_ID"` (auto-reverts if tests fail) 4. If rollback triggered and cycles remain: cycle back with "integration test failure" feedback 5. Clean up worktree: `./lib/archeflow-git.sh cleanup "$RUN_ID"` 6. Proceed to Completion ### 4e. Issues Found (cycles remaining) Build `act-feedback.md` using the feedback routing table below. Archive current cycle artifacts to `cycle-/`. Increment cycle, go back to Plan. ### 4f. Max Cycles Reached Report unresolved findings. Present best implementation on its branch (not merged). Let user decide. --- ## Feedback Routing Table Route each finding to the right agent for the next cycle: | Source | Category | Routes To | Rationale | |--------|----------|-----------|-----------| | Guardian | security, breaking-change | **Creator** | Design must change | | Guardian | reliability, dependency | **Creator** | Architectural decision needed | | Skeptic | design, scalability | **Creator** | Assumptions need revision | | Sage | quality, consistency | **Maker** | Implementation refinement | | Sage | testing | **Maker** | Test gap, not design flaw | | Trickster | reliability (design flaw) | **Creator** | Needs redesign | | Trickster | reliability (test gap) | **Maker** | Needs more tests | | Trickster | testing | **Maker** | Edge case not covered | **Disambiguation**: If fix requires changing the approach -> Creator. If fix requires changing the code within the existing approach -> Maker. ### Feedback File Format `act-feedback.md` splits into `## Creator-Routed Issues` and `## Maker-Routed Issues`. Inject only the relevant section into each agent's prompt. **Same finding in 2 consecutive cycles**: escalate to user. Do not cycle again blindly. **Cross-archetype dedup**: If two reviewers raise the same issue (same file + category), merge into one finding. Route to higher-priority destination (Creator over Maker). --- ## 5. Completion 1. Emit `run.complete` event 2. Check for regressions: `./lib/archeflow-memory.sh regression-check` 3. Generate report: `./lib/archeflow-report.sh .archeflow/events/.jsonl` 4. Score effectiveness: `./lib/archeflow-score.sh extract .archeflow/events/.jsonl` 5. Append to run index: `.archeflow/events/index.jsonl` --- ## Progress Display ``` ━━━ ArcheFlow Run: ━━━━━━━━━━━━━━━━━━━ Run ID: | Workflow: | Cycle: 1/ [Plan] Explorer researching... -> done (35s) [Plan] Creator designing proposal... -> done (25s, confidence: 0.8) [Do] Maker implementing... -> done (90s, 4 files, 8 tests) [Check] Guardian reviewing... -> APPROVED [Check] Skeptic challenging... -> APPROVED (1 INFO) [Check] Sage reviewing... -> APPROVED [Act] All approved — merging... -> merged to main ━━━ Complete: 3m 10s, 1 cycle ━━━━━━━━━━━━━━━ Artifacts: .archeflow/artifacts// Report: .archeflow/events/.jsonl ``` --- ## Shadow Monitoring Quick-check after each agent completes: | Archetype | Shadow | Trigger | |-----------|--------|---------| | Explorer | Rabbit Hole | Output >2000 words without Recommendation section | | Creator | Over-Architect | >2 new abstractions for one feature | | Maker | Rogue | No tests in changeset, or files outside proposal | | Guardian | Paranoid | CRITICAL:WARNING ratio >2:1, or zero approvals | | Skeptic | Paralytic | >7 challenges, <50% have alternatives | | Trickster | False Alarm | Findings in untouched code, or >10 findings | | Sage | Bureaucrat | Review >2x code change length | On detection: apply correction prompt. On 2nd detection of same shadow: replace agent. On 3+ shadows in one cycle: escalate to user. --- ## Event Reference Emit events via `./lib/archeflow-event.sh ''`. Events are optional — never let logging block orchestration. | When | Event Type | Key Data | |------|-----------|----------| | Run starts | `run.start` | task, workflow, max_cycles | | Before agent spawn | `agent.start` | archetype, model, prompt_summary | | After agent returns | `agent.complete` | archetype, duration_ms, artifacts, summary | | Phase boundary | `phase.transition` | from, to, artifacts_so_far | | Alternative chosen | `decision` | what, chosen, alternatives, rationale | | Orchestrator decision (replay) | `decision.point` | archetype, input, decision, confidence — use `./lib/archeflow-decision.sh` | | Reviewer verdict | `review.verdict` | archetype, verdict, findings[] | | Fix addressing review | `fix.applied` | source, finding, file, line | | End of PDCA cycle | `cycle.boundary` | cycle, max_cycles, exit_condition, convergence | | Shadow triggered | `shadow.detected` | archetype, shadow, trigger, action | | Run ends | `run.complete` | status, cycles, agents_total, fixes_total | Parent rules: `run.start` has `parent: []`. Agents parent to the event that triggered them. Phase transitions fan-in from all completing events. Parallel agents share the same parent. --- ## Artifact Naming All artifacts live in `.archeflow/artifacts//`: | File | Content | |------|---------| | `plan-explorer.md` | Explorer research (missing in fast) | | `plan-creator.md` | Creator proposal with confidence | | `plan-mini-explorer.md` | Risk research (only if A3 triggers) | | `do-maker.md` | Maker implementation summary | | `do-maker-files.txt` | Changed file paths (one per line) | | `check-guardian.md` | Guardian verdict + findings | | `check-skeptic.md` | Skeptic verdict (if spawned) | | `check-sage.md` | Sage verdict (if spawned) | | `check-trickster.md` | Trickster verdict (if spawned) | | `act-feedback.md` | Structured feedback for next cycle | | `act-fixes.jsonl` | Applied fixes log | | `cycle-/` | Archived artifacts from cycle N | Always check artifact existence before injecting. Missing optional artifacts are expected — skip, don't fail. Git diff is generated on-the-fly (`git diff main...`), not saved to disk. --- ## Effectiveness Scoring After each run, `./lib/archeflow-score.sh extract` scores review archetypes on: | Dimension | Weight | |-----------|--------| | Signal-to-noise (useful / total findings) | 0.30 | | Fix rate (findings that led to fixes) | 0.25 | | Cost efficiency (useful findings per dollar) | 0.20 | | Accuracy (not contradicted by others) | 0.15 | | Cycle impact (led to cycle exit) | 0.10 | Scores stored in `.archeflow/memory/effectiveness.jsonl`. After 10+ runs, recommend model tier changes and archetype removal. --- ## Run replay (decision log + what-if) After key choices (routing, fast-path skip, escalation), emit `decision.point` via `./lib/archeflow-decision.sh` so runs can be inspected with `./lib/archeflow-replay.sh timeline|whatif|compare `. Weighted what-if helps estimate how much each review archetype swayed the effective ship/block outcome. See skill `af-replay`. --- ## Dry-Run Mode When `--dry-run`: Run Plan phase only. Display workflow, agent counts, confidence scores, cost estimate. Ask user to proceed. If yes, continue with `--start-from do`. ## Start-From Mode | Start from | Required artifacts | |------------|--------------------| | `plan` | None | | `do` | `plan-creator.md` | | `check` | `plan-creator.md`, `do-maker.md`, `do-maker-files.txt` | | `act` | All `check-*.md` files | Validate required artifacts exist. Error if missing. --- ## Error Handling - **Agent timeout (5 min)**: Emit error event, abort run. Do not retry blindly. - **Event emitter fails**: Log warning, continue. Events are observation, not control flow. - **Artifact write fails**: Blocking — abort and report. Artifacts are required for phase handoff. - **Merge conflict**: Do not force-resolve. Report conflict, leave branch intact, let user decide. --- ## Pipeline Strategy Linear flow with no cycle-back. Use for bug fixes and well-understood single-concern tasks. ``` Plan (Creator only) -> Implement (Maker) -> Spec-Review (Guardian, then Skeptic if findings) -> Quality-Review (Sage) -> Verify (tests + merge) ``` 1. **Plan**: Spawn Creator with Mini-Reflect (no Explorer). Save `plan-creator.md`. 2. **Implement**: Spawn Maker in worktree. Save `do-maker.md`. 3. **Spec-Review**: Guardian first. Skeptic only if Guardian has findings. 4. **Quality-Review**: Sage reviews proposal + diff + summary. 5. **Verify**: Run tests. If pass and 0 CRITICAL: merge. If CRITICAL: one targeted Maker fix, re-review, re-test. If still failing: abort, report branch name for manual resolution. WARNINGs are logged but do not block merge. ``` ━━━ ArcheFlow Pipeline: ━━━━━━━━━━━━━━━━ Run ID: | Strategy: pipeline [Plan] Creator designing... -> done (20s) [Implement] Maker building... -> done (60s, 3 files) [Spec] Guardian reviewing... -> APPROVED [Quality] Sage reviewing... -> APPROVED (1 WARNING) [Verify] Tests passing... -> merged to main ━━━ Complete: 2m 15s ━━━━━━━━━━━━━━━━━━━━━━━━━━ ```