Merge run + orchestration + plan-phase + do-phase + artifact-routing + process-log + attention-filters + convergence + effectiveness into a single 459-line run/SKILL.md. Before: run skill (890 lines) + 3 prerequisites (~1,300 lines) = ~2,200 lines of context. After: one self-contained skill (459 lines) with zero prerequisites. Preserved: PDCA flow, workflow selection, adaptation rules A1-A3, agent prompts, attention filters, feedback routing, convergence detection, effectiveness scoring, shadow monitoring, pipeline strategy, event reference, artifact naming. Removed: verbose bash code blocks, shell variable tracking, resolve_model() function, lib validation loops, evidence validation bash, redundant event emission blocks.
18 KiB
name, description
| name | description |
|---|---|
| run | Start an ArcheFlow PDCA run. Usage: /af-run <task description> [--workflow fast|standard|thorough] [--dry-run] [--start-from plan|do|check|act] |
ArcheFlow Run — PDCA Orchestration
One command runs the full cycle: Plan (Explorer+Creator) -> Do (Maker in worktree) -> Check (Guardian first, then others) -> Act (collect findings, route fixes, exit or cycle).
0. Initialize
- Generate run ID:
<YYYY-MM-DD>-<task-slug> - Create artifact directory:
mkdir -p .archeflow/artifacts/<run_id> - Verify
./lib/archeflow-*.shscripts exist before proceeding - Inject cross-run memory:
./lib/archeflow-memory.sh inject "$DOMAIN" "" --audit "$RUN_ID" - Read
.archeflow/config.yamlmodels section. Resolution order: per-workflow per-archetype > per-workflow default > per-archetype > global default. - Emit
run.startevent
Strategy Selection
Determine strategy from CLI flag --strategy, config strategy: field, or auto-detect:
| Signal | Strategy |
|---|---|
| Task contains fix/bug/patch/hotfix | pipeline |
| Task contains refactor/redesign/review | pdca |
Workflow is fast with single file |
pipeline |
Workflow is thorough |
pdca |
| Default | pdca |
If pipeline, skip to the Pipeline section at the end. Otherwise continue with PDCA below.
Workflow Selection
| Signal | Workflow | Max Cycles |
|---|---|---|
| Small fix, low risk, single concern | fast |
1 |
| Feature, multiple files, moderate risk | standard |
2 |
| Security-sensitive, breaking changes, public API | thorough |
3 |
Attention Filters
Each agent receives only what it needs. This is the canonical reference:
| Archetype | Receives | Excludes |
|---|---|---|
| Explorer | Task description, codebase access | Prior proposals, reviews, diffs |
| Creator | Task + Explorer output (+ feedback cycle 2+) | Raw files, diffs, reviewer outputs |
| Maker | Creator's proposal (+ Maker-routed feedback cycle 2+) | Explorer research, reviewer outputs |
| Guardian | Maker's diff + proposal risk section | Full proposal, Explorer research |
| Skeptic | Creator's proposal (assumptions focus) | Diff details, Explorer research |
| Sage | Proposal + diff + implementation summary | Explorer research, other reviews |
| Trickster | Maker's diff only | Everything else |
Status Token Protocol
Every agent ends output with STATUS: <token>. Parse it to decide the next action.
| Status | Action |
|---|---|
DONE |
Proceed to next phase |
DONE_WITH_CONCERNS |
Log concerns, proceed |
NEEDS_CONTEXT |
Pause, request info from user |
BLOCKED |
Abort phase, report blocker |
If no status token found, default to DONE.
1. Plan Phase
1a. Explorer (standard/thorough only)
Agent(
description: "Explorer: research context for <task>",
prompt: "You are the EXPLORER archetype.
<task description>
Research: 1) affected files/functions, 2) dependencies, 3) test coverage,
4) codebase patterns. Write a structured research report.
Be thorough but focused — no rabbit holes.",
subagent_type: "Explore"
)
Save output to .archeflow/artifacts/<run_id>/plan-explorer.md.
1b. Creator
Fast workflow (no Explorer): Creator must perform Mini-Reflect first:
- Restate the task in one sentence
- List 3 assumptions
- Name the highest-damage risk
Agent(
description: "Creator: design proposal for <task>",
prompt: "You are the CREATOR archetype.
<task description>
<if fast: Perform Mini-Reflect first (restate task, 3 assumptions, top risk)>
<if standard/thorough: Research findings: <plan-explorer.md contents>>
<if cycle 2+: Prior feedback: <Creator-routed section of act-feedback.md>>
Design a proposal:
1. Architecture decisions (with rationale)
2. Files to create/modify (exact paths, specific changes, 2-5 min per item)
3. Alternatives considered (2+, with rejection rationale)
4. Test strategy (specific test cases)
5. Confidence table (task understanding, solution completeness, risk coverage — each 0.0-1.0)
6. Risks and mitigations
<if cycle 2+: 7. How each prior issue was addressed (Fixed/Deferred/Accepted/Disputed)>
Be decisive — ship a clear plan, not a menu.",
subagent_type: "Plan"
)
Save output to .archeflow/artifacts/<run_id>/plan-creator.md.
1c. Confidence Gate (Rule A3)
Parse the ### Confidence table from plan-creator.md. If unparseable, default to 0.0 (triggers gate).
| Axis | Score < 0.5 | Action |
|---|---|---|
| Task understanding | Pause | Ask user for clarification. Do not spawn Maker. |
| Solution completeness | Upgrade | If fast -> standard. Spawn Explorer, re-run Creator. |
| Risk coverage | Mini-Explorer | Spawn focused Explorer for risky areas (5 min max, parallel with Do prep). |
Mini-Explorer prompt: "You are the EXPLORER. Risk coverage scored . Identified risks: . Research ONLY the risky areas. Is the risk real? What mitigations exist?"
Save to plan-mini-explorer.md.
2. Do Phase
2a. Maker
Agent(
description: "Maker: implement <task>",
prompt: "You are the MAKER archetype.
Implement this proposal: <plan-creator.md contents>
<if cycle 2+: Implementation feedback: <Maker-routed findings from act-feedback.md>>
Rules:
1. Follow the proposal exactly — don't redesign
2. Write tests for every behavioral change
3. Commit with descriptive messages (CRITICAL: uncommitted worktree changes are LOST)
4. Run existing tests — nothing may break
5. If unclear, implement best interpretation and note it
Self-review before finishing:
- All proposal files changed? Tests added? No out-of-scope files? Existing tests pass?",
isolation: "worktree",
mode: "bypassPermissions"
)
Save summary to do-maker.md. Save changed file list to do-maker-files.txt.
2b. Test-First Gate
Check do-maker-files.txt for test files. If none found and domain is not writing:
- If Creator specified a test strategy: re-run Maker with targeted test instruction (1 retry within Do, not a full cycle)
- If no test strategy: emit WARNING, proceed
3. Check Phase
Spawn Guardian FIRST. Evaluate Rule A2 before spawning other reviewers.
3a. Guardian (always first)
Agent(
description: "Guardian: security review for <task>",
prompt: "You are the GUARDIAN archetype.
Review changes in branch: <maker's branch>
<git diff from Maker's branch>
<risks section from plan-creator.md>
Assess: security, reliability, breaking changes, dependencies.
Output: APPROVED or REJECTED with findings table.
Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
Be rigorous but practical — flag real risks, not theoretical ones."
)
Save to check-guardian.md.
3b. Guardian Fast-Path (Rule A2)
If Guardian found 0 CRITICAL and 0 WARNING AND workflow is not escalated AND not first cycle of thorough:
- Skip remaining reviewers, proceed directly to Act phase
- Log "Guardian fast-path taken"
3c. Remaining Reviewers (spawn in parallel)
Skeptic (standard/thorough) — receives Creator's proposal, focus on assumptions.
Save to check-skeptic.md.
Sage (standard/thorough) — receives proposal + diff + implementation summary.
Save to check-sage.md.
Trickster (thorough only) — receives Maker's diff only. "Think like a QA engineer paid per bug."
Save to check-trickster.md.
3d. Evidence Validation
After all reviewers complete, scan CRITICAL/WARNING findings. Downgrade to INFO if:
- Banned phrases without evidence: "might be", "could potentially", "appears to", "seems like", "may not"
- No evidence: no command output, code citation, line reference, or reproduction steps
Track downgrades in events (do NOT modify artifact files). Act phase excludes downgraded findings from CRITICAL tallies.
4. Act Phase
4a. Collect Verdicts
Read all check-*.md artifacts. Tally CRITICAL/WARNING/INFO per reviewer.
4b. Escalation Check (Rule A1)
If fast workflow and Guardian found 2+ CRITICAL: upgrade next cycle to standard (add Skeptic + Sage). Once escalated, stays escalated. A2 does not apply to escalated workflows.
4c. Convergence Check (cycle 2+ only)
Compare current findings against previous cycle. Classify each as NEW / RESOLVED / PERSISTENT / REGRESSED.
convergence_score = resolved / (resolved + new + regressed)
| Score | Status | Action |
|---|---|---|
| > 0.8 | Converging | Continue if cycles remain |
| 0.5-0.8 | Stalling | Continue with caution |
| < 0.5 | Diverging | STOP if 2 consecutive cycles |
| 0.0 (all persistent) | Stuck | STOP |
Oscillation: Finding present in cycle N-2, absent in N-1, back in N. Two or more oscillating findings = STOP and escalate to user.
Convergence STOP overrides normal cycle-back even if cycles remain.
4d. All Approved
- Run pre-merge hooks from
.archeflow/hooks.yamlif defined - Merge:
./lib/archeflow-git.sh merge "$RUN_ID" --no-ff - Post-merge test validation:
./lib/archeflow-rollback.sh "$RUN_ID"(auto-reverts if tests fail) - If rollback triggered and cycles remain: cycle back with "integration test failure" feedback
- Clean up worktree:
./lib/archeflow-git.sh cleanup "$RUN_ID" - Proceed to Completion
4e. Issues Found (cycles remaining)
Build act-feedback.md using the feedback routing table below. Archive current cycle artifacts to cycle-<N>/. Increment cycle, go back to Plan.
4f. Max Cycles Reached
Report unresolved findings. Present best implementation on its branch (not merged). Let user decide.
Feedback Routing Table
Route each finding to the right agent for the next cycle:
| Source | Category | Routes To | Rationale |
|---|---|---|---|
| Guardian | security, breaking-change | Creator | Design must change |
| Guardian | reliability, dependency | Creator | Architectural decision needed |
| Skeptic | design, scalability | Creator | Assumptions need revision |
| Sage | quality, consistency | Maker | Implementation refinement |
| Sage | testing | Maker | Test gap, not design flaw |
| Trickster | reliability (design flaw) | Creator | Needs redesign |
| Trickster | reliability (test gap) | Maker | Needs more tests |
| Trickster | testing | Maker | Edge case not covered |
Disambiguation: If fix requires changing the approach -> Creator. If fix requires changing the code within the existing approach -> Maker.
Feedback File Format
act-feedback.md splits into ## Creator-Routed Issues and ## Maker-Routed Issues. Inject only the relevant section into each agent's prompt.
Same finding in 2 consecutive cycles: escalate to user. Do not cycle again blindly.
Cross-archetype dedup: If two reviewers raise the same issue (same file + category), merge into one finding. Route to higher-priority destination (Creator over Maker).
5. Completion
- Emit
run.completeevent - Check for regressions:
./lib/archeflow-memory.sh regression-check - Generate report:
./lib/archeflow-report.sh .archeflow/events/<run_id>.jsonl - Score effectiveness:
./lib/archeflow-score.sh extract .archeflow/events/<run_id>.jsonl - Append to run index:
.archeflow/events/index.jsonl
Progress Display
━━━ ArcheFlow Run: <task> ━━━━━━━━━━━━━━━━━━━
Run ID: <run_id> | Workflow: <standard> | Cycle: 1/<max>
[Plan] Explorer researching... -> done (35s)
[Plan] Creator designing proposal... -> done (25s, confidence: 0.8)
[Do] Maker implementing... -> done (90s, 4 files, 8 tests)
[Check] Guardian reviewing... -> APPROVED
[Check] Skeptic challenging... -> APPROVED (1 INFO)
[Check] Sage reviewing... -> APPROVED
[Act] All approved — merging... -> merged to main
━━━ Complete: 3m 10s, 1 cycle ━━━━━━━━━━━━━━━
Artifacts: .archeflow/artifacts/<run_id>/
Report: .archeflow/events/<run_id>.jsonl
Shadow Monitoring
Quick-check after each agent completes:
| Archetype | Shadow | Trigger |
|---|---|---|
| Explorer | Rabbit Hole | Output >2000 words without Recommendation section |
| Creator | Over-Architect | >2 new abstractions for one feature |
| Maker | Rogue | No tests in changeset, or files outside proposal |
| Guardian | Paranoid | CRITICAL:WARNING ratio >2:1, or zero approvals |
| Skeptic | Paralytic | >7 challenges, <50% have alternatives |
| Trickster | False Alarm | Findings in untouched code, or >10 findings |
| Sage | Bureaucrat | Review >2x code change length |
On detection: apply correction prompt. On 2nd detection of same shadow: replace agent. On 3+ shadows in one cycle: escalate to user.
Event Reference
Emit events via ./lib/archeflow-event.sh <run_id> <type> <phase> <agent> '<json>'. Events are optional — never let logging block orchestration.
| When | Event Type | Key Data |
|---|---|---|
| Run starts | run.start |
task, workflow, max_cycles |
| Before agent spawn | agent.start |
archetype, model, prompt_summary |
| After agent returns | agent.complete |
archetype, duration_ms, artifacts, summary |
| Phase boundary | phase.transition |
from, to, artifacts_so_far |
| Alternative chosen | decision |
what, chosen, alternatives, rationale |
| Reviewer verdict | review.verdict |
archetype, verdict, findings[] |
| Fix addressing review | fix.applied |
source, finding, file, line |
| End of PDCA cycle | cycle.boundary |
cycle, max_cycles, exit_condition, convergence |
| Shadow triggered | shadow.detected |
archetype, shadow, trigger, action |
| Run ends | run.complete |
status, cycles, agents_total, fixes_total |
Parent rules: run.start has parent: []. Agents parent to the event that triggered them. Phase transitions fan-in from all completing events. Parallel agents share the same parent.
Artifact Naming
All artifacts live in .archeflow/artifacts/<run_id>/:
| File | Content |
|---|---|
plan-explorer.md |
Explorer research (missing in fast) |
plan-creator.md |
Creator proposal with confidence |
plan-mini-explorer.md |
Risk research (only if A3 triggers) |
do-maker.md |
Maker implementation summary |
do-maker-files.txt |
Changed file paths (one per line) |
check-guardian.md |
Guardian verdict + findings |
check-skeptic.md |
Skeptic verdict (if spawned) |
check-sage.md |
Sage verdict (if spawned) |
check-trickster.md |
Trickster verdict (if spawned) |
act-feedback.md |
Structured feedback for next cycle |
act-fixes.jsonl |
Applied fixes log |
cycle-<N>/ |
Archived artifacts from cycle N |
Always check artifact existence before injecting. Missing optional artifacts are expected — skip, don't fail.
Git diff is generated on-the-fly (git diff main...<maker-branch>), not saved to disk.
Effectiveness Scoring
After each run, ./lib/archeflow-score.sh extract scores review archetypes on:
| Dimension | Weight |
|---|---|
| Signal-to-noise (useful / total findings) | 0.30 |
| Fix rate (findings that led to fixes) | 0.25 |
| Cost efficiency (useful findings per dollar) | 0.20 |
| Accuracy (not contradicted by others) | 0.15 |
| Cycle impact (led to cycle exit) | 0.10 |
Scores stored in .archeflow/memory/effectiveness.jsonl. After 10+ runs, recommend model tier changes and archetype removal.
Dry-Run Mode
When --dry-run: Run Plan phase only. Display workflow, agent counts, confidence scores, cost estimate. Ask user to proceed. If yes, continue with --start-from do.
Start-From Mode
| Start from | Required artifacts |
|---|---|
plan |
None |
do |
plan-creator.md |
check |
plan-creator.md, do-maker.md, do-maker-files.txt |
act |
All check-*.md files |
Validate required artifacts exist. Error if missing.
Error Handling
- Agent timeout (5 min): Emit error event, abort run. Do not retry blindly.
- Event emitter fails: Log warning, continue. Events are observation, not control flow.
- Artifact write fails: Blocking — abort and report. Artifacts are required for phase handoff.
- Merge conflict: Do not force-resolve. Report conflict, leave branch intact, let user decide.
Pipeline Strategy
Linear flow with no cycle-back. Use for bug fixes and well-understood single-concern tasks.
Plan (Creator only) -> Implement (Maker) -> Spec-Review (Guardian, then Skeptic if findings) -> Quality-Review (Sage) -> Verify (tests + merge)
- Plan: Spawn Creator with Mini-Reflect (no Explorer). Save
plan-creator.md. - Implement: Spawn Maker in worktree. Save
do-maker.md. - Spec-Review: Guardian first. Skeptic only if Guardian has findings.
- Quality-Review: Sage reviews proposal + diff + summary.
- Verify: Run tests. If pass and 0 CRITICAL: merge. If CRITICAL: one targeted Maker fix, re-review, re-test. If still failing: abort, report branch name for manual resolution.
WARNINGs are logged but do not block merge.
━━━ ArcheFlow Pipeline: <task> ━━━━━━━━━━━━━━━━
Run ID: <run_id> | Strategy: pipeline
[Plan] Creator designing... -> done (20s)
[Implement] Maker building... -> done (60s, 3 files)
[Spec] Guardian reviewing... -> APPROVED
[Quality] Sage reviewing... -> APPROVED (1 WARNING)
[Verify] Tests passing... -> merged to main
━━━ Complete: 2m 15s ━━━━━━━━━━━━━━━━━━━━━━━━━━