claude-archeflow-plugin/skills/run/SKILL.md

---
name: run
description: |
  Start an ArcheFlow PDCA run. Usage: /af-run <task description> [--workflow fast|standard|thorough] [--dry-run] [--start-from plan|do|check|act]
---

# ArcheFlow Run — PDCA Orchestration

One command runs the full cycle: Plan (Explorer+Creator) -> Do (Maker in worktree) -> Check (Guardian first, then others) -> Act (collect findings, route fixes, exit or cycle).

## 0. Initialize

1. Generate run ID: `<YYYY-MM-DD>-<task-slug>`
2. Create artifact directory: `mkdir -p .archeflow/artifacts/<run_id>`
3. Verify `./lib/archeflow-*.sh` scripts exist before proceeding
4. Inject cross-run memory: `./lib/archeflow-memory.sh inject "$DOMAIN" "" --audit "$RUN_ID"`
5. Read `.archeflow/config.yaml` models section. Resolution order: per-workflow per-archetype > per-workflow default > per-archetype > global default.
6. Emit `run.start` event

### Strategy Selection

Determine strategy from CLI flag `--strategy`, config `strategy:` field, or auto-detect:

| Signal | Strategy |
|--------|----------|
| Task contains fix/bug/patch/hotfix | `pipeline` |
| Task contains refactor/redesign/review | `pdca` |
| Workflow is `fast` with single file | `pipeline` |
| Workflow is `thorough` | `pdca` |
| Default | `pdca` |

If `pipeline`, skip to the Pipeline section at the end. Otherwise continue with PDCA below.

### Workflow Selection

| Signal | Workflow | Max Cycles |
|--------|----------|------------|
| Small fix, low risk, single concern | `fast` | 1 |
| Feature, multiple files, moderate risk | `standard` | 2 |
| Security-sensitive, breaking changes, public API | `thorough` | 3 |

---

## Attention Filters

Each agent receives only what it needs. This is the canonical reference:

| Archetype | Receives | Excludes |
|-----------|----------|----------|
| Explorer | Task description, codebase access | Prior proposals, reviews, diffs |
| Creator | Task + Explorer output (+ feedback cycle 2+) | Raw files, diffs, reviewer outputs |
| Maker | Creator's proposal (+ Maker-routed feedback cycle 2+) | Explorer research, reviewer outputs |
| Guardian | Maker's diff + proposal risk section | Full proposal, Explorer research |
| Skeptic | Creator's proposal (assumptions focus) | Diff details, Explorer research |
| Sage | Proposal + diff + implementation summary | Explorer research, other reviews |
| Trickster | Maker's diff only | Everything else |

---

## Status Token Protocol

Every agent ends output with `STATUS: <token>`. Parse it to decide the next action.

| Status | Action |
|--------|--------|
| `DONE` | Proceed to next phase |
| `DONE_WITH_CONCERNS` | Log concerns, proceed |
| `NEEDS_CONTEXT` | Pause, request info from user |
| `BLOCKED` | Abort phase, report blocker |

If no status token found, default to `DONE`.

---

## 1. Plan Phase

### 1a. Explorer (standard/thorough only)

```
Agent(
  description: "Explorer: research context for <task>",
  prompt: "You are the EXPLORER archetype.
    <task description>
    Research: 1) affected files/functions, 2) dependencies, 3) test coverage,
    4) codebase patterns. Write a structured research report.
    Be thorough but focused — no rabbit holes.",
  subagent_type: "Explore"
)
```

Save output to `.archeflow/artifacts/<run_id>/plan-explorer.md`.

### 1b. Creator

Fast workflow (no Explorer): Creator must perform Mini-Reflect first:
1. Restate the task in one sentence
2. List 3 assumptions
3. Name the highest-damage risk

```
Agent(
  description: "Creator: design proposal for <task>",
  prompt: "You are the CREATOR archetype.
    <task description>
    <if fast: Perform Mini-Reflect first (restate task, 3 assumptions, top risk)>
    <if standard/thorough: Research findings: <plan-explorer.md contents>>
    <if cycle 2+: Prior feedback: <Creator-routed section of act-feedback.md>>
    Design a proposal:
    1. Architecture decisions (with rationale)
    2. Files to create/modify (exact paths, specific changes, 2-5 min per item)
    3. Alternatives considered (2+, with rejection rationale)
    4. Test strategy (specific test cases)
    5. Confidence table (task understanding, solution completeness, risk coverage — each 0.0-1.0)
    6. Risks and mitigations
    <if cycle 2+: 7. How each prior issue was addressed (Fixed/Deferred/Accepted/Disputed)>
    Be decisive — ship a clear plan, not a menu.",
  subagent_type: "Plan"
)
```

Save output to `.archeflow/artifacts/<run_id>/plan-creator.md`.

### 1c. Confidence Gate (Rule A3)

Parse the `### Confidence` table from `plan-creator.md`. If unparseable, default to 0.0 (triggers gate).

| Axis | Score < 0.5 | Action |
|------|-------------|--------|
| Task understanding | Pause | Ask user for clarification. Do not spawn Maker. |
| Solution completeness | Upgrade | If fast -> standard. Spawn Explorer, re-run Creator. |
| Risk coverage | Mini-Explorer | Spawn focused Explorer for risky areas (5 min max, parallel with Do prep). |

Mini-Explorer prompt: "You are the EXPLORER. Risk coverage scored <score>. Identified risks: <risks>. Research ONLY the risky areas. Is the risk real? What mitigations exist?"
Save to `plan-mini-explorer.md`.

---

## 2. Do Phase

### 2a. Maker

```
Agent(
  description: "Maker: implement <task>",
  prompt: "You are the MAKER archetype.
    Implement this proposal: <plan-creator.md contents>
    <if cycle 2+: Implementation feedback: <Maker-routed findings from act-feedback.md>>
    Rules:
    1. Follow the proposal exactly — don't redesign
    2. Write tests for every behavioral change
    3. Commit with descriptive messages (CRITICAL: uncommitted worktree changes are LOST)
    4. Run existing tests — nothing may break
    5. If unclear, implement best interpretation and note it
    Self-review before finishing:
    - All proposal files changed? Tests added? No out-of-scope files? Existing tests pass?",
  isolation: "worktree",
  mode: "bypassPermissions"
)
```

Save summary to `do-maker.md`. Save changed file list to `do-maker-files.txt`.

### 2b. Test-First Gate

Check `do-maker-files.txt` for test files. If none found and domain is not `writing`:
- If Creator specified a test strategy: re-run Maker with targeted test instruction (1 retry within Do, not a full cycle)
- If no test strategy: emit WARNING, proceed

---

## 3. Check Phase

Spawn Guardian FIRST. Evaluate Rule A2 before spawning other reviewers.

### 3a. Guardian (always first)

```
Agent(
  description: "Guardian: security review for <task>",
  prompt: "You are the GUARDIAN archetype.
    Review changes in branch: <maker's branch>
    <git diff from Maker's branch>
    <risks section from plan-creator.md>
    Assess: security, reliability, breaking changes, dependencies.
    Output: APPROVED or REJECTED with findings table.
    Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
    Be rigorous but practical — flag real risks, not theoretical ones."
)
```

Save to `check-guardian.md`.

### 3b. Guardian Fast-Path (Rule A2)

If Guardian found **0 CRITICAL and 0 WARNING** AND workflow is not escalated AND not first cycle of thorough:
- Skip remaining reviewers, proceed directly to Act phase
- Log "Guardian fast-path taken"

### 3c. Remaining Reviewers (spawn in parallel)

**Skeptic** (standard/thorough) — receives Creator's proposal, focus on assumptions.
Save to `check-skeptic.md`.

**Sage** (standard/thorough) — receives proposal + diff + implementation summary.
Save to `check-sage.md`.

**Trickster** (thorough only) — receives Maker's diff only. "Think like a QA engineer paid per bug."
Save to `check-trickster.md`.

### 3d. Evidence Validation

After all reviewers complete, scan CRITICAL/WARNING findings. Downgrade to INFO if:
- **Banned phrases** without evidence: "might be", "could potentially", "appears to", "seems like", "may not"
- **No evidence**: no command output, code citation, line reference, or reproduction steps

Track downgrades in events (do NOT modify artifact files). Act phase excludes downgraded findings from CRITICAL tallies.

---

## 4. Act Phase

### 4a. Collect Verdicts

Read all `check-*.md` artifacts. Tally CRITICAL/WARNING/INFO per reviewer.

### 4b. Escalation Check (Rule A1)

If `fast` workflow and Guardian found 2+ CRITICAL: upgrade next cycle to `standard` (add Skeptic + Sage). Once escalated, stays escalated. A2 does not apply to escalated workflows.

### 4c. Convergence Check (cycle 2+ only)

Compare current findings against previous cycle. Classify each as NEW / RESOLVED / PERSISTENT / REGRESSED.

```
convergence_score = resolved / (resolved + new + regressed)
```

| Score | Status | Action |
|-------|--------|--------|
| > 0.8 | Converging | Continue if cycles remain |
| 0.5-0.8 | Stalling | Continue with caution |
| < 0.5 | Diverging | STOP if 2 consecutive cycles |
| 0.0 (all persistent) | Stuck | STOP |

**Oscillation**: Finding present in cycle N-2, absent in N-1, back in N. Two or more oscillating findings = STOP and escalate to user.

Convergence STOP overrides normal cycle-back even if cycles remain.

### 4d. All Approved

1. Run pre-merge hooks from `.archeflow/hooks.yaml` if defined
2. Merge: `./lib/archeflow-git.sh merge "$RUN_ID" --no-ff`
3. Post-merge test validation: `./lib/archeflow-rollback.sh "$RUN_ID"` (auto-reverts if tests fail)
4. If rollback triggered and cycles remain: cycle back with "integration test failure" feedback
5. Clean up worktree: `./lib/archeflow-git.sh cleanup "$RUN_ID"`
6. Proceed to Completion

### 4e. Issues Found (cycles remaining)

Build `act-feedback.md` using the feedback routing table below. Archive current cycle artifacts to `cycle-<N>/`. Increment cycle, go back to Plan.

### 4f. Max Cycles Reached

Report unresolved findings. Present best implementation on its branch (not merged). Let user decide.

---

## Feedback Routing Table

Route each finding to the right agent for the next cycle:

| Source | Category | Routes To | Rationale |
|--------|----------|-----------|-----------|
| Guardian | security, breaking-change | **Creator** | Design must change |
| Guardian | reliability, dependency | **Creator** | Architectural decision needed |
| Skeptic | design, scalability | **Creator** | Assumptions need revision |
| Sage | quality, consistency | **Maker** | Implementation refinement |
| Sage | testing | **Maker** | Test gap, not design flaw |
| Trickster | reliability (design flaw) | **Creator** | Needs redesign |
| Trickster | reliability (test gap) | **Maker** | Needs more tests |
| Trickster | testing | **Maker** | Edge case not covered |

**Disambiguation**: If fix requires changing the approach -> Creator. If fix requires changing the code within the existing approach -> Maker.

### Feedback File Format

`act-feedback.md` splits into `## Creator-Routed Issues` and `## Maker-Routed Issues`. Inject only the relevant section into each agent's prompt.

**Same finding in 2 consecutive cycles**: escalate to user. Do not cycle again blindly.

**Cross-archetype dedup**: If two reviewers raise the same issue (same file + category), merge into one finding. Route to higher-priority destination (Creator over Maker).

---

## 5. Completion

1. Emit `run.complete` event
2. Check for regressions: `./lib/archeflow-memory.sh regression-check`
3. Generate report: `./lib/archeflow-report.sh .archeflow/events/<run_id>.jsonl`
4. Score effectiveness: `./lib/archeflow-score.sh extract .archeflow/events/<run_id>.jsonl`
5. Append to run index: `.archeflow/events/index.jsonl`

---

## Progress Display

```
━━━ ArcheFlow Run: <task> ━━━━━━━━━━━━━━━━━━━
Run ID: <run_id> | Workflow: <standard> | Cycle: 1/<max>

[Plan]  Explorer researching...        -> done (35s)
[Plan]  Creator designing proposal...  -> done (25s, confidence: 0.8)
[Do]    Maker implementing...          -> done (90s, 4 files, 8 tests)
[Check] Guardian reviewing...          -> APPROVED
[Check] Skeptic challenging...         -> APPROVED (1 INFO)
[Check] Sage reviewing...             -> APPROVED
[Act]   All approved — merging...      -> merged to main

━━━ Complete: 3m 10s, 1 cycle ━━━━━━━━━━━━━━━
Artifacts: .archeflow/artifacts/<run_id>/
Report: .archeflow/events/<run_id>.jsonl
```

---

## Shadow Monitoring

Quick-check after each agent completes:

| Archetype | Shadow | Trigger |
|-----------|--------|---------|
| Explorer | Rabbit Hole | Output >2000 words without Recommendation section |
| Creator | Over-Architect | >2 new abstractions for one feature |
| Maker | Rogue | No tests in changeset, or files outside proposal |
| Guardian | Paranoid | CRITICAL:WARNING ratio >2:1, or zero approvals |
| Skeptic | Paralytic | >7 challenges, <50% have alternatives |
| Trickster | False Alarm | Findings in untouched code, or >10 findings |
| Sage | Bureaucrat | Review >2x code change length |

On detection: apply correction prompt. On 2nd detection of same shadow: replace agent. On 3+ shadows in one cycle: escalate to user.

---

## Event Reference

Emit events via `./lib/archeflow-event.sh <run_id> <type> <phase> <agent> '<json>'`. Events are optional — never let logging block orchestration.

| When | Event Type | Key Data |
|------|-----------|----------|
| Run starts | `run.start` | task, workflow, max_cycles |
| Before agent spawn | `agent.start` | archetype, model, prompt_summary |
| After agent returns | `agent.complete` | archetype, duration_ms, artifacts, summary |
| Phase boundary | `phase.transition` | from, to, artifacts_so_far |
| Alternative chosen | `decision` | what, chosen, alternatives, rationale |
| Orchestrator decision (replay) | `decision.point` | archetype, input, decision, confidence — use `./lib/archeflow-decision.sh` |
| Reviewer verdict | `review.verdict` | archetype, verdict, findings[] |
| Fix addressing review | `fix.applied` | source, finding, file, line |
| End of PDCA cycle | `cycle.boundary` | cycle, max_cycles, exit_condition, convergence |
| Shadow triggered | `shadow.detected` | archetype, shadow, trigger, action |
| Policy halt | `wiggum.break` | trigger, run_state, unresolved_findings, hard/soft |
| Run ends | `run.complete` | status, cycles, agents_total, fixes_total |

Parent rules: `run.start` has `parent: []`. Agents parent to the event that triggered them. Phase transitions fan-in from all completing events. Parallel agents share the same parent.

---

## Artifact Naming

All artifacts live in `.archeflow/artifacts/<run_id>/`:

| File | Content |
|------|---------|
| `plan-explorer.md` | Explorer research (missing in fast) |
| `plan-creator.md` | Creator proposal with confidence |
| `plan-mini-explorer.md` | Risk research (only if A3 triggers) |
| `do-maker.md` | Maker implementation summary |
| `do-maker-files.txt` | Changed file paths (one per line) |
| `check-guardian.md` | Guardian verdict + findings |
| `check-skeptic.md` | Skeptic verdict (if spawned) |
| `check-sage.md` | Sage verdict (if spawned) |
| `check-trickster.md` | Trickster verdict (if spawned) |
| `act-feedback.md` | Structured feedback for next cycle |
| `act-fixes.jsonl` | Applied fixes log |
| `cycle-<N>/` | Archived artifacts from cycle N |

Always check artifact existence before injecting. Missing optional artifacts are expected — skip, don't fail.

Git diff is generated on-the-fly (`git diff main...<maker-branch>`), not saved to disk.

---

## Effectiveness Scoring

After each run, `./lib/archeflow-score.sh extract` scores review archetypes on:

| Dimension | Weight |
|-----------|--------|
| Signal-to-noise (useful / total findings) | 0.30 |
| Fix rate (findings that led to fixes) | 0.25 |
| Cost efficiency (useful findings per dollar) | 0.20 |
| Accuracy (not contradicted by others) | 0.15 |
| Cycle impact (led to cycle exit) | 0.10 |

Scores stored in `.archeflow/memory/effectiveness.jsonl`. After 10+ runs, recommend model tier changes and archetype removal.

---

## Run replay (decision log + what-if)

After key choices (routing, fast-path skip, escalation), emit `decision.point` via `./lib/archeflow-decision.sh` so runs can be inspected with `./lib/archeflow-replay.sh timeline|whatif|compare <run_id>`. Weighted what-if helps estimate how much each review archetype swayed the effective ship/block outcome. See skill `af-replay`.

---

## Dry-Run Mode

When `--dry-run`: Run Plan phase only. Display workflow, agent counts, confidence scores, cost estimate. Ask user to proceed. If yes, continue with `--start-from do`.

## Start-From Mode

| Start from | Required artifacts |
|------------|--------------------|
| `plan` | None |
| `do` | `plan-creator.md` |
| `check` | `plan-creator.md`, `do-maker.md`, `do-maker-files.txt` |
| `act` | All `check-*.md` files |

Validate required artifacts exist. Error if missing.

---

## Error Handling

- **Agent timeout (5 min)**: Emit error event, abort run. Do not retry blindly.
- **Event emitter fails**: Log warning, continue. Events are observation, not control flow.
- **Artifact write fails**: Blocking — abort and report. Artifacts are required for phase handoff.
- **Merge conflict**: Do not force-resolve. Report conflict, leave branch intact, let user decide.

---

## Pipeline Strategy

Linear flow with no cycle-back. Use for bug fixes and well-understood single-concern tasks.

```
Plan (Creator only) -> Implement (Maker) -> Spec-Review (Guardian, then Skeptic if findings) -> Quality-Review (Sage) -> Verify (tests + merge)
```

1. **Plan**: Spawn Creator with Mini-Reflect (no Explorer). Save `plan-creator.md`.
2. **Implement**: Spawn Maker in worktree. Save `do-maker.md`.
3. **Spec-Review**: Guardian first. Skeptic only if Guardian has findings.
4. **Quality-Review**: Sage reviews proposal + diff + summary.
5. **Verify**: Run tests. If pass and 0 CRITICAL: merge. If CRITICAL: one targeted Maker fix, re-review, re-test. If still failing: abort, report branch name for manual resolution.

WARNINGs are logged but do not block merge.

```
━━━ ArcheFlow Pipeline: <task> ━━━━━━━━━━━━━━━━
Run ID: <run_id> | Strategy: pipeline

[Plan]      Creator designing...          -> done (20s)
[Implement] Maker building...             -> done (60s, 3 files)
[Spec]      Guardian reviewing...         -> APPROVED
[Quality]   Sage reviewing...             -> APPROVED (1 WARNING)
[Verify]    Tests passing...              -> merged to main

━━━ Complete: 2m 15s ━━━━━━━━━━━━━━━━━━━━━━━━━━
```