Files

Christian Nennemann 1e96d87f49 feat: introduce Wiggum Break as named circuit breaker

Replaces generic "circuit breaker" with "Wiggum Break" — policy enforcement
halt condition named after Chief Wiggum (policy + Ralph Loop's dad).
Hard breaks (immediate halt) and soft breaks (finish then halt) with
wiggum.break event type. Updated both papers and shadow-detection skill.

2026-04-08 05:19:35 +02:00

18 KiB

Raw Permalink Blame History

name, description

name	description
run	Start an ArcheFlow PDCA run. Usage: /af-run <task description> [--workflow fast\|standard\|thorough] [--dry-run] [--start-from plan\|do\|check\|act]

ArcheFlow Run — PDCA Orchestration

One command runs the full cycle: Plan (Explorer+Creator) -> Do (Maker in worktree) -> Check (Guardian first, then others) -> Act (collect findings, route fixes, exit or cycle).

0. Initialize

Generate run ID: <YYYY-MM-DD>-<task-slug>
Create artifact directory: mkdir -p .archeflow/artifacts/<run_id>
Verify ./lib/archeflow-*.sh scripts exist before proceeding
Inject cross-run memory: ./lib/archeflow-memory.sh inject "$DOMAIN" "" --audit "$RUN_ID"
Read .archeflow/config.yaml models section. Resolution order: per-workflow per-archetype > per-workflow default > per-archetype > global default.
Emit run.start event

Strategy Selection

Determine strategy from CLI flag --strategy, config strategy: field, or auto-detect:

Signal	Strategy
Task contains fix/bug/patch/hotfix	`pipeline`
Task contains refactor/redesign/review	`pdca`
Workflow is `fast` with single file	`pipeline`
Workflow is `thorough`	`pdca`
Default	`pdca`

If pipeline, skip to the Pipeline section at the end. Otherwise continue with PDCA below.

Workflow Selection

Signal	Workflow	Max Cycles
Small fix, low risk, single concern	`fast`	1
Feature, multiple files, moderate risk	`standard`	2
Security-sensitive, breaking changes, public API	`thorough`	3

Attention Filters

Each agent receives only what it needs. This is the canonical reference:

Archetype	Receives	Excludes
Explorer	Task description, codebase access	Prior proposals, reviews, diffs
Creator	Task + Explorer output (+ feedback cycle 2+)	Raw files, diffs, reviewer outputs
Maker	Creator's proposal (+ Maker-routed feedback cycle 2+)	Explorer research, reviewer outputs
Guardian	Maker's diff + proposal risk section	Full proposal, Explorer research
Skeptic	Creator's proposal (assumptions focus)	Diff details, Explorer research
Sage	Proposal + diff + implementation summary	Explorer research, other reviews
Trickster	Maker's diff only	Everything else

Status Token Protocol

Every agent ends output with STATUS: <token>. Parse it to decide the next action.

Status	Action
`DONE`	Proceed to next phase
`DONE_WITH_CONCERNS`	Log concerns, proceed
`NEEDS_CONTEXT`	Pause, request info from user
`BLOCKED`	Abort phase, report blocker

If no status token found, default to DONE.

1. Plan Phase

1a. Explorer (standard/thorough only)

Agent(
  description: "Explorer: research context for <task>",
  prompt: "You are the EXPLORER archetype.
    <task description>
    Research: 1) affected files/functions, 2) dependencies, 3) test coverage,
    4) codebase patterns. Write a structured research report.
    Be thorough but focused — no rabbit holes.",
  subagent_type: "Explore"
)

Save output to .archeflow/artifacts/<run_id>/plan-explorer.md.

1b. Creator

Fast workflow (no Explorer): Creator must perform Mini-Reflect first:

Restate the task in one sentence
List 3 assumptions
Name the highest-damage risk

Agent(
  description: "Creator: design proposal for <task>",
  prompt: "You are the CREATOR archetype.
    <task description>
    <if fast: Perform Mini-Reflect first (restate task, 3 assumptions, top risk)>
    <if standard/thorough: Research findings: <plan-explorer.md contents>>
    <if cycle 2+: Prior feedback: <Creator-routed section of act-feedback.md>>
    Design a proposal:
    1. Architecture decisions (with rationale)
    2. Files to create/modify (exact paths, specific changes, 2-5 min per item)
    3. Alternatives considered (2+, with rejection rationale)
    4. Test strategy (specific test cases)
    5. Confidence table (task understanding, solution completeness, risk coverage — each 0.0-1.0)
    6. Risks and mitigations
    <if cycle 2+: 7. How each prior issue was addressed (Fixed/Deferred/Accepted/Disputed)>
    Be decisive — ship a clear plan, not a menu.",
  subagent_type: "Plan"
)

Save output to .archeflow/artifacts/<run_id>/plan-creator.md.

1c. Confidence Gate (Rule A3)

Parse the ### Confidence table from plan-creator.md. If unparseable, default to 0.0 (triggers gate).

Axis	Score < 0.5	Action
Task understanding	Pause	Ask user for clarification. Do not spawn Maker.
Solution completeness	Upgrade	If fast -> standard. Spawn Explorer, re-run Creator.
Risk coverage	Mini-Explorer	Spawn focused Explorer for risky areas (5 min max, parallel with Do prep).

Mini-Explorer prompt: "You are the EXPLORER. Risk coverage scored . Identified risks: . Research ONLY the risky areas. Is the risk real? What mitigations exist?" Save to plan-mini-explorer.md.

2. Do Phase

2a. Maker

Agent(
  description: "Maker: implement <task>",
  prompt: "You are the MAKER archetype.
    Implement this proposal: <plan-creator.md contents>
    <if cycle 2+: Implementation feedback: <Maker-routed findings from act-feedback.md>>
    Rules:
    1. Follow the proposal exactly — don't redesign
    2. Write tests for every behavioral change
    3. Commit with descriptive messages (CRITICAL: uncommitted worktree changes are LOST)
    4. Run existing tests — nothing may break
    5. If unclear, implement best interpretation and note it
    Self-review before finishing:
    - All proposal files changed? Tests added? No out-of-scope files? Existing tests pass?",
  isolation: "worktree",
  mode: "bypassPermissions"
)

Save summary to do-maker.md. Save changed file list to do-maker-files.txt.

2b. Test-First Gate

Check do-maker-files.txt for test files. If none found and domain is not writing:

If Creator specified a test strategy: re-run Maker with targeted test instruction (1 retry within Do, not a full cycle)
If no test strategy: emit WARNING, proceed

3. Check Phase

Spawn Guardian FIRST. Evaluate Rule A2 before spawning other reviewers.

3a. Guardian (always first)

Agent(
  description: "Guardian: security review for <task>",
  prompt: "You are the GUARDIAN archetype.
    Review changes in branch: <maker's branch>
    <git diff from Maker's branch>
    <risks section from plan-creator.md>
    Assess: security, reliability, breaking changes, dependencies.
    Output: APPROVED or REJECTED with findings table.
    Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
    Be rigorous but practical — flag real risks, not theoretical ones."
)

Save to check-guardian.md.

3b. Guardian Fast-Path (Rule A2)

If Guardian found 0 CRITICAL and 0 WARNING AND workflow is not escalated AND not first cycle of thorough:

Skip remaining reviewers, proceed directly to Act phase
Log "Guardian fast-path taken"

3c. Remaining Reviewers (spawn in parallel)

Skeptic (standard/thorough) — receives Creator's proposal, focus on assumptions. Save to check-skeptic.md.

Sage (standard/thorough) — receives proposal + diff + implementation summary. Save to check-sage.md.

Trickster (thorough only) — receives Maker's diff only. "Think like a QA engineer paid per bug." Save to check-trickster.md.

3d. Evidence Validation

After all reviewers complete, scan CRITICAL/WARNING findings. Downgrade to INFO if:

Banned phrases without evidence: "might be", "could potentially", "appears to", "seems like", "may not"
No evidence: no command output, code citation, line reference, or reproduction steps

Track downgrades in events (do NOT modify artifact files). Act phase excludes downgraded findings from CRITICAL tallies.

4. Act Phase

4a. Collect Verdicts

Read all check-*.md artifacts. Tally CRITICAL/WARNING/INFO per reviewer.

4b. Escalation Check (Rule A1)

If fast workflow and Guardian found 2+ CRITICAL: upgrade next cycle to standard (add Skeptic + Sage). Once escalated, stays escalated. A2 does not apply to escalated workflows.

4c. Convergence Check (cycle 2+ only)

Compare current findings against previous cycle. Classify each as NEW / RESOLVED / PERSISTENT / REGRESSED.

convergence_score = resolved / (resolved + new + regressed)

Score	Status	Action
> 0.8	Converging	Continue if cycles remain
0.5-0.8	Stalling	Continue with caution
< 0.5	Diverging	STOP if 2 consecutive cycles
0.0 (all persistent)	Stuck	STOP

Oscillation: Finding present in cycle N-2, absent in N-1, back in N. Two or more oscillating findings = STOP and escalate to user.

Convergence STOP overrides normal cycle-back even if cycles remain.

4d. All Approved

Run pre-merge hooks from .archeflow/hooks.yaml if defined
Merge: ./lib/archeflow-git.sh merge "$RUN_ID" --no-ff
Post-merge test validation: ./lib/archeflow-rollback.sh "$RUN_ID" (auto-reverts if tests fail)
If rollback triggered and cycles remain: cycle back with "integration test failure" feedback
Clean up worktree: ./lib/archeflow-git.sh cleanup "$RUN_ID"
Proceed to Completion

4e. Issues Found (cycles remaining)

Build act-feedback.md using the feedback routing table below. Archive current cycle artifacts to cycle-<N>/. Increment cycle, go back to Plan.

4f. Max Cycles Reached

Report unresolved findings. Present best implementation on its branch (not merged). Let user decide.

Feedback Routing Table

Route each finding to the right agent for the next cycle:

Source	Category	Routes To	Rationale
Guardian	security, breaking-change	Creator	Design must change
Guardian	reliability, dependency	Creator	Architectural decision needed
Skeptic	design, scalability	Creator	Assumptions need revision
Sage	quality, consistency	Maker	Implementation refinement
Sage	testing	Maker	Test gap, not design flaw
Trickster	reliability (design flaw)	Creator	Needs redesign
Trickster	reliability (test gap)	Maker	Needs more tests
Trickster	testing	Maker	Edge case not covered

Disambiguation: If fix requires changing the approach -> Creator. If fix requires changing the code within the existing approach -> Maker.

Feedback File Format

act-feedback.md splits into ## Creator-Routed Issues and ## Maker-Routed Issues. Inject only the relevant section into each agent's prompt.

Same finding in 2 consecutive cycles: escalate to user. Do not cycle again blindly.

Cross-archetype dedup: If two reviewers raise the same issue (same file + category), merge into one finding. Route to higher-priority destination (Creator over Maker).

5. Completion

Emit run.complete event
Check for regressions: ./lib/archeflow-memory.sh regression-check
Generate report: ./lib/archeflow-report.sh .archeflow/events/<run_id>.jsonl
Score effectiveness: ./lib/archeflow-score.sh extract .archeflow/events/<run_id>.jsonl
Append to run index: .archeflow/events/index.jsonl

Progress Display

━━━ ArcheFlow Run: <task> ━━━━━━━━━━━━━━━━━━━
Run ID: <run_id> | Workflow: <standard> | Cycle: 1/<max>

[Plan]  Explorer researching...        -> done (35s)
[Plan]  Creator designing proposal...  -> done (25s, confidence: 0.8)
[Do]    Maker implementing...          -> done (90s, 4 files, 8 tests)
[Check] Guardian reviewing...          -> APPROVED
[Check] Skeptic challenging...         -> APPROVED (1 INFO)
[Check] Sage reviewing...             -> APPROVED
[Act]   All approved — merging...      -> merged to main

━━━ Complete: 3m 10s, 1 cycle ━━━━━━━━━━━━━━━
Artifacts: .archeflow/artifacts/<run_id>/
Report: .archeflow/events/<run_id>.jsonl

Shadow Monitoring

Quick-check after each agent completes:

Archetype	Shadow	Trigger
Explorer	Rabbit Hole	Output >2000 words without Recommendation section
Creator	Over-Architect	>2 new abstractions for one feature
Maker	Rogue	No tests in changeset, or files outside proposal
Guardian	Paranoid	CRITICAL:WARNING ratio >2:1, or zero approvals
Skeptic	Paralytic	>7 challenges, <50% have alternatives
Trickster	False Alarm	Findings in untouched code, or >10 findings
Sage	Bureaucrat	Review >2x code change length

On detection: apply correction prompt. On 2nd detection of same shadow: replace agent. On 3+ shadows in one cycle: escalate to user.

Event Reference

Emit events via ./lib/archeflow-event.sh <run_id> <type> <phase> <agent> '<json>'. Events are optional — never let logging block orchestration.

When	Event Type	Key Data
Run starts	`run.start`	task, workflow, max_cycles
Before agent spawn	`agent.start`	archetype, model, prompt_summary
After agent returns	`agent.complete`	archetype, duration_ms, artifacts, summary
Phase boundary	`phase.transition`	from, to, artifacts_so_far
Alternative chosen	`decision`	what, chosen, alternatives, rationale
Orchestrator decision (replay)	`decision.point`	archetype, input, decision, confidence — use `./lib/archeflow-decision.sh`
Reviewer verdict	`review.verdict`	archetype, verdict, findings[]
Fix addressing review	`fix.applied`	source, finding, file, line
End of PDCA cycle	`cycle.boundary`	cycle, max_cycles, exit_condition, convergence
Shadow triggered	`shadow.detected`	archetype, shadow, trigger, action
Policy halt	`wiggum.break`	trigger, run_state, unresolved_findings, hard/soft
Run ends	`run.complete`	status, cycles, agents_total, fixes_total

Parent rules: run.start has parent: []. Agents parent to the event that triggered them. Phase transitions fan-in from all completing events. Parallel agents share the same parent.

Artifact Naming

All artifacts live in .archeflow/artifacts/<run_id>/:

File	Content
`plan-explorer.md`	Explorer research (missing in fast)
`plan-creator.md`	Creator proposal with confidence
`plan-mini-explorer.md`	Risk research (only if A3 triggers)
`do-maker.md`	Maker implementation summary
`do-maker-files.txt`	Changed file paths (one per line)
`check-guardian.md`	Guardian verdict + findings
`check-skeptic.md`	Skeptic verdict (if spawned)
`check-sage.md`	Sage verdict (if spawned)
`check-trickster.md`	Trickster verdict (if spawned)
`act-feedback.md`	Structured feedback for next cycle
`act-fixes.jsonl`	Applied fixes log
`cycle-<N>/`	Archived artifacts from cycle N

Always check artifact existence before injecting. Missing optional artifacts are expected — skip, don't fail.

Git diff is generated on-the-fly (git diff main...<maker-branch>), not saved to disk.

Effectiveness Scoring

After each run, ./lib/archeflow-score.sh extract scores review archetypes on:

Dimension	Weight
Signal-to-noise (useful / total findings)	0.30
Fix rate (findings that led to fixes)	0.25
Cost efficiency (useful findings per dollar)	0.20
Accuracy (not contradicted by others)	0.15
Cycle impact (led to cycle exit)	0.10

Scores stored in .archeflow/memory/effectiveness.jsonl. After 10+ runs, recommend model tier changes and archetype removal.

Run replay (decision log + what-if)

After key choices (routing, fast-path skip, escalation), emit decision.point via ./lib/archeflow-decision.sh so runs can be inspected with ./lib/archeflow-replay.sh timeline|whatif|compare <run_id>. Weighted what-if helps estimate how much each review archetype swayed the effective ship/block outcome. See skill af-replay.

Dry-Run Mode

When --dry-run: Run Plan phase only. Display workflow, agent counts, confidence scores, cost estimate. Ask user to proceed. If yes, continue with --start-from do.

Start-From Mode

Start from	Required artifacts
`plan`	None
`do`	`plan-creator.md`
`check`	`plan-creator.md`, `do-maker.md`, `do-maker-files.txt`
`act`	All `check-*.md` files

Validate required artifacts exist. Error if missing.

Error Handling

Agent timeout (5 min): Emit error event, abort run. Do not retry blindly.
Event emitter fails: Log warning, continue. Events are observation, not control flow.
Artifact write fails: Blocking — abort and report. Artifacts are required for phase handoff.
Merge conflict: Do not force-resolve. Report conflict, leave branch intact, let user decide.

Pipeline Strategy

Linear flow with no cycle-back. Use for bug fixes and well-understood single-concern tasks.

Plan (Creator only) -> Implement (Maker) -> Spec-Review (Guardian, then Skeptic if findings) -> Quality-Review (Sage) -> Verify (tests + merge)

Plan: Spawn Creator with Mini-Reflect (no Explorer). Save plan-creator.md.
Implement: Spawn Maker in worktree. Save do-maker.md.
Spec-Review: Guardian first. Skeptic only if Guardian has findings.
Quality-Review: Sage reviews proposal + diff + summary.
Verify: Run tests. If pass and 0 CRITICAL: merge. If CRITICAL: one targeted Maker fix, re-review, re-test. If still failing: abort, report branch name for manual resolution.

WARNINGs are logged but do not block merge.

━━━ ArcheFlow Pipeline: <task> ━━━━━━━━━━━━━━━━
Run ID: <run_id> | Strategy: pipeline

[Plan]      Creator designing...          -> done (20s)
[Implement] Maker building...             -> done (60s, 3 files)
[Spec]      Guardian reviewing...         -> APPROVED
[Quality]   Sage reviewing...             -> APPROVED (1 WARNING)
[Verify]    Tests passing...              -> merged to main

━━━ Complete: 2m 15s ━━━━━━━━━━━━━━━━━━━━━━━━━━

18 KiB Raw Permalink Blame History