Files
Christian Nennemann 1e96d87f49 feat: introduce Wiggum Break as named circuit breaker
Replaces generic "circuit breaker" with "Wiggum Break" — policy enforcement
halt condition named after Chief Wiggum (policy + Ralph Loop's dad).
Hard breaks (immediate halt) and soft breaks (finish then halt) with
wiggum.break event type. Updated both papers and shadow-detection skill.
2026-04-08 05:19:35 +02:00

7.2 KiB

name, description
name description
shadow-detection Corrective action framework for agent dysfunction, system health, and operational policy. Three layers — archetype shadows, system shadows, policy boundaries — one escalation protocol.

Corrective Action Framework

Detect dysfunction. Apply corrective action. Escalate if repeated.

Three layers, one protocol:

  • Archetype Shadows — individual agent dysfunction (virtue pushed too far)
  • System Shadows — orchestration-level dysfunction (process going wrong)
  • Policy Boundaries — operational limits (time, cost, quality thresholds)

Archetype Shadows

Archetype Shadow Detect (any) Corrective Action
Explorer Rabbit Hole Output >2000w without Recommendation; >3 tangents; >15 files no patterns; no synthesis in final 25% "Summarize top 3 findings and one recommendation in 300 words."
Creator Over-Architect >2 new abstractions for one feature; "future-proof" in rationale; scope exceeds task >50%; >1 new package "Design for the current order of magnitude. Remove abstractions for hypothetical requirements."
Maker Rogue Zero test files with >=3 files changed; single monolithic commit; files outside proposal; no test run evidence "Read the proposal. Write a test. Commit. Revert out-of-scope files."
Guardian Paranoid CRITICAL:WARNING ratio >2:1 (min 3); zero APPROVED in 3+ reviews; <50% findings include fix; findings require compromised systems "For each CRITICAL: would a senior engineer block a PR? If not, downgrade. Every rejection needs a specific fix."
Skeptic Paralytic >7 challenges; <50% include alternatives; same concern 2+ times reworded; >3 findings outside scope "Rank by impact. Keep top 3 with alternatives. Delete the rest."
Trickster False Alarm Findings in untouched code; >10 findings for <5 files; impossible scenarios; >3 without repro steps "Delete findings outside the diff. Rank by likelihood x impact. Keep top 3-5."
Sage Bureaucrat Review words >2x diff lines; findings outside changeset; >2 "consider" without action; suggesting docs for trivial functions "Limit to issues affecting maintainability in 6 months. Every finding needs a specific action."

Shadow Immunity

Intensity alone is not a shadow. Shadow = behavior disconnected from the goal.

  • Explorer reading 20 files in a monorepo with scattered deps -- not rabbit hole if each is relevant
  • Guardian blocking with 2 CRITICALs -- not paranoid if both are genuine vulnerabilities
  • Trickster finding 5 edge cases -- not false alarm if all are in changed code with repro steps

System Shadows

Orchestration-level dysfunction that isn't tied to one archetype.

Shadow Detect Corrective Action
Tunnel Vision All reviewers flag same category (e.g., 4 security findings, 0 quality/testing) "Redistribute attention. Are we missing quality, testing, or design concerns?"
Echo Chamber Unanimous approval in <30s on standard/thorough workflow "Suspicious fast consensus. Re-run Guardian with adversarial prompt."
Gold Plating Maker working on INFO fixes while CRITICALs remain open "Fix CRITICALs first. Park INFO items."
Analysis Paralysis Plan phase >2x longer than Do phase; Explorer spawned 3+ times "Stop researching. Ship a proposal with known gaps."
Cargo Cult Memory lesson injected but the same finding repeats anyway "Lesson ineffective. Reword, strengthen, or remove it."
Broken Window 3+ WARNINGs deferred across consecutive runs in the same project "Accumulated tech debt. Schedule a cleanup sprint."
Scope Creep Maker changes >2x files listed in proposal "Revert to proposal scope. If more files needed, update the proposal first."

Policy Boundaries

Operational limits that protect session quality, cost, and resumability.

Checkpoint Policy

Every 45 minutes or 3 completed tasks (whichever first):

  1. Commit + push all work in progress
  2. Write handoff summary to control-center.md
  3. Log token spend so far
  4. Compare output quality: last task vs first task
  5. If quality degrading -> STOP with clean state
  6. If budget >80% spent -> STOP with clean state
  7. Otherwise -> continue

Budget Gate

Threshold Action
50% budget spent Log warning, continue
80% budget spent Downgrade models (sonnet->haiku for reviewers)
95% budget spent Complete current task, then STOP
100% budget STOP immediately, commit WIP

Wiggum Break (Circuit Breaker)

Named after Chief Wiggum — policy enforcement AND the Ralph Loop's dad. When a Wiggum Break triggers, the system halts execution, saves state, and reports to the user. "Bake 'em away, toys."

Hard breaks (halt immediately, commit WIP):

Trigger Reason
3 consecutive agent failures/timeouts Infrastructure issue, not a code problem
3 consecutive task failures in sprint Something systemic is wrong
Same shadow detected 3+ times in one cycle Task needs to be broken down or re-scoped
Test suite broken after merge Auto-revert, then halt
2+ oscillating findings (present→absent→present) Fundamental tension in review criteria

Soft breaks (finish current task, then halt):

Signal Reason
Cycle N findings identical to cycle N-1 No progress — present best result
Convergence score <0.5 for 2 consecutive cycles "This needs a different approach"
Reviewer finding count increases cycle over cycle Implementation is diverging, not converging

When a Wiggum Break fires, emit a wiggum.break event with trigger, run state, and unresolved findings. The event log makes it easy to audit why a run was halted and whether the break was warranted.

Context Pollution

Signal Action
>15 memory lessons injected into one prompt Prune to top 5 by frequency
>20 findings tracked across cycles Summarize into top 5 themes
Agent prompt exceeds estimated 50% of context window Strip examples, keep rules only

Unified Escalation Protocol

All three layers use the same escalation:

Step Archetype Shadows System Shadows Policy Boundaries
1st Apply corrective action, let agent continue Apply corrective action, continue run Apply boundary action (downgrade, checkpoint)
2nd (same issue) Replace the agent -- shadow is entrenched Pause run, report to user Force stop with clean state
3rd (pattern) Escalate to user: "task needs re-scoping" Escalate to user: "systemic issue" Escalate to user: "resource limits reached"

Integration

Shadow checks run after each agent completes during orchestration. System shadow checks run at phase boundaries. Policy checks run on a timer and at task boundaries.

The run skill references this framework at:

  • Step 3 (Check phase): archetype shadow monitoring
  • Step 4 (Act phase): convergence/diminishing returns
  • Step 5 (Completion): effectiveness scoring
  • Sprint skill: checkpoint policy between batches