Replaces generic "circuit breaker" with "Wiggum Break" — policy enforcement halt condition named after Chief Wiggum (policy + Ralph Loop's dad). Hard breaks (immediate halt) and soft breaks (finish then halt) with wiggum.break event type. Updated both papers and shadow-detection skill.
140 lines
7.2 KiB
Markdown
140 lines
7.2 KiB
Markdown
---
|
|
name: shadow-detection
|
|
description: |
|
|
Corrective action framework for agent dysfunction, system health, and operational policy.
|
|
Three layers — archetype shadows, system shadows, policy boundaries — one escalation protocol.
|
|
---
|
|
|
|
# Corrective Action Framework
|
|
|
|
Detect dysfunction. Apply corrective action. Escalate if repeated.
|
|
|
|
Three layers, one protocol:
|
|
- **Archetype Shadows** — individual agent dysfunction (virtue pushed too far)
|
|
- **System Shadows** — orchestration-level dysfunction (process going wrong)
|
|
- **Policy Boundaries** — operational limits (time, cost, quality thresholds)
|
|
|
|
---
|
|
|
|
## Archetype Shadows
|
|
|
|
| Archetype | Shadow | Detect (any) | Corrective Action |
|
|
|-----------|--------|-------------|-------------------|
|
|
| Explorer | Rabbit Hole | Output >2000w without Recommendation; >3 tangents; >15 files no patterns; no synthesis in final 25% | "Summarize top 3 findings and one recommendation in 300 words." |
|
|
| Creator | Over-Architect | >2 new abstractions for one feature; "future-proof" in rationale; scope exceeds task >50%; >1 new package | "Design for the current order of magnitude. Remove abstractions for hypothetical requirements." |
|
|
| Maker | Rogue | Zero test files with >=3 files changed; single monolithic commit; files outside proposal; no test run evidence | "Read the proposal. Write a test. Commit. Revert out-of-scope files." |
|
|
| Guardian | Paranoid | CRITICAL:WARNING ratio >2:1 (min 3); zero APPROVED in 3+ reviews; <50% findings include fix; findings require compromised systems | "For each CRITICAL: would a senior engineer block a PR? If not, downgrade. Every rejection needs a specific fix." |
|
|
| Skeptic | Paralytic | >7 challenges; <50% include alternatives; same concern 2+ times reworded; >3 findings outside scope | "Rank by impact. Keep top 3 with alternatives. Delete the rest." |
|
|
| Trickster | False Alarm | Findings in untouched code; >10 findings for <5 files; impossible scenarios; >3 without repro steps | "Delete findings outside the diff. Rank by likelihood x impact. Keep top 3-5." |
|
|
| Sage | Bureaucrat | Review words >2x diff lines; findings outside changeset; >2 "consider" without action; suggesting docs for trivial functions | "Limit to issues affecting maintainability in 6 months. Every finding needs a specific action." |
|
|
|
|
### Shadow Immunity
|
|
|
|
Intensity alone is not a shadow. **Shadow = behavior disconnected from the goal.**
|
|
|
|
- Explorer reading 20 files in a monorepo with scattered deps -- not rabbit hole if each is relevant
|
|
- Guardian blocking with 2 CRITICALs -- not paranoid if both are genuine vulnerabilities
|
|
- Trickster finding 5 edge cases -- not false alarm if all are in changed code with repro steps
|
|
|
|
---
|
|
|
|
## System Shadows
|
|
|
|
Orchestration-level dysfunction that isn't tied to one archetype.
|
|
|
|
| Shadow | Detect | Corrective Action |
|
|
|--------|--------|-------------------|
|
|
| **Tunnel Vision** | All reviewers flag same category (e.g., 4 security findings, 0 quality/testing) | "Redistribute attention. Are we missing quality, testing, or design concerns?" |
|
|
| **Echo Chamber** | Unanimous approval in <30s on standard/thorough workflow | "Suspicious fast consensus. Re-run Guardian with adversarial prompt." |
|
|
| **Gold Plating** | Maker working on INFO fixes while CRITICALs remain open | "Fix CRITICALs first. Park INFO items." |
|
|
| **Analysis Paralysis** | Plan phase >2x longer than Do phase; Explorer spawned 3+ times | "Stop researching. Ship a proposal with known gaps." |
|
|
| **Cargo Cult** | Memory lesson injected but the same finding repeats anyway | "Lesson ineffective. Reword, strengthen, or remove it." |
|
|
| **Broken Window** | 3+ WARNINGs deferred across consecutive runs in the same project | "Accumulated tech debt. Schedule a cleanup sprint." |
|
|
| **Scope Creep** | Maker changes >2x files listed in proposal | "Revert to proposal scope. If more files needed, update the proposal first." |
|
|
|
|
---
|
|
|
|
## Policy Boundaries
|
|
|
|
Operational limits that protect session quality, cost, and resumability.
|
|
|
|
### Checkpoint Policy
|
|
|
|
Every **45 minutes** or **3 completed tasks** (whichever first):
|
|
|
|
1. Commit + push all work in progress
|
|
2. Write handoff summary to `control-center.md`
|
|
3. Log token spend so far
|
|
4. Compare output quality: last task vs first task
|
|
5. If quality degrading -> STOP with clean state
|
|
6. If budget >80% spent -> STOP with clean state
|
|
7. Otherwise -> continue
|
|
|
|
### Budget Gate
|
|
|
|
| Threshold | Action |
|
|
|-----------|--------|
|
|
| 50% budget spent | Log warning, continue |
|
|
| 80% budget spent | Downgrade models (sonnet->haiku for reviewers) |
|
|
| 95% budget spent | Complete current task, then STOP |
|
|
| 100% budget | STOP immediately, commit WIP |
|
|
|
|
### Wiggum Break (Circuit Breaker)
|
|
|
|
Named after Chief Wiggum — policy enforcement AND the Ralph Loop's dad.
|
|
When a Wiggum Break triggers, the system halts execution, saves state, and
|
|
reports to the user. "Bake 'em away, toys."
|
|
|
|
**Hard breaks** (halt immediately, commit WIP):
|
|
|
|
| Trigger | Reason |
|
|
|---------|--------|
|
|
| 3 consecutive agent failures/timeouts | Infrastructure issue, not a code problem |
|
|
| 3 consecutive task failures in sprint | Something systemic is wrong |
|
|
| Same shadow detected 3+ times in one cycle | Task needs to be broken down or re-scoped |
|
|
| Test suite broken after merge | Auto-revert, then halt |
|
|
| 2+ oscillating findings (present→absent→present) | Fundamental tension in review criteria |
|
|
|
|
**Soft breaks** (finish current task, then halt):
|
|
|
|
| Signal | Reason |
|
|
|--------|--------|
|
|
| Cycle N findings identical to cycle N-1 | No progress — present best result |
|
|
| Convergence score <0.5 for 2 consecutive cycles | "This needs a different approach" |
|
|
| Reviewer finding count increases cycle over cycle | Implementation is diverging, not converging |
|
|
|
|
When a Wiggum Break fires, emit a `wiggum.break` event with trigger, run state, and unresolved findings.
|
|
The event log makes it easy to audit why a run was halted and whether the break was warranted.
|
|
|
|
### Context Pollution
|
|
|
|
| Signal | Action |
|
|
|--------|--------|
|
|
| >15 memory lessons injected into one prompt | Prune to top 5 by frequency |
|
|
| >20 findings tracked across cycles | Summarize into top 5 themes |
|
|
| Agent prompt exceeds estimated 50% of context window | Strip examples, keep rules only |
|
|
|
|
---
|
|
|
|
## Unified Escalation Protocol
|
|
|
|
All three layers use the same escalation:
|
|
|
|
| Step | Archetype Shadows | System Shadows | Policy Boundaries |
|
|
|------|-------------------|----------------|-------------------|
|
|
| **1st** | Apply corrective action, let agent continue | Apply corrective action, continue run | Apply boundary action (downgrade, checkpoint) |
|
|
| **2nd** (same issue) | Replace the agent -- shadow is entrenched | Pause run, report to user | Force stop with clean state |
|
|
| **3rd** (pattern) | Escalate to user: "task needs re-scoping" | Escalate to user: "systemic issue" | Escalate to user: "resource limits reached" |
|
|
|
|
---
|
|
|
|
## Integration
|
|
|
|
Shadow checks run **after each agent completes** during orchestration. System shadow checks run **at phase boundaries**. Policy checks run **on a timer and at task boundaries**.
|
|
|
|
The `run` skill references this framework at:
|
|
- Step 3 (Check phase): archetype shadow monitoring
|
|
- Step 4 (Act phase): convergence/diminishing returns
|
|
- Step 5 (Completion): effectiveness scoring
|
|
- Sprint skill: checkpoint policy between batches
|