Files
Christian Nennemann 130c04fa58 feat: corrective action framework + CLAUDE.md rewrite + v0.8.0 cleanup
- Extend shadow-detection with 3-layer corrective action framework:
  archetype shadows, system shadows (tunnel vision, echo chamber, etc.),
  and policy boundaries (checkpoints, budget gates, circuit breakers)
- Rewrite CLAUDE.md with proper guardrails (DO/DO NOT, skill writing rules,
  200-line max per skill, no bash pseudo-code in skills)
- Update plugin.json to v0.8.0 with consolidated 19-skill list
- Update README architecture tree and skills reference
- Update using-archeflow version string to v0.8.0 / 19 skills
- Remove 8 empty skill directories (absorbed into run skill)
2026-04-06 20:52:27 +02:00

6.6 KiB

name, description
name description
shadow-detection Corrective action framework for agent dysfunction, system health, and operational policy. Three layers — archetype shadows, system shadows, policy boundaries — one escalation protocol.

Corrective Action Framework

Detect dysfunction. Apply corrective action. Escalate if repeated.

Three layers, one protocol:

  • Archetype Shadows — individual agent dysfunction (virtue pushed too far)
  • System Shadows — orchestration-level dysfunction (process going wrong)
  • Policy Boundaries — operational limits (time, cost, quality thresholds)

Archetype Shadows

Archetype Shadow Detect (any) Corrective Action
Explorer Rabbit Hole Output >2000w without Recommendation; >3 tangents; >15 files no patterns; no synthesis in final 25% "Summarize top 3 findings and one recommendation in 300 words."
Creator Over-Architect >2 new abstractions for one feature; "future-proof" in rationale; scope exceeds task >50%; >1 new package "Design for the current order of magnitude. Remove abstractions for hypothetical requirements."
Maker Rogue Zero test files with >=3 files changed; single monolithic commit; files outside proposal; no test run evidence "Read the proposal. Write a test. Commit. Revert out-of-scope files."
Guardian Paranoid CRITICAL:WARNING ratio >2:1 (min 3); zero APPROVED in 3+ reviews; <50% findings include fix; findings require compromised systems "For each CRITICAL: would a senior engineer block a PR? If not, downgrade. Every rejection needs a specific fix."
Skeptic Paralytic >7 challenges; <50% include alternatives; same concern 2+ times reworded; >3 findings outside scope "Rank by impact. Keep top 3 with alternatives. Delete the rest."
Trickster False Alarm Findings in untouched code; >10 findings for <5 files; impossible scenarios; >3 without repro steps "Delete findings outside the diff. Rank by likelihood x impact. Keep top 3-5."
Sage Bureaucrat Review words >2x diff lines; findings outside changeset; >2 "consider" without action; suggesting docs for trivial functions "Limit to issues affecting maintainability in 6 months. Every finding needs a specific action."

Shadow Immunity

Intensity alone is not a shadow. Shadow = behavior disconnected from the goal.

  • Explorer reading 20 files in a monorepo with scattered deps -- not rabbit hole if each is relevant
  • Guardian blocking with 2 CRITICALs -- not paranoid if both are genuine vulnerabilities
  • Trickster finding 5 edge cases -- not false alarm if all are in changed code with repro steps

System Shadows

Orchestration-level dysfunction that isn't tied to one archetype.

Shadow Detect Corrective Action
Tunnel Vision All reviewers flag same category (e.g., 4 security findings, 0 quality/testing) "Redistribute attention. Are we missing quality, testing, or design concerns?"
Echo Chamber Unanimous approval in <30s on standard/thorough workflow "Suspicious fast consensus. Re-run Guardian with adversarial prompt."
Gold Plating Maker working on INFO fixes while CRITICALs remain open "Fix CRITICALs first. Park INFO items."
Analysis Paralysis Plan phase >2x longer than Do phase; Explorer spawned 3+ times "Stop researching. Ship a proposal with known gaps."
Cargo Cult Memory lesson injected but the same finding repeats anyway "Lesson ineffective. Reword, strengthen, or remove it."
Broken Window 3+ WARNINGs deferred across consecutive runs in the same project "Accumulated tech debt. Schedule a cleanup sprint."
Scope Creep Maker changes >2x files listed in proposal "Revert to proposal scope. If more files needed, update the proposal first."

Policy Boundaries

Operational limits that protect session quality, cost, and resumability.

Checkpoint Policy

Every 45 minutes or 3 completed tasks (whichever first):

  1. Commit + push all work in progress
  2. Write handoff summary to control-center.md
  3. Log token spend so far
  4. Compare output quality: last task vs first task
  5. If quality degrading -> STOP with clean state
  6. If budget >80% spent -> STOP with clean state
  7. Otherwise -> continue

Budget Gate

Threshold Action
50% budget spent Log warning, continue
80% budget spent Downgrade models (sonnet->haiku for reviewers)
95% budget spent Complete current task, then STOP
100% budget STOP immediately, commit WIP

Circuit Breaker

Trigger Action
3 consecutive agent failures/timeouts STOP. Infrastructure issue, not a code problem.
3 consecutive task failures in sprint STOP. Something systemic is wrong.
Same shadow detected 3+ times in one cycle STOP. Task needs to be broken down or re-scoped.
Test suite broken after merge Auto-revert, STOP, report.

Diminishing Returns

Signal Action
Cycle N findings identical to cycle N-1 STOP cycling. Present best result.
Convergence score <0.5 for 2 consecutive cycles STOP. "This needs a different approach."
Reviewer finding count increases cycle over cycle STOP. Implementation is diverging, not converging.

Context Pollution

Signal Action
>15 memory lessons injected into one prompt Prune to top 5 by frequency
>20 findings tracked across cycles Summarize into top 5 themes
Agent prompt exceeds estimated 50% of context window Strip examples, keep rules only

Unified Escalation Protocol

All three layers use the same escalation:

Step Archetype Shadows System Shadows Policy Boundaries
1st Apply corrective action, let agent continue Apply corrective action, continue run Apply boundary action (downgrade, checkpoint)
2nd (same issue) Replace the agent -- shadow is entrenched Pause run, report to user Force stop with clean state
3rd (pattern) Escalate to user: "task needs re-scoping" Escalate to user: "systemic issue" Escalate to user: "resource limits reached"

Integration

Shadow checks run after each agent completes during orchestration. System shadow checks run at phase boundaries. Policy checks run on a timer and at task boundaries.

The run skill references this framework at:

  • Step 3 (Check phase): archetype shadow monitoring
  • Step 4 (Act phase): convergence/diminishing returns
  • Step 5 (Completion): effectiveness scoring
  • Sprint skill: checkpoint policy between batches