Files

Christian Nennemann 130c04fa58 feat: corrective action framework + CLAUDE.md rewrite + v0.8.0 cleanup

- Extend shadow-detection with 3-layer corrective action framework:
  archetype shadows, system shadows (tunnel vision, echo chamber, etc.),
  and policy boundaries (checkpoints, budget gates, circuit breakers)
- Rewrite CLAUDE.md with proper guardrails (DO/DO NOT, skill writing rules,
  200-line max per skill, no bash pseudo-code in skills)
- Update plugin.json to v0.8.0 with consolidated 19-skill list
- Update README architecture tree and skills reference
- Update using-archeflow version string to v0.8.0 / 19 skills
- Remove 8 empty skill directories (absorbed into run skill)

2026-04-06 20:52:27 +02:00

6.6 KiB

Raw Permalink Blame History

name, description

name	description
shadow-detection	Corrective action framework for agent dysfunction, system health, and operational policy. Three layers — archetype shadows, system shadows, policy boundaries — one escalation protocol.

Corrective Action Framework

Detect dysfunction. Apply corrective action. Escalate if repeated.

Three layers, one protocol:

Archetype Shadows — individual agent dysfunction (virtue pushed too far)
System Shadows — orchestration-level dysfunction (process going wrong)
Policy Boundaries — operational limits (time, cost, quality thresholds)

Archetype Shadows

Archetype	Shadow	Detect (any)	Corrective Action
Explorer	Rabbit Hole	Output >2000w without Recommendation; >3 tangents; >15 files no patterns; no synthesis in final 25%	"Summarize top 3 findings and one recommendation in 300 words."
Creator	Over-Architect	>2 new abstractions for one feature; "future-proof" in rationale; scope exceeds task >50%; >1 new package	"Design for the current order of magnitude. Remove abstractions for hypothetical requirements."
Maker	Rogue	Zero test files with >=3 files changed; single monolithic commit; files outside proposal; no test run evidence	"Read the proposal. Write a test. Commit. Revert out-of-scope files."
Guardian	Paranoid	CRITICAL:WARNING ratio >2:1 (min 3); zero APPROVED in 3+ reviews; <50% findings include fix; findings require compromised systems	"For each CRITICAL: would a senior engineer block a PR? If not, downgrade. Every rejection needs a specific fix."
Skeptic	Paralytic	>7 challenges; <50% include alternatives; same concern 2+ times reworded; >3 findings outside scope	"Rank by impact. Keep top 3 with alternatives. Delete the rest."
Trickster	False Alarm	Findings in untouched code; >10 findings for <5 files; impossible scenarios; >3 without repro steps	"Delete findings outside the diff. Rank by likelihood x impact. Keep top 3-5."
Sage	Bureaucrat	Review words >2x diff lines; findings outside changeset; >2 "consider" without action; suggesting docs for trivial functions	"Limit to issues affecting maintainability in 6 months. Every finding needs a specific action."

Shadow Immunity

Intensity alone is not a shadow. Shadow = behavior disconnected from the goal.

Explorer reading 20 files in a monorepo with scattered deps -- not rabbit hole if each is relevant
Guardian blocking with 2 CRITICALs -- not paranoid if both are genuine vulnerabilities
Trickster finding 5 edge cases -- not false alarm if all are in changed code with repro steps

System Shadows

Orchestration-level dysfunction that isn't tied to one archetype.

Shadow	Detect	Corrective Action
Tunnel Vision	All reviewers flag same category (e.g., 4 security findings, 0 quality/testing)	"Redistribute attention. Are we missing quality, testing, or design concerns?"
Echo Chamber	Unanimous approval in <30s on standard/thorough workflow	"Suspicious fast consensus. Re-run Guardian with adversarial prompt."
Gold Plating	Maker working on INFO fixes while CRITICALs remain open	"Fix CRITICALs first. Park INFO items."
Analysis Paralysis	Plan phase >2x longer than Do phase; Explorer spawned 3+ times	"Stop researching. Ship a proposal with known gaps."
Cargo Cult	Memory lesson injected but the same finding repeats anyway	"Lesson ineffective. Reword, strengthen, or remove it."
Broken Window	3+ WARNINGs deferred across consecutive runs in the same project	"Accumulated tech debt. Schedule a cleanup sprint."
Scope Creep	Maker changes >2x files listed in proposal	"Revert to proposal scope. If more files needed, update the proposal first."

Policy Boundaries

Operational limits that protect session quality, cost, and resumability.

Checkpoint Policy

Every 45 minutes or 3 completed tasks (whichever first):

Commit + push all work in progress
Write handoff summary to control-center.md
Log token spend so far
Compare output quality: last task vs first task
If quality degrading -> STOP with clean state
If budget >80% spent -> STOP with clean state
Otherwise -> continue

Budget Gate

Threshold	Action
50% budget spent	Log warning, continue
80% budget spent	Downgrade models (sonnet->haiku for reviewers)
95% budget spent	Complete current task, then STOP
100% budget	STOP immediately, commit WIP

Circuit Breaker

Trigger	Action
3 consecutive agent failures/timeouts	STOP. Infrastructure issue, not a code problem.
3 consecutive task failures in sprint	STOP. Something systemic is wrong.
Same shadow detected 3+ times in one cycle	STOP. Task needs to be broken down or re-scoped.
Test suite broken after merge	Auto-revert, STOP, report.

Diminishing Returns

Signal	Action
Cycle N findings identical to cycle N-1	STOP cycling. Present best result.
Convergence score <0.5 for 2 consecutive cycles	STOP. "This needs a different approach."
Reviewer finding count increases cycle over cycle	STOP. Implementation is diverging, not converging.

Context Pollution

Signal	Action
>15 memory lessons injected into one prompt	Prune to top 5 by frequency
>20 findings tracked across cycles	Summarize into top 5 themes
Agent prompt exceeds estimated 50% of context window	Strip examples, keep rules only

Unified Escalation Protocol

All three layers use the same escalation:

Step	Archetype Shadows	System Shadows	Policy Boundaries
1st	Apply corrective action, let agent continue	Apply corrective action, continue run	Apply boundary action (downgrade, checkpoint)
2nd (same issue)	Replace the agent -- shadow is entrenched	Pause run, report to user	Force stop with clean state
3rd (pattern)	Escalate to user: "task needs re-scoping"	Escalate to user: "systemic issue"	Escalate to user: "resource limits reached"

Integration

Shadow checks run after each agent completes during orchestration. System shadow checks run at phase boundaries. Policy checks run on a timer and at task boundaries.

The run skill references this framework at:

Step 3 (Check phase): archetype shadow monitoring
Step 4 (Act phase): convergence/diminishing returns
Step 5 (Completion): effectiveness scoring
Sprint skill: checkpoint policy between batches

6.6 KiB Raw Permalink Blame History