refactor: simplify memory and shadow-detection skills

Trim verbose implementation details that duplicate what the bash helper scripts already handle. Memory skill: 278 -> 120 lines. Shadow detection skill: 180 -> 66 lines. All essential protocols, tables, and commands preserved; removed redundant algorithm descriptions, multiple examples, and narrative prose.
2026-04-06 20:42:47 +02:00
parent 55a6ba14c9
commit 1baaa79946
2 changed files with 83 additions and 354 deletions
--- a/skills/shadow-detection/SKILL.md
+++ b/skills/shadow-detection/SKILL.md
@@ -5,176 +5,62 @@ description: Use when monitoring agent behavior for dysfunction, when an agent s

 # Shadow Detection

-Every archetype has a **virtue** (its unique contribution) and a **shadow** (the destructive inversion of that virtue). A shadow activates when the virtue is pushed too far.
+Every archetype has a virtue and a shadow (its destructive inversion). Shadow activates when the virtue is pushed too far.

-```
-Virtue (healthy)              → pushed too far →  Shadow (dysfunction)
-
-Contextual Clarity            → can't stop      → Rabbit Hole
-Decisive Framing              → over-builds      → Over-Architect
-Execution Discipline          → no guardrails    → Rogue
-Threat Intuition              → sees threats only → Paranoid
-Assumption Surfacing          → questions only    → Paralytic
-Adversarial Creativity        → noise over signal → False Alarm
-Maintainability Judgment      → reviews only      → Bureaucrat
-```
+| Archetype | Virtue | Shadow |
+|-----------|--------|--------|
+| Explorer | Contextual Clarity | Rabbit Hole |
+| Creator | Decisive Framing | Over-Architect |
+| Maker | Execution Discipline | Rogue |
+| Guardian | Threat Intuition | Paranoid |
+| Skeptic | Assumption Surfacing | Paralytic |
+| Trickster | Adversarial Creativity | False Alarm |
+| Sage | Maintainability Judgment | Bureaucrat |

 ---

-## Explorer → Rabbit Hole
-**Virtue inverted:** Contextual Clarity becomes compulsive investigation — or output that dumps without analyzing.
+### Explorer -> Rabbit Hole
+**Detect** (any): output >2000w without Recommendation | >3 tangents | >15 files no patterns | no synthesis in final 25%
+**Correct**: "Summarize top 3 findings and one recommendation in under 300 words."

-**Symptoms:**
- Research output keeps growing but never synthesizes
- "I found one more thing to check" repeated 3+ times
- Reading more than 15 files without producing findings
- Output is a raw inventory of files with no analysis or recommendation
+### Creator -> Over-Architect
+**Detect** (any): >2 new abstractions for a single feature | "future-proof" in rationale | scope exceeds task by >50% | >1 new package for one feature
+**Correct**: "Design for the current order of magnitude. Remove abstractions that serve hypothetical requirements."

-**Detection Checklist** (trigger on ANY):
- [ ] Output >2000 words without a `### Recommendation` section
- [ ] >3 tangent topics not directly related to the original task
- [ ] >15 files read with no `### Patterns` identified
- [ ] No synthesis language (recommend, suggest, conclusion, finding, summary) in final 25% of output
+### Maker -> Rogue
+**Detect** (any): zero test files with >=3 files changed | single monolithic commit | diff contains files not in proposal | no evidence of running tests
+**Correct**: "Read the proposal. Write a test. Commit what you have. Revert changes to files not in the proposal."

-**Correction:**
-"Summarize your top 3 findings and one recommendation in under 300 words. If your output has no Recommendation section, add one. A dump is not research."
+### Guardian -> Paranoid
+**Detect** (any): CRITICAL:WARNING ratio >2:1 (min 3 findings) | zero APPROVED in 3+ reviews | <50% findings include a fix | findings require already-compromised systems
+**Correct**: "For each CRITICAL: would a senior engineer block a PR for this? If not, downgrade. Every rejection must include a specific fix."
+
+### Skeptic -> Paralytic
+**Detect** (any): >7 challenges in a single review | <50% include alternatives | same concern appears 2+ times reworded | >3 findings outside task scope
+**Correct**: "Rank challenges by impact. Keep top 3. Each must include a specific alternative. Delete the rest."
+
+### Trickster -> False Alarm
+**Detect** (any): findings reference code untouched by diff | >10 findings for <5 files | impossible deployment scenarios | >3 findings without repro steps
+**Correct**: "Delete findings outside the diff. Rank remaining by likelihood x impact. Keep top 3-5."
+
+### Sage -> Bureaucrat
+**Detect** (any): review words >2x diff lines | findings reference files not in changeset | >2 "consider" without concrete action | suggesting docs for <5-line functions
+**Correct**: "Limit to issues affecting maintainability in the next 6 months. Every finding must end with a specific action."

 ---

-## Creator → Over-Architect
-**Virtue inverted:** Decisive Framing becomes designing at the wrong scale.
+## Escalation Protocol

-**Symptoms:**
- Abstraction layers for one-time operations
- Future-proofing for requirements that don't exist
- Configuration systems for things that could be constants
- Proposal has more infrastructure than business logic
-
-**Detection Checklist** (trigger on ANY):
- [ ] >2 new abstractions (interfaces, base classes, factories, registries) for a single feature
- [ ] "In the future we might need..." or "future-proof" appears in rationale
- [ ] Proposal scope (files changed) exceeds original task scope by >50%
- [ ] More than 1 new package/module introduced for a single feature
-
-**Correction:**
-"Design for the current order of magnitude. If the app has 1000 users, design for 10,000 — not 10 million. Remove abstractions that serve hypothetical requirements."
-
---
-
-## Maker → Rogue
-**Virtue inverted:** Execution Discipline becomes reckless shipping — or expanding beyond the plan.
-
-**Symptoms:**
- Writing code before reading the proposal fully
- No tests, or tests written after implementation
- Large uncommitted working tree
- Files changed that aren't mentioned in the proposal
-
-**Detection Checklist** (trigger on ANY):
- [ ] Zero test files (`.test.`, `.spec.`, `_test.`) in the changeset with >=3 files changed
- [ ] Single monolithic commit instead of incremental commits
- [ ] Diff contains files not listed in the Creator's proposal `### Changes` section
- [ ] No evidence of running existing test suite before finishing
-
-**Correction:**
-"Read the proposal. Write a test. Commit what you have. Revert changes to files not in the proposal. Then continue."
-
---
-
-## Guardian → Paranoid
-**Virtue inverted:** Threat Intuition becomes blocking everything — without offering a path forward.
-
-**Symptoms:**
- Every finding marked CRITICAL
- Blocking on theoretical risks with < 1% probability
- Rejecting without suggesting how to fix
- Security concerns for internal-only code at external-API severity
-
-**Detection Checklist** (trigger on ANY):
- [ ] CRITICAL:WARNING ratio >2:1 (with minimum 3 total findings)
- [ ] Zero APPROVED verdicts in 3+ consecutive reviews
- [ ] <50% of findings include a suggested fix in the `Fix` column
- [ ] Findings reference attack scenarios that require already-compromised internal systems
-
-**Correction:**
-"For each CRITICAL finding, answer: Would a senior engineer block a PR for this? If not, downgrade. Every rejection must include a specific, implementable fix."
-
---
-
-## Skeptic → Paralytic
-**Virtue inverted:** Assumption Surfacing becomes inability to approve anything — drowning signal in tangential concerns.
-
-**Symptoms:**
- More than 7 challenges raised
- Challenges without suggested alternatives
- "What about X?" chains that drift from the task
- Restating the same concern in different words
-
-**Detection Checklist** (trigger on ANY):
- [ ] >7 findings/challenges raised in a single review
- [ ] <50% of findings include an alternative in the `Fix` column
- [ ] Same conceptual concern appears 2+ times with different wording
- [ ] >3 findings reference code or scenarios outside the task scope
-
-**Correction:**
-"Rank your challenges by impact. Keep the top 3. Each must include a specific alternative. Delete the rest."
-
---
-
-## Trickster → False Alarm
-**Virtue inverted:** Adversarial Creativity becomes noise — too many low-signal findings drowning the real issues.
-
-**Symptoms:**
- Testing code that wasn't changed
- Reporting non-bugs as bugs (unrealistic test scenarios)
- 20 findings when 3 good ones would cover the real risks
- Edge cases for edge cases (diminishing returns)
-
-**Detection Checklist** (trigger on ANY):
- [ ] Any finding references code untouched by the Maker's diff
- [ ] >10 findings for a change touching <5 files
- [ ] Findings describe scenarios requiring conditions that can't occur in the deployment context
- [ ] >3 findings without reproduction steps
-
-**Correction:**
-"Quality over quantity. Delete findings outside the Maker's diff. Rank remaining by likelihood x impact. Keep top 3-5. Three real findings beat twenty noise."
-
---
-
-## Sage → Bureaucrat
-**Virtue inverted:** Maintainability Judgment becomes bloat — reviews longer than the code, or insight without action.
-
-**Symptoms:**
- Review longer than the code change itself
- Requesting documentation for self-evident code
- Suggesting refactors unrelated to the current task
- Deep-sounding analysis that doesn't end with a specific action
-
-**Detection Checklist** (trigger on ANY):
- [ ] Review word count >2x the code change's line count (rough: review words > diff lines x 2)
- [ ] Any finding references files not in the Maker's changeset
- [ ] >2 findings use "consider" or "think about" without a concrete action in the `Fix` column
- [ ] Suggesting documentation for functions with <5 lines or self-descriptive names
-
-**Correction:**
-"Limit your review to issues that affect maintainability in the next 6 months. Every finding must end with a specific action. If you can't state the consequence of NOT fixing it, don't raise it."
-
---
-
-## Shadow Escalation Protocol
-
-1. **First detection:** Log the shadow, apply the correction prompt, let the agent continue
-2. **Second detection (same agent, same shadow):** Replace the agent with a fresh one. The shadow is entrenched.
-3. **Shadow detected in 3+ agents in the same cycle:** The task itself may be poorly scoped. Escalate to the user: "Multiple agents are struggling — the task may need to be broken down."
+1. **1st detection:** Log the shadow, apply the correction prompt, let the agent continue
+2. **2nd detection (same agent, same shadow):** Replace the agent -- the shadow is entrenched
+3. **3+ agents shadowed in same cycle:** Escalate to user -- the task may need to be broken down

 ## Shadow Immunity

-Some behaviors LOOK like shadows but aren't:
+Some behaviors look like shadows but are not. **Rule of thumb:** shadow = behavior disconnected from the goal. Intensity alone is not a shadow.

- Explorer reading 20 files in a monorepo with scattered dependencies → **not a rabbit hole** if each file is genuinely relevant
- Creator adding an abstraction → **not over-architect** if the abstraction is genuinely needed by the current task
- Guardian blocking with 2 CRITICAL findings → **not paranoid** if both are genuine security vulnerabilities
- Trickster finding 5 edge cases → **not false alarm** if all are in the changed code with reproduction steps
- Sage writing a long review → **not bureaucrat** if the change is large and every finding is actionable
-
-**Rule of thumb:** Shadow = behavior disconnected from the goal. Intensity alone is not a shadow.
+- Explorer reading 20 files in a monorepo with scattered dependencies -- not a rabbit hole if each file is genuinely relevant
+- Creator adding an abstraction -- not over-architect if the current task genuinely needs it
+- Guardian blocking with 2 CRITICALs -- not paranoid if both are genuine security vulnerabilities
+- Trickster finding 5 edge cases -- not false alarm if all are in changed code with repro steps
+- Sage writing a long review -- not bureaucrat if the change is large and every finding is actionable