Files
claude-archeflow-plugin/skills/shadow-detection/SKILL.md
Christian Nennemann a6fa708f8b feat: ArcheFlow — multi-agent orchestration plugin for Claude Code
Zero-dependency Claude Code plugin using Jungian archetypes as
behavioral protocols for multi-agent orchestration.

- 7 archetypes (Explorer, Creator, Maker, Guardian, Skeptic, Trickster, Sage)
- ArcheHelix: rising PDCA quality spiral with feedback loops
- Shadow detection: automatic dysfunction recognition and correction
- 3 built-in workflows (fast, standard, thorough)
- Autonomous mode: unattended overnight sessions with full visibility
- Custom archetypes and workflows via markdown/YAML
- SessionStart hook for automatic bootstrap
- Examples for feature implementation and security review
2026-04-02 16:37:44 +00:00

6.9 KiB

name, description
name description
shadow-detection Use when monitoring agent behavior for dysfunction, when an agent seems stuck, or when orchestration quality is degrading. Detects and corrects Jungian shadow activation in archetypes.

Shadow Detection — The Dark Side of Strength

Every archetype has a shadow: the destructive inversion of its core strength. A shadow activates when an archetype's behavior becomes extreme, rigid, or disconnected from the team's goal.

Shadows are not bugs — they're features operating outside their healthy range. Detection and correction are part of the orchestration, not a failure.

The Seven Shadows

Explorer → The Rabbit Hole

Strength inverted: Curiosity becomes compulsive investigation.

Symptoms:

  • Research output keeps growing but never synthesizes
  • "I found one more thing to check" repeated 3+ times
  • Reading more than 15 files without producing findings
  • Output is a raw list of files/functions with no analysis or recommendation
  • Research time exceeds implementation estimate

Triggers:

  • Output length > 2000 words without a recommendation section
  • More than 3 "see also" or "related" tangents
  • No confidence score or decisive recommendation

Correction: Stop the Explorer. Require immediate synthesis: "Summarize your top 3 findings and one recommendation in under 300 words. Everything else is noise."


Creator → The Perfectionist

Strength inverted: Design excellence becomes endless refinement.

Symptoms:

  • Proposal revised 3+ times without new information driving the revision
  • Adding "nice to have" features not in the original task
  • Confidence score keeps dropping instead of stabilizing
  • Scope expanding with each revision
  • "What about..." additions that weren't in Explorer's findings

Triggers:

  • Revision count > 2 without external feedback
  • Proposal scope exceeds original task by > 50%
  • Confidence drops below 0.5

Correction: Freeze the proposal. "Ship at current state. Imperfect plans that ship beat perfect plans that don't. Note remaining concerns under 'Risks' and let the Check phase catch them."


Maker → The Cowboy

Strength inverted: Bias for action becomes reckless shipping.

Symptoms:

  • Writing code before reading the proposal fully
  • No tests, or tests written after implementation (not TDD)
  • Large uncommitted working tree ("I'll commit when it's done")
  • "Improving" code outside the proposal's scope
  • Ignoring existing patterns in favor of "better" approaches

Triggers:

  • No test files in the changeset
  • Single monolithic commit instead of incremental commits
  • Files changed that aren't mentioned in the proposal
  • No commit for > 50% of the implementation work

Correction: Halt implementation. "Read the proposal. Write a test. Commit what you have. Then continue."


Guardian → The Paranoid

Strength inverted: Risk awareness becomes blocking everything.

Symptoms:

  • Every finding marked CRITICAL
  • Blocking on theoretical risks with < 1% probability
  • Rejected 3+ proposals without offering a viable path forward
  • Security concerns for internal-only code at external-API severity
  • Requiring mitigations that cost more than the risk they address

Triggers:

  • CRITICAL:WARNING ratio > 2:1
  • Zero APPROVED verdicts in 3+ consecutive reviews
  • Findings reference threat models inappropriate to the context
  • No suggested fixes, only rejections

Correction: Recalibrate. "For each CRITICAL finding, answer: Would a senior engineer at a well-run company block a PR for this? If not, downgrade to WARNING. Provide a fix suggestion for every finding you keep as CRITICAL."


Skeptic → The Paralytic

Strength inverted: Critical thinking becomes inability to approve anything.

Symptoms:

  • More than 7 challenges raised
  • Challenges without suggested alternatives
  • Questioning requirements that are outside the task scope
  • "What if" chains more than 2 levels deep
  • Restating the same concern in different words

Triggers:

  • Challenge count > 7
  • Less than 50% of challenges include alternatives
  • Challenges reference concerns outside the task scope
  • Same conceptual concern raised multiple times

Correction: Force-rank. "Rank your challenges by impact. Keep the top 3. Each must include a specific alternative. Delete the rest."


Trickster → The Saboteur

Strength inverted: Adversarial testing becomes destructive chaos.

Symptoms:

  • Modifying code instead of testing it
  • "Testing" by breaking things outside the scope of changes
  • Finding bugs in unrelated subsystems and claiming the change caused them
  • Attacks with no constructive reporting (just "it's broken")
  • Enjoying destruction more than improving quality

Triggers:

  • Agent modifies files that aren't in the Maker's changeset
  • Findings reference code untouched by the implementation
  • No reproduction steps in findings
  • Tone shifts from analytical to gleeful

Correction: Scope enforcement. "You test the CHANGES, not the entire system. Limit attacks to files in the Maker's diff. Every finding must include exact reproduction steps."


Sage → The Bureaucrat

Strength inverted: Holistic judgment becomes documentation bloat.

Symptoms:

  • Review longer than the code change itself
  • Requesting documentation for self-evident code
  • Suggesting refactors unrelated to the current task
  • Adding "while we're here" improvement suggestions
  • Philosophical commentary that doesn't lead to actionable findings

Triggers:

  • Review word count > 2x the code change's word count
  • More than 30% of findings are INFO severity
  • Suggestions reference files not in the changeset
  • "Consider" or "think about" without specific recommendation

Correction: Focus. "Limit your review to issues that affect maintainability in the next 6 months. For each finding, state the specific consequence of NOT fixing it. If you can't, it's not worth raising."


Shadow Escalation Protocol

  1. First detection: Log the shadow, apply the correction prompt, let the agent continue
  2. Second detection (same agent, same shadow): Replace the agent with a fresh one. The shadow is entrenched.
  3. Shadow detected in 3+ agents in the same cycle: The task itself may be poorly scoped. Escalate to the user: "Multiple agents are struggling — the task may need to be broken down."

Shadow Immunity

Some behaviors LOOK like shadows but aren't:

  • Explorer reading 20 files in a monorepo with scattered dependencies → not a rabbit hole if each file is genuinely relevant
  • Creator at confidence 0.4 → not perfectionism if the task is genuinely ambiguous (flag to user instead)
  • Guardian blocking with 2 CRITICAL findings → not paranoia if both are genuine security vulnerabilities
  • Trickster finding 5 edge cases → not sabotage if all are in the changed code with reproduction steps

Rule of thumb: Shadow = behavior disconnected from the goal. Intensity alone is not a shadow.