feat: corrective action framework + CLAUDE.md rewrite + v0.8.0 cleanup
- Extend shadow-detection with 3-layer corrective action framework: archetype shadows, system shadows (tunnel vision, echo chamber, etc.), and policy boundaries (checkpoints, budget gates, circuit breakers) - Rewrite CLAUDE.md with proper guardrails (DO/DO NOT, skill writing rules, 200-line max per skill, no bash pseudo-code in skills) - Update plugin.json to v0.8.0 with consolidated 19-skill list - Update README architecture tree and skills reference - Update using-archeflow version string to v0.8.0 / 19 skills - Remove 8 empty skill directories (absorbed into run skill)
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
{
|
||||
"name": "archeflow",
|
||||
"description": "Multi-agent orchestration with Jungian archetypes. PDCA quality cycles, shadow detection, git worktree isolation. Zero dependencies — works with any Claude Code session.",
|
||||
"version": "0.7.0",
|
||||
"version": "0.8.0",
|
||||
"author": {
|
||||
"name": "Chris Nennemann"
|
||||
},
|
||||
@@ -14,12 +14,11 @@
|
||||
"shadow-detection", "workflows"
|
||||
],
|
||||
"skills": [
|
||||
"run", "orchestration", "plan-phase", "do-phase", "check-phase", "act-phase",
|
||||
"shadow-detection", "convergence", "artifact-routing",
|
||||
"process-log", "memory", "effectiveness", "progress",
|
||||
"colette-bridge", "git-integration", "multi-project",
|
||||
"custom-archetypes", "workflow-design", "domains", "cost-tracking",
|
||||
"templates", "autonomous-mode", "using-archeflow", "presence"
|
||||
"run", "sprint", "review", "check-phase", "act-phase",
|
||||
"shadow-detection", "memory", "progress", "presence",
|
||||
"colette-bridge", "git-integration", "multi-project", "cost-tracking",
|
||||
"custom-archetypes", "workflow-design", "domains",
|
||||
"templates", "autonomous-mode", "using-archeflow"
|
||||
],
|
||||
"hooks": "hooks/hooks.json"
|
||||
}
|
||||
|
||||
168
CLAUDE.md
168
CLAUDE.md
@@ -1,71 +1,119 @@
|
||||
# archeflow — Multi-Agent Orchestration Plugin for Claude Code
|
||||
|
||||
Workspace-level orchestration: parallel agent teams across project portfolios, PDCA cycles with Jungian archetype roles, sprint runner, and post-implementation review. Installed as a Claude Code plugin.
|
||||
|
||||
## Tech Stack
|
||||
|
||||
- **Runtime:** Bash (lib scripts) + Claude Code skill system (Markdown skills)
|
||||
- **No build step, no dependencies** — pure bash + markdown
|
||||
- **Plugin format:** Claude Code plugin (skills/, hooks/, agents/, templates/)
|
||||
|
||||
## Key Commands
|
||||
|
||||
```bash
|
||||
# Use via Claude Code slash commands:
|
||||
/af-sprint # Main mode: work the queue across projects
|
||||
/af-run <task> # Deep orchestration with PDCA cycles
|
||||
/af-review # Post-implementation security/quality review
|
||||
/af-status # Current run status
|
||||
/af-init # Initialize ArcheFlow in a project
|
||||
/af-score # Archetype effectiveness scores
|
||||
/af-memory # Cross-run lesson memory
|
||||
/af-report # Full process report
|
||||
/af-fanout # Colette book fanout via agents
|
||||
```
|
||||
PDCA quality cycles with Jungian archetype roles, corrective action framework, sprint runner, and post-implementation review. Zero dependencies — pure Bash + Markdown.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
skills/ Slash command implementations (one dir per skill)
|
||||
sprint/ /af-sprint — queue-driven parallel agent runner
|
||||
run/ /af-run — PDCA orchestration
|
||||
review/ /af-review — Guardian-led code review
|
||||
plan-phase/ PDCA Plan phase
|
||||
do-phase/ PDCA Do phase
|
||||
check-phase/ PDCA Check phase
|
||||
act-phase/ PDCA Act phase
|
||||
memory/ Cross-run lessons learned
|
||||
cost-tracking/ Token/cost awareness
|
||||
domains/ Domain detection (code, writing, research)
|
||||
... ~25 skill directories
|
||||
hooks/
|
||||
hooks.json Hook definitions
|
||||
session-start/ Auto-activation on session start
|
||||
agents/ Archetype agent definitions
|
||||
explorer.md Divergent thinking, research
|
||||
creator.md Design, architecture
|
||||
maker.md Implementation
|
||||
guardian.md Security, risk, quality gates
|
||||
sage.md Wisdom, patterns, trade-offs
|
||||
skeptic.md Devil's advocate
|
||||
trickster.md Edge cases, unconventional approaches
|
||||
lib/ Bash helper scripts (git, DAG, events, progress, etc.)
|
||||
templates/bundles/ Pre-configured workflow bundles
|
||||
docs/ Roadmap, dogfood notes, test reports
|
||||
skills/ Slash commands and internal protocols (one SKILL.md per dir)
|
||||
run/ /af-run — self-contained PDCA orchestration (core skill)
|
||||
sprint/ /af-sprint — queue-driven parallel agent dispatch
|
||||
review/ /af-review — Guardian-led code review
|
||||
check-phase/ Shared reviewer protocol (used by run + review)
|
||||
act-phase/ Finding collection, fix routing, exit decisions
|
||||
shadow-detection/ Corrective action framework (archetype + system + policy)
|
||||
memory/ Cross-run lessons learned
|
||||
cost-tracking/ Token/cost awareness and budget enforcement
|
||||
domains/ Domain detection (code, writing, research)
|
||||
colette-bridge/ Writing context loader from colette.yaml
|
||||
multi-project/ Cross-repo orchestration with dependency DAG
|
||||
git-integration/ Per-phase commits, branch strategy, rollback
|
||||
templates/ Workflow/team bundle gallery
|
||||
autonomous-mode/ Unattended session protocol
|
||||
using-archeflow/ Session-start activation (auto-loaded via hook)
|
||||
agents/ Archetype personality definitions (one .md per archetype)
|
||||
lib/ Bash helper scripts (events, git, memory, progress, etc.)
|
||||
hooks/ Session-start hook (injects using-archeflow)
|
||||
templates/bundles/ Pre-configured workflow bundles
|
||||
```
|
||||
|
||||
## Domain Rules
|
||||
## Commands
|
||||
|
||||
- Skills are Markdown files with frontmatter — follow existing skill format exactly
|
||||
- Agents are archetype personas — maintain their distinct voice and perspective
|
||||
- Dogfood observations go to `archeflow/.archeflow/memory/lessons.jsonl`
|
||||
- Cost tracking: prefer cheap models for bulk ops, expensive for creative/review
|
||||
- PDCA cycle order is mandatory: Plan -> Do -> Check -> Act
|
||||
| Command | Purpose |
|
||||
|---------|---------|
|
||||
| `/af-run <task>` | PDCA orchestration with full agent cycle |
|
||||
| `/af-sprint` | Work the queue across projects |
|
||||
| `/af-review` | Review existing code changes |
|
||||
| `/af-status` | Current/last run status |
|
||||
| `/af-init` | Initialize ArcheFlow in a project |
|
||||
| `/af-score` | Archetype effectiveness scores |
|
||||
| `/af-memory` | Cross-run lesson memory |
|
||||
| `/af-report` | Full process report |
|
||||
| `/af-fanout` | Colette book fanout via agents |
|
||||
|
||||
## Do NOT
|
||||
## Core Concepts
|
||||
|
||||
- Add runtime dependencies — this must stay zero-dependency
|
||||
- Change archetype personalities without updating all referencing skills
|
||||
- Skip the Check phase in PDCA cycles (quality gate)
|
||||
- Modify hooks.json format without testing plugin reload
|
||||
- Use ArcheFlow to orchestrate simple single-file tasks (overhead not justified)
|
||||
### PDCA Cycle
|
||||
```
|
||||
Plan (Explorer + Creator) -> Do (Maker in worktree) -> Check (Guardian first, then others) -> Act (fix, merge, or cycle)
|
||||
```
|
||||
|
||||
### Archetypes
|
||||
Explorer (research), Creator (design), Maker (implement), Guardian (security), Skeptic (assumptions), Trickster (edge cases), Sage (quality). Each has a virtue and a shadow — see `shadow-detection` skill.
|
||||
|
||||
### Corrective Action Framework
|
||||
Three layers, one escalation protocol:
|
||||
- **Archetype shadows** — individual agent dysfunction
|
||||
- **System shadows** — orchestration-level issues (echo chamber, tunnel vision, scope creep)
|
||||
- **Policy boundaries** — operational limits (checkpoints, budgets, circuit breakers)
|
||||
|
||||
### Workflows
|
||||
| Risk Level | Workflow | Agents |
|
||||
|------------|----------|--------|
|
||||
| Low | `fast` | Creator -> Maker -> Guardian |
|
||||
| Medium | `standard` | Explorer + Creator -> Maker -> Guardian + Skeptic + Sage |
|
||||
| High | `thorough` | Explorer + Creator -> Maker -> All 4 reviewers |
|
||||
|
||||
## Guardrails
|
||||
|
||||
### DO
|
||||
|
||||
- Keep skills self-contained. The `run` skill needs zero prerequisites — it was consolidated for a reason.
|
||||
- Write skills as operational instructions Claude can follow, not software specifications.
|
||||
- Use tables for reference data, numbered steps for protocols.
|
||||
- Emit events via `./lib/archeflow-event.sh` — but never let logging block orchestration.
|
||||
- Maintain the corrective action framework when adding new agent types.
|
||||
- Test skill changes by running `/af-run --dry-run` and verifying the flow.
|
||||
- Keep archetype personalities distinct — each agent definition in `agents/` has a specific voice.
|
||||
|
||||
### DO NOT
|
||||
|
||||
- **Add runtime dependencies.** This must stay zero-dependency (Bash + Markdown only).
|
||||
- **Bloat skills back up.** The consolidation from 27 to ~15 skills was intentional. Do not create new skills for internal implementation details — inline them.
|
||||
- **Write bash pseudo-code in skills.** Skills are Claude instructions, not shell scripts. Use one-liner commands or lib script references, not multi-line bash blocks.
|
||||
- **Duplicate protocol definitions.** Finding format lives in `check-phase`. Routing table lives in `act-phase`. Shadow detection lives in `shadow-detection`. One source of truth per concept.
|
||||
- **Skip the Check phase** in PDCA cycles. It's the quality gate.
|
||||
- **Change archetype personalities** without updating all referencing skills and agent definitions.
|
||||
- **Use ArcheFlow for trivial tasks.** Single-file fixes, config changes, questions — just do them directly.
|
||||
- **Let skills exceed ~200 lines.** If a skill is growing past this, it probably needs splitting or the content belongs in a lib script.
|
||||
|
||||
### Skill Writing Rules
|
||||
|
||||
1. **Frontmatter**: `name` (kebab-case), `description` (one-liner + `<example>` tags for user-invocable skills)
|
||||
2. **Structure**: Imperative voice. Lead with what to do, not why. Tables > prose. Steps > paragraphs.
|
||||
3. **Agent templates**: Keep Agent() spawn templates concise. Include only the prompt, subagent_type, and isolation mode.
|
||||
4. **Cross-references**: Use `archeflow:<skill-name>` backtick syntax to reference other skills. Avoid circular dependencies.
|
||||
5. **Bash commands**: One-liners only in skills. Multi-step logic belongs in `lib/` scripts.
|
||||
|
||||
### Cost Awareness
|
||||
|
||||
- Prefer cheap models (haiku) for analytical tasks (validation, diff scoring)
|
||||
- Use capable models (sonnet/opus) for creative tasks (writing, complex design)
|
||||
- Budget enforcement via `cost-tracking` skill and `.archeflow/config.yaml`
|
||||
- Track token spend per agent in events for post-run analysis
|
||||
|
||||
### Git Rules
|
||||
|
||||
- Signing: `git config gpg.format ssh`, key at `~/.ssh/id_ed25519_dev.pub`
|
||||
- Push: `GIT_SSH_COMMAND="ssh -i /home/c/.ssh/id_ed25519_dev -o IdentitiesOnly=yes" git push origin main`
|
||||
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `refactor:`
|
||||
- No Co-Authored-By trailers
|
||||
- All work on worktree branches until explicitly merged
|
||||
- Merges use `--no-ff` (individually revertable)
|
||||
|
||||
## Dogfooding
|
||||
|
||||
When using ArcheFlow to develop ArcheFlow itself:
|
||||
- Log observations to `.archeflow/memory/lessons.jsonl`
|
||||
- Note friction points, shadow false positives, skill gaps
|
||||
- Test skill changes with `/af-run --dry-run` before committing
|
||||
|
||||
87
README.md
87
README.md
@@ -146,60 +146,51 @@ Shadow detection is quantitative, not vibes. Explorer output exceeding 2000 word
|
||||
|
||||
## Skills Reference
|
||||
|
||||
ArcheFlow ships with 24 skills organized by function.
|
||||
ArcheFlow ships with 19 skills organized by function. The `run` skill is self-contained -- no prerequisites needed.
|
||||
|
||||
### Core Orchestration
|
||||
|
||||
| Skill | Description |
|
||||
|-------|-------------|
|
||||
| `archeflow:run` | Automated PDCA execution loop -- single-command orchestration with `--start-from`, `--dry-run`, and cycle-back |
|
||||
| `archeflow:orchestration` | Step-by-step PDCA execution guide for manual orchestration |
|
||||
| `archeflow:plan-phase` | Explorer and Creator output formats and protocols |
|
||||
| `archeflow:do-phase` | Maker implementation rules and worktree commit strategy |
|
||||
| `archeflow:check-phase` | Shared reviewer protocols and output format |
|
||||
| `archeflow:act-phase` | Post-Check decision logic: collect findings, route fixes, exit or cycle |
|
||||
| `archeflow:run` | Self-contained PDCA orchestration -- Plan/Do/Check/Act with adaptation rules, pipeline strategy, and cycle-back |
|
||||
| `archeflow:sprint` | Queue-driven parallel agent dispatch across projects (primary mode) |
|
||||
| `archeflow:review` | Guardian-led code review on diff/branch/commit range |
|
||||
| `archeflow:check-phase` | Shared reviewer protocol -- finding format, evidence requirements, attention filters |
|
||||
| `archeflow:act-phase` | Finding collection, fix routing, exit decisions |
|
||||
|
||||
### Quality and Safety
|
||||
|
||||
| Skill | Description |
|
||||
|-------|-------------|
|
||||
| `archeflow:shadow-detection` | Quantitative dysfunction detection and automatic correction |
|
||||
| `archeflow:convergence` | Detects convergence, stalling, and oscillation in multi-cycle runs |
|
||||
| `archeflow:artifact-routing` | Inter-phase artifact protocol -- naming, storage, routing, archiving |
|
||||
|
||||
### Process Intelligence
|
||||
|
||||
| Skill | Description |
|
||||
|-------|-------------|
|
||||
| `archeflow:process-log` | Event-sourced JSONL logging with DAG parent relationships |
|
||||
| `archeflow:shadow-detection` | Corrective action framework -- archetype shadows, system shadows, policy boundaries |
|
||||
| `archeflow:memory` | Cross-run memory that learns recurring findings and injects lessons |
|
||||
| `archeflow:effectiveness` | Archetype scoring on signal-to-noise, fix rate, cost efficiency |
|
||||
| `archeflow:progress` | Live progress file watchable from a second terminal |
|
||||
|
||||
### Integration
|
||||
|
||||
| Skill | Description |
|
||||
|-------|-------------|
|
||||
| `archeflow:colette-bridge` | Bridges ArcheFlow with the Colette writing platform |
|
||||
| `archeflow:git-integration` | Git-per-phase commits, branch-per-run, rollback to any phase boundary |
|
||||
| `archeflow:git-integration` | Per-phase commits, branch-per-run, rollback |
|
||||
| `archeflow:multi-project` | Cross-repo orchestration with dependency DAG and shared budget |
|
||||
| `archeflow:cost-tracking` | Budget enforcement, per-agent cost aggregation, model tier recommendations |
|
||||
|
||||
### Configuration
|
||||
|
||||
| Skill | Description |
|
||||
|-------|-------------|
|
||||
| `archeflow:domains` | Domain adapters for writing, research, and non-code workflows |
|
||||
| `archeflow:custom-archetypes` | Create domain-specific roles (database reviewer, compliance auditor, etc.) |
|
||||
| `archeflow:workflow-design` | Design custom workflows with per-phase archetype assignment and exit conditions |
|
||||
| `archeflow:domains` | Domain adapters for writing, research, and other non-code workflows |
|
||||
| `archeflow:cost-tracking` | Budget enforcement, per-agent cost aggregation, model tier recommendations |
|
||||
| `archeflow:workflow-design` | Design custom workflows with per-phase archetype assignment |
|
||||
| `archeflow:templates` | Template gallery for sharing workflows, teams, and setup bundles |
|
||||
| `archeflow:autonomous-mode` | Unattended overnight sessions with progress logging and safe stopping |
|
||||
| `archeflow:autonomous-mode` | Unattended sessions with corrective action checkpoints |
|
||||
| `archeflow:progress` | Live progress file watchable from a second terminal |
|
||||
| `archeflow:presence` | User-facing output format -- show outcomes, not mechanics |
|
||||
|
||||
### Meta
|
||||
|
||||
| Skill | Description |
|
||||
|-------|-------------|
|
||||
| `archeflow:using-archeflow` | Session-start skill -- activation criteria, workflow selection, quick reference |
|
||||
| `archeflow:using-archeflow` | Session-start activation -- decision tree, workflow selection, commands |
|
||||
|
||||
## Library Scripts
|
||||
|
||||
@@ -340,46 +331,28 @@ archetypes: [explorer, creator, maker, guardian, db-specialist]
|
||||
|
||||
```
|
||||
archeflow/
|
||||
├── .claude-plugin/plugin.json # Plugin manifest (v0.5.0)
|
||||
├── .claude-plugin/plugin.json # Plugin manifest
|
||||
├── agents/ # 7 archetype personas (behavioral protocols)
|
||||
│ ├── explorer.md # Plan: research and context mapping
|
||||
│ ├── creator.md # Plan: solution design and proposals
|
||||
│ ├── maker.md # Do: implementation in isolated worktree
|
||||
│ ├── guardian.md # Check: security and reliability review
|
||||
│ ├── skeptic.md # Check: assumption challenging
|
||||
│ ├── trickster.md # Check: adversarial testing
|
||||
│ └── sage.md # Check: holistic quality review
|
||||
├── skills/ # 24 behavioral skills
|
||||
│ ├── run/ # Automated PDCA loop
|
||||
│ ├── orchestration/ # Manual PDCA execution guide
|
||||
│ ├── plan-phase/ # Plan protocols
|
||||
│ ├── do-phase/ # Do protocols
|
||||
│ ├── check-phase/ # Check protocols
|
||||
│ ├── act-phase/ # Act phase decision logic
|
||||
│ ├── shadow-detection/ # Dysfunction detection
|
||||
│ ├── convergence/ # Cycle convergence detection
|
||||
│ ├── artifact-routing/ # Inter-phase artifact protocol
|
||||
│ ├── process-log/ # Event-sourced JSONL logging
|
||||
│ ├── explorer.md, creator.md # Plan phase agents
|
||||
│ ├── maker.md # Do phase agent
|
||||
│ └── guardian.md, skeptic.md, # Check phase agents
|
||||
│ trickster.md, sage.md
|
||||
├── skills/ # 19 skills (consolidated from 27)
|
||||
│ ├── run/ # Self-contained PDCA orchestration (core)
|
||||
│ ├── sprint/ # Queue-driven parallel agent dispatch
|
||||
│ ├── review/ # Guardian-led code review
|
||||
│ ├── check-phase/ # Shared reviewer protocol + attention filters
|
||||
│ ├── act-phase/ # Finding collection + fix routing
|
||||
│ ├── shadow-detection/ # Corrective action framework (3 layers)
|
||||
│ ├── memory/ # Cross-run learning
|
||||
│ ├── effectiveness/ # Archetype scoring
|
||||
│ ├── progress/ # Live progress file
|
||||
│ ├── colette-bridge/ # Colette writing platform bridge
|
||||
│ ├── git-integration/ # Per-phase git commits
|
||||
│ ├── multi-project/ # Cross-repo orchestration
|
||||
│ ├── custom-archetypes/ # Domain-specific roles
|
||||
│ ├── workflow-design/ # Custom workflow design
|
||||
│ ├── domains/ # Domain adapters
|
||||
│ ├── cost-tracking/ # Budget and cost management
|
||||
│ ├── templates/ # Template gallery
|
||||
│ ├── autonomous-mode/ # Unattended sessions
|
||||
│ └── using-archeflow/ # Session-start activation
|
||||
├── lib/ # 8 shell scripts (process infrastructure)
|
||||
│ └── ... # + 12 config/integration skills
|
||||
├── lib/ # 10 shell scripts (events, git, memory, etc.)
|
||||
├── hooks/ # Auto-activation (SessionStart)
|
||||
├── examples/ # Walkthroughs, templates, custom archetypes
|
||||
└── docs/ # Roadmap, changelog
|
||||
```
|
||||
|
||||
The flow: skills define behavioral rules (what agents should do), agents define personas (how they think), lib scripts handle tooling (event logging, git, reporting), and hooks wire it all together at session start. Events are emitted at every phase transition, forming a DAG that can be rendered, reported, or scored after the run.
|
||||
Skills define behavioral rules, agents define personas, lib scripts handle tooling, hooks wire it together at session start. The `run` skill is self-contained -- it absorbed 8 previously separate skills (orchestration, plan-phase, do-phase, artifact-routing, process-log, convergence, effectiveness, attention-filters) into one 459-line operational guide.
|
||||
|
||||
## Philosophy
|
||||
|
||||
|
||||
@@ -1,66 +1,129 @@
|
||||
---
|
||||
name: shadow-detection
|
||||
description: Use when monitoring agent behavior for dysfunction, when an agent seems stuck, or when orchestration quality is degrading. Detects and corrects Jungian shadow activation in archetypes.
|
||||
description: |
|
||||
Corrective action framework for agent dysfunction, system health, and operational policy.
|
||||
Three layers — archetype shadows, system shadows, policy boundaries — one escalation protocol.
|
||||
---
|
||||
|
||||
# Shadow Detection
|
||||
# Corrective Action Framework
|
||||
|
||||
Every archetype has a virtue and a shadow (its destructive inversion). Shadow activates when the virtue is pushed too far.
|
||||
Detect dysfunction. Apply corrective action. Escalate if repeated.
|
||||
|
||||
| Archetype | Virtue | Shadow |
|
||||
|-----------|--------|--------|
|
||||
| Explorer | Contextual Clarity | Rabbit Hole |
|
||||
| Creator | Decisive Framing | Over-Architect |
|
||||
| Maker | Execution Discipline | Rogue |
|
||||
| Guardian | Threat Intuition | Paranoid |
|
||||
| Skeptic | Assumption Surfacing | Paralytic |
|
||||
| Trickster | Adversarial Creativity | False Alarm |
|
||||
| Sage | Maintainability Judgment | Bureaucrat |
|
||||
Three layers, one protocol:
|
||||
- **Archetype Shadows** — individual agent dysfunction (virtue pushed too far)
|
||||
- **System Shadows** — orchestration-level dysfunction (process going wrong)
|
||||
- **Policy Boundaries** — operational limits (time, cost, quality thresholds)
|
||||
|
||||
---
|
||||
|
||||
### Explorer -> Rabbit Hole
|
||||
**Detect** (any): output >2000w without Recommendation | >3 tangents | >15 files no patterns | no synthesis in final 25%
|
||||
**Correct**: "Summarize top 3 findings and one recommendation in under 300 words."
|
||||
## Archetype Shadows
|
||||
|
||||
### Creator -> Over-Architect
|
||||
**Detect** (any): >2 new abstractions for a single feature | "future-proof" in rationale | scope exceeds task by >50% | >1 new package for one feature
|
||||
**Correct**: "Design for the current order of magnitude. Remove abstractions that serve hypothetical requirements."
|
||||
| Archetype | Shadow | Detect (any) | Corrective Action |
|
||||
|-----------|--------|-------------|-------------------|
|
||||
| Explorer | Rabbit Hole | Output >2000w without Recommendation; >3 tangents; >15 files no patterns; no synthesis in final 25% | "Summarize top 3 findings and one recommendation in 300 words." |
|
||||
| Creator | Over-Architect | >2 new abstractions for one feature; "future-proof" in rationale; scope exceeds task >50%; >1 new package | "Design for the current order of magnitude. Remove abstractions for hypothetical requirements." |
|
||||
| Maker | Rogue | Zero test files with >=3 files changed; single monolithic commit; files outside proposal; no test run evidence | "Read the proposal. Write a test. Commit. Revert out-of-scope files." |
|
||||
| Guardian | Paranoid | CRITICAL:WARNING ratio >2:1 (min 3); zero APPROVED in 3+ reviews; <50% findings include fix; findings require compromised systems | "For each CRITICAL: would a senior engineer block a PR? If not, downgrade. Every rejection needs a specific fix." |
|
||||
| Skeptic | Paralytic | >7 challenges; <50% include alternatives; same concern 2+ times reworded; >3 findings outside scope | "Rank by impact. Keep top 3 with alternatives. Delete the rest." |
|
||||
| Trickster | False Alarm | Findings in untouched code; >10 findings for <5 files; impossible scenarios; >3 without repro steps | "Delete findings outside the diff. Rank by likelihood x impact. Keep top 3-5." |
|
||||
| Sage | Bureaucrat | Review words >2x diff lines; findings outside changeset; >2 "consider" without action; suggesting docs for trivial functions | "Limit to issues affecting maintainability in 6 months. Every finding needs a specific action." |
|
||||
|
||||
### Maker -> Rogue
|
||||
**Detect** (any): zero test files with >=3 files changed | single monolithic commit | diff contains files not in proposal | no evidence of running tests
|
||||
**Correct**: "Read the proposal. Write a test. Commit what you have. Revert changes to files not in the proposal."
|
||||
### Shadow Immunity
|
||||
|
||||
### Guardian -> Paranoid
|
||||
**Detect** (any): CRITICAL:WARNING ratio >2:1 (min 3 findings) | zero APPROVED in 3+ reviews | <50% findings include a fix | findings require already-compromised systems
|
||||
**Correct**: "For each CRITICAL: would a senior engineer block a PR for this? If not, downgrade. Every rejection must include a specific fix."
|
||||
Intensity alone is not a shadow. **Shadow = behavior disconnected from the goal.**
|
||||
|
||||
### Skeptic -> Paralytic
|
||||
**Detect** (any): >7 challenges in a single review | <50% include alternatives | same concern appears 2+ times reworded | >3 findings outside task scope
|
||||
**Correct**: "Rank challenges by impact. Keep top 3. Each must include a specific alternative. Delete the rest."
|
||||
|
||||
### Trickster -> False Alarm
|
||||
**Detect** (any): findings reference code untouched by diff | >10 findings for <5 files | impossible deployment scenarios | >3 findings without repro steps
|
||||
**Correct**: "Delete findings outside the diff. Rank remaining by likelihood x impact. Keep top 3-5."
|
||||
|
||||
### Sage -> Bureaucrat
|
||||
**Detect** (any): review words >2x diff lines | findings reference files not in changeset | >2 "consider" without concrete action | suggesting docs for <5-line functions
|
||||
**Correct**: "Limit to issues affecting maintainability in the next 6 months. Every finding must end with a specific action."
|
||||
|
||||
---
|
||||
|
||||
## Escalation Protocol
|
||||
|
||||
1. **1st detection:** Log the shadow, apply the correction prompt, let the agent continue
|
||||
2. **2nd detection (same agent, same shadow):** Replace the agent -- the shadow is entrenched
|
||||
3. **3+ agents shadowed in same cycle:** Escalate to user -- the task may need to be broken down
|
||||
|
||||
## Shadow Immunity
|
||||
|
||||
Some behaviors look like shadows but are not. **Rule of thumb:** shadow = behavior disconnected from the goal. Intensity alone is not a shadow.
|
||||
|
||||
- Explorer reading 20 files in a monorepo with scattered dependencies -- not a rabbit hole if each file is genuinely relevant
|
||||
- Creator adding an abstraction -- not over-architect if the current task genuinely needs it
|
||||
- Guardian blocking with 2 CRITICALs -- not paranoid if both are genuine security vulnerabilities
|
||||
- Explorer reading 20 files in a monorepo with scattered deps -- not rabbit hole if each is relevant
|
||||
- Guardian blocking with 2 CRITICALs -- not paranoid if both are genuine vulnerabilities
|
||||
- Trickster finding 5 edge cases -- not false alarm if all are in changed code with repro steps
|
||||
- Sage writing a long review -- not bureaucrat if the change is large and every finding is actionable
|
||||
|
||||
---
|
||||
|
||||
## System Shadows
|
||||
|
||||
Orchestration-level dysfunction that isn't tied to one archetype.
|
||||
|
||||
| Shadow | Detect | Corrective Action |
|
||||
|--------|--------|-------------------|
|
||||
| **Tunnel Vision** | All reviewers flag same category (e.g., 4 security findings, 0 quality/testing) | "Redistribute attention. Are we missing quality, testing, or design concerns?" |
|
||||
| **Echo Chamber** | Unanimous approval in <30s on standard/thorough workflow | "Suspicious fast consensus. Re-run Guardian with adversarial prompt." |
|
||||
| **Gold Plating** | Maker working on INFO fixes while CRITICALs remain open | "Fix CRITICALs first. Park INFO items." |
|
||||
| **Analysis Paralysis** | Plan phase >2x longer than Do phase; Explorer spawned 3+ times | "Stop researching. Ship a proposal with known gaps." |
|
||||
| **Cargo Cult** | Memory lesson injected but the same finding repeats anyway | "Lesson ineffective. Reword, strengthen, or remove it." |
|
||||
| **Broken Window** | 3+ WARNINGs deferred across consecutive runs in the same project | "Accumulated tech debt. Schedule a cleanup sprint." |
|
||||
| **Scope Creep** | Maker changes >2x files listed in proposal | "Revert to proposal scope. If more files needed, update the proposal first." |
|
||||
|
||||
---
|
||||
|
||||
## Policy Boundaries
|
||||
|
||||
Operational limits that protect session quality, cost, and resumability.
|
||||
|
||||
### Checkpoint Policy
|
||||
|
||||
Every **45 minutes** or **3 completed tasks** (whichever first):
|
||||
|
||||
1. Commit + push all work in progress
|
||||
2. Write handoff summary to `control-center.md`
|
||||
3. Log token spend so far
|
||||
4. Compare output quality: last task vs first task
|
||||
5. If quality degrading -> STOP with clean state
|
||||
6. If budget >80% spent -> STOP with clean state
|
||||
7. Otherwise -> continue
|
||||
|
||||
### Budget Gate
|
||||
|
||||
| Threshold | Action |
|
||||
|-----------|--------|
|
||||
| 50% budget spent | Log warning, continue |
|
||||
| 80% budget spent | Downgrade models (sonnet->haiku for reviewers) |
|
||||
| 95% budget spent | Complete current task, then STOP |
|
||||
| 100% budget | STOP immediately, commit WIP |
|
||||
|
||||
### Circuit Breaker
|
||||
|
||||
| Trigger | Action |
|
||||
|---------|--------|
|
||||
| 3 consecutive agent failures/timeouts | STOP. Infrastructure issue, not a code problem. |
|
||||
| 3 consecutive task failures in sprint | STOP. Something systemic is wrong. |
|
||||
| Same shadow detected 3+ times in one cycle | STOP. Task needs to be broken down or re-scoped. |
|
||||
| Test suite broken after merge | Auto-revert, STOP, report. |
|
||||
|
||||
### Diminishing Returns
|
||||
|
||||
| Signal | Action |
|
||||
|--------|--------|
|
||||
| Cycle N findings identical to cycle N-1 | STOP cycling. Present best result. |
|
||||
| Convergence score <0.5 for 2 consecutive cycles | STOP. "This needs a different approach." |
|
||||
| Reviewer finding count increases cycle over cycle | STOP. Implementation is diverging, not converging. |
|
||||
|
||||
### Context Pollution
|
||||
|
||||
| Signal | Action |
|
||||
|--------|--------|
|
||||
| >15 memory lessons injected into one prompt | Prune to top 5 by frequency |
|
||||
| >20 findings tracked across cycles | Summarize into top 5 themes |
|
||||
| Agent prompt exceeds estimated 50% of context window | Strip examples, keep rules only |
|
||||
|
||||
---
|
||||
|
||||
## Unified Escalation Protocol
|
||||
|
||||
All three layers use the same escalation:
|
||||
|
||||
| Step | Archetype Shadows | System Shadows | Policy Boundaries |
|
||||
|------|-------------------|----------------|-------------------|
|
||||
| **1st** | Apply corrective action, let agent continue | Apply corrective action, continue run | Apply boundary action (downgrade, checkpoint) |
|
||||
| **2nd** (same issue) | Replace the agent -- shadow is entrenched | Pause run, report to user | Force stop with clean state |
|
||||
| **3rd** (pattern) | Escalate to user: "task needs re-scoping" | Escalate to user: "systemic issue" | Escalate to user: "resource limits reached" |
|
||||
|
||||
---
|
||||
|
||||
## Integration
|
||||
|
||||
Shadow checks run **after each agent completes** during orchestration. System shadow checks run **at phase boundaries**. Policy checks run **on a timer and at task boundaries**.
|
||||
|
||||
The `run` skill references this framework at:
|
||||
- Step 3 (Check phase): archetype shadow monitoring
|
||||
- Step 4 (Act phase): convergence/diminishing returns
|
||||
- Step 5 (Completion): effectiveness scoring
|
||||
- Sprint skill: checkpoint policy between batches
|
||||
|
||||
@@ -7,7 +7,7 @@ description: Use at session start when implementing features, reviewing code, de
|
||||
|
||||
On activation, print ONE line then proceed silently:
|
||||
```
|
||||
archeflow v0.7.0 · 25 skills · <domain> domain
|
||||
archeflow v0.8.0 · 19 skills · <domain> domain
|
||||
```
|
||||
Domain auto-detected: `writing` if `colette.yaml` exists, `research` if paper/thesis files, `code` otherwise.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user