feat: core improvements — feedback loop, attention filters, shadow heuristics, metrics, auto-activation

- Cross-cycle feedback protocol with structured finding format, routing, and resolution tracking - Attention filter enforcement: explicit context include/exclude per archetype - Shadow detection: quantitative checklists with concrete thresholds - Orchestration metrics: per-phase timing, agent count, findings summary - Autonomous mode wiring: checkpoint protocol, session log, stop conditions - Auto-activation: SessionStart hook fires ArcheFlow for implementation tasks without user config - Emoji avatars for all 7 archetypes - Standardized finding format across all reviewers for cross-cycle tracking - Persisted implementation plan in docs/
2026-04-03 06:02:10 +02:00
parent eec1fc3d82
commit d08dc657d1
14 changed files with 553 additions and 85 deletions
--- a/skills/check-phase/SKILL.md
+++ b/skills/check-phase/SKILL.md
@@ -11,12 +11,33 @@ Multiple reviewers examine the Maker's implementation in parallel. Each agent de

 1. **Read the proposal first.** Review against the intended design, not invented requirements.
 2. **Read the actual code.** Use `git diff` on the Maker's branch. Don't review descriptions alone.
-3. **Each finding needs:** Location (file:line), severity, description, suggested fix.
-4. **Severity:**
-   - **CRITICAL** — Must fix. Blocks approval.
-   - **WARNING** — Should fix. Doesn't block alone.
-   - **INFO** — Nice to have. Never blocks.
-5. **Clear verdict:** `APPROVED` or `REJECTED` with rationale.
+3. **Structured findings.** Use the standardized finding format below for every issue.
+4. **Clear verdict:** `APPROVED` or `REJECTED` with rationale.
+
+## Finding Format
+
+Every finding must use this format for cross-cycle tracking:
+
+```
+| Location | Severity | Category | Description | Fix |
+|----------|----------|----------|-------------|-----|
+| src/auth/handler.ts:48 | CRITICAL | security | Empty string bypasses validation | Add length check before processing |
+```
+
+**Severity:**
+- **CRITICAL** — Must fix. Blocks approval.
+- **WARNING** — Should fix. Doesn't block alone.
+- **INFO** — Nice to have. Never blocks.
+
+**Categories** (use consistently for cross-cycle tracking):
+- `security` — Injection, auth bypass, data exposure, secrets
+- `reliability` — Error handling, edge cases, race conditions, crashes
+- `design` — Architecture, assumptions, scalability, coupling
+- `breaking-change` — API compatibility, schema migrations, removals
+- `dependency` — New deps, version conflicts, license issues
+- `quality` — Readability, maintainability, naming, duplication
+- `testing` — Missing tests, weak assertions, untested paths
+- `consistency` — Deviates from codebase patterns

 ## Consolidated Output

@@ -26,19 +47,33 @@ After all reviewers finish, compile:
 ## Check Phase Results — Cycle N

 ### Guardian: APPROVED
- WARNING: Missing rate limit (src/auth/handler.ts:52)
+| Location | Severity | Category | Description | Fix |
+|----------|----------|----------|-------------|-----|
+| src/auth/handler.ts:52 | WARNING | security | Missing rate limit | Add rate limiter middleware |

 ### Skeptic: APPROVED
- INFO: Consider caching validated tokens
+| Location | Severity | Category | Description | Fix |
+|----------|----------|----------|-------------|-----|
+| src/auth/handler.ts:30 | INFO | design | Consider caching validated tokens | Add TTL cache for token validation |

 ### Sage: APPROVED
- WARNING: Test names could be more descriptive
+| Location | Severity | Category | Description | Fix |
+|----------|----------|----------|-------------|-----|
+| tests/auth.test.ts:15 | WARNING | testing | Test names don't describe behavior | Rename to "should reject expired tokens" |

 ### Trickster: REJECTED
- CRITICAL: Empty string bypasses validation (src/auth/handler.ts:48)
-  Reproduction: POST /auth with `{"token": ""}`
-  Expected: 400 | Actual: 500
+| Location | Severity | Category | Description | Fix |
+|----------|----------|----------|-------------|-----|
+| src/auth/handler.ts:48 | CRITICAL | reliability | Empty string bypasses validation | Add `if (!token || token.trim() === '')` guard |

 ### Verdict: REJECTED — 1 critical finding
-→ Feed back to Plan phase for cycle N+1
+→ Build cycle feedback (see orchestration skill) and feed to Plan phase
 ```
+
+## Why Structured Findings Matter
+
+The standardized format enables:
+- **Cross-cycle tracking:** Same category + location = same issue. Can detect resolution or regression.
+- **Feedback routing:** Security/design findings → Creator. Quality/testing findings → Maker.
+- **Shadow detection:** CRITICAL:WARNING ratios, finding counts, and category distributions are measurable.
+- **Metrics:** Severity counts feed into the orchestration summary.
--- a/skills/orchestration/SKILL.md
+++ b/skills/orchestration/SKILL.md
@@ -22,9 +22,13 @@ Assess the task and pick:
 Spawn agents sequentially — Creator needs Explorer's findings.

 ### Explorer (if standard or thorough)
+
+**Context to include:** Task description, relevant file paths, codebase access.
+**Context to exclude:** Prior proposals, review outputs, implementation details, feedback from previous cycles.
+
 ```
 Agent(
-  description: "Explorer: research context",
+  description: "🔍 Explorer: research context",
  prompt: "<task description>
    You are the EXPLORER archetype.
    Research the codebase to understand:
@@ -39,18 +43,24 @@ Agent(
 ```

 ### Creator
+
+**Context to include:** Task description, Explorer's research output. On cycle 2+: prior cycle's structured feedback (see Cycle Feedback Protocol).
+**Context to exclude:** Raw file contents (Explorer already summarized), git diffs, reviewer full outputs.
+
 ```
 Agent(
-  description: "Creator: design proposal",
+  description: "🏗️ Creator: design proposal",
  prompt: "<task description>
    You are the CREATOR archetype.
    Based on the research findings: <Explorer's output>
+    <if cycle 2+: Prior cycle feedback: <structured feedback — see Cycle Feedback Protocol>>
    Design a solution proposal including:
    1. Architecture decisions (with rationale)
    2. Files to create/modify (with specific changes)
    3. Test strategy
    4. Confidence score (0.0 to 1.0)
    5. Risks you foresee
+    <if cycle 2+: 6. How you addressed each unresolved issue from prior feedback>
    Be decisive. Ship a clear plan, not a menu of options.",
  subagent_type: "Plan"
 )
@@ -60,12 +70,16 @@ Agent(

 Spawn Maker in an **isolated worktree** so changes don't affect main.

+**Context to include:** Creator's proposal only. On cycle 2+: implementation-routed feedback from Sage/Trickster.
+**Context to exclude:** Explorer's research, Guardian/Skeptic findings (those go to Creator).
+
 ```
 Agent(
-  description: "Maker: implement proposal",
+  description: "⚒️ Maker: implement proposal",
  prompt: "<task description>
    You are the MAKER archetype.
    Implement this proposal: <Creator's output>
+    <if cycle 2+: Implementation feedback from prior cycle: <Sage/Trickster findings only>>
    Rules:
    1. Follow the proposal exactly — don't redesign
    2. Write tests for every behavioral change
@@ -85,9 +99,13 @@ Agent(
 Spawn reviewers **in parallel** — they read the Maker's changes independently.

 ### Guardian
+
+**Context to include:** Maker's git diff, proposal risk section only.
+**Context to exclude:** Explorer's research, full proposal, other reviewer outputs.
+
 ```
 Agent(
-  description: "Guardian: security and risk review",
+  description: "🛡️ Guardian: security and risk review",
  prompt: "You are the GUARDIAN archetype.
    Review the changes in branch: <maker's branch>
    Assess:
@@ -96,31 +114,42 @@ Agent(
    3. Breaking changes (API compatibility, schema migrations)
    4. Dependency risks (new deps, version conflicts)
    Output: APPROVED or REJECTED with specific findings.
-    Each finding needs: location, severity (critical/warning/info), description, fix suggestion.
+    Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
+    Categories: security, reliability, design, breaking-change, dependency
    Be rigorous but practical — flag real risks, not theoretical ones."
 )
 ```

 ### Skeptic (if standard or thorough)
+
+**Context to include:** Creator's proposal (focus on assumptions section).
+**Context to exclude:** Git diff details, Explorer's research, other reviewer outputs.
+
 ```
 Agent(
-  description: "Skeptic: challenge assumptions",
+  description: "🤔 Skeptic: challenge assumptions",
  prompt: "You are the SKEPTIC archetype.
-    Review the changes in branch: <maker's branch>
+    Review the proposal: <Creator's proposal>
    Challenge:
    1. Assumptions in the design — what if they're wrong?
    2. Alternative approaches not considered
    3. Edge cases not tested
    4. Scalability concerns
    Output: APPROVED or REJECTED with counterarguments.
+    Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
+    Categories: design, quality, testing, scalability
    Be constructive — every challenge must include a suggested alternative."
 )
 ```

 ### Sage (if standard or thorough)
+
+**Context to include:** Creator's proposal, Maker's git diff, implementation summary.
+**Context to exclude:** Explorer's raw research, other reviewer outputs.
+
 ```
 Agent(
-  description: "Sage: holistic quality review",
+  description: "📚 Sage: holistic quality review",
  prompt: "You are the SAGE archetype.
    Review the changes in branch: <maker's branch>
    Evaluate holistically:
@@ -129,14 +158,20 @@ Agent(
    3. Documentation (does the change need docs?)
    4. Consistency with codebase patterns
    Output: APPROVED or REJECTED with quality findings.
+    Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
+    Categories: quality, testing, design, consistency
    Judge like a senior engineer doing a PR review."
 )
 ```

 ### Trickster (if thorough only)
+
+**Context to include:** Maker's git diff only.
+**Context to exclude:** Everything else — proposal, research, other reviews.
+
 ```
 Agent(
-  description: "Trickster: adversarial testing",
+  description: "🃏 Trickster: adversarial testing",
  prompt: "You are the TRICKSTER archetype.
    Try to break the changes in branch: <maker's branch>
    Attack vectors:
@@ -145,6 +180,8 @@ Agent(
    3. Error path exploitation
    4. Dependency failure scenarios
    Output: APPROVED or REJECTED with edge cases found.
+    Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
+    Categories: security, reliability, testing
    Think like a QA engineer who gets paid per bug found."
 )
 ```
@@ -157,11 +194,12 @@ Collect all reviewer outputs and decide:
 1. Merge the Maker's worktree branch into the target branch
 2. Report: what was implemented, what was reviewed, any warnings noted
 3. Clean up the worktree
+4. Record metrics (see Orchestration Metrics)

 ### Issues Found (and cycles remaining)
-1. Collect all findings into a feedback summary
+1. Build structured feedback using the Cycle Feedback Protocol below
 2. Go back to Step 1 (Plan) with the feedback
-3. Creator revises the proposal based on reviewer findings
+3. Creator revises the proposal, addressing each unresolved issue
 4. Maker re-implements in a fresh worktree
 5. Reviewers check again

@@ -170,11 +208,156 @@ Collect all reviewer outputs and decide:
 2. Present the best implementation so far (on its branch)
 3. Let the user decide: merge as-is, fix manually, or abandon

+---
+
+## Cycle Feedback Protocol
+
+After the Check phase, build structured feedback for the next cycle. This replaces dumping raw reviewer output.
+
+### 1. Extract Findings
+
+Parse each reviewer's output into the standardized format:
+
+```markdown
+## Cycle N Feedback
+
+### Unresolved Issues
+| Source | Severity | Category | Issue | Route to |
+|--------|----------|----------|-------|----------|
+| Guardian | CRITICAL | security | SQL injection in user input | Creator |
+| Skeptic | WARNING | design | Assumes single-tenant only | Creator |
+| Sage | WARNING | quality | Test names don't describe behavior | Maker |
+| Trickster | CRITICAL | reliability | Empty string bypasses validation | Creator |
+
+### Resolved (from cycle N-1)
+| Source | Issue | Resolution |
+|--------|-------|------------|
+| Guardian | Missing rate limit | Added rate limiter middleware |
+```
+
+### 2. Route Feedback
+
+Not all findings go to the same agent:
+
+| Finding source | Routes to | Rationale |
+|----------------|-----------|-----------|
+| Guardian (security, breaking-change) | **Creator** | Design must change |
+| Skeptic (design, scalability) | **Creator** | Assumptions need revision |
+| Sage (quality, consistency) | **Maker** | Implementation refinement |
+| Trickster (reliability, testing) | **Creator** if design flaw, **Maker** if test gap | Depends on root cause |
+
+### 3. Track Resolution
+
+Compare cycle N findings against cycle N-1:
+- If a prior finding no longer appears in the same category → mark **resolved**
+- If a prior finding persists → it stays **unresolved** with an incremented cycle count
+- If new findings appear → add as new unresolved issues
+
+This prevents regression and gives the Creator/Maker a clear list of what to address.
+
+---
+
+## Orchestration Metrics
+
+Track lightweight metrics throughout the orchestration. No token counting (unreliable from skill layer) — just timing and outcomes.
+
+### Per-Phase Logging
+
+After each phase completes, note:
+
+```
+| Phase | Duration | Agents | Outcome |
+|-------|----------|--------|---------|
+| Plan  | 45s      | 2      | Proposal ready (confidence: 0.8) |
+| Do    | 90s      | 1      | 4 files changed, 8 tests added |
+| Check | 60s      | 3      | 1 REJECTED (Guardian), 2 APPROVED |
+| Act   | —        | —      | Cycle back → feedback built |
+```
+
+### Orchestration Summary
+
+At orchestration end, include in the report:
+
+```markdown
+## Orchestration Metrics
+| Metric | Value |
+|--------|-------|
+| Workflow | standard |
+| Cycles | 2 of 2 |
+| Total duration | 4m 30s |
+| Agents spawned | 9 |
+| Findings (total) | 5 |
+| Findings (critical) | 1 |
+| Findings (resolved) | 4 |
+| Shadow detections | 0 |
+```
+
+Use this data to calibrate future workflow selection — if fast workflows consistently need 0 cycles of revision, the task was well-scoped.
+
+---
+
+## Autonomous Mode
+
+When running unattended (overnight sessions, batch queues), add these behaviors to the orchestration loop:
+
+### Between-Task Checkpoint
+
+After each task completes (success or failure):
+1. **Commit and push** all changes immediately
+2. **Update session log** at `.archeflow/session-log.md` with task outcome
+3. **Check stop conditions** before starting next task:
+   - 3 consecutive failures → STOP
+   - Shadow escalation (same shadow 3+ times) → STOP
+   - Test suite broken after merge → REVERT and STOP
+   - Destructive action detected → STOP
+
+### Session Log Protocol
+
+Write to `.archeflow/session-log.md` after each task:
+
+```markdown
+## Task N: <description>
+**Workflow:** standard | **Status:** COMPLETED/FAILED
+**Cycles:** 1 of 2
+**Findings:** Guardian APPROVED, Skeptic APPROVED, Sage WARNING (test names)
+**Files changed:** 5 | **Tests added:** 12
+**Branch:** merged to main (commit abc1234) | OR: archeflow/maker-xyz (NOT merged)
+**Duration:** 8 min
+```
+
+### Safety Rules
+- Never force-push. Never modify main history.
+- All work stays on worktree branches until explicitly merged
+- Merges use `--no-ff` — individually revertable
+- Failed tasks leave branches intact for manual inspection
+
+For full autonomous mode details (task queues, overnight checklists, user controls): load the `archeflow:autonomous-mode` skill.
+
+---
+
+## Shadow Monitoring
+
+During orchestration, watch for shadow activation after each agent completes. Quick checklist:
+
+| Archetype | Shadow | Quick Check |
+|-----------|--------|-------------|
+| Explorer | Rabbit Hole | Output >2000 words without Recommendation section? |
+| Creator | Over-Architect | >2 new abstractions for one feature? |
+| Maker | Rogue | No test files in changeset? Files outside proposal? |
+| Guardian | Paranoid | CRITICAL:WARNING ratio >2:1? Zero approvals? |
+| Skeptic | Paralytic | >7 challenges? <50% have alternatives? |
+| Trickster | False Alarm | Findings in untouched code? >10 findings? |
+| Sage | Bureaucrat | Review >2x code change length? |
+
+On detection: apply correction prompt from `archeflow:shadow-detection` skill. On second detection of same shadow: replace agent. On 3+ shadows in same cycle: escalate to user.
+
+---
+
 ## Orchestration Report

 After completion, summarize:

-```
+```markdown
 ## ArcheFlow Orchestration Report
 - **Task:** <description>
 - **Workflow:** standard (2 cycles)
@@ -183,4 +366,5 @@ After completion, summarize:
 - **Files changed:** 4 files, +120 -30 lines
 - **Tests added:** 8 new tests
 - **Branch:** archeflow/maker-<id> → merged to main
+- **Metrics:** 9 agents, 4m 30s, 5 findings (4 resolved, 1 info remaining)
 ```
--- a/skills/plan-phase/SKILL.md
+++ b/skills/plan-phase/SKILL.md
@@ -50,3 +50,42 @@ Explorer researches, then Creator designs. Sequential — Creator needs Explorer
 ### Not Doing
 - <adjacent concerns deliberately excluded>
 ```
+
+## Creator with Prior Feedback (Cycle 2+)
+
+When the Creator receives structured feedback from a prior cycle, the proposal must include an additional section addressing each unresolved issue:
+
+```markdown
+## Proposal: <task> (Revision — Cycle N)
+**Confidence:** <0.0 to 1.0>
+
+### Prior Feedback Response
+| Issue | Source | Action | Rationale |
+|-------|--------|--------|-----------|
+| SQL injection in user input | Guardian | **Fixed** — added parameterized queries | Direct security fix |
+| Assumes single-tenant | Skeptic | **Deferred** — multi-tenant out of scope | Not in task requirements |
+| Test names unclear | Sage | **Accepted** — routed to Maker | Implementation concern |
+
+### Architecture Decision
+<revised design addressing feedback>
+
+### Changes
+<updated file list>
+
+### Test Strategy
+<updated test cases>
+
+### Risks
+<updated risks — include any new risks from the revision>
+
+### Not Doing
+<updated scope boundaries>
+```
+
+**Rules for addressing feedback:**
+- **Fixed:** Changed the design to resolve the issue. Explain how.
+- **Deferred:** Not addressing now, with explicit reason. Must not be a CRITICAL finding.
+- **Accepted:** Acknowledged and routed to Maker for implementation-level fix.
+- **Disputed:** Disagrees with the finding. Must provide evidence or reasoning.
+
+CRITICAL findings cannot be deferred or disputed — they must be fixed or the proposal will be rejected again.
--- a/skills/shadow-detection/SKILL.md
+++ b/skills/shadow-detection/SKILL.md
@@ -30,10 +30,11 @@ Maintainability Judgment      → reviews only      → Bureaucrat
 - Reading more than 15 files without producing findings
 - Output is a raw inventory of files with no analysis or recommendation

-**Triggers:**
- Output length > 2000 words without a recommendation section
- More than 3 "see also" or "related" tangents
- No patterns or recommendation in output
+**Detection Checklist** (trigger on ANY):
+- [ ] Output >2000 words without a `### Recommendation` section
+- [ ] >3 tangent topics not directly related to the original task
+- [ ] >15 files read with no `### Patterns` identified
+- [ ] No synthesis language (recommend, suggest, conclusion, finding, summary) in final 25% of output

 **Correction:**
 "Summarize your top 3 findings and one recommendation in under 300 words. If your output has no Recommendation section, add one. A dump is not research."
@@ -49,10 +50,11 @@ Maintainability Judgment      → reviews only      → Bureaucrat
 - Configuration systems for things that could be constants
 - Proposal has more infrastructure than business logic

-**Triggers:**
- More than 2 new abstractions (interfaces, base classes, factories) for a single feature
- "In the future we might need..." appears in rationale
- Proposal scope exceeds original task by > 50%
+**Detection Checklist** (trigger on ANY):
+- [ ] >2 new abstractions (interfaces, base classes, factories, registries) for a single feature
+- [ ] "In the future we might need..." or "future-proof" appears in rationale
+- [ ] Proposal scope (files changed) exceeds original task scope by >50%
+- [ ] More than 1 new package/module introduced for a single feature

 **Correction:**
 "Design for the current order of magnitude. If the app has 1000 users, design for 10,000 — not 10 million. Remove abstractions that serve hypothetical requirements."
@@ -68,10 +70,11 @@ Maintainability Judgment      → reviews only      → Bureaucrat
 - Large uncommitted working tree
 - Files changed that aren't mentioned in the proposal

-**Triggers:**
- No test files in the changeset
- Single monolithic commit instead of incremental commits
- Diff contains files not listed in the Creator's proposal
+**Detection Checklist** (trigger on ANY):
+- [ ] Zero test files (`.test.`, `.spec.`, `_test.`) in the changeset with >=3 files changed
+- [ ] Single monolithic commit instead of incremental commits
+- [ ] Diff contains files not listed in the Creator's proposal `### Changes` section
+- [ ] No evidence of running existing test suite before finishing

 **Correction:**
 "Read the proposal. Write a test. Commit what you have. Revert changes to files not in the proposal. Then continue."
@@ -87,10 +90,11 @@ Maintainability Judgment      → reviews only      → Bureaucrat
 - Rejecting without suggesting how to fix
 - Security concerns for internal-only code at external-API severity

-**Triggers:**
- CRITICAL:WARNING ratio > 2:1
- Zero APPROVED verdicts in 3+ consecutive reviews
- Less than 50% of findings include a suggested fix
+**Detection Checklist** (trigger on ANY):
+- [ ] CRITICAL:WARNING ratio >2:1 (with minimum 3 total findings)
+- [ ] Zero APPROVED verdicts in 3+ consecutive reviews
+- [ ] <50% of findings include a suggested fix in the `Fix` column
+- [ ] Findings reference attack scenarios that require already-compromised internal systems

 **Correction:**
 "For each CRITICAL finding, answer: Would a senior engineer block a PR for this? If not, downgrade. Every rejection must include a specific, implementable fix."
@@ -106,10 +110,11 @@ Maintainability Judgment      → reviews only      → Bureaucrat
 - "What about X?" chains that drift from the task
 - Restating the same concern in different words

-**Triggers:**
- Challenge count > 7
- Less than 50% of challenges include alternatives
- Same conceptual concern raised multiple times
+**Detection Checklist** (trigger on ANY):
+- [ ] >7 findings/challenges raised in a single review
+- [ ] <50% of findings include an alternative in the `Fix` column
+- [ ] Same conceptual concern appears 2+ times with different wording
+- [ ] >3 findings reference code or scenarios outside the task scope

 **Correction:**
 "Rank your challenges by impact. Keep the top 3. Each must include a specific alternative. Delete the rest."
@@ -125,13 +130,14 @@ Maintainability Judgment      → reviews only      → Bureaucrat
 - 20 findings when 3 good ones would cover the real risks
 - Edge cases for edge cases (diminishing returns)

-**Triggers:**
- Findings reference code untouched by the implementation
- More than 10 findings for a small change
- Findings describe scenarios that can't happen in the actual deployment context
+**Detection Checklist** (trigger on ANY):
+- [ ] Any finding references code untouched by the Maker's diff
+- [ ] >10 findings for a change touching <5 files
+- [ ] Findings describe scenarios requiring conditions that can't occur in the deployment context
+- [ ] >3 findings without reproduction steps

 **Correction:**
-"Quality over quantity. Delete findings outside the Maker's diff. Rank remaining by likelihood × impact. Keep top 3-5. Three real findings beat twenty noise."
+"Quality over quantity. Delete findings outside the Maker's diff. Rank remaining by likelihood x impact. Keep top 3-5. Three real findings beat twenty noise."

 ---

@@ -144,10 +150,11 @@ Maintainability Judgment      → reviews only      → Bureaucrat
 - Suggesting refactors unrelated to the current task
 - Deep-sounding analysis that doesn't end with a specific action

-**Triggers:**
- Review word count > 2x the code change's word count
- Suggestions reference files not in the changeset
- Findings contain "consider" or "think about" without a specific action
+**Detection Checklist** (trigger on ANY):
+- [ ] Review word count >2x the code change's line count (rough: review words > diff lines x 2)
+- [ ] Any finding references files not in the Maker's changeset
+- [ ] >2 findings use "consider" or "think about" without a concrete action in the `Fix` column
+- [ ] Suggesting documentation for functions with <5 lines or self-descriptive names

 **Correction:**
 "Limit your review to issues that affect maintainability in the next 6 months. Every finding must end with a specific action. If you can't state the consequence of NOT fixing it, don't raise it."
--- a/skills/using-archeflow/SKILL.md
+++ b/skills/using-archeflow/SKILL.md
@@ -3,21 +3,50 @@ name: using-archeflow
 description: Use at session start when implementing features, reviewing code, debugging, or any task that benefits from multiple perspectives. Activates ArcheFlow multi-agent orchestration.
 ---

-# ArcheFlow
+# ArcheFlow — Active

 Multi-agent orchestration using archetypal roles and PDCA quality cycles.

+## IMPORTANT: When to Activate
+
+You MUST use ArcheFlow orchestration (load `archeflow:orchestration` skill and follow its steps) for any task that matches:
+
+- **New features** — any feature touching 2+ files
+- **Refactoring** — structural changes across modules
+- **Security-sensitive changes** — auth, encryption, input handling, API keys
+- **Bug fixes with unclear root cause** — use Explorer to investigate first
+- **Code review requests** — spawn Guardian + relevant reviewers
+- **Multi-file changes** — anything beyond a single-file edit
+
+Choose the workflow based on risk:
+
+| Signal | Workflow | Command |
+|--------|----------|---------|
+| Small fix, low risk, single concern | `fast` | Creator → Maker → Guardian |
+| Feature, multiple files, moderate risk | `standard` | Explorer + Creator → Maker → Guardian + Skeptic + Sage |
+| Security-sensitive, breaking changes, public API | `thorough` | Explorer + Creator → Maker → All 4 reviewers |
+
+## When to Skip ArcheFlow
+
+Do NOT use ArcheFlow for these — just do them directly:
+
+- Single-line fixes, typos, formatting
+- Answering questions (no code changes)
+- Reading/exploring code without making changes
+- Config changes to a single file
+- Git operations (commit, push, branch)
+
 ## Archetypes

-| Archetype | Virtue | Shadow | Phase |
-|-----------|--------|--------|-------|
-| **Explorer** | Contextual Clarity | Rabbit Hole | Plan |
-| **Creator** | Decisive Framing | Over-Architect | Plan |
-| **Maker** | Execution Discipline | Rogue | Do |
-| **Guardian** | Threat Intuition | Paranoid | Check |
-| **Skeptic** | Assumption Surfacing | Paralytic | Check |
-| **Trickster** | Adversarial Creativity | False Alarm | Check |
-| **Sage** | Maintainability Judgment | Bureaucrat | Check |
+| Archetype | Avatar | Virtue | Shadow | Phase |
+|-----------|--------|--------|--------|-------|
+| **Explorer** | 🔍 | Contextual Clarity | Rabbit Hole | Plan |
+| **Creator** | 🏗️ | Decisive Framing | Over-Architect | Plan |
+| **Maker** | ⚒️ | Execution Discipline | Rogue | Do |
+| **Guardian** | 🛡️ | Threat Intuition | Paranoid | Check |
+| **Skeptic** | 🤔 | Assumption Surfacing | Paralytic | Check |
+| **Trickster** | 🃏 | Adversarial Creativity | False Alarm | Check |
+| **Sage** | 📚 | Maintainability Judgment | Bureaucrat | Check |

 ## PDCA Cycle

@@ -28,24 +57,20 @@ Check → Reviewers assess in parallel (approve/reject)
 Act   → All approved? Merge. Issues? Cycle back to Plan.
 ```

-## Workflows
+## Quick Start

-| Workflow | Archetypes | Cycles |
-|----------|-----------|:------:|
-| `fast` | Creator → Maker → Guardian | 1 |
-| `standard` | Explorer + Creator → Maker → Guardian + Skeptic + Sage | 2 |
-| `thorough` | Explorer + Creator → Maker → All 4 reviewers | 3 |
+When the user gives an implementation task:

-## When to Use
+1. Assess: does this need ArcheFlow? (see criteria above)
+2. If yes: load `archeflow:orchestration` skill
+3. Pick workflow (fast/standard/thorough)
+4. Execute the PDCA steps from the orchestration skill

-**Use** for features spanning multiple files, security-sensitive changes, or when multiple perspectives help.
-**Skip** for single-file fixes, formatting, or purely informational tasks.
+## Skills Reference

-## Skills
-
- **archeflow:orchestration** — Step-by-step execution guide
+- **archeflow:orchestration** — Step-by-step execution guide (load this to run)
 - **archeflow:plan-phase** / **do-phase** / **check-phase** — Phase protocols
- **archeflow:shadow-detection** — Recognizing dysfunction
+- **archeflow:shadow-detection** — Recognizing and correcting dysfunction
 - **archeflow:attention-filters** — What context each archetype receives
 - **archeflow:autonomous-mode** — Unattended sessions
 - **archeflow:custom-archetypes** / **workflow-design** — Extending ArcheFlow