refactor: consolidate run skill — merge 8 skills into one self-contained PDCA orchestrator

Merge run + orchestration + plan-phase + do-phase + artifact-routing + process-log + attention-filters + convergence + effectiveness into a single 459-line run/SKILL.md. Before: run skill (890 lines) + 3 prerequisites (~1,300 lines) = ~2,200 lines of context. After: one self-contained skill (459 lines) with zero prerequisites. Preserved: PDCA flow, workflow selection, adaptation rules A1-A3, agent prompts, attention filters, feedback routing, convergence detection, effectiveness scoring, shadow monitoring, pipeline strategy, event reference, artifact naming. Removed: verbose bash code blocks, shell variable tracking, resolve_model() function, lib validation loops, evidence validation bash, redundant event emission blocks.
2026-04-06 20:44:46 +02:00
parent 55de51aabe
commit c8bd55d97c
8 changed files with 304 additions and 2753 deletions
--- a/skills/artifact-routing/SKILL.md
+++ b/skills/artifact-routing/SKILL.md
@@ -1,289 +0,0 @@
---
-name: artifact-routing
-description: |
-  Inter-phase artifact protocol for ArcheFlow runs. Defines how artifacts are named, stored,
-  routed between agents, and archived across PDCA cycles. Ensures each agent receives exactly
-  the context it needs — no more, no less.
-  <example>Automatically loaded by archeflow:run</example>
-  <example>User: "What does the Maker receive as context?"</example>
---
-
-# Artifact Routing — Inter-Phase Context Protocol
-
-Every ArcheFlow run produces artifacts — research notes, proposals, diffs, reviews, feedback. This skill defines how those artifacts are named, where they live, what each agent receives, and how they are preserved across cycles.
-
-## Artifact Directory Structure
-
-```
-.archeflow/artifacts/<run_id>/
-├── plan-explorer.md          # Explorer research output
-├── plan-creator.md           # Creator proposal/outline
-├── do-maker.md               # Maker implementation summary
-├── do-maker-files.txt        # List of files created/modified (one path per line)
-├── check-guardian.md          # Guardian review verdict + findings
-├── check-sage.md             # Sage review (if present)
-├── check-skeptic.md          # Skeptic review (if present)
-├── check-trickster.md        # Trickster review (if present)
-├── act-feedback.md           # Structured feedback for next cycle (Cycle Feedback Protocol)
-├── act-fixes.jsonl           # Applied fixes log (one JSON line per fix)
-├── cycle-1/                  # Archived artifacts from cycle 1
-│   ├── plan-explorer.md
-│   ├── plan-creator.md
-│   ├── do-maker.md
-│   ├── do-maker-files.txt
-│   ├── check-guardian.md
-│   ├── check-sage.md
-│   └── act-feedback.md
-└── cycle-2/                  # Archived artifacts from cycle 2 (if cycle 3 starts)
-    └── ...
-```
-
-## Naming Convention
-
-Artifacts follow the pattern: `<phase>-<agent>.<ext>`
-
-| Phase | Agent | Filename | Format |
-|-------|-------|----------|--------|
-| plan | explorer | `plan-explorer.md` | Markdown research report |
-| plan | creator | `plan-creator.md` | Markdown proposal with confidence scores |
-| plan | mini-explorer | `plan-mini-explorer.md` | Focused risk research (only if confidence gate triggers) |
-| do | maker | `do-maker.md` | Markdown implementation summary |
-| do | maker | `do-maker-files.txt` | Plain text, one file path per line |
-| check | guardian | `check-guardian.md` | Markdown verdict + findings table |
-| check | sage | `check-sage.md` | Markdown verdict + findings table |
-| check | skeptic | `check-skeptic.md` | Markdown verdict + findings table |
-| check | trickster | `check-trickster.md` | Markdown verdict + findings table |
-| act | (orchestrator) | `act-feedback.md` | Structured feedback (see Cycle Feedback Protocol) |
-| act | (orchestrator) | `act-fixes.jsonl` | JSONL fix log |
-
-**Rule:** Never invent new artifact names during a run. If a reviewer is skipped (A2 fast-path, reviewer profile), its artifact simply does not exist. Downstream phases check for file existence before reading.
-
---
-
-## Context Injection Rules
-
-Each agent receives a filtered subset of artifacts. This is the **attention filter** — it controls what context is injected into the agent's prompt.
-
-### Plan Phase
-
-| Agent | Receives | Does NOT receive |
-|-------|----------|-----------------|
-| **Explorer** | Task description, relevant file paths, codebase access | Prior proposals, review outputs, implementation details |
-| **Creator** (cycle 1) | Task description, `plan-explorer.md` (if exists) | Raw file contents (Explorer summarized them), git diffs |
-| **Creator** (cycle 2+) | Task description, `plan-explorer.md`, `act-feedback.md` (Creator-routed findings only) | Raw reviewer outputs, Maker-routed findings |
-
-**Creator context injection template (cycle 2+):**
-```markdown
-## Task
-<task description>
-
-## Research (from Explorer)
-<contents of plan-explorer.md>
-
-## Feedback from Prior Cycle
-<Creator-routed section of act-feedback.md only>
-
-Note: Address each unresolved issue listed above. Explain how your revised proposal resolves it.
-```
-
-### Do Phase
-
-| Agent | Receives | Does NOT receive |
-|-------|----------|-----------------|
-| **Maker** (cycle 1) | `plan-creator.md` (the proposal), `plan-mini-explorer.md` (if exists) | `plan-explorer.md`, reviewer outputs, raw task description |
-| **Maker** (cycle 2+) | `plan-creator.md`, `plan-mini-explorer.md` (if exists), Maker-routed findings from `act-feedback.md` | Explorer research, Guardian/Skeptic findings (those went to Creator) |
-
-**Maker context injection template (cycle 2+):**
-```markdown
-## Proposal
-<contents of plan-creator.md>
-
-## Implementation Feedback from Prior Cycle
-<Maker-routed section of act-feedback.md only>
-
-Note: The proposal has been revised to address design-level issues. Focus on the implementation
-feedback items above (code quality, test gaps, consistency).
-```
-
-**Why Maker doesn't get Explorer output:** The Creator already distilled Explorer's research into a concrete proposal. Giving Maker raw research causes scope creep and "Rogue" shadow activation.
-
-### Check Phase
-
-| Agent | Receives | Does NOT receive |
-|-------|----------|-----------------|
-| **Guardian** | Maker's git diff, risk section from `plan-creator.md` | Full proposal, Explorer research, other reviewer outputs |
-| **Skeptic** | `plan-creator.md` (assumptions focus) | Git diff details, Explorer research, other reviewer outputs |
-| **Sage** | `plan-creator.md`, Maker's git diff, `do-maker.md` | Explorer research, other reviewer outputs |
-| **Trickster** | Maker's git diff only | Everything else |
-
-**Guardian context injection template:**
-```markdown
-## Changes to Review
-<git diff from Maker's branch>
-
-## Risk Assessment (from proposal)
-<risks section extracted from plan-creator.md>
-
-Review these changes for security, reliability, breaking changes, and dependency risks.
-```
-
-**Skeptic context injection template:**
-```markdown
-## Proposal to Challenge
-<contents of plan-creator.md>
-
-Focus on assumptions, alternatives not considered, edge cases, and scalability.
-```
-
-**Sage context injection template:**
-```markdown
-## Proposal
-<contents of plan-creator.md>
-
-## Implementation Summary
-<contents of do-maker.md>
-
-## Changes
-<git diff from Maker's branch>
-
-Evaluate code quality, test coverage, documentation, and codebase consistency.
-```
-
-**Trickster context injection template:**
-```markdown
-## Changes to Attack
-<git diff from Maker's branch>
-
-Try to break this. Malformed input, boundaries, concurrency, error paths, dependency failures.
-```
-
-### Act Phase
-
-No agents are spawned in Act. The orchestrator reads all `check-*.md` artifacts directly.
-
---
-
-## Feedback Routing
-
-> **This is the canonical routing table.** Other skills (orchestration, act-phase) must match this table exactly. When updating routing rules, update this table first, then sync the others.
-
-When building `act-feedback.md` after the Check phase, route each finding to the right agent for the next cycle:
-
-| Finding Source | Finding Category | Routes To | Rationale |
-|---------------|-----------------|-----------|-----------|
-| Guardian | security, breaking-change | **Creator** | Design must change |
-| Guardian | reliability, dependency | **Creator** | Architectural decision needed |
-| Skeptic | design, scalability | **Creator** | Assumptions need revision |
-| Sage | quality, consistency | **Maker** | Implementation refinement |
-| Sage | testing | **Maker** | Test gap, not design flaw |
-| Trickster | reliability (design flaw) | **Creator** | Needs redesign |
-| Trickster | reliability (test gap) | **Maker** | Needs more tests |
-| Trickster | testing | **Maker** | Edge case not covered |
-
-**Disambiguation rule:** When in doubt: if the fix requires changing the approach, route to Creator. If it requires changing the code within the existing approach, route to Maker.
-
-### Feedback File Format
-
-`act-feedback.md` is split into two sections so each agent can be given only its portion:
-
-```markdown
-# Cycle <N> Feedback
-
-## Creator-Routed Issues
-| # | Source | Severity | Category | Issue | Suggested Fix |
-|---|--------|----------|----------|-------|---------------|
-| 1 | Guardian | CRITICAL | security | SQL injection in user input | Add parameterized queries |
-| 2 | Skeptic | WARNING | design | Assumes single-tenant only | Add tenant isolation |
-
-## Maker-Routed Issues
-| # | Source | Severity | Category | Issue | Suggested Fix |
-|---|--------|----------|----------|-------|---------------|
-| 3 | Sage | WARNING | quality | Test names don't describe behavior | Rename to describe expected outcome |
-| 4 | Sage | INFO | consistency | Import order doesn't match codebase style | Re-order imports |
-
-## Resolved (from prior cycles)
-| # | Source | Issue | Resolution | Resolved In |
-|---|--------|-------|------------|-------------|
-| 1 | Guardian | Missing rate limit | Added rate limiter middleware | Cycle 1 |
-
-## Convergence Warnings
-<any finding that appeared unresolved in 2+ consecutive cycles — requires user input>
-```
-
-When injecting feedback into Creator's prompt, include **only** the "Creator-Routed Issues" section.
-When injecting feedback into Maker's prompt, include **only** the "Maker-Routed Issues" section.
-
---
-
-## Cycle Archiving
-
-When a PDCA cycle completes and a new cycle begins, archive the current artifacts so they are preserved and the working directory is clean for the next iteration.
-
-### Archive Procedure
-
-At the end of each cycle (before starting the next):
-
-```bash
-RUN_DIR=".archeflow/artifacts/${RUN_ID}"
-ARCHIVE_DIR="${RUN_DIR}/cycle-${CYCLE}"
-
-mkdir -p "$ARCHIVE_DIR"
-
-# Copy all phase artifacts to archive
-cp "${RUN_DIR}"/plan-*.md   "$ARCHIVE_DIR/" 2>/dev/null || true
-cp "${RUN_DIR}"/do-*.md     "$ARCHIVE_DIR/" 2>/dev/null || true
-cp "${RUN_DIR}"/do-*.txt    "$ARCHIVE_DIR/" 2>/dev/null || true
-cp "${RUN_DIR}"/check-*.md  "$ARCHIVE_DIR/" 2>/dev/null || true
-cp "${RUN_DIR}"/act-feedback.md "$ARCHIVE_DIR/" 2>/dev/null || true
-```
-
-**Do NOT delete** the working-level artifacts after archiving. The next cycle's agents need `act-feedback.md` and `plan-explorer.md` (Explorer cache may reuse prior research). Old artifacts in the working directory get overwritten when the new cycle's agents produce their outputs.
-
-### Archive Access
-
-Archived artifacts are read-only references. Use them for:
- **Resolution tracking:** Compare `cycle-1/check-guardian.md` findings against `cycle-2/check-guardian.md` to detect resolved/persisting issues
- **Convergence detection:** Same finding in `cycle-N/act-feedback.md` and `cycle-N+1/act-feedback.md` → escalate to user
- **Post-hoc analysis:** Understanding how a solution evolved across cycles
-
---
-
-## Artifact Existence Checks
-
-Before injecting an artifact into an agent's context, always check if the file exists. Missing artifacts are expected in certain workflows:
-
-| Artifact | Missing when |
-|----------|-------------|
-| `plan-explorer.md` | Fast workflow (no Explorer) |
-| `plan-mini-explorer.md` | Confidence gate did not trigger for risk coverage |
-| `check-skeptic.md` | Fast workflow, or A2 fast-path taken |
-| `check-sage.md` | Fast workflow, or A2 fast-path taken |
-| `check-trickster.md` | Non-thorough workflow, or A2 fast-path taken |
-| `act-feedback.md` | Cycle 1 (no prior feedback exists) |
-| `act-fixes.jsonl` | Cycle 1, or no fixes applied |
-
-**Rule:** Never fail because an optional artifact is missing. Check existence, skip injection if absent, and note what was skipped in the event data.
-
---
-
-## Git Diff as Artifact
-
-The Maker's git diff is not saved as a file — it is generated on-the-fly from the Maker's worktree branch:
-
-```bash
-git diff main...<maker-branch>
-```
-
-This ensures reviewers always see the actual current diff, not a stale snapshot. The diff is injected directly into reviewer prompts, not saved to disk.
-
-Exception: `do-maker-files.txt` IS saved to disk (just the file list, not the full diff) for quick reference by the orchestrator and for archiving purposes.
-
---
-
-## Design Principles
-
-1. **Minimal context per agent.** Each agent gets only what it needs. Over-injection causes distraction, shadow activation, and wasted tokens.
-2. **Artifacts are the handoff mechanism.** Agents never communicate directly. All inter-agent data flows through saved artifacts.
-3. **Files over memory.** Everything is on disk. If a session crashes, artifacts survive. A `--start-from` resume reads artifacts, not session state.
-4. **Overwrite, don't accumulate.** Working-level artifacts get overwritten each cycle. Archives preserve history. This keeps the working directory simple.
-5. **Check before inject.** Always verify artifact existence. Gracefully handle missing optional artifacts.
--- a/skills/convergence/SKILL.md
+++ b/skills/convergence/SKILL.md
@@ -1,249 +0,0 @@
---
-name: convergence
-description: |
-  Detects convergence, stalling, and oscillation in multi-cycle PDCA runs. Prevents wasted cycles
-  by stopping early when findings are not being resolved or are bouncing between cycles.
-  <example>Automatically loaded during Act phase before exit decision</example>
-  <example>User: "Is the run converging?"</example>
---
-
-# Convergence Detection
-
-In multi-cycle PDCA runs, the Act phase must decide whether another cycle will help or just waste tokens. This skill provides the analysis: are findings being resolved (converging), staying the same (stalling), or bouncing back (oscillating)?
-
-## When It Runs
-
-Convergence analysis runs **after the Check phase completes and before the Act phase exit decision**. It requires at least 2 cycles of data — on cycle 1, it is skipped (no comparison baseline).
-
-```
-Check phase → Convergence Analysis → Act phase exit decision
-```
-
---
-
-## Step 1: Finding Comparison
-
-Extract findings from the current cycle and compare against the previous cycle.
-
-### Data Sources
-
- **Current cycle findings:** Parsed from `check-*.md` artifacts in `.archeflow/artifacts/<run_id>/`
- **Previous cycle findings:** Parsed from `check-*.md` artifacts in `.archeflow/artifacts/<run_id>/cycle-<N-1>/`
-
-Each finding is identified by a composite key: `source + category + file_location + description_keywords`.
-
-### Finding Categories
-
-Every finding from the current cycle is classified into exactly one category:
-
-| Category | Definition |
-|----------|------------|
-| **NEW** | Finding not present in any previous cycle |
-| **RESOLVED** | Was present in the previous cycle, absent in the current cycle |
-| **PERSISTENT** | Present in both the current and previous cycle (same key) |
-| **REGRESSED** | Was RESOLVED in the previous cycle (was present in N-2, absent in N-1), but returned in the current cycle |
-
-### Matching Algorithm
-
-Two findings match if:
-1. Same `source` archetype (guardian, sage, etc.)
-2. Same `category` (security, reliability, quality, etc.)
-3. Same or overlapping file location (same file, line within 10 lines)
-4. 50%+ keyword overlap in description (lowercase, strip punctuation)
-
-All four conditions must hold. This prevents false matches across unrelated findings.
-
---
-
-## Step 2: Convergence Score
-
-Calculate a convergence score from the categorized findings:
-
-```
-convergence = resolved_count / (resolved_count + new_count + regressed_count)
-```
-
-If the denominator is 0 (no resolved, no new, no regressed — only persistent), the score is `0.0` (stalled, not converging).
-
-### Score Interpretation
-
-| Score Range | Status | Meaning |
-|-------------|--------|---------|
-| > 0.8 | **Converging** | Most issues being resolved, few new ones introduced |
-| 0.5 - 0.8 | **Stalling** | Fixing roughly as many as introducing |
-| < 0.5 | **Diverging** | Making things worse — more new/regressed than resolved |
-| 0.0 (all persistent) | **Stuck** | No progress in either direction |
-
---
-
-## Step 3: Oscillation Detection
-
-An oscillating finding is one that bounces between resolved and re-introduced across cycles:
-
-1. Finding was present in cycle N-2
-2. Finding was absent in cycle N-1 (resolved)
-3. Finding is present again in cycle N (regressed)
-
-This indicates the fix in cycle N-1 was undone or invalidated by other changes in cycle N.
-
-### Oscillation Rules
-
- A single oscillating finding: **flag it** in the convergence report but continue.
- Two or more oscillating findings: **STOP** and escalate to the user.
- Message: `"Findings X and Y are oscillating between cycles. Manual intervention needed — the automated fixes are interfering with each other."`
-
-Oscillation tracking requires 3+ cycles of data. On cycles 1-2, oscillation detection is skipped.
-
---
-
-## Step 4: Early Termination Rules
-
-The convergence analysis can override the normal Act phase exit decision. If any of these conditions hold, the recommendation is **STOP**:
-
-| Condition | Threshold | Recommendation |
-|-----------|-----------|----------------|
-| Diverging | Score < 0.5 for 2 consecutive cycles | STOP — changes are making things worse |
-| Stalled | 0 findings resolved between cycles | STOP — no progress, further cycles will not help |
-| Stuck | All findings are PERSISTENT for 2 consecutive cycles | STOP — automated fixes cannot resolve these |
-| Oscillating | 2+ findings oscillating | STOP — fixes are interfering with each other |
-
-When STOP is recommended, the Act phase should:
-1. **Not** start another PDCA cycle
-2. Report all unresolved findings to the user
-3. Present the best implementation so far (on its branch, not merged)
-4. Include the convergence report explaining why the run was stopped
-
-### Override Behavior
-
-The convergence STOP recommendation overrides the normal cycle-back logic in the Act phase. Even if `CYCLE < MAX_CYCLES` and there are fixable-looking findings, if convergence says STOP, the run stops.
-
-The user can always override by explicitly requesting another cycle: `"Run one more cycle anyway"`.
-
---
-
-## Step 5: Integration with Act Phase
-
-### Event Data
-
-Convergence data is included in the `cycle.boundary` event emitted by the Act phase:
-
-```json
-{
-  "type": "cycle.boundary",
-  "phase": "act",
-  "data": {
-    "cycle": 2,
-    "max_cycles": 3,
-    "exit_condition": "convergence_stop",
-    "met": false,
-    "fixes_applied": 2,
-    "next_action": "stop",
-    "convergence": {
-      "score": 0.35,
-      "status": "diverging",
-      "resolved": 1,
-      "new": 2,
-      "regressed": 1,
-      "persistent": 3,
-      "oscillating": ["Timeline reference mismatch"],
-      "recommendation": "stop",
-      "reason": "Diverging for 2 consecutive cycles"
-    }
-  }
-}
-```
-
-### Decision Tree Update
-
-The Act phase decision tree (from `act-phase` skill Step 4) gains a new first branch:
-
-```
-┌─ Convergence analysis (cycle 2+)
-│
-├─ Convergence says STOP
-│  └─ STOP: Report to user with convergence report
-│
-├─ Convergence says CONTINUE
-│  └─ Fall through to normal exit decision logic
-│
-└─ Cycle 1 (no convergence data)
-   └─ Fall through to normal exit decision logic
-```
-
-### Act Feedback Enhancement
-
-When the Act phase builds `act-feedback.md` for the next cycle, it includes the convergence summary at the top:
-
-```markdown
-## Convergence Analysis (Cycle 1 → 2)
-
-Score: 0.75 (converging)
-Resolved: 3 | New: 1 | Regressed: 0 | Persistent: 2
-
-Recommendation: Continue — trend is positive
-
-### Finding Status
-| Finding | Status | Cycles |
-|---------|--------|--------|
-| SQL injection in user input | RESOLVED | 1 |
-| Missing rate limit | RESOLVED | 1 |
-| Test names unclear | RESOLVED | 1 |
-| Null check missing in parser | PERSISTENT | 2 |
-| Error path not tested | PERSISTENT | 2 |
-| New: Unused import introduced | NEW | 1 |
-```
-
---
-
-## Convergence Report Format
-
-The full convergence report is generated as part of the orchestration output:
-
-```markdown
-## Convergence Analysis (Cycle N-1 → N)
-
-**Score:** 0.75 (converging)
-**Resolved:** 3 | **New:** 1 | **Regressed:** 0 | **Persistent:** 2 | **Oscillating:** 0
-
-### Resolved This Cycle
-| Source | Category | Description |
-|--------|----------|-------------|
-| guardian | security | SQL injection in user input handler |
-| guardian | reliability | Missing rate limit on auth endpoint |
-| sage | quality | Test names don't describe behavior |
-
-### New This Cycle
-| Source | Category | Description |
-|--------|----------|-------------|
-| sage | quality | Unused import introduced by fix |
-
-### Persistent (unresolved across cycles)
-| Source | Category | Description | Cycles Open |
-|--------|----------|-------------|-------------|
-| trickster | reliability | Null check missing in parser | 2 |
-| sage | testing | Error path not tested | 2 |
-
-### Oscillating
-(none)
-
-**Recommendation:** Continue — trend is positive
-```
-
---
-
-## Integration with Memory Skill
-
-When convergence detects PERSISTENT findings (present for 2+ cycles), these are strong candidates for the `memory` skill's lesson extraction:
-
- After a run that had persistent findings, `archeflow-memory.sh extract` will pick these up with higher confidence (they have been confirmed across multiple cycles within a single run).
- Persistent findings that also appear in `lessons.jsonl` from prior runs get a double frequency boost (cross-cycle within run + cross-run pattern).
-
---
-
-## Design Principles
-
-1. **Conservative stopping.** Requires 2 consecutive data points before recommending STOP. A single bad cycle might be noise.
-2. **User has final say.** STOP is a recommendation, not an enforced shutdown. The user can override.
-3. **Cheap computation.** Keyword matching on finding descriptions, simple arithmetic on counts. No ML, no embeddings.
-4. **Bounded scope.** Only compares adjacent cycles (N vs N-1, with N-2 for oscillation). Does not attempt to model long-term trends across many cycles.
-5. **Observable.** All convergence data is included in the `cycle.boundary` event, making it available for post-hoc analysis via the process log.
--- a/skills/do-phase/SKILL.md
+++ b/skills/do-phase/SKILL.md
@@ -1,193 +0,0 @@
---
-name: do-phase
-description: Use when acting as Maker in the Do phase. Defines execution rules, worktree protocol, commit discipline, and output format.
---
-
-# Do Phase
-
-Maker implements the Creator's proposal. This skill defines the execution protocol — the agent definition (`agents/maker.md`) has the behavioral rules.
-
-## Execution Protocol
-
-### 1. Read Before Writing
-Read the Creator's proposal completely. Identify:
- Files to create or modify (the `### Changes` section)
- Test strategy (the `### Test Strategy` section)
- Scope boundaries (the `### Not Doing` section)
-
-If the proposal is unclear on any point: implement your best interpretation and note the assumption in your output.
-
-### 2. Implementation Order
-For each change in the proposal:
-1. Write the test first (expect it to fail)
-2. Implement the change (make the test pass)
-3. Verify existing tests still pass
-4. Commit with a descriptive message
-
-For writing domain (stories, prose):
-1. Read the outline / scene plan
-2. Read the voice profile and character sheets
-3. Draft scene by scene, following the outline's emotional beats
-4. Self-check: does the voice hold? Does dialogue sound natural?
-5. Commit after each scene or logical section
-
-### 3. Commit Discipline
-
-**CRITICAL: Always commit before finishing.** Uncommitted worktree changes are LOST when the agent exits.
-
-Commit conventions:
-```
-feat: <what was added>           # New functionality
-fix: <what was fixed>            # Bug fix within the task
-test: <what was tested>          # Test additions
-docs: <what was documented>      # Documentation only
-```
-
-Commit frequency:
- **Code:** After each logical step (one feature, one fix, one test suite)
- **Writing:** After each scene or section (~500-1000 words)
- **Never:** One big commit at the end with everything
-
-### 4. Scope Control
-
-Do exactly what the proposal says. No more, no less.
-
-**In scope:**
- Files listed in the proposal's `### Changes` section
- Tests specified in the `### Test Strategy` section
- Dependencies explicitly mentioned
-
-**Out of scope (even if tempting):**
- Refactoring code you noticed while implementing
- Adding features not in the proposal
- Fixing pre-existing bugs in adjacent code
- Updating documentation beyond what the task requires
-
-If you encounter something that needs fixing but is out of scope: note it in `### Notes` for future work. Don't fix it now.
-
-### 5. Blocker Protocol
-
-If you hit a blocker (dependency missing, test infrastructure broken, proposal contradicts codebase):
-1. Document what's blocked and why
-2. Document what you completed before the block
-3. Commit what you have
-4. Stop and report — don't silently work around it
-
-## Worktree Protocol
-
-When running in an isolated git worktree (`isolation: "worktree"`):
-
-```
-main branch (untouched)
-└── archeflow/maker-<run_id> (worktree branch)
-    ├── commit: implementation step 1
-    ├── commit: implementation step 2
-    └── commit: implementation step 3 (final)
-```
-
- All work stays on the worktree branch
- Main branch is never modified directly
- The branch name follows the pattern: `archeflow/maker-<run_id>`
- After Check phase approves: the orchestrator merges (not the Maker)
-
-## Output Format
-
-```markdown
-## Implementation: <task>
-
-### Files Changed
- `path/file.ext` — What changed (+N -M lines)
-
-### Tests
- N new tests, all passing
- M existing tests still passing
-
-### Commits
-1. `feat: description` (hash)
-2. `test: description` (hash)
-
-### Notes
- Assumptions made where proposal was unclear
- Out-of-scope issues noticed (for future work)
-
-### Branch
-`archeflow/maker-<run_id>` — ready for review
-```
-
-For writing domain:
-```markdown
-## Draft: <story/chapter title>
-
-### Scenes Written
- Scene 1: <title> (~N words)
- Scene 2: <title> (~N words)
-
-### Word Count
- Target: N | Actual: M | Delta: +/-
-
-### Voice Notes
- Dialect usage: N instances (target: moderate)
- Essen/Trinken: present in X/Y scenes
-
-### Commits
-1. `feat: scene 1 - <title>` (hash)
-2. `feat: scene 2 - <title>` (hash)
-
-### Notes
- Deviations from outline (with reasoning)
-```
-
-## With Prior Feedback (Cycle 2+)
-
-When the Maker receives feedback from a prior cycle's Check phase:
-
-1. Read the `act-feedback.md` — focus on the `### For Maker` section
-2. Address each finding marked as "routed to Maker"
-3. In your output, include a response table:
-
-```markdown
-### Feedback Response
-| Finding | Source | Action |
-|---------|--------|--------|
-| Test names unclear | Sage | Fixed — renamed to behavior descriptions |
-| Missing edge case | Trickster | Added test for empty input |
-```
-
-Do not address findings routed to Creator — those were handled in the revised proposal.
-
-## Quality Checklist (self-check before finishing)
-
-Before your final commit, verify:
- [ ] All proposal changes implemented
- [ ] All new tests pass
- [ ] All existing tests still pass
- [ ] No files modified outside proposal scope
- [ ] Every logical step has its own commit
- [ ] Output summary is complete and accurate
- [ ] Branch name follows convention
-
-## Test-First Gate
-
-Before the Maker's output is accepted, the orchestrator validates that tests were included.
-
-### Validation Logic
-
-Read `do-maker-files.txt`. Check if any file path matches common test patterns:
- `*test*`, `*spec*`, `*.test.*`, `*.spec.*`, `*_test.*`, `*_spec.*`
- Files in directories named `test/`, `tests/`, `__tests__/`, `spec/`
-
-For writing domain projects, this gate is skipped.
-
-### Outcomes
-
-| Result | Action |
-|--------|--------|
-| Test files found | Pass — proceed to Check phase |
-| No test files, code domain | **Warn** — emit WARNING event, note in do-maker.md |
-| No test files + Creator specified tests | **Block** — re-run Maker with test instruction (1 retry) |
-| Writing domain | Skip gate entirely |
-
-The block case triggers a targeted re-run with prompt:
-"The proposal specified these test cases: <test strategy section>. No test files
-were found in your changes. Add the specified tests before finishing."
-This is one retry within the Do phase, not a full PDCA cycle.
--- a/skills/effectiveness/SKILL.md
+++ b/skills/effectiveness/SKILL.md
@@ -1,200 +0,0 @@
---
-name: effectiveness
-description: |
-  Track archetype effectiveness across runs. Scores each archetype on signal-to-noise,
-  fix rate, cost efficiency, accuracy, and cycle impact. Recommends model tier changes
-  and archetype removal based on rolling averages.
-  <example>User: "Which reviewers are actually useful?"</example>
-  <example>User: "Show archetype effectiveness report"</example>
---
-
-# Agent Effectiveness Scoring
-
-Track which archetypes are most useful vs. which waste tokens. Over multiple runs, build a profile of each archetype's effectiveness and use it to optimize team composition and model selection.
-
-## Storage
-
-```
-.archeflow/memory/effectiveness.jsonl     # Per-run archetype scores (append-only)
-```
-
-## Scoring Dimensions
-
-For each archetype that participates in a run, calculate these scores:
-
-| Dimension | How Measured | Weight |
-|-----------|-------------|--------|
-| **Signal-to-noise** | useful findings / total findings | 0.30 |
-| **Fix rate** | findings that led to actual fixes / total findings | 0.25 |
-| **Cost efficiency** | useful findings per dollar spent | 0.20 |
-| **Accuracy** | findings not contradicted by other reviewers | 0.15 |
-| **Cycle impact** | did this archetype's findings lead to cycle exit? | 0.10 |
-
-### Definitions
-
- **Useful finding**: A finding in a `review.verdict` event with `severity >= WARNING` (i.e., severity is `warning`, `bug`, or `critical`) AND `fix_required == true`.
- **Actual fix**: A `fix.applied` event whose `source` field matches this archetype (or whose DAG `parent` chain traces back to this archetype's `review.verdict` event).
- **Contradicted finding**: Another reviewer's `review.verdict` has `verdict == "approved"` for the same scope where this archetype flagged an issue. Approximation: if archetype A flags N findings but archetype B approves the same code with 0 findings in overlapping severity categories, A's unmatched findings are considered potentially contradicted.
- **Cycle impact**: The archetype's findings (with `fix_required == true`) resulted in fixes that were part of the final approved cycle. Determined by checking if `fix.applied` events referencing this archetype exist before the final `cycle.boundary` with `met == true`.
-
-### Composite Score
-
-```
-composite = (signal_to_noise * 0.30)
-          + (fix_rate * 0.25)
-          + (cost_efficiency_normalized * 0.20)
-          + (accuracy * 0.15)
-          + (cycle_impact * 0.10)
-```
-
-**Cost efficiency normalization**: Raw cost efficiency is `useful_findings / cost_usd`. To normalize to 0-1 range, use: `min(1.0, raw_efficiency / 100)`. The threshold of 100 means "100 useful findings per dollar" is considered perfect efficiency (achievable with haiku on structured reviews).
-
-## Per-Run Scoring
-
-After `run.complete`, calculate scores for each archetype that participated. The `extract` command does this.
-
-### Per-Run Score Record
-
-```jsonl
-{"ts":"2026-04-03T16:00:00Z","run_id":"2026-04-03-der-huster","archetype":"guardian","signal_to_noise":0.85,"fix_rate":1.0,"cost_efficiency":42.5,"accuracy":1.0,"cycle_impact":true,"composite_score":0.91,"tokens":5000,"cost_usd":0.004,"model":"haiku","findings_total":4,"findings_useful":3,"fixes_applied":3}
-```
-
-Appended to `.archeflow/memory/effectiveness.jsonl`.
-
-### Scoring Non-Review Archetypes
-
-Only archetypes that produce `review.verdict` events are scored (Guardian, Skeptic, Sage, Trickster, and any custom review archetypes). Non-review archetypes (Explorer, Creator, Maker) are tracked by cost-tracking but not effectiveness-scored, because their output quality is measured differently (by whether the run succeeds, not by individual findings).
-
-## Aggregate Scoring
-
-Across all runs, maintain rolling averages (computed on-demand, not stored):
-
-```jsonl
-{"archetype":"guardian","runs":12,"avg_composite":0.88,"avg_signal_noise":0.82,"avg_cost_efficiency":38.2,"trend":"stable","recommendation":"keep"}
-{"archetype":"trickster","runs":8,"avg_composite":0.35,"avg_signal_noise":0.20,"avg_cost_efficiency":5.1,"trend":"declining","recommendation":"consider_removing"}
-```
-
-### Trend Calculation
-
-Compare the average composite score of the last 5 runs to the 5 runs before that:
-
- **improving**: last-5 avg > prior-5 avg + 0.05
- **declining**: last-5 avg < prior-5 avg - 0.05
- **stable**: within +/- 0.05
-
-If fewer than 10 runs exist, trend is `"insufficient_data"`.
-
-### Recommendations
-
-Based on aggregate composite scores:
-
-| Composite Score | Recommendation | Meaning |
-|----------------|---------------|---------|
-| >= 0.70 | `keep` | Archetype is valuable, contributes meaningful findings |
-| 0.40 - 0.69 | `optimize` | Consider cheaper model or tighter review lens |
-| < 0.40 | `consider_removing` | Might be wasting tokens, review whether it adds value |
-
-## Integration Points
-
-### At Run Start
-
-When the `run` skill initializes, show a brief effectiveness summary for the team's archetypes:
-
-```
-Archetype effectiveness (last 10 runs):
-  guardian:  0.88 (keep)     — haiku, $0.004/run avg
-  sage:      0.72 (keep)     — sonnet, $0.08/run avg
-  skeptic:   0.45 (optimize) — haiku, $0.003/run avg
-  trickster: 0.32 (consider_removing) — haiku, $0.003/run avg
-```
-
-### Model Tier Suggestions
-
-Cross-reference effectiveness with model assignment:
-
- **High effectiveness on cheap model** (composite >= 0.7, model = haiku): "Keep cheap. Working well."
- **Low effectiveness on cheap model** (composite < 0.5, model = haiku): "Consider upgrading to sonnet — cheap model may not be capturing issues."
- **High effectiveness on expensive model** (composite >= 0.7, model = sonnet): "Try downgrading to haiku — may maintain quality at lower cost."
- **Low effectiveness on expensive model** (composite < 0.5, model = sonnet): "Consider removing — expensive and not contributing."
-
-### Cost-Tracking Integration
-
-Multiply estimated cost by effectiveness to get "value per dollar":
-
-```
-value_per_dollar = composite_score / cost_usd
-```
-
-This metric helps compare archetypes directly: a cheap archetype with moderate effectiveness may have higher value_per_dollar than an expensive one with high effectiveness.
-
-## Effectiveness Script
-
-**Location:** `lib/archeflow-score.sh`
-
-```
-Usage:
-  archeflow-score.sh extract <events.jsonl>     # Score archetypes from a completed run
-  archeflow-score.sh report                     # Show aggregate effectiveness report
-  archeflow-score.sh recommend <team.yaml>      # Recommend model tiers for a team
-```
-
-### `extract` Command
-
-1. Read all events from the JSONL file
-2. Verify a `run.complete` event exists (scoring incomplete runs is unreliable)
-3. For each `review.verdict` event:
-   - Count total findings and useful findings (severity >= WARNING, fix_required)
-   - Cross-reference with `fix.applied` events via the `source` field or DAG parent chain
-   - Check for contradictions from other reviewers
-   - Determine cycle impact
-4. Calculate all scoring dimensions and composite score
-5. Append per-archetype score records to `.archeflow/memory/effectiveness.jsonl`
-
-### `report` Command
-
-1. Read `.archeflow/memory/effectiveness.jsonl`
-2. Group by archetype
-3. Calculate rolling averages (last 10 runs per archetype)
-4. Calculate trends (last 5 vs. prior 5)
-5. Output a markdown table:
-
-```markdown
-# Archetype Effectiveness Report
-
-| Archetype | Runs | Avg Score | S/N | Fix Rate | Cost Eff | Accuracy | Trend | Rec |
-|-----------|------|-----------|-----|----------|----------|----------|-------|-----|
-| guardian | 12 | 0.88 | 0.82 | 0.95 | 38.2 | 0.97 | stable | keep |
-| sage | 10 | 0.72 | 0.70 | 0.80 | 12.1 | 0.88 | improving | keep |
-| skeptic | 8 | 0.45 | 0.40 | 0.50 | 22.5 | 0.60 | stable | optimize |
-| trickster | 8 | 0.35 | 0.20 | 0.30 | 5.1 | 0.55 | declining | consider_removing |
-
-**Model suggestions:**
- skeptic (haiku, score 0.45): Consider upgrading to sonnet or tightening review lens
- trickster (haiku, score 0.35): Consider removing — low signal, low fix rate
-```
-
-### `recommend` Command
-
-1. Read the team preset YAML file
-2. For each archetype in the team, look up its effectiveness from `.archeflow/memory/effectiveness.jsonl`
-3. Cross-reference current model assignment with effectiveness
-4. Output recommendations:
-
-```markdown
-# Model Recommendations for team: story-development
-
-| Archetype | Current Model | Score | Suggestion |
-|-----------|--------------|-------|------------|
-| guardian | haiku | 0.88 | Keep haiku — high effectiveness at low cost |
-| sage | sonnet | 0.72 | Keep sonnet — quality-sensitive role |
-| skeptic | haiku | 0.45 | Try sonnet — may improve signal quality |
-| trickster | haiku | 0.35 | Consider removing from team |
-```
-
-## Design Principles
-
-1. **Append-only.** Score records are immutable facts. Aggregates are computed on-demand.
-2. **Review archetypes only.** Non-review agents (Explorer, Creator, Maker) are not scored — their value is in the final product, not in individual findings.
-3. **Relative, not absolute.** Scores are meaningful in comparison (guardian vs. trickster), not as standalone numbers. The thresholds (0.7, 0.4) are starting points — calibrate after 20+ runs.
-4. **Actionable.** Every report ends with concrete recommendations (keep, optimize, remove, change model).
-5. **Cheap to compute.** One JSONL scan per report. No databases, no external services.
--- a/skills/orchestration/SKILL.md
+++ b/skills/orchestration/SKILL.md
@@ -1,634 +0,0 @@
---
-name: orchestration
-description: Use when executing a multi-agent orchestration — spawning archetype agents, managing PDCA cycles, coordinating worktrees, and merging results. This is the step-by-step execution guide.
---
-
-# Orchestration Execution
-
-This skill guides you through running a full ArcheFlow orchestration using Claude Code's native Agent tool and git worktrees.
-
-## Strategy Selection
-
-A **strategy** defines the shape of an orchestration run — which phases execute, in what order, and when to iterate. A **workflow** (fast/standard/thorough) controls the depth within a strategy.
-
-### Available Strategies
-
-| Strategy | Flow | When to Use |
-|----------|------|-------------|
-| `pdca` | Plan -> Do -> Check -> Act (cyclic) | Refactors, thorough reviews, multi-concern tasks |
-| `pipeline` | Plan -> Implement -> Spec-Review -> Quality-Review -> Verify (linear) | Bug fixes, fast patches, single-concern tasks |
-| `auto` | Selected by task analysis | Default — let ArcheFlow decide |
-
-### Strategy Interface
-
-Every strategy defines:
-
- **Phases** — ordered list of execution stages
- **Agent mapping** — which archetypes run in each phase
- **Transition rules** — conditions for moving between phases
- **Iteration model** — cyclic (PDCA) or linear (pipeline)
- **Exit conditions** — when the run terminates
-
-### PDCA Strategy
-
-The existing orchestration flow (Steps 0-4 below). Cyclic — the Act phase can feed back to Plan for another iteration. Best for tasks requiring multiple review perspectives and iterative refinement.
-
-### Pipeline Strategy
-
-Linear flow with no cycle-back. Faster for well-understood tasks where one pass is sufficient.
-
-| Phase | Agent | Purpose |
-|-------|-------|---------|
-| Plan | Creator | Design proposal |
-| Implement | Maker | Build in worktree |
-| Spec-Review | Guardian, then Skeptic | Security + assumption check (sequential) |
-| Quality-Review | Sage | Code quality review |
-| Verify | (automated) | Run tests, apply targeted fix if CRITICAL |
-
-No cycle-back — WARNINGs are logged but do not block. CRITICALs in Verify trigger a single targeted fix attempt by the Maker, not a full cycle.
-
-### Auto-Selection Rules
-
-When `strategy: auto` (default):
-
- Task contains "fix", "bug", "patch", "hotfix" → `pipeline`
- Task contains "refactor", "redesign", "review" → `pdca`
- Workflow is `thorough` → `pdca` (always)
- Workflow is `fast` with single file → `pipeline`
- Otherwise → `pdca`
-
---
-
-## Step 0: Choose a Workflow
-
-If `.archeflow/teams/<name>.yaml` exists, the user can reference a team preset: `"Use the backend team"`. Load the preset's phase config instead of built-in defaults. See `archeflow:custom-archetypes` skill for preset format.
-
-Otherwise, assess the task and pick:
-
-| Signal | Workflow |
-|--------|----------|
-| Small fix, low risk, single concern | `fast` (1 cycle) |
-| Feature, multiple files, moderate risk | `standard` (2 cycles) |
-| Security-sensitive, breaking changes, public API | `thorough` (3 cycles) |
-
-## Workflow Adaptation Rules
-
-The initial workflow choice is a starting point, not a commitment. These rules adapt the workflow at runtime. Each rule specifies when it evaluates (which phase boundary).
-
-### A3: Confidence Gate (evaluates: after Plan, before Do)
-
-**When:** Creator's confidence table has any axis below 0.5.
-**Action by axis:**
-
-| Axis | Score < 0.5 Action |
-|------|-------------------|
-| Task understanding | **Pause.** Ask user to clarify before proceeding. Do not spawn Maker. |
-| Solution completeness | **Upgrade to standard.** Add Explorer before Maker starts. |
-| Risk coverage | **Spawn mini-Explorer** for the specific risky area (parallel, 5 min max). Maker can proceed. |
-
-A3 runs before any Do/Check agents spawn, so there are no cancellation issues.
-
-### A1: Conditional Escalation (evaluates: after Check, before next cycle)
-
-**When:** Guardian rejects with 2+ CRITICAL findings in a `fast` workflow.
-**Action:** Escalate to `standard` for the **next cycle** — add Skeptic + Sage to the reviewer roster.
-**Why:** If Guardian found serious issues, more perspectives help find root causes.
-**Sticky:** Once escalated, the workflow stays escalated for all remaining cycles. A2 does not apply to escalated workflows.
-
-### A2: Guardian Fast-Path (evaluates: after Guardian, before spawning other reviewers)
-
-**When:** Guardian finds 0 CRITICAL and 0 WARNING in a non-escalated `standard` or `thorough` workflow.
-**Action:** Do not spawn Skeptic, Sage, or Trickster. Proceed directly to Act phase.
-**Why:** Guardian's security review is the strictest gate. Clean pass = safe to skip additional reviewers.
-**Critical:** Evaluate A2 **after Guardian completes but before other reviewers are spawned.** Do not spawn reviewers in parallel with Guardian — spawn Guardian first, check A2, then spawn remaining reviewers only if A2 doesn't trigger.
-**Does not apply to:** Escalated workflows (A1 triggered), or first cycle of `thorough` workflows (Trickster is mandatory on first pass).
-**Log:** Note "Guardian fast-path taken" in orchestration report.
-
-### Evaluation Order
-
-```
-Plan phase completes → A3 (confidence gate)
-                     ↓
-Guardian completes  → A2 (fast-path check) → if clean, skip other reviewers
-                     ↓                       if not, spawn other reviewers
-Check phase done    → A1 (escalation check) → if 2+ CRITICALs in fast, next cycle is standard
-```
-
-## Process Logging
-
-If `.archeflow/events/` exists (or should be created), emit structured events throughout orchestration. See `archeflow:process-log` skill for full schema.
-
-**Quick reference — emit at these points:**
-
-```
-run.start        → After workflow selection, before first agent
-agent.start      → Before each Agent tool call
-agent.complete   → After each Agent returns (include duration, tokens, summary, artifacts)
-decision         → When choosing between alternatives (plot direction, approach, fix strategy)
-phase.transition → At Plan→Do, Do→Check, Check→Act boundaries
-review.verdict   → After each reviewer delivers verdict
-fix.applied      → After each edit addressing a review finding
-cycle.boundary   → End of PDCA cycle
-shadow.detected  → When shadow threshold triggers
-run.complete     → After final Act phase (include totals)
-```
-
-**Helper:** `./lib/archeflow-event.sh <run_id> <type> <phase> <agent> '<json>'`
-
-**Report:** `./lib/archeflow-report.sh .archeflow/events/<run_id>.jsonl`
-
-Events are optional — if the events dir doesn't exist, skip logging. Never let logging block orchestration.
-
---
-
-## Model Configuration
-
-Model assignment per archetype and workflow is configured in `.archeflow/config.yaml` under the `models:` section. The `archeflow:run` skill (section 0c) handles resolution with fallback chain: per-workflow per-archetype > per-workflow default > per-archetype > global default. When spawning agents manually, read the config to select the appropriate model.
-
---
-
-## Step 1: Plan Phase
-
-Spawn agents sequentially — Creator needs Explorer's findings.
-
-### Explorer (if standard or thorough)
-
-**Context to include:** Task description, relevant file paths, codebase access.
-**Context to exclude:** Prior proposals, review outputs, implementation details, feedback from previous cycles.
-
-```
-Agent(
-  description: "🔍 Explorer: research context",
-  prompt: "<task description>
-    You are the EXPLORER archetype.
-    Research the codebase to understand:
-    1. What files and functions are involved
-    2. What dependencies exist
-    3. What tests currently cover this area
-    4. What patterns the codebase uses
-    Write your findings as a structured research report.
-    Be thorough but focused — no rabbit holes.",
-  subagent_type: "Explore"
-)
-```
-
-### Creator
-
-**Context to include:** Task description, Explorer's research output. On cycle 2+: prior cycle's structured feedback (see Cycle Feedback Protocol).
-**Context to exclude:** Raw file contents (Explorer already summarized), git diffs, reviewer full outputs.
-
-**Fast workflow only (no Explorer):** The Creator must perform a Mini-Reflect before proposing:
-1. Restate the task in your own words (catch misunderstandings early)
-2. List 3 assumptions you're making
-3. Name the one risk that would cause most damage if wrong
-
-```
-Agent(
-  description: "🏗️ Creator: design proposal",
-  prompt: "<task description>
-    You are the CREATOR archetype.
-    <if fast workflow (no Explorer): Before proposing, perform a Mini-Reflect:
-      1. Restate the task in one sentence
-      2. List 3 assumptions you're making
-      3. Name the highest-damage risk
-      Then propose.>
-    <if standard/thorough: Based on the research findings: <Explorer's output>>
-    <if cycle 2+: Prior cycle feedback: <structured feedback — see Cycle Feedback Protocol>>
-    Design a solution proposal including:
-    1. Architecture decisions (with rationale)
-    2. Files to create/modify (with specific changes)
-    3. Alternatives considered (at least 2, with rejection rationale)
-    4. Test strategy
-    5. Confidence (scored by axis: task understanding, solution completeness, risk coverage)
-    6. Risks you foresee
-    <if cycle 2+: 6. How you addressed each unresolved issue from prior feedback>
-    Be decisive. Ship a clear plan, not a menu of options.",
-  subagent_type: "Plan"
-)
-```
-
-## Step 2: Do Phase
-
-Spawn Maker in an **isolated worktree** so changes don't affect main.
-
-**Context to include:** Creator's proposal only. On cycle 2+: implementation-routed feedback from Sage/Trickster.
-**Context to exclude:** Explorer's research, Guardian/Skeptic findings (those go to Creator).
-
-```
-Agent(
-  description: "⚒️ Maker: implement proposal",
-  prompt: "<task description>
-    You are the MAKER archetype.
-    Implement this proposal: <Creator's output>
-    <if cycle 2+: Implementation feedback from prior cycle: <Sage/Trickster findings only>>
-    Rules:
-    1. Follow the proposal exactly — don't redesign
-    2. Write tests for every behavioral change
-    3. Commit with descriptive messages
-    4. Run existing tests — nothing may break
-    5. If the proposal is unclear, implement your best interpretation and note it
-    Do NOT skip tests. Do NOT refactor unrelated code.
-
-    BEFORE finishing — Self-Review Checklist:
-    1. Did I change ALL files listed in the proposal's Changes section?
-    2. Did I add tests for each behavioral change?
-    3. Are there files in my diff NOT listed in the proposal? If yes, revert them.
-    4. Do all existing tests still pass?
-    Report any gaps in your Implementation summary.",
-  isolation: "worktree",
-  mode: "bypassPermissions"
-)
-```
-
-**Critical:** The Maker MUST commit its changes before finishing. Uncommitted changes in a worktree are lost.
-
-## Step 3: Check Phase
-
-Spawn Guardian **first**. After Guardian completes, check adaptation rule A2 (fast-path). If A2 triggers (0 CRITICAL, 0 WARNING, non-escalated workflow), skip remaining reviewers and proceed to Act. Otherwise, spawn remaining reviewers **in parallel**.
-
-**Reviewer spawning protocol:** The canonical sequence (Guardian first, A2 evaluation, parallel spawning, timeout handling) is defined in `archeflow:check-phase` under "Reviewer Spawning Protocol". Follow that protocol for the exact spawning order, context per reviewer, and timeout rules.
-
-### Guardian (always runs first)
-
-**Context to include:** Maker's git diff, proposal risk section only.
-**Context to exclude:** Explorer's research, full proposal, other reviewer outputs.
-
-```
-Agent(
-  description: "🛡️ Guardian: security and risk review",
-  prompt: "You are the GUARDIAN archetype.
-    Review the changes in branch: <maker's branch>
-    Assess:
-    1. Security vulnerabilities (injection, auth bypass, data exposure)
-    2. Reliability risks (error handling, edge cases, race conditions)
-    3. Breaking changes (API compatibility, schema migrations)
-    4. Dependency risks (new deps, version conflicts)
-    Output: APPROVED or REJECTED with specific findings.
-    Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
-    Categories: security, reliability, design, breaking-change, dependency
-    Be rigorous but practical — flag real risks, not theoretical ones."
-)
-```
-
-### Skeptic (if standard or thorough)
-
-**Context to include:** Creator's proposal (focus on assumptions section).
-**Context to exclude:** Git diff details, Explorer's research, other reviewer outputs.
-
-```
-Agent(
-  description: "🤔 Skeptic: challenge assumptions",
-  prompt: "You are the SKEPTIC archetype.
-    Review the proposal: <Creator's proposal>
-    Challenge:
-    1. Assumptions in the design — what if they're wrong?
-    2. Alternative approaches not considered
-    3. Edge cases not tested
-    4. Scalability concerns
-    Output: APPROVED or REJECTED with counterarguments.
-    Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
-    Categories: design, quality, testing, scalability
-    Be constructive — every challenge must include a suggested alternative."
-)
-```
-
-### Sage (if standard or thorough)
-
-**Context to include:** Creator's proposal, Maker's git diff, implementation summary.
-**Context to exclude:** Explorer's raw research, other reviewer outputs.
-
-```
-Agent(
-  description: "📚 Sage: holistic quality review",
-  prompt: "You are the SAGE archetype.
-    Review the changes in branch: <maker's branch>
-    Evaluate holistically:
-    1. Code quality (readability, maintainability, simplicity)
-    2. Test coverage (are the tests meaningful, not just present?)
-    3. Documentation (does the change need docs?)
-    4. Consistency with codebase patterns
-    Output: APPROVED or REJECTED with quality findings.
-    Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
-    Categories: quality, testing, design, consistency
-    Judge like a senior engineer doing a PR review."
-)
-```
-
-### Trickster (if thorough only)
-
-**Context to include:** Maker's git diff only.
-**Context to exclude:** Everything else — proposal, research, other reviews.
-
-```
-Agent(
-  description: "🃏 Trickster: adversarial testing",
-  prompt: "You are the TRICKSTER archetype.
-    Try to break the changes in branch: <maker's branch>
-    Attack vectors:
-    1. Malformed input, boundary values, empty/null/huge data
-    2. Concurrency and race conditions
-    3. Error path exploitation
-    4. Dependency failure scenarios
-    Output: APPROVED or REJECTED with edge cases found.
-    Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
-    Categories: security, reliability, testing
-    Think like a QA engineer who gets paid per bug found."
-)
-```
-
-## Step 4: Act Phase
-
-Collect all reviewer outputs and decide.
-
-### Completion Promise (optional)
-
-If the user defined explicit done criteria with the task, check them now:
-
-```
-Completion criteria: <test command passes> AND <Guardian approves>
-Example: "done when pytest passes and Guardian approves with 0 CRITICAL"
-```
-
-If completion criteria are defined, **all criteria must pass** — reviewer approval alone is not sufficient. If tests fail but reviewers approved, cycle back with "tests failing" as feedback to Creator.
-
-### All Approved (and completion criteria met)
-1. **Pre-merge hooks:** Check `.archeflow/hooks.yaml` for `pre-merge` hooks. Run them. If `fail_action: abort`, stop and report.
-2. Merge the Maker's worktree branch into the target branch
-3. **Post-merge hooks:** Run `post-merge` hooks from `.archeflow/hooks.yaml` if defined. Then run the project's test suite on the merged branch
-   - Tests pass → proceed to step 3
-   - Tests fail → **auto-revert** the merge commit, report the failure, and cycle back with "integration test failure on main" as feedback
-3. Report: what was implemented, what was reviewed, any warnings noted
-4. Clean up the worktree
-5. Record metrics (see Orchestration Metrics)
-
-### Issues Found (and cycles remaining)
-1. Build structured feedback using the Cycle Feedback Protocol below
-2. Go back to Step 1 (Plan) with the feedback
-3. Creator revises the proposal, addressing each unresolved issue
-4. Maker re-implements in a fresh worktree
-5. Reviewers check again
-
-### Max Cycles Reached with Unresolved Issues
-1. Report all unresolved findings to the user
-2. Present the best implementation so far (on its branch)
-3. Let the user decide: merge as-is, fix manually, or abandon
-
---
-
-## Cycle Feedback Protocol
-
-After the Check phase, build structured feedback for the next cycle. This replaces dumping raw reviewer output.
-
-### 1. Extract Findings
-
-Parse each reviewer's output into the standardized format:
-
-```markdown
-## Cycle N Feedback
-
-### Unresolved Issues
-| Source | Severity | Category | Issue | Route to |
-|--------|----------|----------|-------|----------|
-| Guardian | CRITICAL | security | SQL injection in user input | Creator |
-| Skeptic | WARNING | design | Assumes single-tenant only | Creator |
-| Sage | WARNING | quality | Test names don't describe behavior | Maker |
-| Trickster | CRITICAL | reliability | Empty string bypasses validation | Creator |
-
-### Resolved (from cycle N-1)
-| Source | Issue | Resolution |
-|--------|-------|------------|
-| Guardian | Missing rate limit | Added rate limiter middleware |
-```
-
-### 2. Route Feedback
-
-Not all findings go to the same agent:
-
-| Source | Category | Routes to | Reason |
-|--------|----------|-----------|--------|
-| Guardian | security, breaking-change | **Creator** | Design must change |
-| Guardian | reliability, dependency | **Creator** | Architectural decision needed |
-| Skeptic | design, scalability | **Creator** | Assumptions need revision |
-| Sage | quality, consistency | **Maker** | Implementation refinement |
-| Sage | testing | **Maker** | Test gap, not design flaw |
-| Trickster | reliability (design flaw) | **Creator** | Needs redesign |
-| Trickster | reliability (test gap) | **Maker** | Needs more tests |
-| Trickster | testing | **Maker** | Edge case not covered |
-
-**Disambiguation rule:** When in doubt: if the fix requires changing the approach, route to Creator. If it requires changing the code within the existing approach, route to Maker.
-
-### 3. Track Resolution
-
-Compare cycle N findings against cycle N-1:
- If a prior finding no longer appears in the same category → mark **resolved**
- If a prior finding persists → it stays **unresolved** with an incremented cycle count
- If new findings appear → add as new unresolved issues
-
-This prevents regression and gives the Creator/Maker a clear list of what to address.
-
-### 4. Convergence Detection
-
-If the **same finding** (same category + same file location) appears **unresolved in 2 consecutive cycles**, escalate to user:
-
-> "Finding persists across 2 cycles: [Guardian] CRITICAL security — SQL injection in src/auth.ts:48. This may need human judgment or a different approach."
-
-Do not cycle again blindly. The issue is likely structural (wrong design, not wrong implementation) and needs human input.
-
-### 5. Cross-Archetype Dedup
-
-If two reviewers raise the same issue (same file + same category + similar description), merge into one finding in the consolidated output:
-
-```
-| Guardian + Skeptic | CRITICAL | security | Input not sanitized (src/api.ts:30) | Add validation |
-```
-
-Don't double-count in severity tallies. Route to the higher-priority destination (Creator over Maker).
-
---
-
-## Orchestration Metrics
-
-Track lightweight metrics throughout the orchestration. No token counting (unreliable from skill layer) — just timing and outcomes.
-
-### Per-Phase Logging
-
-After each phase completes, note:
-
-```
-| Phase | Duration | Agents | Outcome |
-|-------|----------|--------|---------|
-| Plan  | 45s      | 2      | Proposal ready (confidence: 0.8) |
-| Do    | 90s      | 1      | 4 files changed, 8 tests added |
-| Check | 60s      | 3      | 1 REJECTED (Guardian), 2 APPROVED |
-| Act   | —        | —      | Cycle back → feedback built |
-```
-
-### Orchestration Summary
-
-At orchestration end, include in the report:
-
-```markdown
-## Orchestration Metrics
-| Metric | Value |
-|--------|-------|
-| Workflow | standard |
-| Cycles | 2 of 2 |
-| Total duration | 4m 30s |
-| Agents spawned | 9 |
-| Findings (total) | 5 |
-| Findings (critical) | 1 |
-| Findings (resolved) | 4 |
-| Shadow detections | 0 |
-```
-
-Use this data to calibrate future workflow selection — if fast workflows consistently need 0 cycles of revision, the task was well-scoped.
-
---
-
-## Autonomous Mode
-
-When running unattended (overnight sessions, batch queues), add these behaviors to the orchestration loop:
-
-### Between-Task Checkpoint
-
-After each task completes (success or failure):
-1. **Commit and push** all changes immediately
-2. **Update session log** at `.archeflow/session-log.md` with task outcome
-3. **Check stop conditions** before starting next task:
-   - 3 consecutive failures → STOP
-   - Shadow escalation (same shadow 3+ times) → STOP
-   - Test suite broken after merge → REVERT and STOP
-   - Destructive action detected → STOP
-
-### Session Log Protocol
-
-**Primary:** Emit `run.complete` event to `.archeflow/events/<run_id>.jsonl` (see Process Logging section above). The event stream is the source of truth.
-
-**Secondary:** Also write a human-readable summary to `.archeflow/session-log.md`:
-
-```markdown
-## Task N: <description>
-**Workflow:** standard | **Status:** COMPLETED/FAILED
-**Cycles:** 1 of 2
-**Findings:** Guardian APPROVED, Skeptic APPROVED, Sage WARNING (test names)
-**Files changed:** 5 | **Tests added:** 12
-**Branch:** merged to main (commit abc1234) | OR: archeflow/maker-xyz (NOT merged)
-**Duration:** 8 min
-**Events:** `.archeflow/events/<run_id>.jsonl` (full process log)
-```
-
-Generate the full Markdown report: `./lib/archeflow-report.sh .archeflow/events/<run_id>.jsonl`
-
-### Safety Rules
- Never force-push. Never modify main history.
- All work stays on worktree branches until explicitly merged
- Merges use `--no-ff` — individually revertable
- Failed tasks leave branches intact for manual inspection
-
-For full autonomous mode details (task queues, overnight checklists, user controls): load the `archeflow:autonomous-mode` skill.
-
---
-
-## Shadow Monitoring
-
-During orchestration, watch for shadow activation after each agent completes. Quick checklist:
-
-| Archetype | Shadow | Quick Check |
-|-----------|--------|-------------|
-| Explorer | Rabbit Hole | Output >2000 words without Recommendation section? |
-| Creator | Over-Architect | >2 new abstractions for one feature? |
-| Maker | Rogue | No test files in changeset? Files outside proposal? |
-| Guardian | Paranoid | CRITICAL:WARNING ratio >2:1? Zero approvals? |
-| Skeptic | Paralytic | >7 challenges? <50% have alternatives? |
-| Trickster | False Alarm | Findings in untouched code? >10 findings? |
-| Sage | Bureaucrat | Review >2x code change length? |
-
-On detection: apply correction prompt from `archeflow:shadow-detection` skill. On second detection of same shadow: replace agent. On 3+ shadows in same cycle: escalate to user.
-
---
-
-## Parallel Team Orchestration
-
-When running multiple independent tasks, spawn parallel ArcheFlow teams. Each team runs its own PDCA cycle on a separate worktree.
-
-### Rules
-
-1. **Non-overlapping file scope:** Each team must work on different files. If two tasks touch the same file, run them sequentially.
-2. **Independent worktrees:** Each team's Maker gets its own worktree branch (`archeflow/team-1-maker`, `archeflow/team-2-maker`).
-3. **First-finished-first-merged:** Teams merge in completion order. Later teams rebase onto the updated main before their own merge.
-4. **Merge conflict handling:** If rebase fails, the later team re-runs its Check phase against the merged main. If conflicts are structural, escalate to user.
-5. **Max 3 parallel teams:** More causes diminishing returns and merge headaches.
-
-### Spawning Parallel Teams
-
-```
-# Launch 2-3 teams in a single message with multiple Agent calls:
-Agent(description: "🏗️ Team 1: pagination fix (fast)", ...)
-Agent(description: "🏗️ Team 2: JWT auth (standard)", ...)
-Agent(description: "🏗️ Team 3: logging refactor (fast)", ...)
-```
-
-Each team follows the full PDCA steps independently. The orchestrator monitors all teams and handles merges.
-
---
-
-## Reviewer Profiles
-
-Projects can configure which reviewers matter in `.archeflow/config.yaml`:
-
-```yaml
-reviewers:
-  always: [guardian]        # Always runs
-  default: [sage]           # Runs in standard+thorough
-  thorough_only: [trickster] # Only in thorough
-  skip: [skeptic]           # Never runs for this project
-```
-
-If no config exists, use the built-in workflow defaults. Profiles save tokens by not spawning reviewers that add little value for the specific project.
-
-## Explorer Cache
-
-If the same code area was explored recently, skip Explorer and reuse prior research:
-
-**Cache hit criteria:** Same files affected (>70% overlap by path) AND prior research is <24 hours old AND no commits to those files since the research.
-
-**On cache hit:** Show the prior research to Creator with a note: "Using cached Explorer research from [timestamp]. If the codebase changed significantly, re-run Explorer."
-
-**On cache miss:** Run Explorer normally.
-
-Cache is stored in `.archeflow/explorer-cache/` as timestamped markdown files. The orchestrator checks for matches before spawning Explorer.
-
-## Learning from History
-
-Track which archetypes catch real issues per project over time. After each orchestration, append to `.archeflow/metrics.jsonl`:
-
-```json
-{"task": "...", "archetype": "guardian", "findings": 2, "critical": 1, "resolved": 2, "useful": true}
-{"task": "...", "archetype": "skeptic", "findings": 3, "critical": 0, "resolved": 0, "useful": false}
-```
-
-A finding is **useful** if it was resolved (led to a code change) rather than dismissed.
-
-After 10+ orchestrations, the orchestrator can recommend reviewer profile changes:
- "Skeptic has found 0 useful issues in 8 runs — consider moving to `skip` or `thorough_only`"
- "Guardian catches critical issues in 80% of runs — confirmed as essential"
-
-This is advisory, not automatic. The user decides based on the data.
-
---
-
-## Orchestration Report
-
-After completion, summarize:
-
-```markdown
-## ArcheFlow Orchestration Report
- **Task:** <description>
- **Workflow:** standard (2 cycles)
- **Cycle 1:** Guardian rejected (SQL injection in user input handler)
- **Cycle 2:** All approved after input sanitization added
- **Files changed:** 4 files, +120 -30 lines
- **Tests added:** 8 new tests
- **Branch:** archeflow/maker-<id> → merged to main
- **Metrics:** 9 agents, 4m 30s, 5 findings (4 resolved, 1 info remaining)
-```
--- a/skills/plan-phase/SKILL.md
+++ b/skills/plan-phase/SKILL.md
@@ -1,175 +0,0 @@
---
-name: plan-phase
-description: Use when acting as Explorer or Creator in the Plan phase. Defines output formats for research and proposals.
---
-
-# Plan Phase
-
-Explorer researches, then Creator designs. Sequential — Creator needs Explorer's findings.
-
-## Explorer Output Format
-
-```markdown
-## Research: <task>
-
-### Affected Code
- `path/file.ext` — description (L<start>-<end>)
-
-### Dependencies
- What depends on what, what breaks if changed
-
-### Patterns
- How the codebase solves similar problems
-
-### Risks
- What could go wrong
-
-### Recommendation
-<one paragraph: approach + rationale>
-```
-
-## Creator Output Format
-
-```markdown
-## Proposal: <task>
-
-### Mini-Reflect (fast workflow only — skip if Explorer ran)
- **Task restated:** <one sentence>
- **Assumptions:** 1) ... 2) ... 3) ...
- **Highest-damage risk:** <the one thing that would hurt most if wrong>
-
-### Architecture Decision
-<What and WHY>
-
-### Alternatives Considered
-| Approach | Why Rejected |
-|----------|-------------|
-| <option A> | <reason> |
-| <option B> | <reason> |
-
-### Changes
-1. **`path/file.ext`** — What changes and why
-2. **`path/test.ext`** — What tests to add
-
-### Test Strategy
- <specific test cases>
-
-### Confidence
-| Axis | Score | Note |
-|------|-------|------|
-| Task understanding | <0.0-1.0> | <why> |
-| Solution completeness | <0.0-1.0> | <gaps?> |
-| Risk coverage | <0.0-1.0> | <unknowns?> |
-
-### Risks
- <what could go wrong + mitigations>
-
-### Not Doing
- <adjacent concerns deliberately excluded>
-```
-
-**Confidence triggers:** If any axis scores below 0.5, flag it to the orchestrator. Low task understanding → clarify with user. Low solution completeness → consider standard workflow. Low risk coverage → spawn targeted Explorer research.
-
-## Creator with Prior Feedback (Cycle 2+)
-
-When the Creator receives structured feedback from a prior cycle, the proposal must include an additional section addressing each unresolved issue:
-
-```markdown
-## Proposal: <task> (Revision — Cycle N)
-
-### What Changed (vs. prior proposal)
- <brief delta: what was added, removed, or redesigned>
-
-### Prior Feedback Response
-| Issue | Source | Action | Rationale |
-|-------|--------|--------|-----------|
-| SQL injection in user input | Guardian | **Fixed** — added parameterized queries | Direct security fix |
-| Assumes single-tenant | Skeptic | **Deferred** — multi-tenant out of scope | Not in task requirements |
-| Test names unclear | Sage | **Accepted** — routed to Maker | Implementation concern |
-
-### Architecture Decision
-<revised design addressing feedback>
-
-### Changes
-<updated file list>
-
-### Test Strategy
-<updated test cases>
-
-### Confidence
-| Axis | Score | Note |
-|------|-------|------|
-| Task understanding | <0.0-1.0> | <why> |
-| Solution completeness | <0.0-1.0> | <gaps?> |
-| Risk coverage | <0.0-1.0> | <unknowns?> |
-
-### Risks
-<updated risks — include any new risks from the revision>
-
-### Not Doing
-<updated scope boundaries>
-```
-
-**Rules for addressing feedback:**
- **Fixed:** Changed the design to resolve the issue. Explain how.
- **Deferred:** Not addressing now, with explicit reason. Must not be a CRITICAL finding.
- **Accepted:** Acknowledged and routed to Maker for implementation-level fix.
- **Disputed:** Disagrees with the finding. Must provide evidence or reasoning.
-
-CRITICAL findings cannot be deferred or disputed — they must be fixed or the proposal will be rejected again.
-
-## Task Granularity
-
-Each change item in the Creator's proposal must be a **2-5 minute task** — specific enough that the Maker can implement it without interpretation.
-
-### Requirements per Change Item
-
-Every item in the `### Changes` section must include:
-
-1. **Exact file path** — `src/auth/handler.ts`, not "the auth module"
-2. **What to change** — a code block showing the target state or transformation
-3. **How to verify** — a command or check that confirms correctness
-
-### Good Example
-
-```markdown
-1. **`src/auth/handler.ts:48`** — Add input length validation before token processing
-   ```typescript
-   if (!token || token.trim().length === 0) {
-     throw new ValidationError('Token must not be empty');
-   }
-   ```
-   **Verify:** `npm test -- --grep "empty token"` passes
-```
-
-### Bad Example
-
-```markdown
-1. **Auth module** — Fix the validation logic
-```
-
-This is too vague. Which file? Which function? What does "fix" mean? The Maker will guess.
-
-### Granularity Check
-
- If a single change item would take **>5 minutes**, split it into smaller items
- If a non-trivial task has **<2 change items**, it is under-specified — the Creator missed something
- Each item should touch **1-2 files** at most. Cross-cutting changes need separate items per file.
-
---
-
-## Explorer Skip Conditions
-
-Not every task needs Explorer research. Use this decision table:
-
-| Condition | Skip Explorer? | Reason |
-|-----------|---------------|--------|
-| Task names specific files (1-2) and change is clear | **Yes** | Context is already known |
-| Bug fix with stack trace or error message | **Yes** | Root cause is locatable without research |
-| High confidence + small scope (single function/class) | **Yes** | Creator can mini-reflect instead |
-| Task contains "investigate", "research", "explore" | **No** | Explicit research request |
-| Task affects >3 files or unknown scope | **No** | Need dependency mapping |
-| Unfamiliar area of codebase (no recent commits by team) | **No** | Need pattern discovery |
-| Security-sensitive change (auth, crypto, input handling) | **No** | Need risk surface mapping |
-
-When Explorer is skipped, Creator MUST include the **Mini-Reflect** section in its proposal to compensate for missing research context.
--- a/skills/process-log/SKILL.md
+++ b/skills/process-log/SKILL.md
@@ -1,278 +0,0 @@
---
-name: process-log
-description: |
-  Event-based process logging for ArcheFlow orchestrations. Captures every phase transition,
-  agent output, decision, and fix as structured JSONL events. Enables post-hoc reports,
-  dashboards, and process archaeology.
-  <example>Automatically loaded during orchestration</example>
-  <example>User: "Show me how this story was made"</example>
---
-
-# Process Log — Event-Sourced Orchestration History
-
-Every ArcheFlow orchestration writes structured events to a JSONL file. Events are the **single source of truth** — all reports (Markdown, dashboards, timelines) are generated views.
-
-## Event Storage
-
-```
-.archeflow/events/<run-id>.jsonl     # One file per orchestration run
-.archeflow/events/index.jsonl        # Run index (one line per run, for listing)
-```
-
-**Run ID format:** `<date>-<slug>` (e.g., `2026-04-03-der-huster`)
-
-## When to Emit Events
-
-Emit an event at each of these points during orchestration:
-
-| Moment | Event Type | Trigger |
-|--------|-----------|---------|
-| Orchestration starts | `run.start` | After workflow selection, before first agent |
-| Agent spawned | `agent.start` | Before each Agent tool call |
-| Agent completes | `agent.complete` | After each Agent returns |
-| Phase transition | `phase.transition` | Plan→Do, Do→Check, Check→Act |
-| Decision made | `decision` | Plot direction chosen, fix applied, workflow adapted |
-| Review verdict | `review.verdict` | Guardian/Sage/Skeptic delivers verdict |
-| Fix applied | `fix.applied` | After each edit that addresses a review finding |
-| Cycle boundary | `cycle.boundary` | End of PDCA cycle, before next (or exit) |
-| Shadow detected | `shadow.detected` | Shadow threshold triggered |
-| Orchestration ends | `run.complete` | After final Act phase |
-
-## Event Schema
-
-Every event is one JSON line with these required fields:
-
-```jsonl
-{
-  "ts": "2026-04-03T14:32:07Z",
-  "run_id": "2026-04-03-der-huster",
-  "seq": 4,
-  "parent": [2],
-  "type": "agent.complete",
-  "phase": "plan",
-  "agent": "creator",
-  "data": { ... }
-}
-```
-
-| Field | Type | Description |
-|-------|------|-------------|
-| `ts` | ISO 8601 | Timestamp |
-| `run_id` | string | Unique run identifier |
-| `seq` | integer | Monotonically increasing sequence number within run |
-| `parent` | int[] | Seq numbers of causal parent events. Forms a DAG. `[]` for root events. |
-| `type` | string | Event type (see table above) |
-| `phase` | string | Current PDCA phase: `plan`, `do`, `check`, `act` |
-| `agent` | string or null | Agent archetype that triggered the event |
-| `data` | object | Event-type-specific payload (see below) |
-
-### Parent Relationships (DAG)
-
-The `parent` field turns the flat event stream into a directed acyclic graph (agent call graph). This enables:
-
- **Causal reconstruction:** which agent output caused which downstream action
- **Parallel visualization:** agents sharing a parent ran concurrently
- **Blame tracking:** trace a fix back through review → draft → outline → research
-
-Rules:
- `run.start` has `parent: []` (root node)
- An agent has `parent: [seq of event that triggered it]`
- A phase transition has `parent: [seq of all completing events in prior phase]`
- A fix has `parent: [seq of the review that found the issue]`
- A decision has `parent: [seq of the agent that produced the alternatives]`
- Parallel agents share the same parent (fan-out), phase transitions collect them (fan-in)
-
-Example DAG from a writing workflow:
-```
-#1 run.start []
-├── #2 agent.complete (explorer)       [1]
-│   └── #3 decision (plot direction)   [2]
-├── #4 agent.complete (creator)        [2]     ← explorer informs creator
-├── #5 phase.transition (plan→do)      [3,4]   ← fan-in
-│   └── #6 agent.complete (maker)      [5]
-├── #7 phase.transition (do→check)     [6]
-│   ├── #8 review (guardian)           [7]     ← parallel (fan-out)
-│   └── #9 review (sage)              [7]     ← parallel (fan-out)
-├── #10 phase.transition (check→act)   [8,9]   ← fan-in
-├── #11 fix (timeline)                 [8]     ← caused by guardian
-├── #12 fix (voice drift)             [9]     ← caused by sage
-└── #18 run.complete                   [17]
-```
-
-## Event Payloads by Type
-
-### `run.start`
-```json
-{
-  "task": "Write short story 'Der Huster'",
-  "workflow": "kurzgeschichte",
-  "team": "story-development",
-  "max_cycles": 2,
-  "config": {
-    "voice_profile": "vp-giesing-gschichten-v1",
-    "persona": "giesinger",
-    "target_words": 6000
-  }
-}
-```
-
-### `agent.start`
-```json
-{
-  "archetype": "story-explorer",
-  "model": "haiku",
-  "prompt_summary": "Research premise, find emotional core, suggest 3 plot directions"
-}
-```
-
-### `agent.complete`
-```json
-{
-  "archetype": "story-explorer",
-  "duration_ms": 87605,
-  "tokens": 21645,
-  "artifacts": ["docs/01-der-huster-research.md"],
-  "summary": "3 plot directions developed, recommended C (Mo krank + Koffer)"
-}
-```
-
-### `decision`
-```json
-{
-  "what": "plot_direction",
-  "chosen": "C — Mo krank + Koffer aus B",
-  "alternatives": [
-    {"id": "A", "label": "Mo ist weg", "reason_rejected": "Zu passiv für 6k-Story"},
-    {"id": "B", "label": "Huster gehört nicht Mo", "reason_rejected": "Zu Krimi-nah"}
-  ],
-  "rationale": "Stärkster emotionaler Kern, passt zum Voice Profile"
-}
-```
-
-### `review.verdict`
-```json
-{
-  "archetype": "guardian",
-  "verdict": "approved_with_fixes",
-  "findings": [
-    {"severity": "bug", "description": "Timeline: 'Montag' referenced but story starts Dienstag", "fix_required": true},
-    {"severity": "recommendation", "description": "Gentrification monologue too long for Alex register", "fix_required": false}
-  ]
-}
-```
-
-### `fix.applied`
-```json
-{
-  "source": "guardian",
-  "finding": "Timeline: Montag → Dienstag",
-  "file": "stories/01-der-huster.md",
-  "line": 302,
-  "before": "das Gegenteil von Montag",
-  "after": "das Gegenteil von Dienstag"
-}
-```
-
-### `phase.transition`
-```json
-{
-  "from": "plan",
-  "to": "do",
-  "artifacts_so_far": ["research.md", "outline.md"],
-  "notes": "Explorer recommended direction C, Creator produced 6-scene outline"
-}
-```
-
-### `cycle.boundary`
-```json
-{
-  "cycle": 1,
-  "max_cycles": 2,
-  "exit_condition": "all_approved",
-  "met": true,
-  "fixes_applied": 6,
-  "next_action": "complete"
-}
-```
-
-### `shadow.detected`
-```json
-{
-  "archetype": "story-explorer",
-  "shadow": "endless_research",
-  "trigger": "output >2000 words without recommendation",
-  "action": "correction_prompt_applied",
-  "occurrence": 1
-}
-```
-
-### `run.complete`
-```json
-{
-  "status": "completed",
-  "cycles": 1,
-  "agents_total": 5,
-  "fixes_total": 6,
-  "shadows": 0,
-  "duration_ms": 1295519,
-  "artifacts": [
-    "docs/01-der-huster-research.md",
-    "docs/01-der-huster-outline.md",
-    "stories/01-der-huster.md",
-    "docs/01-der-huster-guardian-review.md",
-    "docs/01-der-huster-sage-review.md",
-    "docs/01-der-huster-process.md"
-  ]
-}
-```
-
-## How to Emit Events
-
-During orchestration, write events using this pattern:
-
-```bash
-# Append one event to the run's JSONL file
-echo '{"ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","run_id":"RUN_ID","seq":SEQ,"type":"TYPE","phase":"PHASE","agent":"AGENT","data":{...}}' >> .archeflow/events/RUN_ID.jsonl
-```
-
-Or use the helper script:
-
-```bash
-./lib/archeflow-event.sh RUN_ID TYPE PHASE AGENT '{"key":"value"}'
-```
-
-The orchestration skill should call the event emitter at each trigger point listed in the table above.
-
-## Generating Reports
-
-After orchestration completes (or during, for live progress):
-
-```bash
-# Generate markdown process report
-./lib/archeflow-report.sh .archeflow/events/2026-04-03-der-huster.jsonl > docs/process-report.md
-
-# List all runs
-cat .archeflow/events/index.jsonl | jq -r '[.run_id, .status, .task] | @tsv'
-```
-
-## Run Index
-
-After each `run.complete`, append a summary line to `.archeflow/events/index.jsonl`:
-
-```jsonl
-{"run_id":"2026-04-03-der-huster","ts":"2026-04-03T16:00:00Z","task":"Write Der Huster","workflow":"kurzgeschichte","status":"completed","cycles":1,"agents":5,"fixes":6,"duration_ms":1295519}
-```
-
-## Integration with Existing Skills
-
- **`orchestration`**: Emit events at phase transitions and after each agent
- **`shadow-detection`**: Emit `shadow.detected` when thresholds trigger
- **`autonomous-mode`**: Use `index.jsonl` for session summaries instead of separate session-log
- **`workflow-design`**: Custom workflows inherit logging automatically
-
-## Design Principles
-
-1. **Append-only.** Never modify or delete events. They are immutable facts.
-2. **Self-contained.** Each event has enough context to be understood alone (no forward references).
-3. **Cheap.** One `echo >>` per event. No database, no service, no dependencies.
-4. **Optional.** If events dir doesn't exist, orchestration works fine without logging. Events are observation, not control flow.
--- a/skills/run/SKILL.md
+++ b/skills/run/SKILL.md