Compare commits
9 Commits
chore/trim
...
refactor/t
| Author | SHA1 | Date | |
|---|---|---|---|
| 130c04fa58 | |||
| 752177528f | |||
| a1667633ad | |||
| d94688ca1b | |||
| c8bd55d97c | |||
| 55de51aabe | |||
| 1baaa79946 | |||
| 8837a359ac | |||
| af1f4e7da7 |
@@ -1,7 +1,7 @@
|
|||||||
{
|
{
|
||||||
"name": "archeflow",
|
"name": "archeflow",
|
||||||
"description": "Multi-agent orchestration with Jungian archetypes. PDCA quality cycles, shadow detection, git worktree isolation. Zero dependencies — works with any Claude Code session.",
|
"description": "Multi-agent orchestration with Jungian archetypes. PDCA quality cycles, shadow detection, git worktree isolation. Zero dependencies — works with any Claude Code session.",
|
||||||
"version": "0.7.0",
|
"version": "0.8.0",
|
||||||
"author": {
|
"author": {
|
||||||
"name": "Chris Nennemann"
|
"name": "Chris Nennemann"
|
||||||
},
|
},
|
||||||
@@ -14,12 +14,11 @@
|
|||||||
"shadow-detection", "workflows"
|
"shadow-detection", "workflows"
|
||||||
],
|
],
|
||||||
"skills": [
|
"skills": [
|
||||||
"run", "orchestration", "plan-phase", "do-phase", "check-phase", "act-phase",
|
"run", "sprint", "review", "check-phase", "act-phase",
|
||||||
"shadow-detection", "attention-filters", "convergence", "artifact-routing",
|
"shadow-detection", "memory", "progress", "presence",
|
||||||
"process-log", "memory", "effectiveness", "progress",
|
"colette-bridge", "git-integration", "multi-project", "cost-tracking",
|
||||||
"colette-bridge", "git-integration", "multi-project",
|
"custom-archetypes", "workflow-design", "domains",
|
||||||
"custom-archetypes", "workflow-design", "domains", "cost-tracking",
|
"templates", "autonomous-mode", "using-archeflow"
|
||||||
"templates", "autonomous-mode", "using-archeflow", "presence"
|
|
||||||
],
|
],
|
||||||
"hooks": "hooks/hooks.json"
|
"hooks": "hooks/hooks.json"
|
||||||
}
|
}
|
||||||
|
|||||||
160
CLAUDE.md
160
CLAUDE.md
@@ -1,71 +1,119 @@
|
|||||||
# archeflow — Multi-Agent Orchestration Plugin for Claude Code
|
# archeflow — Multi-Agent Orchestration Plugin for Claude Code
|
||||||
|
|
||||||
Workspace-level orchestration: parallel agent teams across project portfolios, PDCA cycles with Jungian archetype roles, sprint runner, and post-implementation review. Installed as a Claude Code plugin.
|
PDCA quality cycles with Jungian archetype roles, corrective action framework, sprint runner, and post-implementation review. Zero dependencies — pure Bash + Markdown.
|
||||||
|
|
||||||
## Tech Stack
|
|
||||||
|
|
||||||
- **Runtime:** Bash (lib scripts) + Claude Code skill system (Markdown skills)
|
|
||||||
- **No build step, no dependencies** — pure bash + markdown
|
|
||||||
- **Plugin format:** Claude Code plugin (skills/, hooks/, agents/, templates/)
|
|
||||||
|
|
||||||
## Key Commands
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Use via Claude Code slash commands:
|
|
||||||
/af-sprint # Main mode: work the queue across projects
|
|
||||||
/af-run <task> # Deep orchestration with PDCA cycles
|
|
||||||
/af-review # Post-implementation security/quality review
|
|
||||||
/af-status # Current run status
|
|
||||||
/af-init # Initialize ArcheFlow in a project
|
|
||||||
/af-score # Archetype effectiveness scores
|
|
||||||
/af-memory # Cross-run lesson memory
|
|
||||||
/af-report # Full process report
|
|
||||||
/af-fanout # Colette book fanout via agents
|
|
||||||
```
|
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
```
|
```
|
||||||
skills/ Slash command implementations (one dir per skill)
|
skills/ Slash commands and internal protocols (one SKILL.md per dir)
|
||||||
sprint/ /af-sprint — queue-driven parallel agent runner
|
run/ /af-run — self-contained PDCA orchestration (core skill)
|
||||||
run/ /af-run — PDCA orchestration
|
sprint/ /af-sprint — queue-driven parallel agent dispatch
|
||||||
review/ /af-review — Guardian-led code review
|
review/ /af-review — Guardian-led code review
|
||||||
plan-phase/ PDCA Plan phase
|
check-phase/ Shared reviewer protocol (used by run + review)
|
||||||
do-phase/ PDCA Do phase
|
act-phase/ Finding collection, fix routing, exit decisions
|
||||||
check-phase/ PDCA Check phase
|
shadow-detection/ Corrective action framework (archetype + system + policy)
|
||||||
act-phase/ PDCA Act phase
|
|
||||||
memory/ Cross-run lessons learned
|
memory/ Cross-run lessons learned
|
||||||
cost-tracking/ Token/cost awareness
|
cost-tracking/ Token/cost awareness and budget enforcement
|
||||||
domains/ Domain detection (code, writing, research)
|
domains/ Domain detection (code, writing, research)
|
||||||
... ~25 skill directories
|
colette-bridge/ Writing context loader from colette.yaml
|
||||||
hooks/
|
multi-project/ Cross-repo orchestration with dependency DAG
|
||||||
hooks.json Hook definitions
|
git-integration/ Per-phase commits, branch strategy, rollback
|
||||||
session-start/ Auto-activation on session start
|
templates/ Workflow/team bundle gallery
|
||||||
agents/ Archetype agent definitions
|
autonomous-mode/ Unattended session protocol
|
||||||
explorer.md Divergent thinking, research
|
using-archeflow/ Session-start activation (auto-loaded via hook)
|
||||||
creator.md Design, architecture
|
agents/ Archetype personality definitions (one .md per archetype)
|
||||||
maker.md Implementation
|
lib/ Bash helper scripts (events, git, memory, progress, etc.)
|
||||||
guardian.md Security, risk, quality gates
|
hooks/ Session-start hook (injects using-archeflow)
|
||||||
sage.md Wisdom, patterns, trade-offs
|
|
||||||
skeptic.md Devil's advocate
|
|
||||||
trickster.md Edge cases, unconventional approaches
|
|
||||||
lib/ Bash helper scripts (git, DAG, events, progress, etc.)
|
|
||||||
templates/bundles/ Pre-configured workflow bundles
|
templates/bundles/ Pre-configured workflow bundles
|
||||||
docs/ Roadmap, dogfood notes, test reports
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Domain Rules
|
## Commands
|
||||||
|
|
||||||
- Skills are Markdown files with frontmatter — follow existing skill format exactly
|
| Command | Purpose |
|
||||||
- Agents are archetype personas — maintain their distinct voice and perspective
|
|---------|---------|
|
||||||
- Dogfood observations go to `archeflow/.archeflow/memory/lessons.jsonl`
|
| `/af-run <task>` | PDCA orchestration with full agent cycle |
|
||||||
- Cost tracking: prefer cheap models for bulk ops, expensive for creative/review
|
| `/af-sprint` | Work the queue across projects |
|
||||||
- PDCA cycle order is mandatory: Plan -> Do -> Check -> Act
|
| `/af-review` | Review existing code changes |
|
||||||
|
| `/af-status` | Current/last run status |
|
||||||
|
| `/af-init` | Initialize ArcheFlow in a project |
|
||||||
|
| `/af-score` | Archetype effectiveness scores |
|
||||||
|
| `/af-memory` | Cross-run lesson memory |
|
||||||
|
| `/af-report` | Full process report |
|
||||||
|
| `/af-fanout` | Colette book fanout via agents |
|
||||||
|
|
||||||
## Do NOT
|
## Core Concepts
|
||||||
|
|
||||||
- Add runtime dependencies — this must stay zero-dependency
|
### PDCA Cycle
|
||||||
- Change archetype personalities without updating all referencing skills
|
```
|
||||||
- Skip the Check phase in PDCA cycles (quality gate)
|
Plan (Explorer + Creator) -> Do (Maker in worktree) -> Check (Guardian first, then others) -> Act (fix, merge, or cycle)
|
||||||
- Modify hooks.json format without testing plugin reload
|
```
|
||||||
- Use ArcheFlow to orchestrate simple single-file tasks (overhead not justified)
|
|
||||||
|
### Archetypes
|
||||||
|
Explorer (research), Creator (design), Maker (implement), Guardian (security), Skeptic (assumptions), Trickster (edge cases), Sage (quality). Each has a virtue and a shadow — see `shadow-detection` skill.
|
||||||
|
|
||||||
|
### Corrective Action Framework
|
||||||
|
Three layers, one escalation protocol:
|
||||||
|
- **Archetype shadows** — individual agent dysfunction
|
||||||
|
- **System shadows** — orchestration-level issues (echo chamber, tunnel vision, scope creep)
|
||||||
|
- **Policy boundaries** — operational limits (checkpoints, budgets, circuit breakers)
|
||||||
|
|
||||||
|
### Workflows
|
||||||
|
| Risk Level | Workflow | Agents |
|
||||||
|
|------------|----------|--------|
|
||||||
|
| Low | `fast` | Creator -> Maker -> Guardian |
|
||||||
|
| Medium | `standard` | Explorer + Creator -> Maker -> Guardian + Skeptic + Sage |
|
||||||
|
| High | `thorough` | Explorer + Creator -> Maker -> All 4 reviewers |
|
||||||
|
|
||||||
|
## Guardrails
|
||||||
|
|
||||||
|
### DO
|
||||||
|
|
||||||
|
- Keep skills self-contained. The `run` skill needs zero prerequisites — it was consolidated for a reason.
|
||||||
|
- Write skills as operational instructions Claude can follow, not software specifications.
|
||||||
|
- Use tables for reference data, numbered steps for protocols.
|
||||||
|
- Emit events via `./lib/archeflow-event.sh` — but never let logging block orchestration.
|
||||||
|
- Maintain the corrective action framework when adding new agent types.
|
||||||
|
- Test skill changes by running `/af-run --dry-run` and verifying the flow.
|
||||||
|
- Keep archetype personalities distinct — each agent definition in `agents/` has a specific voice.
|
||||||
|
|
||||||
|
### DO NOT
|
||||||
|
|
||||||
|
- **Add runtime dependencies.** This must stay zero-dependency (Bash + Markdown only).
|
||||||
|
- **Bloat skills back up.** The consolidation from 27 to ~15 skills was intentional. Do not create new skills for internal implementation details — inline them.
|
||||||
|
- **Write bash pseudo-code in skills.** Skills are Claude instructions, not shell scripts. Use one-liner commands or lib script references, not multi-line bash blocks.
|
||||||
|
- **Duplicate protocol definitions.** Finding format lives in `check-phase`. Routing table lives in `act-phase`. Shadow detection lives in `shadow-detection`. One source of truth per concept.
|
||||||
|
- **Skip the Check phase** in PDCA cycles. It's the quality gate.
|
||||||
|
- **Change archetype personalities** without updating all referencing skills and agent definitions.
|
||||||
|
- **Use ArcheFlow for trivial tasks.** Single-file fixes, config changes, questions — just do them directly.
|
||||||
|
- **Let skills exceed ~200 lines.** If a skill is growing past this, it probably needs splitting or the content belongs in a lib script.
|
||||||
|
|
||||||
|
### Skill Writing Rules
|
||||||
|
|
||||||
|
1. **Frontmatter**: `name` (kebab-case), `description` (one-liner + `<example>` tags for user-invocable skills)
|
||||||
|
2. **Structure**: Imperative voice. Lead with what to do, not why. Tables > prose. Steps > paragraphs.
|
||||||
|
3. **Agent templates**: Keep Agent() spawn templates concise. Include only the prompt, subagent_type, and isolation mode.
|
||||||
|
4. **Cross-references**: Use `archeflow:<skill-name>` backtick syntax to reference other skills. Avoid circular dependencies.
|
||||||
|
5. **Bash commands**: One-liners only in skills. Multi-step logic belongs in `lib/` scripts.
|
||||||
|
|
||||||
|
### Cost Awareness
|
||||||
|
|
||||||
|
- Prefer cheap models (haiku) for analytical tasks (validation, diff scoring)
|
||||||
|
- Use capable models (sonnet/opus) for creative tasks (writing, complex design)
|
||||||
|
- Budget enforcement via `cost-tracking` skill and `.archeflow/config.yaml`
|
||||||
|
- Track token spend per agent in events for post-run analysis
|
||||||
|
|
||||||
|
### Git Rules
|
||||||
|
|
||||||
|
- Signing: `git config gpg.format ssh`, key at `~/.ssh/id_ed25519_dev.pub`
|
||||||
|
- Push: `GIT_SSH_COMMAND="ssh -i /home/c/.ssh/id_ed25519_dev -o IdentitiesOnly=yes" git push origin main`
|
||||||
|
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `refactor:`
|
||||||
|
- No Co-Authored-By trailers
|
||||||
|
- All work on worktree branches until explicitly merged
|
||||||
|
- Merges use `--no-ff` (individually revertable)
|
||||||
|
|
||||||
|
## Dogfooding
|
||||||
|
|
||||||
|
When using ArcheFlow to develop ArcheFlow itself:
|
||||||
|
- Log observations to `.archeflow/memory/lessons.jsonl`
|
||||||
|
- Note friction points, shadow false positives, skill gaps
|
||||||
|
- Test skill changes with `/af-run --dry-run` before committing
|
||||||
|
|||||||
89
README.md
89
README.md
@@ -146,61 +146,51 @@ Shadow detection is quantitative, not vibes. Explorer output exceeding 2000 word
|
|||||||
|
|
||||||
## Skills Reference
|
## Skills Reference
|
||||||
|
|
||||||
ArcheFlow ships with 24 skills organized by function.
|
ArcheFlow ships with 19 skills organized by function. The `run` skill is self-contained -- no prerequisites needed.
|
||||||
|
|
||||||
### Core Orchestration
|
### Core Orchestration
|
||||||
|
|
||||||
| Skill | Description |
|
| Skill | Description |
|
||||||
|-------|-------------|
|
|-------|-------------|
|
||||||
| `archeflow:run` | Automated PDCA execution loop -- single-command orchestration with `--start-from`, `--dry-run`, and cycle-back |
|
| `archeflow:run` | Self-contained PDCA orchestration -- Plan/Do/Check/Act with adaptation rules, pipeline strategy, and cycle-back |
|
||||||
| `archeflow:orchestration` | Step-by-step PDCA execution guide for manual orchestration |
|
| `archeflow:sprint` | Queue-driven parallel agent dispatch across projects (primary mode) |
|
||||||
| `archeflow:plan-phase` | Explorer and Creator output formats and protocols |
|
| `archeflow:review` | Guardian-led code review on diff/branch/commit range |
|
||||||
| `archeflow:do-phase` | Maker implementation rules and worktree commit strategy |
|
| `archeflow:check-phase` | Shared reviewer protocol -- finding format, evidence requirements, attention filters |
|
||||||
| `archeflow:check-phase` | Shared reviewer protocols and output format |
|
| `archeflow:act-phase` | Finding collection, fix routing, exit decisions |
|
||||||
| `archeflow:act-phase` | Post-Check decision logic: collect findings, route fixes, exit or cycle |
|
|
||||||
|
|
||||||
### Quality and Safety
|
### Quality and Safety
|
||||||
|
|
||||||
| Skill | Description |
|
| Skill | Description |
|
||||||
|-------|-------------|
|
|-------|-------------|
|
||||||
| `archeflow:shadow-detection` | Quantitative dysfunction detection and automatic correction |
|
| `archeflow:shadow-detection` | Corrective action framework -- archetype shadows, system shadows, policy boundaries |
|
||||||
| `archeflow:attention-filters` | Context optimization per archetype -- each agent gets only what it needs |
|
|
||||||
| `archeflow:convergence` | Detects convergence, stalling, and oscillation in multi-cycle runs |
|
|
||||||
| `archeflow:artifact-routing` | Inter-phase artifact protocol -- naming, storage, routing, archiving |
|
|
||||||
|
|
||||||
### Process Intelligence
|
|
||||||
|
|
||||||
| Skill | Description |
|
|
||||||
|-------|-------------|
|
|
||||||
| `archeflow:process-log` | Event-sourced JSONL logging with DAG parent relationships |
|
|
||||||
| `archeflow:memory` | Cross-run memory that learns recurring findings and injects lessons |
|
| `archeflow:memory` | Cross-run memory that learns recurring findings and injects lessons |
|
||||||
| `archeflow:effectiveness` | Archetype scoring on signal-to-noise, fix rate, cost efficiency |
|
|
||||||
| `archeflow:progress` | Live progress file watchable from a second terminal |
|
|
||||||
|
|
||||||
### Integration
|
### Integration
|
||||||
|
|
||||||
| Skill | Description |
|
| Skill | Description |
|
||||||
|-------|-------------|
|
|-------|-------------|
|
||||||
| `archeflow:colette-bridge` | Bridges ArcheFlow with the Colette writing platform |
|
| `archeflow:colette-bridge` | Bridges ArcheFlow with the Colette writing platform |
|
||||||
| `archeflow:git-integration` | Git-per-phase commits, branch-per-run, rollback to any phase boundary |
|
| `archeflow:git-integration` | Per-phase commits, branch-per-run, rollback |
|
||||||
| `archeflow:multi-project` | Cross-repo orchestration with dependency DAG and shared budget |
|
| `archeflow:multi-project` | Cross-repo orchestration with dependency DAG and shared budget |
|
||||||
|
| `archeflow:cost-tracking` | Budget enforcement, per-agent cost aggregation, model tier recommendations |
|
||||||
|
|
||||||
### Configuration
|
### Configuration
|
||||||
|
|
||||||
| Skill | Description |
|
| Skill | Description |
|
||||||
|-------|-------------|
|
|-------|-------------|
|
||||||
|
| `archeflow:domains` | Domain adapters for writing, research, and non-code workflows |
|
||||||
| `archeflow:custom-archetypes` | Create domain-specific roles (database reviewer, compliance auditor, etc.) |
|
| `archeflow:custom-archetypes` | Create domain-specific roles (database reviewer, compliance auditor, etc.) |
|
||||||
| `archeflow:workflow-design` | Design custom workflows with per-phase archetype assignment and exit conditions |
|
| `archeflow:workflow-design` | Design custom workflows with per-phase archetype assignment |
|
||||||
| `archeflow:domains` | Domain adapters for writing, research, and other non-code workflows |
|
|
||||||
| `archeflow:cost-tracking` | Budget enforcement, per-agent cost aggregation, model tier recommendations |
|
|
||||||
| `archeflow:templates` | Template gallery for sharing workflows, teams, and setup bundles |
|
| `archeflow:templates` | Template gallery for sharing workflows, teams, and setup bundles |
|
||||||
| `archeflow:autonomous-mode` | Unattended overnight sessions with progress logging and safe stopping |
|
| `archeflow:autonomous-mode` | Unattended sessions with corrective action checkpoints |
|
||||||
|
| `archeflow:progress` | Live progress file watchable from a second terminal |
|
||||||
|
| `archeflow:presence` | User-facing output format -- show outcomes, not mechanics |
|
||||||
|
|
||||||
### Meta
|
### Meta
|
||||||
|
|
||||||
| Skill | Description |
|
| Skill | Description |
|
||||||
|-------|-------------|
|
|-------|-------------|
|
||||||
| `archeflow:using-archeflow` | Session-start skill -- activation criteria, workflow selection, quick reference |
|
| `archeflow:using-archeflow` | Session-start activation -- decision tree, workflow selection, commands |
|
||||||
|
|
||||||
## Library Scripts
|
## Library Scripts
|
||||||
|
|
||||||
@@ -341,47 +331,28 @@ archetypes: [explorer, creator, maker, guardian, db-specialist]
|
|||||||
|
|
||||||
```
|
```
|
||||||
archeflow/
|
archeflow/
|
||||||
├── .claude-plugin/plugin.json # Plugin manifest (v0.5.0)
|
├── .claude-plugin/plugin.json # Plugin manifest
|
||||||
├── agents/ # 7 archetype personas (behavioral protocols)
|
├── agents/ # 7 archetype personas (behavioral protocols)
|
||||||
│ ├── explorer.md # Plan: research and context mapping
|
│ ├── explorer.md, creator.md # Plan phase agents
|
||||||
│ ├── creator.md # Plan: solution design and proposals
|
│ ├── maker.md # Do phase agent
|
||||||
│ ├── maker.md # Do: implementation in isolated worktree
|
│ └── guardian.md, skeptic.md, # Check phase agents
|
||||||
│ ├── guardian.md # Check: security and reliability review
|
│ trickster.md, sage.md
|
||||||
│ ├── skeptic.md # Check: assumption challenging
|
├── skills/ # 19 skills (consolidated from 27)
|
||||||
│ ├── trickster.md # Check: adversarial testing
|
│ ├── run/ # Self-contained PDCA orchestration (core)
|
||||||
│ └── sage.md # Check: holistic quality review
|
│ ├── sprint/ # Queue-driven parallel agent dispatch
|
||||||
├── skills/ # 24 behavioral skills
|
│ ├── review/ # Guardian-led code review
|
||||||
│ ├── run/ # Automated PDCA loop
|
│ ├── check-phase/ # Shared reviewer protocol + attention filters
|
||||||
│ ├── orchestration/ # Manual PDCA execution guide
|
│ ├── act-phase/ # Finding collection + fix routing
|
||||||
│ ├── plan-phase/ # Plan protocols
|
│ ├── shadow-detection/ # Corrective action framework (3 layers)
|
||||||
│ ├── do-phase/ # Do protocols
|
|
||||||
│ ├── check-phase/ # Check protocols
|
|
||||||
│ ├── act-phase/ # Act phase decision logic
|
|
||||||
│ ├── shadow-detection/ # Dysfunction detection
|
|
||||||
│ ├── attention-filters/ # Context optimization
|
|
||||||
│ ├── convergence/ # Cycle convergence detection
|
|
||||||
│ ├── artifact-routing/ # Inter-phase artifact protocol
|
|
||||||
│ ├── process-log/ # Event-sourced JSONL logging
|
|
||||||
│ ├── memory/ # Cross-run learning
|
│ ├── memory/ # Cross-run learning
|
||||||
│ ├── effectiveness/ # Archetype scoring
|
│ └── ... # + 12 config/integration skills
|
||||||
│ ├── progress/ # Live progress file
|
├── lib/ # 10 shell scripts (events, git, memory, etc.)
|
||||||
│ ├── colette-bridge/ # Colette writing platform bridge
|
|
||||||
│ ├── git-integration/ # Per-phase git commits
|
|
||||||
│ ├── multi-project/ # Cross-repo orchestration
|
|
||||||
│ ├── custom-archetypes/ # Domain-specific roles
|
|
||||||
│ ├── workflow-design/ # Custom workflow design
|
|
||||||
│ ├── domains/ # Domain adapters
|
|
||||||
│ ├── cost-tracking/ # Budget and cost management
|
|
||||||
│ ├── templates/ # Template gallery
|
|
||||||
│ ├── autonomous-mode/ # Unattended sessions
|
|
||||||
│ └── using-archeflow/ # Session-start activation
|
|
||||||
├── lib/ # 8 shell scripts (process infrastructure)
|
|
||||||
├── hooks/ # Auto-activation (SessionStart)
|
├── hooks/ # Auto-activation (SessionStart)
|
||||||
├── examples/ # Walkthroughs, templates, custom archetypes
|
├── examples/ # Walkthroughs, templates, custom archetypes
|
||||||
└── docs/ # Roadmap, changelog
|
└── docs/ # Roadmap, changelog
|
||||||
```
|
```
|
||||||
|
|
||||||
The flow: skills define behavioral rules (what agents should do), agents define personas (how they think), lib scripts handle tooling (event logging, git, reporting), and hooks wire it all together at session start. Events are emitted at every phase transition, forming a DAG that can be rendered, reported, or scored after the run.
|
Skills define behavioral rules, agents define personas, lib scripts handle tooling, hooks wire it together at session start. The `run` skill is self-contained -- it absorbed 8 previously separate skills (orchestration, plan-phase, do-phase, artifact-routing, process-log, convergence, effectiveness, attention-filters) into one 459-line operational guide.
|
||||||
|
|
||||||
## Philosophy
|
## Philosophy
|
||||||
|
|
||||||
|
|||||||
@@ -1,289 +0,0 @@
|
|||||||
---
|
|
||||||
name: artifact-routing
|
|
||||||
description: |
|
|
||||||
Inter-phase artifact protocol for ArcheFlow runs. Defines how artifacts are named, stored,
|
|
||||||
routed between agents, and archived across PDCA cycles. Ensures each agent receives exactly
|
|
||||||
the context it needs — no more, no less.
|
|
||||||
<example>Automatically loaded by archeflow:run</example>
|
|
||||||
<example>User: "What does the Maker receive as context?"</example>
|
|
||||||
---
|
|
||||||
|
|
||||||
# Artifact Routing — Inter-Phase Context Protocol
|
|
||||||
|
|
||||||
Every ArcheFlow run produces artifacts — research notes, proposals, diffs, reviews, feedback. This skill defines how those artifacts are named, where they live, what each agent receives, and how they are preserved across cycles.
|
|
||||||
|
|
||||||
## Artifact Directory Structure
|
|
||||||
|
|
||||||
```
|
|
||||||
.archeflow/artifacts/<run_id>/
|
|
||||||
├── plan-explorer.md # Explorer research output
|
|
||||||
├── plan-creator.md # Creator proposal/outline
|
|
||||||
├── do-maker.md # Maker implementation summary
|
|
||||||
├── do-maker-files.txt # List of files created/modified (one path per line)
|
|
||||||
├── check-guardian.md # Guardian review verdict + findings
|
|
||||||
├── check-sage.md # Sage review (if present)
|
|
||||||
├── check-skeptic.md # Skeptic review (if present)
|
|
||||||
├── check-trickster.md # Trickster review (if present)
|
|
||||||
├── act-feedback.md # Structured feedback for next cycle (Cycle Feedback Protocol)
|
|
||||||
├── act-fixes.jsonl # Applied fixes log (one JSON line per fix)
|
|
||||||
├── cycle-1/ # Archived artifacts from cycle 1
|
|
||||||
│ ├── plan-explorer.md
|
|
||||||
│ ├── plan-creator.md
|
|
||||||
│ ├── do-maker.md
|
|
||||||
│ ├── do-maker-files.txt
|
|
||||||
│ ├── check-guardian.md
|
|
||||||
│ ├── check-sage.md
|
|
||||||
│ └── act-feedback.md
|
|
||||||
└── cycle-2/ # Archived artifacts from cycle 2 (if cycle 3 starts)
|
|
||||||
└── ...
|
|
||||||
```
|
|
||||||
|
|
||||||
## Naming Convention
|
|
||||||
|
|
||||||
Artifacts follow the pattern: `<phase>-<agent>.<ext>`
|
|
||||||
|
|
||||||
| Phase | Agent | Filename | Format |
|
|
||||||
|-------|-------|----------|--------|
|
|
||||||
| plan | explorer | `plan-explorer.md` | Markdown research report |
|
|
||||||
| plan | creator | `plan-creator.md` | Markdown proposal with confidence scores |
|
|
||||||
| plan | mini-explorer | `plan-mini-explorer.md` | Focused risk research (only if confidence gate triggers) |
|
|
||||||
| do | maker | `do-maker.md` | Markdown implementation summary |
|
|
||||||
| do | maker | `do-maker-files.txt` | Plain text, one file path per line |
|
|
||||||
| check | guardian | `check-guardian.md` | Markdown verdict + findings table |
|
|
||||||
| check | sage | `check-sage.md` | Markdown verdict + findings table |
|
|
||||||
| check | skeptic | `check-skeptic.md` | Markdown verdict + findings table |
|
|
||||||
| check | trickster | `check-trickster.md` | Markdown verdict + findings table |
|
|
||||||
| act | (orchestrator) | `act-feedback.md` | Structured feedback (see Cycle Feedback Protocol) |
|
|
||||||
| act | (orchestrator) | `act-fixes.jsonl` | JSONL fix log |
|
|
||||||
|
|
||||||
**Rule:** Never invent new artifact names during a run. If a reviewer is skipped (A2 fast-path, reviewer profile), its artifact simply does not exist. Downstream phases check for file existence before reading.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Context Injection Rules
|
|
||||||
|
|
||||||
Each agent receives a filtered subset of artifacts. This is the **attention filter** — it controls what context is injected into the agent's prompt.
|
|
||||||
|
|
||||||
### Plan Phase
|
|
||||||
|
|
||||||
| Agent | Receives | Does NOT receive |
|
|
||||||
|-------|----------|-----------------|
|
|
||||||
| **Explorer** | Task description, relevant file paths, codebase access | Prior proposals, review outputs, implementation details |
|
|
||||||
| **Creator** (cycle 1) | Task description, `plan-explorer.md` (if exists) | Raw file contents (Explorer summarized them), git diffs |
|
|
||||||
| **Creator** (cycle 2+) | Task description, `plan-explorer.md`, `act-feedback.md` (Creator-routed findings only) | Raw reviewer outputs, Maker-routed findings |
|
|
||||||
|
|
||||||
**Creator context injection template (cycle 2+):**
|
|
||||||
```markdown
|
|
||||||
## Task
|
|
||||||
<task description>
|
|
||||||
|
|
||||||
## Research (from Explorer)
|
|
||||||
<contents of plan-explorer.md>
|
|
||||||
|
|
||||||
## Feedback from Prior Cycle
|
|
||||||
<Creator-routed section of act-feedback.md only>
|
|
||||||
|
|
||||||
Note: Address each unresolved issue listed above. Explain how your revised proposal resolves it.
|
|
||||||
```
|
|
||||||
|
|
||||||
### Do Phase
|
|
||||||
|
|
||||||
| Agent | Receives | Does NOT receive |
|
|
||||||
|-------|----------|-----------------|
|
|
||||||
| **Maker** (cycle 1) | `plan-creator.md` (the proposal), `plan-mini-explorer.md` (if exists) | `plan-explorer.md`, reviewer outputs, raw task description |
|
|
||||||
| **Maker** (cycle 2+) | `plan-creator.md`, `plan-mini-explorer.md` (if exists), Maker-routed findings from `act-feedback.md` | Explorer research, Guardian/Skeptic findings (those went to Creator) |
|
|
||||||
|
|
||||||
**Maker context injection template (cycle 2+):**
|
|
||||||
```markdown
|
|
||||||
## Proposal
|
|
||||||
<contents of plan-creator.md>
|
|
||||||
|
|
||||||
## Implementation Feedback from Prior Cycle
|
|
||||||
<Maker-routed section of act-feedback.md only>
|
|
||||||
|
|
||||||
Note: The proposal has been revised to address design-level issues. Focus on the implementation
|
|
||||||
feedback items above (code quality, test gaps, consistency).
|
|
||||||
```
|
|
||||||
|
|
||||||
**Why Maker doesn't get Explorer output:** The Creator already distilled Explorer's research into a concrete proposal. Giving Maker raw research causes scope creep and "Rogue" shadow activation.
|
|
||||||
|
|
||||||
### Check Phase
|
|
||||||
|
|
||||||
| Agent | Receives | Does NOT receive |
|
|
||||||
|-------|----------|-----------------|
|
|
||||||
| **Guardian** | Maker's git diff, risk section from `plan-creator.md` | Full proposal, Explorer research, other reviewer outputs |
|
|
||||||
| **Skeptic** | `plan-creator.md` (assumptions focus) | Git diff details, Explorer research, other reviewer outputs |
|
|
||||||
| **Sage** | `plan-creator.md`, Maker's git diff, `do-maker.md` | Explorer research, other reviewer outputs |
|
|
||||||
| **Trickster** | Maker's git diff only | Everything else |
|
|
||||||
|
|
||||||
**Guardian context injection template:**
|
|
||||||
```markdown
|
|
||||||
## Changes to Review
|
|
||||||
<git diff from Maker's branch>
|
|
||||||
|
|
||||||
## Risk Assessment (from proposal)
|
|
||||||
<risks section extracted from plan-creator.md>
|
|
||||||
|
|
||||||
Review these changes for security, reliability, breaking changes, and dependency risks.
|
|
||||||
```
|
|
||||||
|
|
||||||
**Skeptic context injection template:**
|
|
||||||
```markdown
|
|
||||||
## Proposal to Challenge
|
|
||||||
<contents of plan-creator.md>
|
|
||||||
|
|
||||||
Focus on assumptions, alternatives not considered, edge cases, and scalability.
|
|
||||||
```
|
|
||||||
|
|
||||||
**Sage context injection template:**
|
|
||||||
```markdown
|
|
||||||
## Proposal
|
|
||||||
<contents of plan-creator.md>
|
|
||||||
|
|
||||||
## Implementation Summary
|
|
||||||
<contents of do-maker.md>
|
|
||||||
|
|
||||||
## Changes
|
|
||||||
<git diff from Maker's branch>
|
|
||||||
|
|
||||||
Evaluate code quality, test coverage, documentation, and codebase consistency.
|
|
||||||
```
|
|
||||||
|
|
||||||
**Trickster context injection template:**
|
|
||||||
```markdown
|
|
||||||
## Changes to Attack
|
|
||||||
<git diff from Maker's branch>
|
|
||||||
|
|
||||||
Try to break this. Malformed input, boundaries, concurrency, error paths, dependency failures.
|
|
||||||
```
|
|
||||||
|
|
||||||
### Act Phase
|
|
||||||
|
|
||||||
No agents are spawned in Act. The orchestrator reads all `check-*.md` artifacts directly.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Feedback Routing
|
|
||||||
|
|
||||||
> **This is the canonical routing table.** Other skills (orchestration, act-phase) must match this table exactly. When updating routing rules, update this table first, then sync the others.
|
|
||||||
|
|
||||||
When building `act-feedback.md` after the Check phase, route each finding to the right agent for the next cycle:
|
|
||||||
|
|
||||||
| Finding Source | Finding Category | Routes To | Rationale |
|
|
||||||
|---------------|-----------------|-----------|-----------|
|
|
||||||
| Guardian | security, breaking-change | **Creator** | Design must change |
|
|
||||||
| Guardian | reliability, dependency | **Creator** | Architectural decision needed |
|
|
||||||
| Skeptic | design, scalability | **Creator** | Assumptions need revision |
|
|
||||||
| Sage | quality, consistency | **Maker** | Implementation refinement |
|
|
||||||
| Sage | testing | **Maker** | Test gap, not design flaw |
|
|
||||||
| Trickster | reliability (design flaw) | **Creator** | Needs redesign |
|
|
||||||
| Trickster | reliability (test gap) | **Maker** | Needs more tests |
|
|
||||||
| Trickster | testing | **Maker** | Edge case not covered |
|
|
||||||
|
|
||||||
**Disambiguation rule:** When in doubt: if the fix requires changing the approach, route to Creator. If it requires changing the code within the existing approach, route to Maker.
|
|
||||||
|
|
||||||
### Feedback File Format
|
|
||||||
|
|
||||||
`act-feedback.md` is split into two sections so each agent can be given only its portion:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
# Cycle <N> Feedback
|
|
||||||
|
|
||||||
## Creator-Routed Issues
|
|
||||||
| # | Source | Severity | Category | Issue | Suggested Fix |
|
|
||||||
|---|--------|----------|----------|-------|---------------|
|
|
||||||
| 1 | Guardian | CRITICAL | security | SQL injection in user input | Add parameterized queries |
|
|
||||||
| 2 | Skeptic | WARNING | design | Assumes single-tenant only | Add tenant isolation |
|
|
||||||
|
|
||||||
## Maker-Routed Issues
|
|
||||||
| # | Source | Severity | Category | Issue | Suggested Fix |
|
|
||||||
|---|--------|----------|----------|-------|---------------|
|
|
||||||
| 3 | Sage | WARNING | quality | Test names don't describe behavior | Rename to describe expected outcome |
|
|
||||||
| 4 | Sage | INFO | consistency | Import order doesn't match codebase style | Re-order imports |
|
|
||||||
|
|
||||||
## Resolved (from prior cycles)
|
|
||||||
| # | Source | Issue | Resolution | Resolved In |
|
|
||||||
|---|--------|-------|------------|-------------|
|
|
||||||
| 1 | Guardian | Missing rate limit | Added rate limiter middleware | Cycle 1 |
|
|
||||||
|
|
||||||
## Convergence Warnings
|
|
||||||
<any finding that appeared unresolved in 2+ consecutive cycles — requires user input>
|
|
||||||
```
|
|
||||||
|
|
||||||
When injecting feedback into Creator's prompt, include **only** the "Creator-Routed Issues" section.
|
|
||||||
When injecting feedback into Maker's prompt, include **only** the "Maker-Routed Issues" section.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Cycle Archiving
|
|
||||||
|
|
||||||
When a PDCA cycle completes and a new cycle begins, archive the current artifacts so they are preserved and the working directory is clean for the next iteration.
|
|
||||||
|
|
||||||
### Archive Procedure
|
|
||||||
|
|
||||||
At the end of each cycle (before starting the next):
|
|
||||||
|
|
||||||
```bash
|
|
||||||
RUN_DIR=".archeflow/artifacts/${RUN_ID}"
|
|
||||||
ARCHIVE_DIR="${RUN_DIR}/cycle-${CYCLE}"
|
|
||||||
|
|
||||||
mkdir -p "$ARCHIVE_DIR"
|
|
||||||
|
|
||||||
# Copy all phase artifacts to archive
|
|
||||||
cp "${RUN_DIR}"/plan-*.md "$ARCHIVE_DIR/" 2>/dev/null || true
|
|
||||||
cp "${RUN_DIR}"/do-*.md "$ARCHIVE_DIR/" 2>/dev/null || true
|
|
||||||
cp "${RUN_DIR}"/do-*.txt "$ARCHIVE_DIR/" 2>/dev/null || true
|
|
||||||
cp "${RUN_DIR}"/check-*.md "$ARCHIVE_DIR/" 2>/dev/null || true
|
|
||||||
cp "${RUN_DIR}"/act-feedback.md "$ARCHIVE_DIR/" 2>/dev/null || true
|
|
||||||
```
|
|
||||||
|
|
||||||
**Do NOT delete** the working-level artifacts after archiving. The next cycle's agents need `act-feedback.md` and `plan-explorer.md` (Explorer cache may reuse prior research). Old artifacts in the working directory get overwritten when the new cycle's agents produce their outputs.
|
|
||||||
|
|
||||||
### Archive Access
|
|
||||||
|
|
||||||
Archived artifacts are read-only references. Use them for:
|
|
||||||
- **Resolution tracking:** Compare `cycle-1/check-guardian.md` findings against `cycle-2/check-guardian.md` to detect resolved/persisting issues
|
|
||||||
- **Convergence detection:** Same finding in `cycle-N/act-feedback.md` and `cycle-N+1/act-feedback.md` → escalate to user
|
|
||||||
- **Post-hoc analysis:** Understanding how a solution evolved across cycles
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Artifact Existence Checks
|
|
||||||
|
|
||||||
Before injecting an artifact into an agent's context, always check if the file exists. Missing artifacts are expected in certain workflows:
|
|
||||||
|
|
||||||
| Artifact | Missing when |
|
|
||||||
|----------|-------------|
|
|
||||||
| `plan-explorer.md` | Fast workflow (no Explorer) |
|
|
||||||
| `plan-mini-explorer.md` | Confidence gate did not trigger for risk coverage |
|
|
||||||
| `check-skeptic.md` | Fast workflow, or A2 fast-path taken |
|
|
||||||
| `check-sage.md` | Fast workflow, or A2 fast-path taken |
|
|
||||||
| `check-trickster.md` | Non-thorough workflow, or A2 fast-path taken |
|
|
||||||
| `act-feedback.md` | Cycle 1 (no prior feedback exists) |
|
|
||||||
| `act-fixes.jsonl` | Cycle 1, or no fixes applied |
|
|
||||||
|
|
||||||
**Rule:** Never fail because an optional artifact is missing. Check existence, skip injection if absent, and note what was skipped in the event data.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Git Diff as Artifact
|
|
||||||
|
|
||||||
The Maker's git diff is not saved as a file — it is generated on-the-fly from the Maker's worktree branch:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
git diff main...<maker-branch>
|
|
||||||
```
|
|
||||||
|
|
||||||
This ensures reviewers always see the actual current diff, not a stale snapshot. The diff is injected directly into reviewer prompts, not saved to disk.
|
|
||||||
|
|
||||||
Exception: `do-maker-files.txt` IS saved to disk (just the file list, not the full diff) for quick reference by the orchestrator and for archiving purposes.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Design Principles
|
|
||||||
|
|
||||||
1. **Minimal context per agent.** Each agent gets only what it needs. Over-injection causes distraction, shadow activation, and wasted tokens.
|
|
||||||
2. **Artifacts are the handoff mechanism.** Agents never communicate directly. All inter-agent data flows through saved artifacts.
|
|
||||||
3. **Files over memory.** Everything is on disk. If a session crashes, artifacts survive. A `--start-from` resume reads artifacts, not session state.
|
|
||||||
4. **Overwrite, don't accumulate.** Working-level artifacts get overwritten each cycle. Archives preserve history. This keeps the working directory simple.
|
|
||||||
5. **Check before inject.** Always verify artifact existence. Gracefully handle missing optional artifacts.
|
|
||||||
@@ -1,121 +0,0 @@
|
|||||||
---
|
|
||||||
name: attention-filters
|
|
||||||
description: Use when spawning archetype agents to decide what context each agent receives. Reduces token waste and sharpens focus by passing only relevant artifacts.
|
|
||||||
---
|
|
||||||
|
|
||||||
# Attention Filters
|
|
||||||
|
|
||||||
Each archetype needs different context. Pass only what's relevant — not everything.
|
|
||||||
|
|
||||||
| Archetype | Receives | Does NOT Receive |
|
|
||||||
|-----------|----------|-----------------|
|
|
||||||
| Explorer | Task description, codebase access | Prior proposals or reviews |
|
|
||||||
| Creator | Explorer's research + task description | Implementation details |
|
|
||||||
| Maker | Creator's proposal | Explorer's research, reviews |
|
|
||||||
| Guardian | Maker's git diff + proposal risk section | Explorer's research |
|
|
||||||
| Skeptic | Creator's proposal (focus: assumptions) | Git diff details |
|
|
||||||
| Trickster | Maker's git diff only | Everything else |
|
|
||||||
| Sage | Proposal + implementation + diff | Explorer's raw research |
|
|
||||||
|
|
||||||
## Why This Matters
|
|
||||||
|
|
||||||
- **Token cost:** A Guardian reading the Explorer's 2000-word research wastes ~2600 tokens on irrelevant context
|
|
||||||
- **Focus:** An agent with too much context drifts from its archetype's concern
|
|
||||||
- **Shadow prevention:** Over-loading context encourages rabbit-holing (Explorer) and scope creep (Maker)
|
|
||||||
|
|
||||||
## In Practice
|
|
||||||
|
|
||||||
When spawning a Check-phase agent, include only the filtered context in the prompt:
|
|
||||||
|
|
||||||
```
|
|
||||||
# Guardian receives:
|
|
||||||
"Review these changes: <git diff output>
|
|
||||||
The proposal identified these risks: <risks section only>
|
|
||||||
Verdict: APPROVED or REJECTED with findings."
|
|
||||||
|
|
||||||
# NOT:
|
|
||||||
"Here is the full research, the full proposal, the full implementation,
|
|
||||||
the full git log, and everything else we have..."
|
|
||||||
```
|
|
||||||
|
|
||||||
## Prompt Construction Templates
|
|
||||||
|
|
||||||
### Explorer
|
|
||||||
- **Receives:** Task description, file tree (max 200 lines), prior-cycle feedback (if cycle 2+)
|
|
||||||
- **Excludes:** Creator proposals, Maker diffs, reviewer outputs
|
|
||||||
- **Token target:** ~2000 tokens input
|
|
||||||
|
|
||||||
### Creator
|
|
||||||
- **Receives:** Task description, Explorer research (if available), prior-cycle feedback (if cycle 2+)
|
|
||||||
- **Excludes:** Maker diffs, reviewer outputs
|
|
||||||
- **Token target:** ~3000 tokens input
|
|
||||||
|
|
||||||
### Maker
|
|
||||||
- **Receives:** Creator's proposal (full), test strategy section, file list
|
|
||||||
- **Excludes:** Explorer research, reviewer outputs, prior-cycle feedback
|
|
||||||
- **Token target:** ~2500 tokens input
|
|
||||||
|
|
||||||
### Guardian
|
|
||||||
- **Receives:** Maker's git diff, proposal risk section, test results
|
|
||||||
- **Excludes:** Explorer research, Creator rationale, Skeptic/Sage outputs
|
|
||||||
- **Token target:** ~2000 tokens input
|
|
||||||
|
|
||||||
### Skeptic
|
|
||||||
- **Receives:** Creator's proposal (assumptions + architecture decision), confidence scores
|
|
||||||
- **Excludes:** Git diff details, Explorer raw research, other reviewer outputs
|
|
||||||
- **Token target:** ~1500 tokens input
|
|
||||||
|
|
||||||
### Trickster
|
|
||||||
- **Receives:** Maker's git diff only, attack surface summary (file types + entry points)
|
|
||||||
- **Excludes:** Proposal, research, other reviewer outputs
|
|
||||||
- **Token target:** ~1500 tokens input
|
|
||||||
|
|
||||||
### Sage
|
|
||||||
- **Receives:** Creator's proposal, Maker's implementation summary + diff, test results
|
|
||||||
- **Excludes:** Explorer raw research, other reviewer verdicts
|
|
||||||
- **Token target:** ~2500 tokens input
|
|
||||||
|
|
||||||
## Token Budget Targets
|
|
||||||
|
|
||||||
| Archetype | Fast | Standard | Thorough |
|
|
||||||
|-----------|------|----------|----------|
|
|
||||||
| Explorer | skip | 2000 | 3000 |
|
|
||||||
| Creator | 2000 | 3000 | 4000 |
|
|
||||||
| Maker | 2000 | 2500 | 3000 |
|
|
||||||
| Guardian | 1500 | 2000 | 2500 |
|
|
||||||
| Skeptic | skip | 1500 | 2000 |
|
|
||||||
| Trickster | skip | skip | 1500 |
|
|
||||||
| Sage | skip | 2500 | 3000 |
|
|
||||||
|
|
||||||
"skip" means the archetype is not spawned in that workflow tier.
|
|
||||||
|
|
||||||
## Cycle-Back Filtering
|
|
||||||
|
|
||||||
When injecting prior-cycle feedback into cycle 2+:
|
|
||||||
|
|
||||||
1. **Summary only** — pass the structured feedback table (issue, source, severity), not full reviewer artifacts
|
|
||||||
2. **Strip resolved items** — if a finding was marked Fixed in the Act phase, exclude it
|
|
||||||
3. **Compress context** — prior proposal diffs reduce to "What Changed" section only (not full re-proposal)
|
|
||||||
4. **Cap at 500 tokens** — if feedback exceeds this, summarize by severity (CRITICAL first, then WARNING, drop INFO)
|
|
||||||
|
|
||||||
## Filter Verification Checklist
|
|
||||||
|
|
||||||
Before spawning each agent, verify:
|
|
||||||
|
|
||||||
- [ ] Prompt contains ONLY the artifacts listed in that archetype's "Receives" above
|
|
||||||
- [ ] No cross-contamination from other reviewers' outputs
|
|
||||||
- [ ] Token count is within 20% of the target for the current workflow tier
|
|
||||||
- [ ] Prior-cycle feedback (if any) is summarized, not raw
|
|
||||||
- [ ] Excluded artifacts are genuinely absent (search for keywords like file paths from excluded sources)
|
|
||||||
|
|
||||||
## Context Isolation
|
|
||||||
|
|
||||||
Attention filters control *what* each agent receives. Context isolation controls *how* that context is constructed — ensuring agents operate on provided facts, not ambient knowledge.
|
|
||||||
|
|
||||||
### Rules
|
|
||||||
|
|
||||||
1. **No session bleed.** Agents receive fresh context only — constructed from task description, artifact files, or extracted sections. They must not inherit session state, chat history, or prior agent prompts.
|
|
||||||
2. **No cross-agent contamination.** An agent receives another agent's output only if the attention filter table above explicitly allows it. Guardian does not see Skeptic's output. Skeptic does not see the Maker's diff. Violations produce unreliable reviews.
|
|
||||||
3. **Controller-constructed only.** All agent context is assembled by the orchestrator from: (a) the task description, (b) artifact files on disk, or (c) extracted sections of those artifacts. Agents never pull their own context.
|
|
||||||
4. **No ambient knowledge.** Agents cannot "remember" findings from prior phases or cycles unless that information is explicitly injected via the cycle-back filtering protocol above. An agent that references information not in its prompt is hallucinating.
|
|
||||||
5. **Verification.** Before spawning each agent, confirm the constructed prompt has zero references to other agents' raw outputs that are not in the "Receives" column. Search for file paths, archetype names, and finding descriptions from excluded sources.
|
|
||||||
@@ -1,221 +1,70 @@
|
|||||||
---
|
---
|
||||||
name: autonomous-mode
|
name: autonomous-mode
|
||||||
description: Use when the user wants to run ArcheFlow orchestrations unattended — overnight sessions, batch processing multiple tasks, or fully autonomous coding. Handles self-organization, progress logging, and safe stopping.
|
description: Use when the user wants to run ArcheFlow orchestrations unattended -- overnight sessions, batch processing multiple tasks, or fully autonomous coding. Handles self-organization, progress logging, and safe stopping.
|
||||||
---
|
---
|
||||||
|
|
||||||
# Autonomous Mode
|
# Autonomous Mode
|
||||||
|
|
||||||
ArcheFlow orchestrations can run fully autonomously because the archetypes self-organize through the PDCA cycle. The user sets the task queue, walks away, and reviews results later.
|
ArcheFlow orchestrations run fully autonomously through the PDCA cycle's natural quality gates. No unreviewed code reaches main.
|
||||||
|
|
||||||
## How Autonomous Mode Works
|
|
||||||
|
|
||||||
The PDCA cycle provides natural quality gates at every turn of the spiral:
|
|
||||||
- **Plan** phase produces a proposal — reviewable artifact
|
|
||||||
- **Do** phase produces committed code in a worktree — isolated, reversible
|
|
||||||
- **Check** phase produces approval/rejection — automatic quality control
|
|
||||||
- **Act** phase either merges (safe) or cycles back (self-correcting)
|
|
||||||
|
|
||||||
No unreviewed code reaches the main branch. Ever. That's what makes overnight runs safe.
|
|
||||||
|
|
||||||
## Starting an Autonomous Session
|
|
||||||
|
|
||||||
```
|
|
||||||
You are entering AUTONOMOUS MODE.
|
|
||||||
|
|
||||||
Task queue:
|
|
||||||
1. "Add input validation to all API endpoints" (thorough)
|
|
||||||
2. "Refactor auth middleware to use JWT" (standard)
|
|
||||||
3. "Fix pagination bug in search results" (fast)
|
|
||||||
4. "Add rate limiting to public endpoints" (standard)
|
|
||||||
|
|
||||||
Rules:
|
|
||||||
- Process tasks sequentially (one orchestration at a time)
|
|
||||||
- Log progress to .archeflow/session-log.md after each task
|
|
||||||
- If a task fails after max cycles: log findings, skip to next task
|
|
||||||
- If 3 consecutive tasks fail: STOP and wait for user
|
|
||||||
- Commit and push after each successful merge
|
|
||||||
- Never force-push. Never modify main history.
|
|
||||||
```
|
|
||||||
|
|
||||||
## Session Log — Full Visibility
|
|
||||||
|
|
||||||
Every autonomous session writes to `.archeflow/session-log.md`:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
# ArcheFlow Autonomous Session
|
|
||||||
**Started:** 2026-04-02 22:00 UTC
|
|
||||||
**Mode:** autonomous
|
|
||||||
**Tasks:** 4 queued
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Task 1: Add input validation to all API endpoints
|
|
||||||
**Workflow:** thorough | **Status:** COMPLETED
|
|
||||||
**Cycles:** 2 of 3
|
|
||||||
**Cycle 1:** Guardian REJECTED (missing sanitization on 2 endpoints)
|
|
||||||
**Cycle 2:** All APPROVED
|
|
||||||
**Files changed:** 8 | **Tests added:** 24
|
|
||||||
**Branch:** merged to main (commit abc1234)
|
|
||||||
**Duration:** 12 min | **Completed:** 22:12 UTC
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Task 2: Refactor auth middleware to use JWT
|
|
||||||
**Workflow:** standard | **Status:** COMPLETED
|
|
||||||
**Cycles:** 1 of 2
|
|
||||||
**Cycle 1:** All APPROVED (clean implementation)
|
|
||||||
**Files changed:** 5 | **Tests added:** 15
|
|
||||||
**Branch:** merged to main (commit def5678)
|
|
||||||
**Duration:** 8 min | **Completed:** 22:20 UTC
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Task 3: Fix pagination bug in search results
|
|
||||||
**Workflow:** fast | **Status:** COMPLETED
|
|
||||||
**Cycles:** 1 of 1
|
|
||||||
**Cycle 1:** Guardian APPROVED
|
|
||||||
**Files changed:** 2 | **Tests added:** 3
|
|
||||||
**Branch:** merged to main (commit ghi9012)
|
|
||||||
**Duration:** 4 min | **Completed:** 22:24 UTC
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Task 4: Add rate limiting to public endpoints
|
|
||||||
**Workflow:** standard | **Status:** FAILED (max cycles)
|
|
||||||
**Cycles:** 2 of 2
|
|
||||||
**Cycle 1:** Skeptic REJECTED (Redis dependency not in Docker setup)
|
|
||||||
**Cycle 2:** Guardian REJECTED (race condition in token bucket)
|
|
||||||
**Unresolved:** Race condition in concurrent token bucket decrement
|
|
||||||
**Branch:** archeflow/maker-xyz (NOT merged — available for manual review)
|
|
||||||
**Duration:** 15 min | **Completed:** 22:39 UTC
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Session Summary
|
|
||||||
**Completed:** 3 of 4 tasks
|
|
||||||
**Failed:** 1 (rate limiting — needs human input on concurrency design)
|
|
||||||
**Total duration:** 39 min
|
|
||||||
**Files changed:** 15 | **Tests added:** 42
|
|
||||||
**Ended:** 22:39 UTC
|
|
||||||
```
|
|
||||||
|
|
||||||
## Safety Mechanisms
|
|
||||||
|
|
||||||
### Automatic Stop Conditions
|
|
||||||
The session halts and waits for the user when:
|
|
||||||
- **3 consecutive failures:** Something systemic is wrong
|
|
||||||
- **Destructive action detected:** Force push, branch deletion, schema drop
|
|
||||||
- **Shadow escalation:** Same shadow detected 3+ times across tasks
|
|
||||||
- **Budget exceeded:** If cost tracking is enabled, stop at budget limit
|
|
||||||
- **Test suite broken:** If existing tests fail after merge, halt immediately and revert
|
|
||||||
|
|
||||||
### Everything is Reversible
|
|
||||||
- Code changes live on worktree branches until explicitly merged
|
|
||||||
- Merges use `--no-ff` — every merge commit is individually revertable
|
|
||||||
- The session log captures every decision for post-hoc review
|
|
||||||
- Failed tasks leave their branches intact for manual inspection
|
|
||||||
|
|
||||||
### User Controls
|
|
||||||
The user can at any time:
|
|
||||||
- **Cancel:** Kill the session. All incomplete work stays on branches.
|
|
||||||
- **Pause:** Stop after current task completes. Resume later.
|
|
||||||
- **Skip:** Skip the current task, move to the next one.
|
|
||||||
- **Review:** Read `.archeflow/session-log.md` for real-time progress.
|
|
||||||
- **Intervene:** Jump into a worktree branch and fix something manually.
|
|
||||||
|
|
||||||
## Task Queue Formats
|
## Task Queue Formats
|
||||||
|
|
||||||
### Simple (inline)
|
**Inline:**
|
||||||
```
|
```
|
||||||
Tasks:
|
|
||||||
1. "Fix the login bug" (fast)
|
1. "Fix the login bug" (fast)
|
||||||
2. "Add user profile page" (standard)
|
2. "Add user profile page" (standard)
|
||||||
```
|
```
|
||||||
|
|
||||||
### From File
|
**From file (`.archeflow/queue.md`):**
|
||||||
Create `.archeflow/queue.md`:
|
|
||||||
```markdown
|
```markdown
|
||||||
- [ ] Fix the login bug | fast
|
- [ ] Fix the login bug | fast
|
||||||
- [ ] Add user profile page | standard
|
- [ ] Add user profile page | standard | depends: fix login
|
||||||
- [ ] Security audit of payment flow | thorough
|
- [ ] Security audit | thorough | done: Guardian approves AND load_test.sh passes
|
||||||
- [x] Refactor database queries | standard (completed)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### With Dependencies
|
Tasks with `depends:` wait for the named task to complete. Tasks with `done:` have completion criteria checked in the Act phase.
|
||||||
```markdown
|
|
||||||
- [ ] Add user model (standard)
|
|
||||||
- [ ] Add user API endpoints (standard) | depends: user model
|
|
||||||
- [ ] Add user UI (standard) | depends: user API endpoints
|
|
||||||
```
|
|
||||||
Dependencies are processed in order: a task with `depends: X` waits until X completes successfully. Tasks without dependencies or with resolved dependencies can run in parallel (see Parallel Team Orchestration in the orchestration skill).
|
|
||||||
|
|
||||||
### With Completion Criteria
|
## Safety Mechanisms
|
||||||
```markdown
|
|
||||||
- [ ] Fix login bug | fast | done: login_test.py passes
|
### Automatic Stop Conditions
|
||||||
- [ ] Add rate limiting | standard | done: Guardian approves AND load_test.sh passes
|
|
||||||
```
|
- **3 consecutive failures:** Something systemic is wrong
|
||||||
Completion criteria are checked in the Act phase. If the test command fails even when reviewers approve, the task cycles back.
|
- **Test suite broken:** Halt immediately, revert last merge
|
||||||
|
- **Budget exceeded:** Stop at limit
|
||||||
|
- **Shadow escalation:** Same shadow detected 3+ times across tasks
|
||||||
|
- **Destructive action detected:** Force push, branch deletion, schema drop
|
||||||
|
|
||||||
|
### Everything is Reversible
|
||||||
|
|
||||||
|
- Code lives on worktree branches until explicitly merged
|
||||||
|
- Merges use `--no-ff` (individually revertable)
|
||||||
|
- Failed tasks leave branches intact for inspection
|
||||||
|
|
||||||
|
### User Controls
|
||||||
|
|
||||||
|
- **Cancel:** Kill session, incomplete work stays on branches
|
||||||
|
- **Pause:** Stop after current task, resume later
|
||||||
|
- **Skip:** Move to next task
|
||||||
|
- **Review:** Read `.archeflow/session-log.md` for progress
|
||||||
|
|
||||||
|
## Session Log
|
||||||
|
|
||||||
|
Every session writes to `.archeflow/session-log.md` with per-task entries:
|
||||||
|
- Workflow, status, cycles, reviewer verdicts
|
||||||
|
- Files changed, tests added
|
||||||
|
- Branch and commit info
|
||||||
|
- Duration and timestamps
|
||||||
|
- Session summary at the end
|
||||||
|
|
||||||
## Budget-Aware Scheduling
|
## Budget-Aware Scheduling
|
||||||
|
|
||||||
Set a token or cost budget for the session. The orchestrator tracks estimated cost per task and adapts:
|
|
||||||
|
|
||||||
```
|
|
||||||
Budget: $5.00 (or ~2M tokens)
|
|
||||||
```
|
|
||||||
|
|
||||||
| Budget Remaining | Action |
|
| Budget Remaining | Action |
|
||||||
|-----------------|--------|
|
|-----------------|--------|
|
||||||
| > 50% | Run tasks at their selected workflow level |
|
| > 50% | Run at selected workflow level |
|
||||||
| 25-50% | Downgrade `thorough` → `standard`, `standard` → `fast` |
|
| 25-50% | Downgrade thorough to standard, standard to fast |
|
||||||
| < 25% | Run remaining tasks as `fast` only |
|
| < 25% | All tasks as fast only |
|
||||||
| Exhausted | Stop. Log remaining tasks as "skipped — budget exhausted" |
|
| Exhausted | Stop, log remaining as skipped |
|
||||||
|
|
||||||
Budget is tracked per-task in the session log. Estimated cost per agent by model tier:
|
## Auto-Resume
|
||||||
|
|
||||||
| Tier | Model | Est. Cost/Agent |
|
On interruption, save state to `.archeflow/state.json` (current task, phase, cycle, completed tasks, worktree branch). On next session start, offer to resume or start fresh.
|
||||||
|------|-------|----------------|
|
|
||||||
| cheap | Haiku | ~$0.01 |
|
|
||||||
| standard | Sonnet | ~$0.05 |
|
|
||||||
| premium | Opus | ~$0.25 |
|
|
||||||
|
|
||||||
A standard workflow (6 agents, mostly Sonnet) costs ~$0.30. A thorough workflow (8 agents) costs ~$0.50. These are rough estimates — actual cost depends on context size and output length.
|
|
||||||
|
|
||||||
## Auto-Resume on Interruption
|
|
||||||
|
|
||||||
If a session is interrupted (crash, timeout, user cancel), save state for resumption:
|
|
||||||
|
|
||||||
### On Interruption
|
|
||||||
Write `.archeflow/state.json`:
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"session_id": "...",
|
|
||||||
"current_task": 2,
|
|
||||||
"current_phase": "check",
|
|
||||||
"current_cycle": 1,
|
|
||||||
"completed_tasks": [1],
|
|
||||||
"queue": ["task3", "task4"],
|
|
||||||
"worktree_branch": "archeflow/maker-abc",
|
|
||||||
"timestamp": "2026-04-03T22:15:00Z"
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### On Next Session Start
|
|
||||||
If `.archeflow/state.json` exists:
|
|
||||||
1. Report: "Found interrupted ArcheFlow session from [timestamp]. Task [N] was in [phase] phase."
|
|
||||||
2. Offer: "Resume from where we left off? Or start fresh?"
|
|
||||||
3. If resume: pick up from the saved phase. The worktree branch is still intact.
|
|
||||||
4. If fresh: clean up state file and worktrees, start over.
|
|
||||||
|
|
||||||
## Overnight Session Checklist
|
|
||||||
|
|
||||||
Before starting an autonomous overnight session:
|
|
||||||
|
|
||||||
1. **Clean working tree:** `git status` — no uncommitted changes
|
|
||||||
2. **Tests passing:** Run the full test suite. Don't start on a broken baseline.
|
|
||||||
3. **Task queue defined:** Either inline or in `.archeflow/queue.md`
|
|
||||||
4. **Workflow selected per task:** Match risk level to workflow type
|
|
||||||
5. **Budget set (optional):** If cost matters, set a token/dollar limit
|
|
||||||
6. **Push access:** Verify git push works (SSH key, auth token)
|
|
||||||
|
|
||||||
Then: set it, forget it, read the session log in the morning.
|
|
||||||
|
|||||||
@@ -1,233 +1,110 @@
|
|||||||
---
|
---
|
||||||
name: check-phase
|
name: check-phase
|
||||||
description: Use when you are acting as Guardian, Skeptic, Sage, or Trickster archetype in the Check phase. Defines shared review rules and output format.
|
description: Use when acting as Guardian, Skeptic, Sage, or Trickster in the Check phase. Defines review rules, finding format, attention filters, and spawning protocol.
|
||||||
---
|
---
|
||||||
|
|
||||||
# Check Phase
|
# Check Phase
|
||||||
|
|
||||||
Multiple reviewers examine the Maker's implementation in parallel. Each agent definition has its specific protocol — this skill defines the shared rules.
|
Reviewers examine the Maker's implementation. This skill defines shared rules, finding format, and spawning protocol.
|
||||||
|
|
||||||
## Shared Rules
|
## Shared Rules
|
||||||
|
|
||||||
1. **Read the proposal first.** Review against the intended design, not invented requirements.
|
1. Review against the proposal's intended design, not invented requirements.
|
||||||
2. **Read the actual code.** Use `git diff` on the Maker's branch. Don't review descriptions alone.
|
2. Read actual code via `git diff` on the Maker's branch.
|
||||||
3. **Structured findings.** Use the standardized finding format below for every issue.
|
3. Use the finding format below for every issue.
|
||||||
4. **Clear verdict:** `APPROVED` or `REJECTED` with rationale.
|
4. Give a clear verdict: `APPROVED` or `REJECTED` with rationale.
|
||||||
5. **Status tokens are separate from verdicts.** The `STATUS: DONE` line signals the agent finished successfully. The `APPROVED`/`REJECTED` verdict is domain output. A reviewer can be `STATUS: DONE` with verdict `REJECTED` — that is normal. Parse both independently.
|
5. `STATUS: DONE` signals agent completion. `APPROVED`/`REJECTED` is domain output. Both are parsed independently.
|
||||||
|
|
||||||
## Finding Format
|
## Finding Format
|
||||||
|
|
||||||
Every finding must use this format for cross-cycle tracking:
|
|
||||||
|
|
||||||
```
|
|
||||||
| Location | Severity | Category | Description | Fix |
|
| Location | Severity | Category | Description | Fix |
|
||||||
|----------|----------|----------|-------------|-----|
|
|----------|----------|----------|-------------|-----|
|
||||||
| src/auth/handler.ts:48 | CRITICAL | security | Empty string bypasses validation | Add length check before processing |
|
| src/auth/handler.ts:48 | CRITICAL | security | Empty string bypasses validation | Add length check |
|
||||||
```
|
|
||||||
|
|
||||||
**Severity:**
|
**Severity:** CRITICAL = must fix, blocks approval. WARNING = should fix, doesn't block alone. INFO = nice to have, never blocks.
|
||||||
- **CRITICAL** — Must fix. Blocks approval.
|
|
||||||
- **WARNING** — Should fix. Doesn't block alone.
|
|
||||||
- **INFO** — Nice to have. Never blocks.
|
|
||||||
|
|
||||||
**Categories** (use consistently for cross-cycle tracking):
|
**Categories:** `security` `reliability` `design` `breaking-change` `dependency` `quality` `testing` `consistency`
|
||||||
- `security` — Injection, auth bypass, data exposure, secrets
|
|
||||||
- `reliability` — Error handling, edge cases, race conditions, crashes
|
|
||||||
- `design` — Architecture, assumptions, scalability, coupling
|
|
||||||
- `breaking-change` — API compatibility, schema migrations, removals
|
|
||||||
- `dependency` — New deps, version conflicts, license issues
|
|
||||||
- `quality` — Readability, maintainability, naming, duplication
|
|
||||||
- `testing` — Missing tests, weak assertions, untested paths
|
|
||||||
- `consistency` — Deviates from codebase patterns
|
|
||||||
|
|
||||||
## Consolidated Output
|
## Evidence Requirements
|
||||||
|
|
||||||
After all reviewers finish, compile:
|
Every CRITICAL or WARNING must include concrete evidence. Without evidence, downgrade to INFO.
|
||||||
|
|
||||||
```markdown
|
**Valid evidence:** command output, exit codes, code citations with line numbers, git diff excerpts, reproduction steps.
|
||||||
## Check Phase Results — Cycle N
|
|
||||||
|
|
||||||
### Guardian: APPROVED
|
**Banned in CRITICAL/WARNING:** "might be", "could potentially", "appears to", "seems like", "may not". Rewrite with evidence or downgrade.
|
||||||
| Location | Severity | Category | Description | Fix |
|
|
||||||
|----------|----------|----------|-------------|-----|
|
|
||||||
| src/auth/handler.ts:52 | WARNING | security | Missing rate limit | Add rate limiter middleware |
|
|
||||||
|
|
||||||
### Skeptic: APPROVED
|
For each CRITICAL/WARNING, state: (1) what was tested, (2) what was observed, (3) what correct behavior should be.
|
||||||
| Location | Severity | Category | Description | Fix |
|
|
||||||
|----------|----------|----------|-------------|-----|
|
|
||||||
| src/auth/handler.ts:30 | INFO | design | Consider caching validated tokens | Add TTL cache for token validation |
|
|
||||||
|
|
||||||
### Sage: APPROVED
|
## Attention Filters
|
||||||
| Location | Severity | Category | Description | Fix |
|
|
||||||
|----------|----------|----------|-------------|-----|
|
|
||||||
| tests/auth.test.ts:15 | WARNING | testing | Test names don't describe behavior | Rename to "should reject expired tokens" |
|
|
||||||
|
|
||||||
### Trickster: REJECTED
|
Each archetype receives only relevant context. Do not pass everything.
|
||||||
| Location | Severity | Category | Description | Fix |
|
|
||||||
|----------|----------|----------|-------------|-----|
|
|
||||||
| src/auth/handler.ts:48 | CRITICAL | reliability | Empty string bypasses validation | Add `if (!token || token.trim() === '')` guard |
|
|
||||||
|
|
||||||
### Deduplication
|
| Archetype | Receives | Excludes |
|
||||||
If two reviewers raise the same issue (same file + same category), merge:
|
|-----------|----------|----------|
|
||||||
| Guardian + Skeptic | CRITICAL | security | Input not sanitized (src/api.ts:30) | Add validation |
|
| Guardian | Maker's git diff + proposal risk section + test results | Explorer research, Creator rationale, other reviewers |
|
||||||
|
| Skeptic | Creator's proposal (assumptions + architecture) + confidence scores | Git diff, Explorer research, other reviewers |
|
||||||
|
| Sage | Creator's proposal + Maker's diff + implementation summary + test results | Explorer raw research, other reviewer verdicts |
|
||||||
|
| Trickster | Maker's git diff + attack surface summary (file types + entry points) | Proposal, research, other reviewers |
|
||||||
|
|
||||||
Use the higher severity. Don't double-count in the verdict.
|
**Token budget targets:**
|
||||||
|
|
||||||
### Verdict: REJECTED — 1 critical finding
|
| Archetype | Fast | Standard | Thorough |
|
||||||
→ Build cycle feedback (see orchestration skill) and feed to Plan phase
|
|-----------|------|----------|----------|
|
||||||
```
|
| Guardian | 1500 | 2000 | 2500 |
|
||||||
|
| Skeptic | skip | 1500 | 2000 |
|
||||||
|
| Trickster | skip | skip | 1500 |
|
||||||
|
| Sage | skip | 2500 | 3000 |
|
||||||
|
|
||||||
|
**Context isolation:** Agents receive fresh, controller-constructed context only. No session bleed, no cross-agent contamination, no ambient knowledge. Verify zero references to excluded artifacts before spawning.
|
||||||
|
|
||||||
|
**Cycle-back filtering (cycle 2+):** Pass structured feedback table only (not full reviewer artifacts). Strip resolved items. Cap at 500 tokens — summarize by severity if exceeded.
|
||||||
|
|
||||||
## Reviewer Spawning Protocol
|
## Reviewer Spawning Protocol
|
||||||
|
|
||||||
This section defines the exact sequence for spawning reviewers in the Check phase.
|
|
||||||
|
|
||||||
### Step 1: Guardian First (mandatory)
|
### Step 1: Guardian First (mandatory)
|
||||||
|
|
||||||
Guardian always runs first, before any other reviewer. It receives the Maker's git diff and the proposal's risk section only.
|
Guardian always runs first. It receives the Maker's git diff and the proposal's risk section only.
|
||||||
|
|
||||||
**Context for Guardian:**
|
|
||||||
- `git diff main...<maker-branch>` (the actual code changes)
|
|
||||||
- Risk section from `plan-creator.md` (if present)
|
|
||||||
- Do NOT include: Explorer research, full proposal, other reviewer outputs
|
|
||||||
|
|
||||||
```
|
|
||||||
Agent(
|
|
||||||
description: "Guardian: security and risk review for <task>",
|
|
||||||
prompt: "You are the GUARDIAN archetype.
|
|
||||||
Review the diff: <maker's diff>
|
|
||||||
Proposal risks: <risk section from plan-creator.md>
|
|
||||||
Assess: security vulnerabilities, reliability risks, breaking changes, dependency risks.
|
|
||||||
Output: APPROVED or REJECTED with findings in the standardized format.
|
|
||||||
Each finding: | Location | Severity | Category | Description | Fix |",
|
|
||||||
model: <resolve_model guardian $WORKFLOW>
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
Save output to `.archeflow/artifacts/${RUN_ID}/check-guardian.md`.
|
Save output to `.archeflow/artifacts/${RUN_ID}/check-guardian.md`.
|
||||||
|
|
||||||
### Step 2: A2 Fast-Path Evaluation
|
### Step 2: A2 Fast-Path Evaluation
|
||||||
|
|
||||||
After Guardian completes, parse its output before spawning other reviewers:
|
After Guardian completes, count CRITICAL and WARNING findings in its output. If both are zero, and not escalated, and not first cycle of a thorough workflow — skip remaining reviewers and proceed to Act phase.
|
||||||
|
|
||||||
```bash
|
### Step 3: Parallel Remaining Reviewers
|
||||||
CRITICAL_COUNT=$(grep -c "| CRITICAL |" ".archeflow/artifacts/${RUN_ID}/check-guardian.md" || echo 0)
|
|
||||||
WARNING_COUNT=$(grep -c "| WARNING |" ".archeflow/artifacts/${RUN_ID}/check-guardian.md" || echo 0)
|
|
||||||
|
|
||||||
# A2 fast-path: skip remaining reviewers if Guardian is clean
|
If A2 does not trigger, spawn remaining reviewers in parallel:
|
||||||
# Exception: first cycle of thorough workflows always spawns all reviewers
|
|
||||||
if [[ "$CRITICAL_COUNT" -eq 0 && "$WARNING_COUNT" -eq 0 \
|
|
||||||
&& "$ESCALATED" != "true" \
|
|
||||||
&& ! ("$WORKFLOW" == "thorough" && "$CYCLE" -eq 1) ]]; then
|
|
||||||
echo "Guardian fast-path: 0 CRITICAL, 0 WARNING — skipping remaining reviewers."
|
|
||||||
# Proceed directly to Act phase
|
|
||||||
fi
|
|
||||||
```
|
|
||||||
|
|
||||||
### Step 3: Parallel Reviewer Spawning
|
|
||||||
|
|
||||||
If A2 does not trigger, spawn remaining reviewers in parallel based on workflow:
|
|
||||||
|
|
||||||
| Workflow | Reviewers (after Guardian) |
|
| Workflow | Reviewers (after Guardian) |
|
||||||
|----------|--------------------------|
|
|----------|--------------------------|
|
||||||
| `fast` | None (Guardian only) |
|
| `fast` | None (Guardian only) |
|
||||||
| `fast` (escalated via A1) | Skeptic + Sage |
|
| `fast` (escalated) | Skeptic + Sage |
|
||||||
| `standard` | Skeptic + Sage |
|
| `standard` | Skeptic + Sage |
|
||||||
| `thorough` | Skeptic + Sage + Trickster |
|
| `thorough` | Skeptic + Sage + Trickster |
|
||||||
|
|
||||||
Spawn all applicable reviewers in a single message with multiple Agent calls:
|
Each reviewer gets context per the attention filters above.
|
||||||
|
|
||||||
```
|
### Step 4: Collect and Consolidate
|
||||||
# Standard workflow example — spawn Skeptic and Sage in parallel:
|
|
||||||
Agent(
|
|
||||||
description: "Skeptic: challenge assumptions for <task>",
|
|
||||||
prompt: "<Skeptic prompt with Creator's proposal>",
|
|
||||||
model: <resolve_model skeptic $WORKFLOW>
|
|
||||||
)
|
|
||||||
|
|
||||||
Agent(
|
For each reviewer: save to `.archeflow/artifacts/${RUN_ID}/check-<archetype>.md`, emit `review.verdict` event, record sequence number.
|
||||||
description: "Sage: holistic quality review for <task>",
|
|
||||||
prompt: "<Sage prompt with proposal + diff + implementation summary>",
|
**Deduplication:** If two reviewers raise the same issue (same file + same category), merge into one finding using the higher severity. Don't double-count.
|
||||||
model: <resolve_model sage $WORKFLOW>
|
|
||||||
)
|
**Verdict:** Count CRITICAL findings across all reviewers (after dedup). Any CRITICAL = `REJECTED`. Otherwise `APPROVED`.
|
||||||
|
|
||||||
|
Example consolidated output:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
## Check Phase Results — Cycle 1
|
||||||
|
### Guardian: APPROVED
|
||||||
|
| Location | Severity | Category | Description | Fix |
|
||||||
|
|----------|----------|----------|-------------|-----|
|
||||||
|
| src/auth.ts:52 | WARNING | security | Missing rate limit | Add rate limiter |
|
||||||
|
### Verdict: APPROVED — 0 critical, 1 warning
|
||||||
```
|
```
|
||||||
|
|
||||||
Each reviewer gets context per the attention filters defined in `archeflow:orchestration`:
|
## Timeout Handling
|
||||||
- **Skeptic:** Creator's proposal (assumptions section focus)
|
|
||||||
- **Sage:** Creator's proposal + Maker's diff + implementation summary
|
|
||||||
- **Trickster:** Maker's diff only
|
|
||||||
|
|
||||||
### Step 4: Collect Results
|
Each reviewer has a **5-minute timeout**. On timeout: emit `agent.complete` with `"error": true`, log WARNING, treat as no findings, proceed.
|
||||||
|
|
||||||
Wait for all spawned reviewers to return. For each:
|
**Exception:** Guardian timeout is blocking — abort Check phase and report to user.
|
||||||
1. Save output to `.archeflow/artifacts/${RUN_ID}/check-<archetype>.md`
|
|
||||||
2. Emit `review.verdict` event with findings
|
|
||||||
3. Record sequence number for DAG parent tracking
|
|
||||||
|
|
||||||
### Timeout Handling
|
|
||||||
|
|
||||||
Each reviewer has a **5-minute timeout**. If a reviewer does not return within 5 minutes:
|
|
||||||
1. Emit `agent.complete` with `"error": true, "reason": "timeout"`
|
|
||||||
2. Log a WARNING — do not block the run
|
|
||||||
3. Treat the timed-out reviewer as having delivered no findings (neither approved nor rejected)
|
|
||||||
4. Proceed with available verdicts
|
|
||||||
|
|
||||||
If Guardian times out, this is a blocking failure — abort the Check phase and report to the user.
|
|
||||||
|
|
||||||
### Re-Check Protocol (Act Phase Fixes)
|
|
||||||
|
|
||||||
When the Act phase routes findings back to the Maker and the Maker applies fixes in a subsequent cycle, the Check phase re-runs with the updated diff. Reviewers who previously rejected should focus on whether their specific findings were addressed. The structured feedback from `act-feedback.md` provides the mapping of which findings were routed where.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Evidence Requirements
|
|
||||||
|
|
||||||
Every CRITICAL or WARNING finding must include concrete evidence. Findings without evidence are downgraded to INFO.
|
|
||||||
|
|
||||||
### Evidence Types
|
|
||||||
|
|
||||||
| Type | Example | When Required |
|
|
||||||
|------|---------|---------------|
|
|
||||||
| Command output | `npm test` output showing failure | Test-related findings |
|
|
||||||
| Exit code | `exit code 1 from eslint` | Tool-based validation |
|
|
||||||
| Code citation | `src/auth.ts:48 — \`if (token) { ... }\`` | Logic or security findings |
|
|
||||||
| Git diff | `+ db.query(userInput)` (unsanitized) | Implementation review |
|
|
||||||
| Reproduction steps | "1. Send POST with empty body, 2. Observe 500" | Runtime behavior findings |
|
|
||||||
|
|
||||||
### Banned Phrases
|
|
||||||
|
|
||||||
The following phrases are not permitted in CRITICAL or WARNING findings. They indicate speculation, not evidence:
|
|
||||||
|
|
||||||
- "might be"
|
|
||||||
- "could potentially"
|
|
||||||
- "appears to"
|
|
||||||
- "seems like"
|
|
||||||
- "may not"
|
|
||||||
|
|
||||||
A finding using these phrases must either be rewritten with evidence or downgraded to INFO.
|
|
||||||
|
|
||||||
### Verification Protocol
|
|
||||||
|
|
||||||
For each CRITICAL or WARNING finding, state:
|
|
||||||
|
|
||||||
1. **What was tested** — the specific code path, input, or scenario examined
|
|
||||||
2. **What was observed** — the actual behavior or code construct found
|
|
||||||
3. **What correct behavior should be** — the expected alternative
|
|
||||||
|
|
||||||
### Downgrade Rule
|
|
||||||
|
|
||||||
If a reviewer produces a CRITICAL or WARNING finding without any of the evidence types above, the orchestrator downgrades it to INFO and emits a `decision` event:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./lib/archeflow-event.sh "$RUN_ID" decision check "" \
|
|
||||||
'{"what":"evidence_downgrade","from":"CRITICAL","to":"INFO","finding":"<description>","reviewer":"<archetype>","reason":"no evidence provided"}'
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Why Structured Findings Matter
|
|
||||||
|
|
||||||
The standardized format enables:
|
|
||||||
- **Cross-cycle tracking:** Same category + location = same issue. Can detect resolution or regression.
|
|
||||||
- **Feedback routing:** Security/design findings → Creator. Quality/testing findings → Maker.
|
|
||||||
- **Shadow detection:** CRITICAL:WARNING ratios, finding counts, and category distributions are measurable.
|
|
||||||
- **Metrics:** Severity counts feed into the orchestration summary.
|
|
||||||
|
|||||||
@@ -9,384 +9,91 @@ description: |
|
|||||||
<example>User: "archeflow:run" in a project with colette.yaml</example>
|
<example>User: "archeflow:run" in a project with colette.yaml</example>
|
||||||
---
|
---
|
||||||
|
|
||||||
# Colette Bridge — Writing Context Auto-Loader
|
# Colette Bridge -- Writing Context Auto-Loader
|
||||||
|
|
||||||
When ArcheFlow detects `colette.yaml` in the project root, this skill automatically loads voice profiles, personas, character sheets, and project rules into a context bundle that every agent receives (filtered by archetype role).
|
When `colette.yaml` exists in the project root, this skill loads voice profiles, personas, character sheets, and project rules into a context bundle filtered per archetype.
|
||||||
|
|
||||||
## Prerequisites
|
## Activation
|
||||||
|
|
||||||
- `archeflow:domains` — Colette Bridge sets domain to `writing` automatically
|
At `run.start`, after domain detection but before Plan phase:
|
||||||
- `archeflow:artifact-routing` — bundle is injected via the artifact routing system
|
1. Check for `colette.yaml` in project root
|
||||||
- `archeflow:run` — bridge hooks into run initialization
|
2. If found: activate bridge, set domain to `writing`
|
||||||
|
3. If not found: skip silently
|
||||||
## Trigger
|
|
||||||
|
|
||||||
At `run.start`, after domain detection but before the Plan phase:
|
|
||||||
|
|
||||||
1. Check if `colette.yaml` exists in the project root
|
|
||||||
2. If found, activate Colette Bridge
|
|
||||||
3. If not found, skip silently (no error, no warning)
|
|
||||||
|
|
||||||
When the bridge activates, it emits a decision event:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./lib/archeflow-event.sh "$RUN_ID" decision init "" \
|
|
||||||
'{"what":"colette_bridge","chosen":"activated","signal":"colette.yaml found","files_resolved":<count>}'
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## File Resolution
|
## File Resolution
|
||||||
|
|
||||||
Colette projects reference files by ID (e.g., `vp-giesing-gschichten-v1`) but the actual YAML files may live in different locations. The bridge resolves files using this search order:
|
Colette projects reference files by ID (e.g., `vp-giesing-gschichten-v1`). The bridge resolves them:
|
||||||
|
|
||||||
### Search Priority (highest first)
|
| Priority | Location |
|
||||||
|
|----------|----------|
|
||||||
|
| 1 | Explicit path in `colette.yaml` (has `/` or `.yaml`) |
|
||||||
|
| 2 | Project root subdirectories (`./profiles/<id>.yaml`) |
|
||||||
|
| 3 | Parent `writing.colette/` dir (`../writing.colette/profiles/<id>.yaml`) |
|
||||||
|
|
||||||
| Priority | Location | Example |
|
**What gets resolved:**
|
||||||
|----------|----------|---------|
|
|
||||||
| 1 | Explicit path in `colette.yaml` | `voice.profile: ../writing.colette/profiles/custom.yaml` |
|
|
||||||
| 2 | Project root subdirectories | `./profiles/vp-giesing-gschichten-v1.yaml` |
|
|
||||||
| 3 | Parent directory + `writing.colette/` | `../writing.colette/profiles/vp-giesing-gschichten-v1.yaml` |
|
|
||||||
|
|
||||||
### What Gets Resolved
|
| Source | colette.yaml field | Search subdirs |
|
||||||
|
|--------|-------------------|----------------|
|
||||||
| Source | colette.yaml field | Search paths |
|
| Voice profile | `voice.profile` | `profiles/` |
|
||||||
|--------|-------------------|-------------|
|
| Persona | `writing.persona` or inferred from profile | `personas/` |
|
||||||
| Voice profile | `voice.profile` | `profiles/<id>.yaml`, `../writing.colette/profiles/<id>.yaml` |
|
|
||||||
| Persona | `writing.persona` or inferred from profile | `personas/<id>.yaml`, `../writing.colette/personas/<id>.yaml` |
|
|
||||||
| Characters | Auto-discovered | `characters/*.yaml` |
|
| Characters | Auto-discovered | `characters/*.yaml` |
|
||||||
| Series config | `series` section (if present) | `colette.yaml` itself, `../writing.colette/series/<name>.yaml` |
|
| Series config | `series` section | `colette.yaml` itself |
|
||||||
| Project rules | Always | `CLAUDE.md` in project root |
|
| Project rules | Always | `CLAUDE.md` in project root |
|
||||||
|
|
||||||
### Resolution Procedure
|
Missing files emit a warning event but do not abort the run.
|
||||||
|
|
||||||
```
|
|
||||||
for each reference in colette.yaml:
|
|
||||||
1. If the field contains a path (has / or .yaml) → use as-is, verify exists
|
|
||||||
2. If the field contains an ID (e.g., "vp-giesing-gschichten-v1"):
|
|
||||||
a. Check ./profiles/<id>.yaml (or ./personas/<id>.yaml)
|
|
||||||
b. Check ../writing.colette/profiles/<id>.yaml (or ../writing.colette/personas/<id>.yaml)
|
|
||||||
c. If not found → warn in event log, skip this file
|
|
||||||
3. For characters/ → glob characters/*.yaml in project root
|
|
||||||
4. For CLAUDE.md → check project root
|
|
||||||
```
|
|
||||||
|
|
||||||
If a referenced file cannot be found at any location, emit a warning event but do not abort:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./lib/archeflow-event.sh "$RUN_ID" decision init "" \
|
|
||||||
'{"what":"colette_bridge_warning","chosen":"skip","file":"vp-giesing-gschichten-v1","reason":"not found in any search path"}'
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Context Bundle
|
## Context Bundle
|
||||||
|
|
||||||
The bridge generates `.archeflow/context/colette-bundle.md` — a summarized, token-efficient Markdown file that agents receive as part of their prompt context.
|
Generated at `.archeflow/context/colette-bundle.md`. Summarized, not raw YAML. Target: under 1500 tokens.
|
||||||
|
|
||||||
### Bundle Structure
|
**Summarization rules:**
|
||||||
|
- Voice dimensions: key + value (no YAML wrapper)
|
||||||
```markdown
|
- Verboten/erlaubt: bullet list, truncate items over 15 words
|
||||||
# Writing Context (auto-loaded from Colette)
|
|
||||||
|
|
||||||
## Voice Profile: <id>
|
|
||||||
**Tone:** <tone_summary from meta>
|
|
||||||
**Perspective:** <perspektive>
|
|
||||||
**Density:** <dichte>
|
|
||||||
**Attitude:** <haltung>
|
|
||||||
**Sharpness:** <schaerfe>
|
|
||||||
**Humor:** <humor>
|
|
||||||
**Tempo:** <tempo>
|
|
||||||
**Reader relationship:** <leser_beziehung>
|
|
||||||
|
|
||||||
### Forbidden
|
|
||||||
- <each item from verboten>
|
|
||||||
|
|
||||||
### Allowed
|
|
||||||
- <each item from erlaubt>
|
|
||||||
|
|
||||||
### Style models
|
|
||||||
- <each item from vorbilder, name only + one-word tag>
|
|
||||||
|
|
||||||
## Persona: <id>
|
|
||||||
**Name:** <name>
|
|
||||||
**Bio:** <bio, max 2 sentences>
|
|
||||||
**Genres:** <genres, comma-separated>
|
|
||||||
|
|
||||||
### Rules
|
|
||||||
- <each item from rules>
|
|
||||||
|
|
||||||
## Characters
|
|
||||||
### <name> (<role>)
|
|
||||||
- **Age:** <age>
|
|
||||||
- **Key traits:** <first 3 personality items>
|
|
||||||
- **Speech:** <speech_pattern, first sentence only>
|
|
||||||
- **Relationships:** <key relationships, one line each>
|
|
||||||
|
|
||||||
[Repeated for each character in characters/*.yaml]
|
|
||||||
|
|
||||||
## Series Context
|
|
||||||
[Only if series config found in colette.yaml]
|
|
||||||
- **Shared concepts:** <list>
|
|
||||||
- **Glossary:** <key terms>
|
|
||||||
- **Forbidden cross-story:** <items>
|
|
||||||
|
|
||||||
## Project Rules (from CLAUDE.md)
|
|
||||||
[Key writing rules extracted from CLAUDE.md, summarized as bullet points]
|
|
||||||
- <rule 1>
|
|
||||||
- <rule 2>
|
|
||||||
- ...
|
|
||||||
```
|
|
||||||
|
|
||||||
### Summarization Rules
|
|
||||||
|
|
||||||
The bundle is **summarized**, not a raw YAML dump. This reduces token cost:
|
|
||||||
|
|
||||||
- Voice profile dimensions: key name + value (no YAML formatting, no `dimensionen:` wrapper)
|
|
||||||
- Verboten/erlaubt: bullet list, strip explanation after the dash if over 15 words
|
|
||||||
- Characters: name, role, age, top 3 traits, first sentence of speech pattern, relationships
|
- Characters: name, role, age, top 3 traits, first sentence of speech pattern, relationships
|
||||||
- Persona bio: max 2 sentences
|
- Persona bio: max 2 sentences
|
||||||
- CLAUDE.md: extract only rules/style sections, skip meta/git/cost config
|
- CLAUDE.md: only writing rules, skip meta/git/cost config
|
||||||
- Target: bundle should be under 1500 tokens for a typical project
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Caching
|
## Caching
|
||||||
|
|
||||||
The bundle is regenerated only when source files have changed. Cache validation uses file modification times.
|
Bundle regenerated only when source file mtimes are newer than the bundle. If all sources are older, reuse cached bundle.
|
||||||
|
|
||||||
### Cache Check Procedure
|
|
||||||
|
|
||||||
```
|
|
||||||
bundle_path = .archeflow/context/colette-bundle.md
|
|
||||||
|
|
||||||
if bundle_path does not exist → generate
|
|
||||||
if bundle_path exists:
|
|
||||||
bundle_mtime = mtime of bundle_path
|
|
||||||
for each resolved source file:
|
|
||||||
if source_mtime > bundle_mtime → regenerate, break
|
|
||||||
if no source file is newer → use cached bundle
|
|
||||||
```
|
|
||||||
|
|
||||||
When the cache is valid, emit:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./lib/archeflow-event.sh "$RUN_ID" decision init "" \
|
|
||||||
'{"what":"colette_bundle_cache","chosen":"reuse","reason":"all sources older than bundle"}'
|
|
||||||
```
|
|
||||||
|
|
||||||
When regenerating:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./lib/archeflow-event.sh "$RUN_ID" decision init "" \
|
|
||||||
'{"what":"colette_bundle_cache","chosen":"regenerate","reason":"<file> modified since last bundle"}'
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Per-Agent Attention Filters
|
## Per-Agent Attention Filters
|
||||||
|
|
||||||
Not every agent needs the full bundle. The bridge defines attention filters that control which sections each archetype receives. This extends the base attention filters from `archeflow:attention-filters`.
|
Not every agent needs the full bundle:
|
||||||
|
|
||||||
| Archetype | Bundle sections injected | Rationale |
|
| Archetype | Receives |
|
||||||
|-----------|------------------------|-----------|
|
|-----------|----------|
|
||||||
| **Explorer** | Full bundle | Needs all context for research — setting, characters, voice, rules |
|
| Explorer | Full bundle |
|
||||||
| **Creator** | Voice dimensions + persona rules + characters | Designs outline — needs to know who speaks how, who exists, what's allowed |
|
| Creator | Voice dimensions + persona rules + characters |
|
||||||
| **Maker** | Full bundle | Writes prose — needs voice for style, characters for dialogue, rules for guardrails |
|
| Maker | Full bundle |
|
||||||
| **Guardian** | Characters + series shared_concepts | Checks consistency — needs character facts and cross-story constraints |
|
| Guardian | Characters + series shared_concepts |
|
||||||
| **Sage** | Voice profile (full, including verboten/erlaubt) + persona rules | Checks voice drift — needs the complete voice spec and persona constraints |
|
| Sage | Full voice profile (incl. verboten/erlaubt) + persona rules |
|
||||||
| **Trickster** | Characters + series glossary | Tests continuity — needs character facts and terminology for contradiction checks |
|
| Trickster | Characters + series glossary |
|
||||||
|
|
||||||
### Filter Implementation
|
Custom archetypes inherit the filter of their closest base archetype. Override with `colette_filter` in archetype frontmatter:
|
||||||
|
|
||||||
When injecting the bundle into an agent prompt, extract only the relevant sections:
|
|
||||||
|
|
||||||
```
|
|
||||||
# For Guardian:
|
|
||||||
Extract: "## Characters" section (all characters)
|
|
||||||
Extract: "## Series Context" section (if present)
|
|
||||||
Skip: everything else
|
|
||||||
|
|
||||||
# For Sage:
|
|
||||||
Extract: "## Voice Profile" section (full, with forbidden/allowed)
|
|
||||||
Extract: "## Persona" section (rules subsection)
|
|
||||||
Skip: characters, series, project rules
|
|
||||||
|
|
||||||
# For Explorer and Maker:
|
|
||||||
Inject: full bundle as-is
|
|
||||||
```
|
|
||||||
|
|
||||||
The filtering happens at prompt assembly time, not at bundle generation time. One bundle, multiple filtered views.
|
|
||||||
|
|
||||||
### Custom Archetypes
|
|
||||||
|
|
||||||
Custom archetypes (e.g., `story-explorer`, `story-sage`) inherit the filter of their closest base archetype:
|
|
||||||
|
|
||||||
| Custom archetype | Inherits filter from | Override |
|
|
||||||
|-----------------|---------------------|----------|
|
|
||||||
| `story-explorer` | Explorer | Full bundle |
|
|
||||||
| `story-sage` | Sage | Full voice profile + persona rules |
|
|
||||||
| `story-guardian` | Guardian | Characters + series |
|
|
||||||
|
|
||||||
If a custom archetype needs a different filter, define it in the archetype's markdown frontmatter:
|
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
---
|
|
||||||
name: story-sage
|
|
||||||
colette_filter: [voice_profile, persona, characters]
|
colette_filter: [voice_profile, persona, characters]
|
||||||
---
|
|
||||||
```
|
```
|
||||||
|
|
||||||
The `colette_filter` field accepts section keys: `voice_profile`, `persona`, `characters`, `series`, `project_rules`, `full`.
|
Section keys: `voice_profile`, `persona`, `characters`, `series`, `project_rules`, `full`.
|
||||||
|
|
||||||
---
|
## Run Integration
|
||||||
|
|
||||||
## Integration with Run Skill
|
|
||||||
|
|
||||||
The Colette Bridge hooks into `archeflow:run` initialization. The sequence is:
|
|
||||||
|
|
||||||
```
|
```
|
||||||
run.start
|
run.start
|
||||||
├── Domain detection (from archeflow:domains)
|
+-- Domain detection -> colette.yaml found -> domain = writing
|
||||||
│ └── colette.yaml found → domain = writing
|
+-- Colette Bridge activation
|
||||||
├── Colette Bridge activation
|
| +-- Resolve files
|
||||||
│ ├── Resolve files (voice profile, persona, characters, CLAUDE.md)
|
| +-- Check/refresh bundle cache
|
||||||
│ ├── Check bundle cache
|
| +-- Register bundle in artifact routing
|
||||||
│ ├── Generate/refresh bundle → .archeflow/context/colette-bundle.md
|
+-- Continue to Plan phase
|
||||||
│ └── Register bundle path in artifact routing
|
|
||||||
└── Continue to Plan phase
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Artifact Routing Registration
|
**Prompt injection order:**
|
||||||
|
1. Archetype definition
|
||||||
The bundle path is registered so that every phase's context injection includes the (filtered) bundle:
|
2. Domain-specific review focus
|
||||||
|
|
||||||
```
|
|
||||||
artifact_routing.register_context(
|
|
||||||
path = ".archeflow/context/colette-bundle.md",
|
|
||||||
inject_at = "all_phases",
|
|
||||||
filter_by = "archetype" # Apply per-agent attention filters
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
In practice, this means the run skill prepends the filtered bundle content to each agent's prompt, after the standard task description but before phase-specific artifacts.
|
|
||||||
|
|
||||||
### Prompt Injection Order
|
|
||||||
|
|
||||||
```
|
|
||||||
1. Archetype definition (from SKILL.md or custom archetype .md)
|
|
||||||
2. Domain-specific review focus (from archeflow:domains)
|
|
||||||
3. Colette bundle (filtered for this archetype)
|
3. Colette bundle (filtered for this archetype)
|
||||||
4. Task description
|
4. Task description
|
||||||
5. Phase-specific artifacts (Explorer output, Creator proposal, etc.)
|
5. Phase-specific artifacts
|
||||||
6. Cycle feedback (if cycle 2+)
|
6. Cycle feedback (if cycle 2+)
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Example: Giesing Gschichten
|
|
||||||
|
|
||||||
Given this `colette.yaml`:
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
project:
|
|
||||||
name: "Giesing Gschichten"
|
|
||||||
author: "C. Nennemann"
|
|
||||||
language: de
|
|
||||||
type: fiction
|
|
||||||
|
|
||||||
voice:
|
|
||||||
profile: vp-giesing-gschichten-v1
|
|
||||||
|
|
||||||
writing:
|
|
||||||
target_words: 6000
|
|
||||||
style: "Ich-Erzaehler, lakonisch, Eberhofer-meets-Grossstadt"
|
|
||||||
```
|
|
||||||
|
|
||||||
The bridge:
|
|
||||||
|
|
||||||
1. Reads `voice.profile: vp-giesing-gschichten-v1`
|
|
||||||
2. Searches for `./profiles/vp-giesing-gschichten-v1.yaml` — not found
|
|
||||||
3. Searches for `../writing.colette/profiles/vp-giesing-gschichten-v1.yaml` — found
|
|
||||||
4. Infers persona from voice profile ID pattern or searches `personas/` — finds `giesinger.yaml` at `../writing.colette/personas/giesinger.yaml`
|
|
||||||
5. Globs `characters/*.yaml` — finds `alex.yaml` (and others if present)
|
|
||||||
6. Reads `CLAUDE.md` for writing rules
|
|
||||||
7. Generates bundle:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
# Writing Context (auto-loaded from Colette)
|
|
||||||
|
|
||||||
## Voice Profile: vp-giesing-gschichten-v1
|
|
||||||
**Tone:** Lakonisch, warmherzig-genervt, trockener Humor
|
|
||||||
**Perspective:** Ich-Erzaehler (Alex), nah dran, subjektiv
|
|
||||||
**Density:** Alltagsdetails die Atmosphaere schaffen
|
|
||||||
**Attitude:** Lakonisch, leicht genervt, aber mit Herz
|
|
||||||
**Sharpness:** Beobachtungsscharf, sprachlich reduziert
|
|
||||||
**Humor:** Trocken, Understatement, absurde Situationen
|
|
||||||
**Tempo:** Gemaechlich mit Spannungsspitzen, Slow Burn
|
|
||||||
**Reader relationship:** Kumpel am Stammtisch
|
|
||||||
|
|
||||||
### Forbidden
|
|
||||||
- Hochdeutsch-Sterilitaet
|
|
||||||
- Krimi-Klischees (CSI, Profiler, Tatort)
|
|
||||||
- Lederhosen-Kitsch und Oktoberfest-Folklore
|
|
||||||
- Dialekt-Overkill
|
|
||||||
- Moralisieren oder Erklaeren
|
|
||||||
- Kuenstliche Spannungsaufbauten
|
|
||||||
- Adverb-Orgien und Adjektiv-Ketten
|
|
||||||
- Infodumps
|
|
||||||
|
|
||||||
### Allowed
|
|
||||||
- Bairische Einsprengsel in Hochdeutsch-Prosa
|
|
||||||
- Essen und Trinken als Leitmotiv
|
|
||||||
- Kiffer-Humor und Slow-Motion-Beobachtungen
|
|
||||||
- Gentrification-Satire
|
|
||||||
- Echte Giesinger Orte und Strassen
|
|
||||||
- Skurrile Nachbarn
|
|
||||||
- Kriminalplot aus dem Alltag
|
|
||||||
- Kurze, lakonische Dialoge
|
|
||||||
|
|
||||||
### Style models
|
|
||||||
- Rita Falk (Erzaehlton), Wolf Haas (lakonisch), Helmut Dietl (Muenchner Milieu), Friedrich Ani (duester), Bukowski (Anti-Held)
|
|
||||||
|
|
||||||
## Persona: giesinger
|
|
||||||
**Name:** Der Giesinger
|
|
||||||
**Bio:** Erzaehlt Geschichten aus Muenchen-Giesing. Eberhofer meets Grossstadt.
|
|
||||||
**Genres:** Krimi, Kurzgeschichte, Milieustudie
|
|
||||||
|
|
||||||
### Rules
|
|
||||||
- Ich-Erzaehler, immer — Alex erzaehlt
|
|
||||||
- Hauptsaechlich Hochdeutsch mit bairischen Einsprengsel
|
|
||||||
- Jede Geschichte hat einen Kriminalplot
|
|
||||||
- Essen/Trinken in jeder Geschichte
|
|
||||||
- Echte Giesinger Orte und Strassen
|
|
||||||
- Humor durch Understatement
|
|
||||||
- Alex ist kein Ermittler
|
|
||||||
- Figuren reden wie echte Menschen
|
|
||||||
|
|
||||||
## Characters
|
|
||||||
### Alex (protagonist)
|
|
||||||
- **Age:** Mitte 30
|
|
||||||
- **Key traits:** Lakonisch, funktionaler Kiffer, unmotiviert aber nicht dumm
|
|
||||||
- **Speech:** Kurze Saetze, Hochdeutsch mit bairischen Einsprengsel.
|
|
||||||
- **Relationships:** Mo — Nachbar, Kumpel und Unruhestifter
|
|
||||||
|
|
||||||
## Project Rules (from CLAUDE.md)
|
|
||||||
- Jede Geschichte beginnt mit einer Alltagsszene
|
|
||||||
- Kriminalplot ergibt sich organisch aus dem Alltag
|
|
||||||
- Essen/Trinken in jeder Geschichte
|
|
||||||
- Echte Giesinger Orte verwenden
|
|
||||||
- Kein Moralisieren, kein Erklaerbaer
|
|
||||||
- Ende muss nicht alles aufloesen
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Design Principles
|
|
||||||
|
|
||||||
1. **Summarize, don't dump.** Raw YAML wastes tokens and confuses agents. The bundle is a curated briefing.
|
|
||||||
2. **Cache aggressively.** Voice profiles and characters rarely change mid-run. Only regenerate when mtimes change.
|
|
||||||
3. **Filter per agent.** A Guardian checking plot consistency does not need the full voice profile. A Sage checking voice drift does not need character sheets.
|
|
||||||
4. **Graceful degradation.** Missing files are warned about, not fatal. A project with `colette.yaml` but no characters/ still works — the Characters section is simply empty.
|
|
||||||
5. **One bundle, filtered views.** Generate the full bundle once. Filter at injection time per archetype. This keeps caching simple.
|
|
||||||
6. **Additive to existing skills.** The bridge does not replace domain detection or artifact routing — it hooks into them. Remove the bridge, everything still works (just without auto-loaded writing context).
|
|
||||||
|
|||||||
@@ -1,249 +0,0 @@
|
|||||||
---
|
|
||||||
name: convergence
|
|
||||||
description: |
|
|
||||||
Detects convergence, stalling, and oscillation in multi-cycle PDCA runs. Prevents wasted cycles
|
|
||||||
by stopping early when findings are not being resolved or are bouncing between cycles.
|
|
||||||
<example>Automatically loaded during Act phase before exit decision</example>
|
|
||||||
<example>User: "Is the run converging?"</example>
|
|
||||||
---
|
|
||||||
|
|
||||||
# Convergence Detection
|
|
||||||
|
|
||||||
In multi-cycle PDCA runs, the Act phase must decide whether another cycle will help or just waste tokens. This skill provides the analysis: are findings being resolved (converging), staying the same (stalling), or bouncing back (oscillating)?
|
|
||||||
|
|
||||||
## When It Runs
|
|
||||||
|
|
||||||
Convergence analysis runs **after the Check phase completes and before the Act phase exit decision**. It requires at least 2 cycles of data — on cycle 1, it is skipped (no comparison baseline).
|
|
||||||
|
|
||||||
```
|
|
||||||
Check phase → Convergence Analysis → Act phase exit decision
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Step 1: Finding Comparison
|
|
||||||
|
|
||||||
Extract findings from the current cycle and compare against the previous cycle.
|
|
||||||
|
|
||||||
### Data Sources
|
|
||||||
|
|
||||||
- **Current cycle findings:** Parsed from `check-*.md` artifacts in `.archeflow/artifacts/<run_id>/`
|
|
||||||
- **Previous cycle findings:** Parsed from `check-*.md` artifacts in `.archeflow/artifacts/<run_id>/cycle-<N-1>/`
|
|
||||||
|
|
||||||
Each finding is identified by a composite key: `source + category + file_location + description_keywords`.
|
|
||||||
|
|
||||||
### Finding Categories
|
|
||||||
|
|
||||||
Every finding from the current cycle is classified into exactly one category:
|
|
||||||
|
|
||||||
| Category | Definition |
|
|
||||||
|----------|------------|
|
|
||||||
| **NEW** | Finding not present in any previous cycle |
|
|
||||||
| **RESOLVED** | Was present in the previous cycle, absent in the current cycle |
|
|
||||||
| **PERSISTENT** | Present in both the current and previous cycle (same key) |
|
|
||||||
| **REGRESSED** | Was RESOLVED in the previous cycle (was present in N-2, absent in N-1), but returned in the current cycle |
|
|
||||||
|
|
||||||
### Matching Algorithm
|
|
||||||
|
|
||||||
Two findings match if:
|
|
||||||
1. Same `source` archetype (guardian, sage, etc.)
|
|
||||||
2. Same `category` (security, reliability, quality, etc.)
|
|
||||||
3. Same or overlapping file location (same file, line within 10 lines)
|
|
||||||
4. 50%+ keyword overlap in description (lowercase, strip punctuation)
|
|
||||||
|
|
||||||
All four conditions must hold. This prevents false matches across unrelated findings.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Step 2: Convergence Score
|
|
||||||
|
|
||||||
Calculate a convergence score from the categorized findings:
|
|
||||||
|
|
||||||
```
|
|
||||||
convergence = resolved_count / (resolved_count + new_count + regressed_count)
|
|
||||||
```
|
|
||||||
|
|
||||||
If the denominator is 0 (no resolved, no new, no regressed — only persistent), the score is `0.0` (stalled, not converging).
|
|
||||||
|
|
||||||
### Score Interpretation
|
|
||||||
|
|
||||||
| Score Range | Status | Meaning |
|
|
||||||
|-------------|--------|---------|
|
|
||||||
| > 0.8 | **Converging** | Most issues being resolved, few new ones introduced |
|
|
||||||
| 0.5 - 0.8 | **Stalling** | Fixing roughly as many as introducing |
|
|
||||||
| < 0.5 | **Diverging** | Making things worse — more new/regressed than resolved |
|
|
||||||
| 0.0 (all persistent) | **Stuck** | No progress in either direction |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Step 3: Oscillation Detection
|
|
||||||
|
|
||||||
An oscillating finding is one that bounces between resolved and re-introduced across cycles:
|
|
||||||
|
|
||||||
1. Finding was present in cycle N-2
|
|
||||||
2. Finding was absent in cycle N-1 (resolved)
|
|
||||||
3. Finding is present again in cycle N (regressed)
|
|
||||||
|
|
||||||
This indicates the fix in cycle N-1 was undone or invalidated by other changes in cycle N.
|
|
||||||
|
|
||||||
### Oscillation Rules
|
|
||||||
|
|
||||||
- A single oscillating finding: **flag it** in the convergence report but continue.
|
|
||||||
- Two or more oscillating findings: **STOP** and escalate to the user.
|
|
||||||
- Message: `"Findings X and Y are oscillating between cycles. Manual intervention needed — the automated fixes are interfering with each other."`
|
|
||||||
|
|
||||||
Oscillation tracking requires 3+ cycles of data. On cycles 1-2, oscillation detection is skipped.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Step 4: Early Termination Rules
|
|
||||||
|
|
||||||
The convergence analysis can override the normal Act phase exit decision. If any of these conditions hold, the recommendation is **STOP**:
|
|
||||||
|
|
||||||
| Condition | Threshold | Recommendation |
|
|
||||||
|-----------|-----------|----------------|
|
|
||||||
| Diverging | Score < 0.5 for 2 consecutive cycles | STOP — changes are making things worse |
|
|
||||||
| Stalled | 0 findings resolved between cycles | STOP — no progress, further cycles will not help |
|
|
||||||
| Stuck | All findings are PERSISTENT for 2 consecutive cycles | STOP — automated fixes cannot resolve these |
|
|
||||||
| Oscillating | 2+ findings oscillating | STOP — fixes are interfering with each other |
|
|
||||||
|
|
||||||
When STOP is recommended, the Act phase should:
|
|
||||||
1. **Not** start another PDCA cycle
|
|
||||||
2. Report all unresolved findings to the user
|
|
||||||
3. Present the best implementation so far (on its branch, not merged)
|
|
||||||
4. Include the convergence report explaining why the run was stopped
|
|
||||||
|
|
||||||
### Override Behavior
|
|
||||||
|
|
||||||
The convergence STOP recommendation overrides the normal cycle-back logic in the Act phase. Even if `CYCLE < MAX_CYCLES` and there are fixable-looking findings, if convergence says STOP, the run stops.
|
|
||||||
|
|
||||||
The user can always override by explicitly requesting another cycle: `"Run one more cycle anyway"`.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Step 5: Integration with Act Phase
|
|
||||||
|
|
||||||
### Event Data
|
|
||||||
|
|
||||||
Convergence data is included in the `cycle.boundary` event emitted by the Act phase:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"type": "cycle.boundary",
|
|
||||||
"phase": "act",
|
|
||||||
"data": {
|
|
||||||
"cycle": 2,
|
|
||||||
"max_cycles": 3,
|
|
||||||
"exit_condition": "convergence_stop",
|
|
||||||
"met": false,
|
|
||||||
"fixes_applied": 2,
|
|
||||||
"next_action": "stop",
|
|
||||||
"convergence": {
|
|
||||||
"score": 0.35,
|
|
||||||
"status": "diverging",
|
|
||||||
"resolved": 1,
|
|
||||||
"new": 2,
|
|
||||||
"regressed": 1,
|
|
||||||
"persistent": 3,
|
|
||||||
"oscillating": ["Timeline reference mismatch"],
|
|
||||||
"recommendation": "stop",
|
|
||||||
"reason": "Diverging for 2 consecutive cycles"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Decision Tree Update
|
|
||||||
|
|
||||||
The Act phase decision tree (from `act-phase` skill Step 4) gains a new first branch:
|
|
||||||
|
|
||||||
```
|
|
||||||
┌─ Convergence analysis (cycle 2+)
|
|
||||||
│
|
|
||||||
├─ Convergence says STOP
|
|
||||||
│ └─ STOP: Report to user with convergence report
|
|
||||||
│
|
|
||||||
├─ Convergence says CONTINUE
|
|
||||||
│ └─ Fall through to normal exit decision logic
|
|
||||||
│
|
|
||||||
└─ Cycle 1 (no convergence data)
|
|
||||||
└─ Fall through to normal exit decision logic
|
|
||||||
```
|
|
||||||
|
|
||||||
### Act Feedback Enhancement
|
|
||||||
|
|
||||||
When the Act phase builds `act-feedback.md` for the next cycle, it includes the convergence summary at the top:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
## Convergence Analysis (Cycle 1 → 2)
|
|
||||||
|
|
||||||
Score: 0.75 (converging)
|
|
||||||
Resolved: 3 | New: 1 | Regressed: 0 | Persistent: 2
|
|
||||||
|
|
||||||
Recommendation: Continue — trend is positive
|
|
||||||
|
|
||||||
### Finding Status
|
|
||||||
| Finding | Status | Cycles |
|
|
||||||
|---------|--------|--------|
|
|
||||||
| SQL injection in user input | RESOLVED | 1 |
|
|
||||||
| Missing rate limit | RESOLVED | 1 |
|
|
||||||
| Test names unclear | RESOLVED | 1 |
|
|
||||||
| Null check missing in parser | PERSISTENT | 2 |
|
|
||||||
| Error path not tested | PERSISTENT | 2 |
|
|
||||||
| New: Unused import introduced | NEW | 1 |
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Convergence Report Format
|
|
||||||
|
|
||||||
The full convergence report is generated as part of the orchestration output:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
## Convergence Analysis (Cycle N-1 → N)
|
|
||||||
|
|
||||||
**Score:** 0.75 (converging)
|
|
||||||
**Resolved:** 3 | **New:** 1 | **Regressed:** 0 | **Persistent:** 2 | **Oscillating:** 0
|
|
||||||
|
|
||||||
### Resolved This Cycle
|
|
||||||
| Source | Category | Description |
|
|
||||||
|--------|----------|-------------|
|
|
||||||
| guardian | security | SQL injection in user input handler |
|
|
||||||
| guardian | reliability | Missing rate limit on auth endpoint |
|
|
||||||
| sage | quality | Test names don't describe behavior |
|
|
||||||
|
|
||||||
### New This Cycle
|
|
||||||
| Source | Category | Description |
|
|
||||||
|--------|----------|-------------|
|
|
||||||
| sage | quality | Unused import introduced by fix |
|
|
||||||
|
|
||||||
### Persistent (unresolved across cycles)
|
|
||||||
| Source | Category | Description | Cycles Open |
|
|
||||||
|--------|----------|-------------|-------------|
|
|
||||||
| trickster | reliability | Null check missing in parser | 2 |
|
|
||||||
| sage | testing | Error path not tested | 2 |
|
|
||||||
|
|
||||||
### Oscillating
|
|
||||||
(none)
|
|
||||||
|
|
||||||
**Recommendation:** Continue — trend is positive
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Integration with Memory Skill
|
|
||||||
|
|
||||||
When convergence detects PERSISTENT findings (present for 2+ cycles), these are strong candidates for the `memory` skill's lesson extraction:
|
|
||||||
|
|
||||||
- After a run that had persistent findings, `archeflow-memory.sh extract` will pick these up with higher confidence (they have been confirmed across multiple cycles within a single run).
|
|
||||||
- Persistent findings that also appear in `lessons.jsonl` from prior runs get a double frequency boost (cross-cycle within run + cross-run pattern).
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Design Principles
|
|
||||||
|
|
||||||
1. **Conservative stopping.** Requires 2 consecutive data points before recommending STOP. A single bad cycle might be noise.
|
|
||||||
2. **User has final say.** STOP is a recommendation, not an enforced shutdown. The user can override.
|
|
||||||
3. **Cheap computation.** Keyword matching on finding descriptions, simple arithmetic on counts. No ML, no embeddings.
|
|
||||||
4. **Bounded scope.** Only compares adjacent cycles (N vs N-1, with N-2 for oscillation). Does not attempt to model long-term trends across many cycles.
|
|
||||||
5. **Observable.** All convergence data is included in the `cycle.boundary` event, making it available for post-hoc analysis via the process log.
|
|
||||||
@@ -8,320 +8,87 @@ description: |
|
|||||||
<example>Automatically active when budget is configured</example>
|
<example>Automatically active when budget is configured</example>
|
||||||
---
|
---
|
||||||
|
|
||||||
# Cost Tracking — Budget-Aware Orchestration
|
# Cost Tracking -- Budget-Aware Orchestration
|
||||||
|
|
||||||
Every ArcheFlow orchestration consumes LLM tokens. This skill tracks costs per agent and per run, enforces budgets, and recommends cost-optimal model assignments.
|
Tracks costs per agent and per run, enforces budgets, and selects cost-optimal models.
|
||||||
|
|
||||||
## Model Pricing Table
|
## Model Pricing
|
||||||
|
|
||||||
Current pricing (update when models change):
|
| Model | Input ($/M tok) | Output ($/M tok) |
|
||||||
|
|-------|----------------:|-----------------:|
|
||||||
|
| claude-opus-4-6 | 15.00 | 75.00 |
|
||||||
|
| claude-sonnet-4-6 | 3.00 | 15.00 |
|
||||||
|
| claude-haiku-4-5 | 0.80 | 4.00 |
|
||||||
|
|
||||||
| Model | Input ($/M tokens) | Output ($/M tokens) | Notes |
|
**Prompt caching:** 90% discount on cached input tokens. Structure system prompts for cache hits.
|
||||||
|-------|--------------------:|---------------------:|-------|
|
**Batches API:** 50% discount. Use for non-time-sensitive bulk ops.
|
||||||
| `claude-opus-4-6` | 15.00 | 75.00 | Highest quality, use sparingly |
|
|
||||||
| `claude-sonnet-4-6` | 3.00 | 15.00 | Good balance of quality and cost |
|
|
||||||
| `claude-haiku-4-5` | 0.80 | 4.00 | Cheap, fast, good for structured tasks |
|
|
||||||
|
|
||||||
**Prompt caching** (when applicable): 90% discount on cached input tokens. The orchestrator should structure system prompts to maximize cache hits (archetype instructions, voice profiles, and domain context are cache-friendly since they repeat across agents in a run).
|
## Cost Calculation
|
||||||
|
|
||||||
**Batches API**: 50% discount on all tokens. Use for non-time-sensitive bulk operations (validation passes, consistency checks).
|
|
||||||
|
|
||||||
## Per-Agent Cost Tracking
|
|
||||||
|
|
||||||
Every `agent.complete` event includes cost data:
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
{
|
|
||||||
"type": "agent.complete",
|
|
||||||
"data": {
|
|
||||||
"archetype": "story-explorer",
|
|
||||||
"duration_ms": 87605,
|
|
||||||
"tokens_input": 15000,
|
|
||||||
"tokens_output": 6000,
|
|
||||||
"tokens_cache_read": 8000,
|
|
||||||
"model": "haiku",
|
|
||||||
"estimated_cost_usd": 0.02,
|
|
||||||
"summary": "3 plot directions developed, recommended C"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Cost Calculation
|
|
||||||
|
|
||||||
```
|
```
|
||||||
cost = (tokens_input - tokens_cache_read) * input_price / 1_000_000
|
cost = (input - cache_read) * input_price/1M
|
||||||
+ tokens_cache_read * input_price * 0.10 / 1_000_000
|
+ cache_read * input_price * 0.10/1M
|
||||||
+ tokens_output * output_price / 1_000_000
|
+ output * output_price/1M
|
||||||
```
|
```
|
||||||
|
|
||||||
If exact token counts are unavailable (Claude Code doesn't always expose them), estimate based on character count:
|
If exact tokens unavailable, estimate: `tokens ~= chars / 4`. Mark with `cost_estimated: true`.
|
||||||
|
|
||||||
```
|
## Default Model Assignments
|
||||||
estimated_tokens = character_count / 4 # rough heuristic
|
|
||||||
```
|
|
||||||
|
|
||||||
Mark estimated costs with `"cost_estimated": true` in the event data so reports can distinguish measured from estimated values.
|
| Archetype | Code | Writing |
|
||||||
|
|-----------|------|---------|
|
||||||
|
| Explorer | haiku | haiku |
|
||||||
|
| Creator | sonnet | sonnet |
|
||||||
|
| Maker | sonnet | **sonnet** |
|
||||||
|
| Guardian | haiku | haiku |
|
||||||
|
| Skeptic | haiku | haiku |
|
||||||
|
| Sage | sonnet | **sonnet** |
|
||||||
|
| Trickster | haiku | haiku |
|
||||||
|
|
||||||
## Run-Level Aggregation
|
Opus is user-opt-in only (team preset `model_overrides`).
|
||||||
|
|
||||||
The `run.complete` event includes cost totals:
|
**Resolution order:** team preset override > domain override > archetype default.
|
||||||
|
|
||||||
```jsonl
|
## Pre-Agent Cost Estimates
|
||||||
{
|
|
||||||
"type": "run.complete",
|
|
||||||
"data": {
|
|
||||||
"status": "completed",
|
|
||||||
"total_tokens_input": 95000,
|
|
||||||
"total_tokens_output": 33000,
|
|
||||||
"total_tokens_cache_read": 42000,
|
|
||||||
"total_cost_usd": 1.45,
|
|
||||||
"budget_usd": 10.00,
|
|
||||||
"budget_remaining_usd": 8.55,
|
|
||||||
"agents_total": 5,
|
|
||||||
"cost_by_phase": {
|
|
||||||
"plan": 0.35,
|
|
||||||
"do": 0.72,
|
|
||||||
"check": 0.38
|
|
||||||
},
|
|
||||||
"cost_by_model": {
|
|
||||||
"haiku": 0.12,
|
|
||||||
"sonnet": 1.33
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Cost Summary in Orchestration Report
|
| Archetype | Typical Input | Typical Output |
|
||||||
|
|-----------|-------------:|---------------:|
|
||||||
|
| Explorer | 8k | 4k |
|
||||||
|
| Creator | 12k | 6k |
|
||||||
|
| Maker | 15k | 12k |
|
||||||
|
| Guardian | 10k | 3k |
|
||||||
|
| Skeptic | 8k | 3k |
|
||||||
|
| Sage | 12k | 4k |
|
||||||
|
| Trickster | 8k | 4k |
|
||||||
|
|
||||||
After each orchestration, the report includes a cost section:
|
After 10+ runs, use actual averages from `metrics.jsonl` instead.
|
||||||
|
|
||||||
```markdown
|
|
||||||
## Cost Summary
|
|
||||||
| Phase | Model(s) | Tokens (in/out) | Cost |
|
|
||||||
|-------|----------|-----------------|------|
|
|
||||||
| Plan | haiku, sonnet | 32k / 12k | $0.35 |
|
|
||||||
| Do | sonnet | 40k / 15k | $0.72 |
|
|
||||||
| Check | haiku, sonnet | 23k / 6k | $0.38 |
|
|
||||||
| **Total** | | **95k / 33k** | **$1.45** |
|
|
||||||
|
|
||||||
Budget: $10.00 | Spent: $1.45 | Remaining: $8.55
|
|
||||||
```
|
|
||||||
|
|
||||||
## Budget Configuration
|
## Budget Configuration
|
||||||
|
|
||||||
Budgets are defined in team presets or `.archeflow/config.yaml`:
|
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
# .archeflow/config.yaml
|
|
||||||
budget:
|
budget:
|
||||||
per_run_usd: 10.00 # Max cost per orchestration run
|
per_run_usd: 10.00
|
||||||
per_agent_usd: 3.00 # Max cost per individual agent
|
per_agent_usd: 3.00
|
||||||
daily_usd: 50.00 # Max daily spend across all runs
|
daily_usd: 50.00
|
||||||
warn_at_percent: 75 # Warn when this % of budget is consumed
|
warn_at_percent: 75
|
||||||
```
|
```
|
||||||
|
|
||||||
```yaml
|
Team preset budget overrides global config. No budget = unlimited (costs still tracked).
|
||||||
# Team preset override
|
|
||||||
name: story-development
|
|
||||||
domain: writing
|
|
||||||
budget:
|
|
||||||
per_run_usd: 5.00 # Writing runs are usually cheaper
|
|
||||||
```
|
|
||||||
|
|
||||||
Team preset budget overrides the global config for that run.
|
|
||||||
|
|
||||||
### Budget Precedence
|
|
||||||
|
|
||||||
1. Team preset `budget` (if set)
|
|
||||||
2. `.archeflow/config.yaml` `budget`
|
|
||||||
3. No budget (unlimited) — costs are still tracked but not enforced
|
|
||||||
|
|
||||||
## Budget Enforcement
|
## Budget Enforcement
|
||||||
|
|
||||||
Budget checks happen at two points:
|
**Pre-agent:** Estimate cost. If > remaining budget: stop (autonomous) or warn (attended).
|
||||||
|
|
||||||
### 1. Pre-Agent Check (before spawning)
|
**Post-agent:** Update total. Warn at threshold. Stop if budget exceeded.
|
||||||
|
|
||||||
Before each agent is spawned, estimate its cost and check against remaining budget:
|
## Cost Optimization
|
||||||
|
|
||||||
```
|
1. **Prompt caching:** Stable content first (archetype instructions, voice profiles). Saves 30-50% on input.
|
||||||
estimated_agent_cost = estimate_tokens(archetype, task_complexity) * model_price
|
2. **Guardian fast-path (A2):** 0 issues = skip remaining reviewers. Saves $0.30-0.80/cycle.
|
||||||
remaining_budget = budget - sum(costs_so_far)
|
3. **Explorer cache:** Reuse recent research. Saves $0.02-0.05/hit.
|
||||||
|
4. **Batches API:** For autonomous/overnight review passes (50% discount).
|
||||||
if estimated_agent_cost > remaining_budget:
|
5. **Early termination:** Clean Guardian + clean Maker self-review = skip remaining cycles.
|
||||||
WARN: "Estimated cost for {archetype} (${estimated}) would exceed remaining budget (${remaining}). Continue? [y/N]"
|
|
||||||
```
|
|
||||||
|
|
||||||
**In autonomous mode**: if budget would be exceeded, STOP the run and report. Do not prompt — there is no one to answer.
|
|
||||||
|
|
||||||
**In attended mode**: warn and ask the user. They can approve the overage or stop.
|
|
||||||
|
|
||||||
### 2. Post-Agent Check (after completion)
|
|
||||||
|
|
||||||
After each agent completes, update the running total and check:
|
|
||||||
|
|
||||||
```
|
|
||||||
if total_cost > budget * warn_at_percent / 100:
|
|
||||||
WARN: "Budget ${warn_at_percent}% consumed (${total_cost} of ${budget})"
|
|
||||||
|
|
||||||
if total_cost > budget:
|
|
||||||
STOP: "Budget exceeded (${total_cost} of ${budget}). Run halted."
|
|
||||||
```
|
|
||||||
|
|
||||||
### Pre-Agent Cost Estimation
|
|
||||||
|
|
||||||
Rough token estimates by archetype (calibrate over time with actual data from `metrics.jsonl`):
|
|
||||||
|
|
||||||
| Archetype | Typical Input | Typical Output | Notes |
|
|
||||||
|-----------|-------------:|---------------:|-------|
|
|
||||||
| Explorer | 8k | 4k | Research, reads many files |
|
|
||||||
| Creator | 12k | 6k | Receives Explorer output, produces plan |
|
|
||||||
| Maker | 15k | 12k | Largest output (implementation/prose) |
|
|
||||||
| Guardian | 10k | 3k | Reads diff, structured output |
|
|
||||||
| Skeptic | 8k | 3k | Reads proposal, structured challenges |
|
|
||||||
| Sage | 12k | 4k | Reads diff + proposal |
|
|
||||||
| Trickster | 8k | 4k | Reads diff, generates test cases |
|
|
||||||
|
|
||||||
These are starting estimates. After 10+ runs, use actual averages from `metrics.jsonl` instead.
|
|
||||||
|
|
||||||
## Cost-Aware Model Selection
|
|
||||||
|
|
||||||
Each archetype has a recommended model tier based on the quality requirements of its role:
|
|
||||||
|
|
||||||
### Default Model Assignments (Code Domain)
|
|
||||||
|
|
||||||
| Archetype | Model | Rationale |
|
|
||||||
|-----------|-------|-----------|
|
|
||||||
| Explorer | haiku | Research is structured extraction — cheap model handles it well |
|
|
||||||
| Creator | sonnet | Design decisions need reasoning quality |
|
|
||||||
| Maker | sonnet | Implementation needs quality to avoid rework cycles |
|
|
||||||
| Guardian | haiku | Security/risk review is checklist-driven — structured and cheap |
|
|
||||||
| Skeptic | haiku | Challenge generation follows patterns — cheap |
|
|
||||||
| Sage | sonnet | Holistic quality judgment needs nuance |
|
|
||||||
| Trickster | haiku | Adversarial testing is systematic — cheap |
|
|
||||||
|
|
||||||
### Writing Domain Overrides
|
|
||||||
|
|
||||||
Writing tasks need higher quality for prose-generating agents:
|
|
||||||
|
|
||||||
| Archetype | Model | Rationale |
|
|
||||||
|-----------|-------|-----------|
|
|
||||||
| Explorer / story-explorer | haiku | Research is still cheap |
|
|
||||||
| Creator | sonnet | Outline design needs narrative judgment |
|
|
||||||
| Maker | **sonnet** | Prose quality is the product — cannot be cheap |
|
|
||||||
| Guardian | haiku | Plot/continuity checks are structured |
|
|
||||||
| Skeptic | haiku | Premise challenges are structured |
|
|
||||||
| Sage / story-sage | **sonnet** | Voice and craft judgment need taste |
|
|
||||||
| Trickster | haiku | Reader-confusion analysis is systematic |
|
|
||||||
|
|
||||||
**When to escalate to opus**: Only for final-pass prose polishing on high-stakes content (book manuscripts, not short stories). Never for review or research agents. The user must explicitly opt in via:
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
# Team preset
|
|
||||||
model_overrides:
|
|
||||||
maker: opus # Only for final polish pass
|
|
||||||
```
|
|
||||||
|
|
||||||
### Domain-Driven Model Selection
|
|
||||||
|
|
||||||
The effective model for each agent is resolved in this order:
|
|
||||||
|
|
||||||
1. **Team preset `model_overrides`** (highest priority — explicit choice)
|
|
||||||
2. **Domain `model_overrides`** (from `.archeflow/domains/<name>.yaml`)
|
|
||||||
3. **Archetype default** (from the table above)
|
|
||||||
4. **Custom archetype `model` field** (from archetype YAML frontmatter)
|
|
||||||
|
|
||||||
Example resolution for `story-sage` in a writing run:
|
|
||||||
- Team preset says nothing about story-sage → skip
|
|
||||||
- Writing domain says `story-sage: sonnet` → **use sonnet**
|
|
||||||
- Archetype YAML says `model: sonnet` → would have been used if domain didn't specify
|
|
||||||
|
|
||||||
## Cost Optimization Strategies
|
|
||||||
|
|
||||||
### 1. Prompt Caching
|
|
||||||
|
|
||||||
Structure prompts so that stable content comes first (maximizes cache prefix hits):
|
|
||||||
|
|
||||||
```
|
|
||||||
[System prompt — archetype instructions] ← cached across agents in same run
|
|
||||||
[Domain context — voice profile, persona] ← cached across agents in same run
|
|
||||||
[Phase context — Explorer output, proposal] ← changes per agent
|
|
||||||
[Task-specific instructions] ← changes per agent
|
|
||||||
```
|
|
||||||
|
|
||||||
Estimated savings: 30-50% on input tokens for runs with 5+ agents.
|
|
||||||
|
|
||||||
### 2. Guardian Fast-Path (A2)
|
|
||||||
|
|
||||||
When Guardian approves with 0 issues, skip Skeptic/Sage/Trickster. This saves 2-3 agent calls per cycle. See `archeflow:orchestration` skill, rule A2.
|
|
||||||
|
|
||||||
Typical savings: $0.30-0.80 per skipped cycle (depending on models).
|
|
||||||
|
|
||||||
### 3. Explorer Cache
|
|
||||||
|
|
||||||
Reuse recent Explorer research instead of re-running. See `archeflow:orchestration` skill, Explorer Cache section.
|
|
||||||
|
|
||||||
Typical savings: $0.02-0.05 per cache hit (haiku Explorer).
|
|
||||||
|
|
||||||
### 4. Batches API for Bulk Operations
|
|
||||||
|
|
||||||
When running consistency checks, validation passes, or other non-time-sensitive work across multiple files, use the Batches API (50% discount):
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
# Mark agents as batch-eligible in team presets
|
|
||||||
batch_eligible:
|
|
||||||
- guardian # Structured review, can wait
|
|
||||||
- skeptic # Challenge generation, can wait
|
|
||||||
```
|
|
||||||
|
|
||||||
Only use batches when the user is not waiting for real-time results (overnight runs, autonomous mode).
|
|
||||||
|
|
||||||
### 5. Early Termination
|
|
||||||
|
|
||||||
If the first cycle produces a clean Guardian pass (A2 fast-path) AND the Maker's self-review checklist is clean, skip the remaining cycles even if `max_cycles > 1`. This avoids spending tokens on unnecessary verification.
|
|
||||||
|
|
||||||
## Daily Cost Tracking
|
## Daily Cost Tracking
|
||||||
|
|
||||||
Across runs, maintain a daily cost ledger:
|
Ledger at `.archeflow/costs/<YYYY-MM-DD>.jsonl`. One line per run with cost, tokens, models, domain. Daily budget enforcement reads this before starting new runs.
|
||||||
|
|
||||||
```
|
|
||||||
.archeflow/costs/<YYYY-MM-DD>.jsonl
|
|
||||||
```
|
|
||||||
|
|
||||||
Each line is one run's cost summary:
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
{"run_id":"2026-04-03-der-huster","cost_usd":1.45,"tokens_input":95000,"tokens_output":33000,"models":{"haiku":2,"sonnet":3},"domain":"writing"}
|
|
||||||
{"run_id":"2026-04-03-auth-refactor","cost_usd":2.10,"tokens_input":120000,"tokens_output":45000,"models":{"haiku":3,"sonnet":2},"domain":"code"}
|
|
||||||
```
|
|
||||||
|
|
||||||
Daily budget enforcement reads this file to check `daily_usd` limits before starting new runs.
|
|
||||||
|
|
||||||
### Cost Report Command
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Show today's costs
|
|
||||||
./lib/archeflow-costs.sh today
|
|
||||||
|
|
||||||
# Show costs for a date range
|
|
||||||
./lib/archeflow-costs.sh 2026-04-01 2026-04-03
|
|
||||||
|
|
||||||
# Show costs for a specific run
|
|
||||||
./lib/archeflow-costs.sh run 2026-04-03-der-huster
|
|
||||||
```
|
|
||||||
|
|
||||||
## Integration with Other Skills
|
|
||||||
|
|
||||||
- **`orchestration`**: Calls pre-agent and post-agent budget checks. Includes cost summary in orchestration report.
|
|
||||||
- **`process-log`**: Cost data is embedded in `agent.complete` and `run.complete` events. No separate cost events needed.
|
|
||||||
- **`domains`**: Reads `model_overrides` from the active domain to determine effective model per agent.
|
|
||||||
- **`autonomous-mode`**: Enforces budget strictly (no prompts — just stop on budget exceeded). Uses daily budget to limit overnight spend.
|
|
||||||
- **`workflow-design`**: Custom workflows can specify per-phase model assignments that override domain defaults.
|
|
||||||
|
|
||||||
## Design Principles
|
|
||||||
|
|
||||||
1. **Track always, enforce optionally.** Cost data is in every event regardless of whether a budget is set. Budget enforcement is opt-in.
|
|
||||||
2. **Estimate before spend.** Always estimate before spawning an agent. Surprises are worse than slightly inaccurate estimates.
|
|
||||||
3. **Cheapest model that works.** Default to haiku. Upgrade to sonnet only when the task demonstrably needs it. Opus is user-opt-in only.
|
|
||||||
4. **Transparent.** Every cost shows up in the orchestration report. No hidden token spend.
|
|
||||||
5. **Learn from history.** After enough runs, replace estimates with actual averages from `metrics.jsonl`.
|
|
||||||
|
|||||||
@@ -1,181 +1,58 @@
|
|||||||
---
|
---
|
||||||
name: custom-archetypes
|
name: custom-archetypes
|
||||||
description: Use when the user wants to create domain-specific archetypes — specialized agent roles beyond the 7 built-in ones. For example a database reviewer, compliance auditor, or accessibility tester.
|
description: Use when the user wants to create domain-specific archetypes -- specialized agent roles beyond the 7 built-in ones.
|
||||||
---
|
---
|
||||||
|
|
||||||
# Custom Archetypes
|
# Custom Archetypes
|
||||||
|
|
||||||
ArcheFlow's 7 built-in archetypes cover general software engineering. Custom archetypes add **domain expertise** — a database specialist, a compliance auditor, an accessibility reviewer.
|
Add domain expertise beyond the 7 built-ins: database specialist, compliance auditor, accessibility reviewer, etc.
|
||||||
|
|
||||||
## When to Create One
|
## When to Create
|
||||||
|
|
||||||
- A recurring review concern isn't covered by built-in archetypes
|
- A recurring review concern isn't covered by built-ins
|
||||||
- You need domain knowledge (GDPR, PCI-DSS, WCAG, SQL optimization)
|
- You need domain knowledge (GDPR, PCI-DSS, WCAG, SQL optimization)
|
||||||
- The same custom instructions are used in multiple orchestrations
|
- Same custom instructions used across multiple orchestrations
|
||||||
|
|
||||||
## Archetype Definition
|
## Definition Format
|
||||||
|
|
||||||
Create a markdown file in your project at `.archeflow/archetypes/<id>.md`:
|
Create `.archeflow/archetypes/<id>.md`:
|
||||||
|
|
||||||
```markdown
|
```markdown
|
||||||
# <Name>
|
# <Name>
|
||||||
|
|
||||||
## Identity
|
## Identity
|
||||||
**ID:** <lowercase-with-hyphens>
|
**ID:** <lowercase-with-hyphens>
|
||||||
**Role:** <one sentence — what this archetype does>
|
**Role:** <one sentence>
|
||||||
**Lens:** <the question this archetype always asks>
|
**Lens:** <the one question this archetype always asks>
|
||||||
**Model tier:** cheap | standard | premium
|
**Model tier:** cheap | standard | premium
|
||||||
|
|
||||||
## Behavior
|
## Behavior
|
||||||
<System prompt injected into the agent. Define:
|
<System prompt: what to look for, how to evaluate, output format, decision criteria>
|
||||||
- What to look for
|
|
||||||
- How to evaluate
|
|
||||||
- What output format to use
|
|
||||||
- Decision criteria for approve/reject>
|
|
||||||
|
|
||||||
## Outputs
|
## Outputs
|
||||||
<What message types this archetype produces>
|
<Message types: Research, Proposal, Challenge, RiskAssessment, QualityReport, Implementation>
|
||||||
- Research (if it gathers info)
|
|
||||||
- Proposal (if it designs)
|
|
||||||
- Challenge (if it critiques)
|
|
||||||
- RiskAssessment (if it assesses risk)
|
|
||||||
- QualityReport (if it reviews quality)
|
|
||||||
- Implementation (if it writes code)
|
|
||||||
|
|
||||||
## Shadow
|
## Shadow
|
||||||
**Name:** <the dysfunction>
|
**Name:** <dysfunction name>
|
||||||
**Strength inverted:** <how the core strength becomes destructive>
|
**Strength inverted:** <how core strength becomes destructive>
|
||||||
**Symptoms:**
|
**Symptoms:** <3 observable behaviors>
|
||||||
- <observable behavior 1>
|
|
||||||
- <observable behavior 2>
|
|
||||||
- <observable behavior 3>
|
|
||||||
**Correction:** <specific prompt to course-correct>
|
**Correction:** <specific prompt to course-correct>
|
||||||
```
|
```
|
||||||
|
|
||||||
## Examples
|
## Composition
|
||||||
|
|
||||||
### Database Specialist
|
Combine two archetypes into a focused super-reviewer:
|
||||||
```markdown
|
|
||||||
# Database Specialist
|
|
||||||
|
|
||||||
## Identity
|
- Max 2 archetypes combined
|
||||||
**ID:** db-specialist
|
|
||||||
**Role:** Reviews database schemas, queries, and migration safety
|
|
||||||
**Lens:** "Will this scale? Will this corrupt data?"
|
|
||||||
**Model tier:** standard
|
|
||||||
|
|
||||||
## Behavior
|
|
||||||
You review database changes for:
|
|
||||||
1. Schema design — normalization, index coverage, constraint integrity
|
|
||||||
2. Query performance — would an EXPLAIN ANALYZE show problems?
|
|
||||||
3. Migration safety — backward compatible? Zero-downtime possible?
|
|
||||||
4. Data integrity — foreign keys, unique constraints, NOT NULL where needed
|
|
||||||
|
|
||||||
Output APPROVED or REJECTED with findings including:
|
|
||||||
- Table/column/query location
|
|
||||||
- Severity (CRITICAL/WARNING/INFO)
|
|
||||||
- Specific fix
|
|
||||||
|
|
||||||
## Outputs
|
|
||||||
- Challenge
|
|
||||||
- QualityReport
|
|
||||||
|
|
||||||
## Shadow
|
|
||||||
**Name:** Schema Perfectionist
|
|
||||||
**Strength inverted:** Database expertise becomes over-normalization and premature optimization
|
|
||||||
**Symptoms:**
|
|
||||||
- Demanding 3NF for a 10-row config table
|
|
||||||
- Requiring indexes for queries that run once a day
|
|
||||||
- Blocking on theoretical scale issues for an app with 50 users
|
|
||||||
**Correction:** "Optimize for the current order of magnitude. If the app has 1000 users, design for 10,000. Not for 10 million."
|
|
||||||
```
|
|
||||||
|
|
||||||
### Compliance Auditor
|
|
||||||
```markdown
|
|
||||||
# Compliance Auditor
|
|
||||||
|
|
||||||
## Identity
|
|
||||||
**ID:** compliance-auditor
|
|
||||||
**Role:** Verifies code changes against regulatory requirements
|
|
||||||
**Lens:** "Could this get us fined?"
|
|
||||||
**Model tier:** premium
|
|
||||||
|
|
||||||
## Behavior
|
|
||||||
You audit changes against:
|
|
||||||
1. GDPR — personal data handling, consent, right to deletion
|
|
||||||
2. PCI-DSS — payment data storage, transmission, access controls
|
|
||||||
3. Logging — are sensitive fields being logged? PII in error messages?
|
|
||||||
4. Data retention — are we keeping data longer than allowed?
|
|
||||||
|
|
||||||
Reference specific regulation articles in findings.
|
|
||||||
|
|
||||||
## Outputs
|
|
||||||
- RiskAssessment
|
|
||||||
|
|
||||||
## Shadow
|
|
||||||
**Name:** Regulation Zealot
|
|
||||||
**Strength inverted:** Compliance awareness becomes impossible-to-satisfy requirements
|
|
||||||
**Symptoms:**
|
|
||||||
- Citing regulations irrelevant to the change
|
|
||||||
- Requiring legal review for non-PII code
|
|
||||||
- Blocking internal tools with customer-facing compliance standards
|
|
||||||
**Correction:** "Match the compliance level to the data classification. Internal admin tools don't need PCI-DSS Level 1 controls."
|
|
||||||
```
|
|
||||||
|
|
||||||
## Using Custom Archetypes
|
|
||||||
|
|
||||||
Reference them by ID when orchestrating:
|
|
||||||
|
|
||||||
```
|
|
||||||
# In the orchestration skill, add to Check phase:
|
|
||||||
Agent(
|
|
||||||
description: "db-specialist: review schema changes",
|
|
||||||
prompt: "<contents of .archeflow/archetypes/db-specialist.md>
|
|
||||||
Review the changes in branch: <maker's branch>
|
|
||||||
..."
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
Or in a custom workflow, include them in the check phase archetypes list.
|
|
||||||
|
|
||||||
## Archetype Composition
|
|
||||||
|
|
||||||
Combine two archetypes into a focused super-reviewer when you need a specific perspective but don't want to spawn two agents:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
# .archeflow/archetypes/security-breaker.md
|
|
||||||
|
|
||||||
## Identity
|
|
||||||
**ID:** security-breaker
|
|
||||||
**Composed of:** Guardian + Trickster
|
|
||||||
**Role:** Security review with active exploitation attempts
|
|
||||||
**Lens:** "Can I break the security model? How?"
|
|
||||||
**Model tier:** standard
|
|
||||||
|
|
||||||
## Behavior
|
|
||||||
Combine Guardian's checklist-driven security review with Trickster's
|
|
||||||
adversarial testing. For each Guardian finding, attempt to exploit it.
|
|
||||||
Only report findings you can actually reproduce.
|
|
||||||
|
|
||||||
## Shadow
|
|
||||||
**Name:** Security Theater
|
|
||||||
**Strength inverted:** Both shadows compound — paranoid blocking + noise
|
|
||||||
**Correction:** "Only report findings with reproduction steps. Max 5."
|
|
||||||
```
|
|
||||||
|
|
||||||
**Rules for composition:**
|
|
||||||
- Max 2 archetypes combined (more defeats the purpose)
|
|
||||||
- Combined shadow must address both source shadows
|
- Combined shadow must address both source shadows
|
||||||
- Use when spawning both separately would waste tokens on overlapping context
|
- Use when spawning both separately would waste tokens on overlapping context
|
||||||
|
|
||||||
## Team Presets
|
## Team Presets
|
||||||
|
|
||||||
Save common team configurations for your project in `.archeflow/teams/`:
|
Save team configs in `.archeflow/teams/<name>.yaml`:
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
# .archeflow/teams/backend.yaml
|
|
||||||
name: backend
|
name: backend
|
||||||
description: Standard backend development team
|
|
||||||
plan: [explorer, creator]
|
plan: [explorer, creator]
|
||||||
do: [maker]
|
do: [maker]
|
||||||
check: [guardian, sage]
|
check: [guardian, sage]
|
||||||
@@ -183,23 +60,12 @@ exit: all_approved
|
|||||||
max_cycles: 2
|
max_cycles: 2
|
||||||
```
|
```
|
||||||
|
|
||||||
```yaml
|
Reference custom archetypes by ID in the `check` (or any phase) list.
|
||||||
# .archeflow/teams/security-audit.yaml
|
|
||||||
name: security-audit
|
|
||||||
description: Security-focused review team
|
|
||||||
plan: [explorer, creator]
|
|
||||||
do: [maker]
|
|
||||||
check: [guardian, trickster, compliance-auditor]
|
|
||||||
exit: all_approved
|
|
||||||
max_cycles: 3
|
|
||||||
```
|
|
||||||
|
|
||||||
Use in orchestration: `"Use the backend team preset"` or `"Run security-audit workflow on this change"`
|
## Rules
|
||||||
|
|
||||||
## Design Principles
|
1. One concern per archetype
|
||||||
|
2. Concrete shadow with observable symptoms
|
||||||
1. **One concern per archetype.** Don't make a "full-stack reviewer."
|
3. Right model tier: analytical = cheap, creative = standard, judgment = premium
|
||||||
2. **Concrete shadow.** Vague shadows don't get detected. Use observable symptoms.
|
4. Specific lens question focuses behavior
|
||||||
3. **Right model tier.** Analytical → cheap. Creative → standard. Judgment-heavy → premium.
|
5. Compose before creating from scratch
|
||||||
4. **Specific lens.** The one question the archetype asks. This focuses behavior.
|
|
||||||
5. **Composition over sprawl.** Combine before creating from scratch. 2 composed > 3 separate.
|
|
||||||
|
|||||||
@@ -1,193 +0,0 @@
|
|||||||
---
|
|
||||||
name: do-phase
|
|
||||||
description: Use when acting as Maker in the Do phase. Defines execution rules, worktree protocol, commit discipline, and output format.
|
|
||||||
---
|
|
||||||
|
|
||||||
# Do Phase
|
|
||||||
|
|
||||||
Maker implements the Creator's proposal. This skill defines the execution protocol — the agent definition (`agents/maker.md`) has the behavioral rules.
|
|
||||||
|
|
||||||
## Execution Protocol
|
|
||||||
|
|
||||||
### 1. Read Before Writing
|
|
||||||
Read the Creator's proposal completely. Identify:
|
|
||||||
- Files to create or modify (the `### Changes` section)
|
|
||||||
- Test strategy (the `### Test Strategy` section)
|
|
||||||
- Scope boundaries (the `### Not Doing` section)
|
|
||||||
|
|
||||||
If the proposal is unclear on any point: implement your best interpretation and note the assumption in your output.
|
|
||||||
|
|
||||||
### 2. Implementation Order
|
|
||||||
For each change in the proposal:
|
|
||||||
1. Write the test first (expect it to fail)
|
|
||||||
2. Implement the change (make the test pass)
|
|
||||||
3. Verify existing tests still pass
|
|
||||||
4. Commit with a descriptive message
|
|
||||||
|
|
||||||
For writing domain (stories, prose):
|
|
||||||
1. Read the outline / scene plan
|
|
||||||
2. Read the voice profile and character sheets
|
|
||||||
3. Draft scene by scene, following the outline's emotional beats
|
|
||||||
4. Self-check: does the voice hold? Does dialogue sound natural?
|
|
||||||
5. Commit after each scene or logical section
|
|
||||||
|
|
||||||
### 3. Commit Discipline
|
|
||||||
|
|
||||||
**CRITICAL: Always commit before finishing.** Uncommitted worktree changes are LOST when the agent exits.
|
|
||||||
|
|
||||||
Commit conventions:
|
|
||||||
```
|
|
||||||
feat: <what was added> # New functionality
|
|
||||||
fix: <what was fixed> # Bug fix within the task
|
|
||||||
test: <what was tested> # Test additions
|
|
||||||
docs: <what was documented> # Documentation only
|
|
||||||
```
|
|
||||||
|
|
||||||
Commit frequency:
|
|
||||||
- **Code:** After each logical step (one feature, one fix, one test suite)
|
|
||||||
- **Writing:** After each scene or section (~500-1000 words)
|
|
||||||
- **Never:** One big commit at the end with everything
|
|
||||||
|
|
||||||
### 4. Scope Control
|
|
||||||
|
|
||||||
Do exactly what the proposal says. No more, no less.
|
|
||||||
|
|
||||||
**In scope:**
|
|
||||||
- Files listed in the proposal's `### Changes` section
|
|
||||||
- Tests specified in the `### Test Strategy` section
|
|
||||||
- Dependencies explicitly mentioned
|
|
||||||
|
|
||||||
**Out of scope (even if tempting):**
|
|
||||||
- Refactoring code you noticed while implementing
|
|
||||||
- Adding features not in the proposal
|
|
||||||
- Fixing pre-existing bugs in adjacent code
|
|
||||||
- Updating documentation beyond what the task requires
|
|
||||||
|
|
||||||
If you encounter something that needs fixing but is out of scope: note it in `### Notes` for future work. Don't fix it now.
|
|
||||||
|
|
||||||
### 5. Blocker Protocol
|
|
||||||
|
|
||||||
If you hit a blocker (dependency missing, test infrastructure broken, proposal contradicts codebase):
|
|
||||||
1. Document what's blocked and why
|
|
||||||
2. Document what you completed before the block
|
|
||||||
3. Commit what you have
|
|
||||||
4. Stop and report — don't silently work around it
|
|
||||||
|
|
||||||
## Worktree Protocol
|
|
||||||
|
|
||||||
When running in an isolated git worktree (`isolation: "worktree"`):
|
|
||||||
|
|
||||||
```
|
|
||||||
main branch (untouched)
|
|
||||||
└── archeflow/maker-<run_id> (worktree branch)
|
|
||||||
├── commit: implementation step 1
|
|
||||||
├── commit: implementation step 2
|
|
||||||
└── commit: implementation step 3 (final)
|
|
||||||
```
|
|
||||||
|
|
||||||
- All work stays on the worktree branch
|
|
||||||
- Main branch is never modified directly
|
|
||||||
- The branch name follows the pattern: `archeflow/maker-<run_id>`
|
|
||||||
- After Check phase approves: the orchestrator merges (not the Maker)
|
|
||||||
|
|
||||||
## Output Format
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
## Implementation: <task>
|
|
||||||
|
|
||||||
### Files Changed
|
|
||||||
- `path/file.ext` — What changed (+N -M lines)
|
|
||||||
|
|
||||||
### Tests
|
|
||||||
- N new tests, all passing
|
|
||||||
- M existing tests still passing
|
|
||||||
|
|
||||||
### Commits
|
|
||||||
1. `feat: description` (hash)
|
|
||||||
2. `test: description` (hash)
|
|
||||||
|
|
||||||
### Notes
|
|
||||||
- Assumptions made where proposal was unclear
|
|
||||||
- Out-of-scope issues noticed (for future work)
|
|
||||||
|
|
||||||
### Branch
|
|
||||||
`archeflow/maker-<run_id>` — ready for review
|
|
||||||
```
|
|
||||||
|
|
||||||
For writing domain:
|
|
||||||
```markdown
|
|
||||||
## Draft: <story/chapter title>
|
|
||||||
|
|
||||||
### Scenes Written
|
|
||||||
- Scene 1: <title> (~N words)
|
|
||||||
- Scene 2: <title> (~N words)
|
|
||||||
|
|
||||||
### Word Count
|
|
||||||
- Target: N | Actual: M | Delta: +/-
|
|
||||||
|
|
||||||
### Voice Notes
|
|
||||||
- Dialect usage: N instances (target: moderate)
|
|
||||||
- Essen/Trinken: present in X/Y scenes
|
|
||||||
|
|
||||||
### Commits
|
|
||||||
1. `feat: scene 1 - <title>` (hash)
|
|
||||||
2. `feat: scene 2 - <title>` (hash)
|
|
||||||
|
|
||||||
### Notes
|
|
||||||
- Deviations from outline (with reasoning)
|
|
||||||
```
|
|
||||||
|
|
||||||
## With Prior Feedback (Cycle 2+)
|
|
||||||
|
|
||||||
When the Maker receives feedback from a prior cycle's Check phase:
|
|
||||||
|
|
||||||
1. Read the `act-feedback.md` — focus on the `### For Maker` section
|
|
||||||
2. Address each finding marked as "routed to Maker"
|
|
||||||
3. In your output, include a response table:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
### Feedback Response
|
|
||||||
| Finding | Source | Action |
|
|
||||||
|---------|--------|--------|
|
|
||||||
| Test names unclear | Sage | Fixed — renamed to behavior descriptions |
|
|
||||||
| Missing edge case | Trickster | Added test for empty input |
|
|
||||||
```
|
|
||||||
|
|
||||||
Do not address findings routed to Creator — those were handled in the revised proposal.
|
|
||||||
|
|
||||||
## Quality Checklist (self-check before finishing)
|
|
||||||
|
|
||||||
Before your final commit, verify:
|
|
||||||
- [ ] All proposal changes implemented
|
|
||||||
- [ ] All new tests pass
|
|
||||||
- [ ] All existing tests still pass
|
|
||||||
- [ ] No files modified outside proposal scope
|
|
||||||
- [ ] Every logical step has its own commit
|
|
||||||
- [ ] Output summary is complete and accurate
|
|
||||||
- [ ] Branch name follows convention
|
|
||||||
|
|
||||||
## Test-First Gate
|
|
||||||
|
|
||||||
Before the Maker's output is accepted, the orchestrator validates that tests were included.
|
|
||||||
|
|
||||||
### Validation Logic
|
|
||||||
|
|
||||||
Read `do-maker-files.txt`. Check if any file path matches common test patterns:
|
|
||||||
- `*test*`, `*spec*`, `*.test.*`, `*.spec.*`, `*_test.*`, `*_spec.*`
|
|
||||||
- Files in directories named `test/`, `tests/`, `__tests__/`, `spec/`
|
|
||||||
|
|
||||||
For writing domain projects, this gate is skipped.
|
|
||||||
|
|
||||||
### Outcomes
|
|
||||||
|
|
||||||
| Result | Action |
|
|
||||||
|--------|--------|
|
|
||||||
| Test files found | Pass — proceed to Check phase |
|
|
||||||
| No test files, code domain | **Warn** — emit WARNING event, note in do-maker.md |
|
|
||||||
| No test files + Creator specified tests | **Block** — re-run Maker with test instruction (1 retry) |
|
|
||||||
| Writing domain | Skip gate entirely |
|
|
||||||
|
|
||||||
The block case triggers a targeted re-run with prompt:
|
|
||||||
"The proposal specified these test cases: <test strategy section>. No test files
|
|
||||||
were found in your changes. Add the specified tests before finishing."
|
|
||||||
This is one retry within the Do phase, not a full PDCA cycle.
|
|
||||||
@@ -10,363 +10,92 @@ description: |
|
|||||||
|
|
||||||
# Domain Adapter System
|
# Domain Adapter System
|
||||||
|
|
||||||
ArcheFlow's PDCA pipeline and archetype system are domain-agnostic. This skill defines how to adapt them to specific domains (writing, code, research, etc.) so that events, metrics, reviews, and context use terminology that makes sense for the work being done.
|
Adapts the PDCA pipeline and archetype system to specific domains (writing, code, research) so events, metrics, reviews, and context use domain-appropriate terminology.
|
||||||
|
|
||||||
## Domain Registry
|
## Domain Registry
|
||||||
|
|
||||||
Domain definitions live in `.archeflow/domains/<name>.yaml`. Each domain maps ArcheFlow's generic concepts to domain-specific equivalents and configures what metrics to track, what reviewers should focus on, and what context agents need.
|
Domain definitions live in `.archeflow/domains/<name>.yaml`. Each maps generic concepts to domain-specific equivalents.
|
||||||
|
|
||||||
### Writing Domain
|
### Concept Mapping
|
||||||
|
|
||||||
```yaml
|
| Generic Concept | Code | Writing | Research |
|
||||||
# .archeflow/domains/writing.yaml
|
|----------------|------|---------|----------|
|
||||||
name: writing
|
| implementation | code changes | draft/prose | draft/analysis |
|
||||||
description: "Creative writing — stories, novels, non-fiction"
|
| tests | automated tests | consistency checks | citation verification |
|
||||||
|
| files_changed | files changed | word count delta | section count |
|
||||||
|
| test_coverage | test coverage % | voice drift score | source coverage |
|
||||||
|
| code_review | code review | prose review | peer review |
|
||||||
|
| build | build/compile | compile/export | compile (LaTeX/PDF) |
|
||||||
|
| deploy | deploy | publish | submit/publish |
|
||||||
|
| bug | bug | continuity error | unsupported claim |
|
||||||
|
| feature | feature | scene/chapter | section |
|
||||||
|
|
||||||
# Concept mapping — how generic ArcheFlow terms translate
|
### Metrics by Domain
|
||||||
concepts:
|
|
||||||
implementation: "draft/prose"
|
|
||||||
tests: "consistency checks"
|
|
||||||
files_changed: "word count delta"
|
|
||||||
test_coverage: "voice drift score"
|
|
||||||
code_review: "prose review"
|
|
||||||
build: "compile/export"
|
|
||||||
deploy: "publish"
|
|
||||||
refactor: "revision"
|
|
||||||
bug: "continuity error"
|
|
||||||
feature: "scene/chapter"
|
|
||||||
PR: "manuscript submission"
|
|
||||||
|
|
||||||
# Metrics — what to track instead of lines/files/tests
|
| Code | Writing | Research |
|
||||||
metrics:
|
|------|---------|----------|
|
||||||
- word_count
|
| files_changed | word_count | word_count |
|
||||||
- voice_drift_score
|
| lines_added/removed | voice_drift_score | citation_count |
|
||||||
- dialect_density
|
| tests_added | dialect_density | source_diversity |
|
||||||
- essen_count # Giesing Gschichten rule: food in every scene
|
| tests_passing | scene_count | claim_count |
|
||||||
- scene_count
|
| coverage_delta | dialogue_ratio | unsupported_claims |
|
||||||
- dialogue_ratio
|
|
||||||
|
|
||||||
# Review focus areas — override default Guardian/Sage lenses
|
### Review Focus by Domain
|
||||||
review_focus:
|
|
||||||
guardian:
|
|
||||||
- plot_coherence
|
|
||||||
- character_consistency
|
|
||||||
- timeline_accuracy
|
|
||||||
- continuity
|
|
||||||
sage:
|
|
||||||
- voice_consistency
|
|
||||||
- prose_quality
|
|
||||||
- dialect_authenticity
|
|
||||||
- forbidden_pattern_violations
|
|
||||||
skeptic:
|
|
||||||
- premise_strength
|
|
||||||
- character_motivation
|
|
||||||
- ending_satisfaction
|
|
||||||
trickster:
|
|
||||||
- reader_confusion_points
|
|
||||||
- pacing_dead_spots
|
|
||||||
- suspension_of_disbelief_breaks
|
|
||||||
|
|
||||||
# Context injection — what extra files agents should read per phase
|
| Reviewer | Code | Writing | Research |
|
||||||
context:
|
|----------|------|---------|----------|
|
||||||
always:
|
| Guardian | security, breaking changes, deps, error handling | plot coherence, character consistency, timeline, continuity | factual accuracy, citation validity, logic, methodology |
|
||||||
- "voice profile YAML (profiles/*.yaml)"
|
| Sage | code quality, coverage, docs, patterns | voice consistency, prose quality, dialect authenticity | argument structure, clarity, tone, completeness |
|
||||||
- "persona YAML (personas/*.yaml)"
|
| Skeptic | design assumptions, scalability, edge cases | premise strength, motivation, ending satisfaction | (default) |
|
||||||
- "character sheets (characters/*.yaml)"
|
| Trickster | malformed input, races, error paths, dep failures | reader confusion, pacing dead spots, disbelief breaks | (default) |
|
||||||
plan_phase:
|
|
||||||
- "series config (colette.yaml)"
|
|
||||||
- "previous stories (if series, for continuity)"
|
|
||||||
- "story brief / premise"
|
|
||||||
do_phase:
|
|
||||||
- "scene outline from Creator"
|
|
||||||
- "voice profile (for style reference)"
|
|
||||||
check_phase:
|
|
||||||
- "voice profile (for Sage drift scoring)"
|
|
||||||
- "outline (for Guardian coherence check)"
|
|
||||||
- "character sheets (for consistency)"
|
|
||||||
|
|
||||||
# Model preferences — domain-specific overrides
|
### Model Overrides
|
||||||
model_overrides:
|
|
||||||
maker: sonnet # Prose quality matters more than for code
|
|
||||||
story-sage: sonnet # Needs taste for voice evaluation
|
|
||||||
```
|
|
||||||
|
|
||||||
### Code Domain (Default)
|
Domains can override default model assignments:
|
||||||
|
|
||||||
```yaml
|
| Domain | Override | Rationale |
|
||||||
# .archeflow/domains/code.yaml
|
|--------|----------|-----------|
|
||||||
name: code
|
| Writing | maker: sonnet | Prose quality is the product |
|
||||||
description: "Software development — applications, libraries, infrastructure"
|
| Writing | story-sage: sonnet | Voice evaluation needs taste |
|
||||||
|
| Research | maker: sonnet | Analysis quality matters |
|
||||||
|
| Code | (none) | Defaults are calibrated for code |
|
||||||
|
|
||||||
concepts:
|
### Context Injection by Domain
|
||||||
implementation: "code changes"
|
|
||||||
tests: "automated tests"
|
|
||||||
files_changed: "files changed"
|
|
||||||
test_coverage: "test coverage %"
|
|
||||||
code_review: "code review"
|
|
||||||
build: "build/compile"
|
|
||||||
deploy: "deploy"
|
|
||||||
refactor: "refactor"
|
|
||||||
bug: "bug"
|
|
||||||
feature: "feature"
|
|
||||||
PR: "pull request"
|
|
||||||
|
|
||||||
metrics:
|
Domains declare which extra files agents should read per phase. Context injection is additive (on top of standard ArcheFlow context).
|
||||||
- files_changed
|
|
||||||
- lines_added
|
|
||||||
- lines_removed
|
|
||||||
- tests_added
|
|
||||||
- tests_passing
|
|
||||||
- coverage_delta
|
|
||||||
|
|
||||||
review_focus:
|
| Phase | Code | Writing |
|
||||||
guardian:
|
|-------|------|---------|
|
||||||
- security_vulnerabilities
|
| always | README.md, config.yaml | voice profile, persona, characters |
|
||||||
- breaking_changes
|
| plan | relevant source files, existing tests | series config, previous stories, brief |
|
||||||
- dependency_risks
|
| do | Creator's proposal, test fixtures | scene outline, voice profile |
|
||||||
- error_handling
|
| check | git diff, risk section | voice profile (Sage), outline (Guardian), characters |
|
||||||
sage:
|
|
||||||
- code_quality
|
|
||||||
- test_coverage
|
|
||||||
- documentation
|
|
||||||
- pattern_consistency
|
|
||||||
skeptic:
|
|
||||||
- design_assumptions
|
|
||||||
- scalability
|
|
||||||
- alternative_approaches
|
|
||||||
- edge_cases
|
|
||||||
trickster:
|
|
||||||
- malformed_input
|
|
||||||
- concurrency_races
|
|
||||||
- error_path_exploitation
|
|
||||||
- dependency_failures
|
|
||||||
|
|
||||||
context:
|
|
||||||
always:
|
|
||||||
- "README.md"
|
|
||||||
- ".archeflow/config.yaml"
|
|
||||||
plan_phase:
|
|
||||||
- "relevant source files (Explorer identifies)"
|
|
||||||
- "existing tests for affected area"
|
|
||||||
do_phase:
|
|
||||||
- "Creator's proposal"
|
|
||||||
- "test fixtures and helpers"
|
|
||||||
check_phase:
|
|
||||||
- "git diff from Maker"
|
|
||||||
- "proposal risk section"
|
|
||||||
|
|
||||||
model_overrides: {}
|
|
||||||
# Code domain uses default archetype model assignments
|
|
||||||
```
|
|
||||||
|
|
||||||
### Research Domain (Example Extension)
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
# .archeflow/domains/research.yaml
|
|
||||||
name: research
|
|
||||||
description: "Academic or technical research — papers, analysis, literature review"
|
|
||||||
|
|
||||||
concepts:
|
|
||||||
implementation: "draft/analysis"
|
|
||||||
tests: "citation verification"
|
|
||||||
files_changed: "section count"
|
|
||||||
test_coverage: "source coverage"
|
|
||||||
code_review: "peer review"
|
|
||||||
build: "compile (LaTeX/PDF)"
|
|
||||||
deploy: "submit/publish"
|
|
||||||
|
|
||||||
metrics:
|
|
||||||
- word_count
|
|
||||||
- citation_count
|
|
||||||
- source_diversity
|
|
||||||
- claim_count
|
|
||||||
- unsupported_claims
|
|
||||||
|
|
||||||
review_focus:
|
|
||||||
guardian:
|
|
||||||
- factual_accuracy
|
|
||||||
- citation_validity
|
|
||||||
- logical_coherence
|
|
||||||
- methodology_soundness
|
|
||||||
sage:
|
|
||||||
- argument_structure
|
|
||||||
- prose_clarity
|
|
||||||
- academic_tone
|
|
||||||
- completeness
|
|
||||||
|
|
||||||
context:
|
|
||||||
always:
|
|
||||||
- "bibliography/references"
|
|
||||||
- "research brief"
|
|
||||||
plan_phase:
|
|
||||||
- "prior literature notes"
|
|
||||||
- "methodology constraints"
|
|
||||||
check_phase:
|
|
||||||
- "citation database"
|
|
||||||
- "claims vs. evidence mapping"
|
|
||||||
|
|
||||||
model_overrides:
|
|
||||||
maker: sonnet # Research writing needs quality
|
|
||||||
```
|
|
||||||
|
|
||||||
## Domain Detection
|
## Domain Detection
|
||||||
|
|
||||||
ArcheFlow auto-detects the domain based on project markers. Detection runs once at `run.start` and the result is stored in the run's event stream.
|
Auto-detects at `run.start`. Result stored in event stream.
|
||||||
|
|
||||||
### Detection Priority (highest first)
|
| Priority | Signal | Domain |
|
||||||
|
|----------|--------|--------|
|
||||||
| Priority | Signal | Domain | Rationale |
|
| 1 | CLI `--domain <name>` | as specified |
|
||||||
|----------|--------|--------|-----------|
|
| 2 | Team preset `domain:` field | as specified |
|
||||||
| 1 | CLI flag `--domain <name>` | as specified | Explicit override always wins |
|
| 3 | `colette.yaml` exists | writing |
|
||||||
| 2 | Team preset has `domain: <name>` | as specified | Preset knows its domain |
|
| 4 | `*.bib` or `references/` exists | research |
|
||||||
| 3 | `colette.yaml` exists in project root | `writing` | Colette is the writing platform |
|
| 5 | `package.json`, `Cargo.toml`, `pyproject.toml`, `go.mod`, `Makefile` | code |
|
||||||
| 4 | `*.bib` or `references/` exists | `research` | Bibliography signals research |
|
| 6 | No markers | code (default) |
|
||||||
| 5 | `package.json` exists | `code` | Node.js project |
|
|
||||||
| 6 | `Cargo.toml` exists | `code` | Rust project |
|
|
||||||
| 7 | `pyproject.toml` exists | `code` | Python project |
|
|
||||||
| 8 | `go.mod` exists | `code` | Go project |
|
|
||||||
| 9 | `Makefile` or `CMakeLists.txt` exists | `code` | C/C++ project |
|
|
||||||
| 10 | No markers found | `code` | Default fallback |
|
|
||||||
|
|
||||||
### Detection in Team Presets
|
|
||||||
|
|
||||||
Team presets can declare their domain explicitly:
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
# .archeflow/teams/story-development.yaml
|
|
||||||
name: story-development
|
|
||||||
domain: writing # <-- explicit domain
|
|
||||||
description: "Kurzgeschichten-Entwicklung"
|
|
||||||
plan: [story-explorer, creator]
|
|
||||||
do: [maker]
|
|
||||||
check: [guardian, story-sage]
|
|
||||||
```
|
|
||||||
|
|
||||||
When `domain` is set in the preset, detection is skipped entirely.
|
|
||||||
|
|
||||||
### Detection Event
|
|
||||||
|
|
||||||
Domain detection emits a decision event:
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
{"ts":"...","run_id":"...","seq":1,"parent":[],"type":"decision","phase":"init","agent":null,"data":{"what":"domain_detection","chosen":"writing","signal":"colette.yaml exists","alternatives":[{"id":"code","reason_rejected":"No code project markers found"}]}}
|
|
||||||
```
|
|
||||||
|
|
||||||
## How Domains Affect Orchestration
|
|
||||||
|
|
||||||
### 1. Concept Translation in Reports
|
|
||||||
|
|
||||||
The orchestration report and session log use domain-translated terms:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
# Code domain report
|
|
||||||
- **Files changed:** 4 files, +120 -30 lines
|
|
||||||
- **Tests added:** 8 new tests
|
|
||||||
|
|
||||||
# Writing domain report (same data, different framing)
|
|
||||||
- **Word count delta:** +6004 words across 7 scenes
|
|
||||||
- **Consistency checks:** voice drift 0.12, 2 continuity fixes applied
|
|
||||||
```
|
|
||||||
|
|
||||||
### 2. Domain-Specific Event Data
|
|
||||||
|
|
||||||
Events include domain-relevant metrics in their `data` payload:
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
// Writing domain — agent.complete
|
|
||||||
{"type":"agent.complete","data":{"archetype":"maker","duration_ms":180000,"word_count":6004,"voice_drift":0.12,"scenes":7,"dialogue_ratio":0.35,"essen_count":4}}
|
|
||||||
|
|
||||||
// Code domain — agent.complete
|
|
||||||
{"type":"agent.complete","data":{"archetype":"maker","duration_ms":90000,"files_changed":5,"tests_added":12,"coverage_delta":"+3%","lines_added":245,"lines_removed":80}}
|
|
||||||
|
|
||||||
// Writing domain — run.complete
|
|
||||||
{"type":"run.complete","data":{"status":"completed","word_count":6004,"voice_drift_final":0.08,"scenes":7,"dialect_density":0.15,"cycles":1}}
|
|
||||||
|
|
||||||
// Code domain — run.complete
|
|
||||||
{"type":"run.complete","data":{"status":"completed","files_changed":4,"tests_total":20,"coverage":"87%","cycles":2}}
|
|
||||||
```
|
|
||||||
|
|
||||||
### 3. Review Focus Override
|
|
||||||
|
|
||||||
When a domain defines `review_focus`, reviewers receive domain-specific instructions instead of the defaults:
|
|
||||||
|
|
||||||
```
|
|
||||||
# Without domain adapter (code defaults):
|
|
||||||
Guardian → "Check for security vulnerabilities, breaking changes..."
|
|
||||||
|
|
||||||
# With writing domain adapter:
|
|
||||||
Guardian → "Check for plot coherence, character consistency, timeline accuracy, continuity..."
|
|
||||||
```
|
|
||||||
|
|
||||||
The orchestration skill reads the domain's `review_focus` and injects it into the reviewer prompt. The archetype's base personality (virtue, shadow, lens) stays the same — only the checklist changes.
|
|
||||||
|
|
||||||
### 4. Context Injection
|
|
||||||
|
|
||||||
The domain's `context` config tells the orchestrator which additional files to pass to each agent:
|
|
||||||
|
|
||||||
```
|
|
||||||
# Plan phase in writing domain:
|
|
||||||
# Orchestrator automatically includes voice profile, persona, character sheets, series config
|
|
||||||
# alongside the standard task description and Explorer output
|
|
||||||
|
|
||||||
# Check phase in writing domain:
|
|
||||||
# Guardian gets the outline (for coherence)
|
|
||||||
# Sage gets the voice profile (for drift scoring)
|
|
||||||
```
|
|
||||||
|
|
||||||
Context injection is additive — domain context is added on top of ArcheFlow's standard context rules (task description, prior phase output, etc.).
|
|
||||||
|
|
||||||
### 5. Model Overrides
|
|
||||||
|
|
||||||
If the domain specifies `model_overrides`, those override the default model assignment for the listed archetypes:
|
|
||||||
|
|
||||||
```
|
|
||||||
# Default: Maker uses whatever the workflow assigns (often haiku for cheap tasks)
|
|
||||||
# Writing domain: Maker uses sonnet (prose quality matters)
|
|
||||||
# Research domain: Maker uses sonnet (analysis quality matters)
|
|
||||||
```
|
|
||||||
|
|
||||||
Model overrides interact with cost tracking — the cost-tracking skill reads the effective model assignment (after domain overrides) for its estimates.
|
|
||||||
|
|
||||||
## Adding a New Domain
|
## Adding a New Domain
|
||||||
|
|
||||||
1. Create `.archeflow/domains/<name>.yaml` following the schema above
|
1. Create `.archeflow/domains/<name>.yaml` with `name`, `concepts`, `metrics` (minimum required)
|
||||||
2. Add detection signals to the priority table (or rely on `--domain` / team preset)
|
2. Optionally add `review_focus`, `context`, `model_overrides`
|
||||||
3. Define custom archetypes if needed (e.g., `story-explorer` for writing)
|
3. Missing sections fall back to `code` domain defaults
|
||||||
4. Test with `--domain <name> --dry-run` to verify detection and context injection
|
4. Test with `--domain <name> --dry-run`
|
||||||
|
|
||||||
### Minimum Viable Domain
|
## How Domains Affect Orchestration
|
||||||
|
|
||||||
Only `name`, `concepts`, and `metrics` are required. Everything else has sensible defaults:
|
- **Reports** use domain-translated terms (e.g., "word count delta" instead of "files changed")
|
||||||
|
- **Events** include domain-relevant metrics in `agent.complete` and `run.complete` payloads
|
||||||
```yaml
|
- **Reviewers** receive domain-specific focus checklists (archetype personality stays the same)
|
||||||
name: legal
|
- **Context injection** adds domain-declared files to each agent's prompt
|
||||||
description: "Legal document drafting and review"
|
- **Model overrides** change which model an archetype uses (interacts with cost-tracking)
|
||||||
|
- **One domain per run.** Multi-domain projects use separate runs.
|
||||||
concepts:
|
|
||||||
implementation: "draft"
|
|
||||||
tests: "compliance checks"
|
|
||||||
code_review: "legal review"
|
|
||||||
|
|
||||||
metrics:
|
|
||||||
- clause_count
|
|
||||||
- citation_count
|
|
||||||
- compliance_score
|
|
||||||
```
|
|
||||||
|
|
||||||
Missing sections fall back to the `code` domain defaults.
|
|
||||||
|
|
||||||
## Integration with Other Skills
|
|
||||||
|
|
||||||
- **`orchestration`**: Reads domain config at `run.start`, applies concept translation, context injection, model overrides, and review focus throughout the run
|
|
||||||
- **`process-log`**: Domain-specific event data fields are included in `agent.complete` and `run.complete` payloads
|
|
||||||
- **`cost-tracking`**: Reads `model_overrides` from the active domain to calculate accurate cost estimates
|
|
||||||
- **`custom-archetypes`**: Domain-specific archetypes (e.g., `story-explorer`, `story-sage`) are defined per-project and referenced in team presets
|
|
||||||
- **`workflow-design`**: Custom workflows can reference a domain explicitly
|
|
||||||
|
|
||||||
## Design Principles
|
|
||||||
|
|
||||||
1. **Additive, not replacing.** Domains add context and translate terms. They do not change the PDCA cycle, archetype system, or event schema.
|
|
||||||
2. **Graceful degradation.** If no domain config exists, everything works as before (code domain defaults).
|
|
||||||
3. **One domain per run.** A run operates in exactly one domain. Multi-domain projects use separate runs.
|
|
||||||
4. **Domain config is data, not code.** YAML files, no scripts. Portable across projects.
|
|
||||||
|
|||||||
@@ -1,200 +0,0 @@
|
|||||||
---
|
|
||||||
name: effectiveness
|
|
||||||
description: |
|
|
||||||
Track archetype effectiveness across runs. Scores each archetype on signal-to-noise,
|
|
||||||
fix rate, cost efficiency, accuracy, and cycle impact. Recommends model tier changes
|
|
||||||
and archetype removal based on rolling averages.
|
|
||||||
<example>User: "Which reviewers are actually useful?"</example>
|
|
||||||
<example>User: "Show archetype effectiveness report"</example>
|
|
||||||
---
|
|
||||||
|
|
||||||
# Agent Effectiveness Scoring
|
|
||||||
|
|
||||||
Track which archetypes are most useful vs. which waste tokens. Over multiple runs, build a profile of each archetype's effectiveness and use it to optimize team composition and model selection.
|
|
||||||
|
|
||||||
## Storage
|
|
||||||
|
|
||||||
```
|
|
||||||
.archeflow/memory/effectiveness.jsonl # Per-run archetype scores (append-only)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Scoring Dimensions
|
|
||||||
|
|
||||||
For each archetype that participates in a run, calculate these scores:
|
|
||||||
|
|
||||||
| Dimension | How Measured | Weight |
|
|
||||||
|-----------|-------------|--------|
|
|
||||||
| **Signal-to-noise** | useful findings / total findings | 0.30 |
|
|
||||||
| **Fix rate** | findings that led to actual fixes / total findings | 0.25 |
|
|
||||||
| **Cost efficiency** | useful findings per dollar spent | 0.20 |
|
|
||||||
| **Accuracy** | findings not contradicted by other reviewers | 0.15 |
|
|
||||||
| **Cycle impact** | did this archetype's findings lead to cycle exit? | 0.10 |
|
|
||||||
|
|
||||||
### Definitions
|
|
||||||
|
|
||||||
- **Useful finding**: A finding in a `review.verdict` event with `severity >= WARNING` (i.e., severity is `warning`, `bug`, or `critical`) AND `fix_required == true`.
|
|
||||||
- **Actual fix**: A `fix.applied` event whose `source` field matches this archetype (or whose DAG `parent` chain traces back to this archetype's `review.verdict` event).
|
|
||||||
- **Contradicted finding**: Another reviewer's `review.verdict` has `verdict == "approved"` for the same scope where this archetype flagged an issue. Approximation: if archetype A flags N findings but archetype B approves the same code with 0 findings in overlapping severity categories, A's unmatched findings are considered potentially contradicted.
|
|
||||||
- **Cycle impact**: The archetype's findings (with `fix_required == true`) resulted in fixes that were part of the final approved cycle. Determined by checking if `fix.applied` events referencing this archetype exist before the final `cycle.boundary` with `met == true`.
|
|
||||||
|
|
||||||
### Composite Score
|
|
||||||
|
|
||||||
```
|
|
||||||
composite = (signal_to_noise * 0.30)
|
|
||||||
+ (fix_rate * 0.25)
|
|
||||||
+ (cost_efficiency_normalized * 0.20)
|
|
||||||
+ (accuracy * 0.15)
|
|
||||||
+ (cycle_impact * 0.10)
|
|
||||||
```
|
|
||||||
|
|
||||||
**Cost efficiency normalization**: Raw cost efficiency is `useful_findings / cost_usd`. To normalize to 0-1 range, use: `min(1.0, raw_efficiency / 100)`. The threshold of 100 means "100 useful findings per dollar" is considered perfect efficiency (achievable with haiku on structured reviews).
|
|
||||||
|
|
||||||
## Per-Run Scoring
|
|
||||||
|
|
||||||
After `run.complete`, calculate scores for each archetype that participated. The `extract` command does this.
|
|
||||||
|
|
||||||
### Per-Run Score Record
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
{"ts":"2026-04-03T16:00:00Z","run_id":"2026-04-03-der-huster","archetype":"guardian","signal_to_noise":0.85,"fix_rate":1.0,"cost_efficiency":42.5,"accuracy":1.0,"cycle_impact":true,"composite_score":0.91,"tokens":5000,"cost_usd":0.004,"model":"haiku","findings_total":4,"findings_useful":3,"fixes_applied":3}
|
|
||||||
```
|
|
||||||
|
|
||||||
Appended to `.archeflow/memory/effectiveness.jsonl`.
|
|
||||||
|
|
||||||
### Scoring Non-Review Archetypes
|
|
||||||
|
|
||||||
Only archetypes that produce `review.verdict` events are scored (Guardian, Skeptic, Sage, Trickster, and any custom review archetypes). Non-review archetypes (Explorer, Creator, Maker) are tracked by cost-tracking but not effectiveness-scored, because their output quality is measured differently (by whether the run succeeds, not by individual findings).
|
|
||||||
|
|
||||||
## Aggregate Scoring
|
|
||||||
|
|
||||||
Across all runs, maintain rolling averages (computed on-demand, not stored):
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
{"archetype":"guardian","runs":12,"avg_composite":0.88,"avg_signal_noise":0.82,"avg_cost_efficiency":38.2,"trend":"stable","recommendation":"keep"}
|
|
||||||
{"archetype":"trickster","runs":8,"avg_composite":0.35,"avg_signal_noise":0.20,"avg_cost_efficiency":5.1,"trend":"declining","recommendation":"consider_removing"}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Trend Calculation
|
|
||||||
|
|
||||||
Compare the average composite score of the last 5 runs to the 5 runs before that:
|
|
||||||
|
|
||||||
- **improving**: last-5 avg > prior-5 avg + 0.05
|
|
||||||
- **declining**: last-5 avg < prior-5 avg - 0.05
|
|
||||||
- **stable**: within +/- 0.05
|
|
||||||
|
|
||||||
If fewer than 10 runs exist, trend is `"insufficient_data"`.
|
|
||||||
|
|
||||||
### Recommendations
|
|
||||||
|
|
||||||
Based on aggregate composite scores:
|
|
||||||
|
|
||||||
| Composite Score | Recommendation | Meaning |
|
|
||||||
|----------------|---------------|---------|
|
|
||||||
| >= 0.70 | `keep` | Archetype is valuable, contributes meaningful findings |
|
|
||||||
| 0.40 - 0.69 | `optimize` | Consider cheaper model or tighter review lens |
|
|
||||||
| < 0.40 | `consider_removing` | Might be wasting tokens, review whether it adds value |
|
|
||||||
|
|
||||||
## Integration Points
|
|
||||||
|
|
||||||
### At Run Start
|
|
||||||
|
|
||||||
When the `run` skill initializes, show a brief effectiveness summary for the team's archetypes:
|
|
||||||
|
|
||||||
```
|
|
||||||
Archetype effectiveness (last 10 runs):
|
|
||||||
guardian: 0.88 (keep) — haiku, $0.004/run avg
|
|
||||||
sage: 0.72 (keep) — sonnet, $0.08/run avg
|
|
||||||
skeptic: 0.45 (optimize) — haiku, $0.003/run avg
|
|
||||||
trickster: 0.32 (consider_removing) — haiku, $0.003/run avg
|
|
||||||
```
|
|
||||||
|
|
||||||
### Model Tier Suggestions
|
|
||||||
|
|
||||||
Cross-reference effectiveness with model assignment:
|
|
||||||
|
|
||||||
- **High effectiveness on cheap model** (composite >= 0.7, model = haiku): "Keep cheap. Working well."
|
|
||||||
- **Low effectiveness on cheap model** (composite < 0.5, model = haiku): "Consider upgrading to sonnet — cheap model may not be capturing issues."
|
|
||||||
- **High effectiveness on expensive model** (composite >= 0.7, model = sonnet): "Try downgrading to haiku — may maintain quality at lower cost."
|
|
||||||
- **Low effectiveness on expensive model** (composite < 0.5, model = sonnet): "Consider removing — expensive and not contributing."
|
|
||||||
|
|
||||||
### Cost-Tracking Integration
|
|
||||||
|
|
||||||
Multiply estimated cost by effectiveness to get "value per dollar":
|
|
||||||
|
|
||||||
```
|
|
||||||
value_per_dollar = composite_score / cost_usd
|
|
||||||
```
|
|
||||||
|
|
||||||
This metric helps compare archetypes directly: a cheap archetype with moderate effectiveness may have higher value_per_dollar than an expensive one with high effectiveness.
|
|
||||||
|
|
||||||
## Effectiveness Script
|
|
||||||
|
|
||||||
**Location:** `lib/archeflow-score.sh`
|
|
||||||
|
|
||||||
```
|
|
||||||
Usage:
|
|
||||||
archeflow-score.sh extract <events.jsonl> # Score archetypes from a completed run
|
|
||||||
archeflow-score.sh report # Show aggregate effectiveness report
|
|
||||||
archeflow-score.sh recommend <team.yaml> # Recommend model tiers for a team
|
|
||||||
```
|
|
||||||
|
|
||||||
### `extract` Command
|
|
||||||
|
|
||||||
1. Read all events from the JSONL file
|
|
||||||
2. Verify a `run.complete` event exists (scoring incomplete runs is unreliable)
|
|
||||||
3. For each `review.verdict` event:
|
|
||||||
- Count total findings and useful findings (severity >= WARNING, fix_required)
|
|
||||||
- Cross-reference with `fix.applied` events via the `source` field or DAG parent chain
|
|
||||||
- Check for contradictions from other reviewers
|
|
||||||
- Determine cycle impact
|
|
||||||
4. Calculate all scoring dimensions and composite score
|
|
||||||
5. Append per-archetype score records to `.archeflow/memory/effectiveness.jsonl`
|
|
||||||
|
|
||||||
### `report` Command
|
|
||||||
|
|
||||||
1. Read `.archeflow/memory/effectiveness.jsonl`
|
|
||||||
2. Group by archetype
|
|
||||||
3. Calculate rolling averages (last 10 runs per archetype)
|
|
||||||
4. Calculate trends (last 5 vs. prior 5)
|
|
||||||
5. Output a markdown table:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
# Archetype Effectiveness Report
|
|
||||||
|
|
||||||
| Archetype | Runs | Avg Score | S/N | Fix Rate | Cost Eff | Accuracy | Trend | Rec |
|
|
||||||
|-----------|------|-----------|-----|----------|----------|----------|-------|-----|
|
|
||||||
| guardian | 12 | 0.88 | 0.82 | 0.95 | 38.2 | 0.97 | stable | keep |
|
|
||||||
| sage | 10 | 0.72 | 0.70 | 0.80 | 12.1 | 0.88 | improving | keep |
|
|
||||||
| skeptic | 8 | 0.45 | 0.40 | 0.50 | 22.5 | 0.60 | stable | optimize |
|
|
||||||
| trickster | 8 | 0.35 | 0.20 | 0.30 | 5.1 | 0.55 | declining | consider_removing |
|
|
||||||
|
|
||||||
**Model suggestions:**
|
|
||||||
- skeptic (haiku, score 0.45): Consider upgrading to sonnet or tightening review lens
|
|
||||||
- trickster (haiku, score 0.35): Consider removing — low signal, low fix rate
|
|
||||||
```
|
|
||||||
|
|
||||||
### `recommend` Command
|
|
||||||
|
|
||||||
1. Read the team preset YAML file
|
|
||||||
2. For each archetype in the team, look up its effectiveness from `.archeflow/memory/effectiveness.jsonl`
|
|
||||||
3. Cross-reference current model assignment with effectiveness
|
|
||||||
4. Output recommendations:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
# Model Recommendations for team: story-development
|
|
||||||
|
|
||||||
| Archetype | Current Model | Score | Suggestion |
|
|
||||||
|-----------|--------------|-------|------------|
|
|
||||||
| guardian | haiku | 0.88 | Keep haiku — high effectiveness at low cost |
|
|
||||||
| sage | sonnet | 0.72 | Keep sonnet — quality-sensitive role |
|
|
||||||
| skeptic | haiku | 0.45 | Try sonnet — may improve signal quality |
|
|
||||||
| trickster | haiku | 0.35 | Consider removing from team |
|
|
||||||
```
|
|
||||||
|
|
||||||
## Design Principles
|
|
||||||
|
|
||||||
1. **Append-only.** Score records are immutable facts. Aggregates are computed on-demand.
|
|
||||||
2. **Review archetypes only.** Non-review agents (Explorer, Creator, Maker) are not scored — their value is in the final product, not in individual findings.
|
|
||||||
3. **Relative, not absolute.** Scores are meaningful in comparison (guardian vs. trickster), not as standalone numbers. The thresholds (0.7, 0.4) are starting points — calibrate after 20+ runs.
|
|
||||||
4. **Actionable.** Every report ends with concrete recommendations (keep, optimize, remove, change model).
|
|
||||||
5. **Cheap to compute.** One JSONL scan per report. No databases, no external services.
|
|
||||||
@@ -6,263 +6,86 @@ description: |
|
|||||||
Enables rollback to any phase boundary and full audit trail via git history.
|
Enables rollback to any phase boundary and full audit trail via git history.
|
||||||
<example>Automatically loaded by archeflow:run when git.enabled is true</example>
|
<example>Automatically loaded by archeflow:run when git.enabled is true</example>
|
||||||
<example>User: "archeflow rollback --to plan"</example>
|
<example>User: "archeflow rollback --to plan"</example>
|
||||||
<example>User: "Show me the git history for this run"</example>
|
|
||||||
---
|
---
|
||||||
|
|
||||||
# Git Integration — Per-Phase Commit Strategy
|
# Git Integration -- Per-Phase Commit Strategy
|
||||||
|
|
||||||
Every ArcheFlow run creates a dedicated branch. Each phase transition and agent completion produces a commit. At run completion, the branch is merged back to the base branch. On failure, the branch stays intact for inspection or rollback.
|
Every run creates branch `archeflow/<run_id>`. Each phase transition and agent completion produces a commit. On success, merge back. On failure, branch stays for inspection.
|
||||||
|
|
||||||
## Prerequisites
|
|
||||||
|
|
||||||
- `archeflow:orchestration` — workflow rules and safety constraints
|
|
||||||
- `archeflow:process-log` — event schema (git events are emitted alongside process events)
|
|
||||||
- `archeflow:artifact-routing` — artifact paths that get committed
|
|
||||||
|
|
||||||
## Helper Script
|
|
||||||
|
|
||||||
All git operations go through the helper script:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./lib/archeflow-git.sh <command> <run_id> [args...]
|
|
||||||
```
|
|
||||||
|
|
||||||
See `lib/archeflow-git.sh` for full usage. The skill describes *when* to call the script; the script handles *how*.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Branch Strategy
|
## Branch Strategy
|
||||||
|
|
||||||
```
|
```
|
||||||
main (or current base branch)
|
main
|
||||||
└── archeflow/<run_id> # Created at run.start
|
+-- archeflow/<run_id>
|
||||||
├── commit: "archeflow(plan): explorer research"
|
+-- archeflow(plan): explorer research
|
||||||
├── commit: "archeflow(plan): creator outline"
|
+-- archeflow(plan): creator outline
|
||||||
├── commit: "archeflow(plan→do): phase transition"
|
+-- archeflow(plan->do): phase transition
|
||||||
├── commit: "archeflow(do): maker draft"
|
+-- archeflow(do): maker draft
|
||||||
├── commit: "archeflow(do→check): phase transition"
|
+-- archeflow(check): guardian review
|
||||||
├── commit: "archeflow(check): guardian review"
|
+-- archeflow(act): cycle 1 complete
|
||||||
├── commit: "archeflow(check): sage review"
|
+-- archeflow(run): complete
|
||||||
├── commit: "archeflow(check→act): phase transition"
|
|
||||||
├── commit: "archeflow(act): apply 6 fixes"
|
|
||||||
├── commit: "archeflow(act): cycle 1 complete"
|
|
||||||
└── commit: "archeflow(run): complete — <summary>"
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Branch naming: `archeflow/<run_id>` (e.g., `archeflow/2026-04-03-jwt-auth`).
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Commit Points
|
## Commit Points
|
||||||
|
|
||||||
| Trigger | What to commit | Message format |
|
| Trigger | Message format |
|
||||||
|---------|---------------|----------------|
|
|---------|----------------|
|
||||||
| After `agent.complete` | Agent artifacts + any created/modified files | `archeflow(<phase>): <archetype> <summary>` |
|
| `agent.complete` | `archeflow(<phase>): <archetype> <summary>` |
|
||||||
| After `phase.transition` | All artifacts from completed phase | `archeflow(<from>→<to>): phase transition` |
|
| `phase.transition` | `archeflow(<from>-><to>): phase transition` |
|
||||||
| After each `fix.applied` | The fixed file | `archeflow(fix): <source> — <finding summary>` |
|
| `fix.applied` | `archeflow(fix): <source> -- <finding>` |
|
||||||
| After `cycle.boundary` | Everything staged | `archeflow(act): cycle <N> <status>` |
|
| `cycle.boundary` | `archeflow(act): cycle <N> <status>` |
|
||||||
| After `run.complete` | Final state + process report | `archeflow(run): complete — <summary>` |
|
| `run.complete` | `archeflow(run): complete -- <summary>` |
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Commit Protocol
|
## Commit Protocol
|
||||||
|
|
||||||
1. **Stage only relevant files.** Never `git add -A`. Stage:
|
- Stage only relevant files: `.archeflow/artifacts/<run_id>/`, event log, project files from maker
|
||||||
- `.archeflow/artifacts/<run_id>/` — artifacts produced by the current agent/phase
|
- Never `git add -A`
|
||||||
- `.archeflow/events/<run_id>.jsonl` — updated event log
|
- Exclude: `progress.md`, `explorer-cache/`, `session-log.md`
|
||||||
- Any project files created or modified by the current agent (from `do-maker-files.txt` or explicit file list)
|
- Use conventional commit format
|
||||||
2. **Exclude ephemeral files.** Never commit:
|
- Signing opt-in via `git.signing_key` config
|
||||||
- `.archeflow/progress.md` (live progress display, ephemeral)
|
|
||||||
- `.archeflow/explorer-cache/` (local cache, not run-specific)
|
|
||||||
- `.archeflow/session-log.md` (separate concern)
|
|
||||||
3. **Use conventional commit format:** `archeflow(<scope>): <message>`
|
|
||||||
4. **Signing:** If `git.signing_key` is configured, pass `-c user.signingkey=<key>` to `git commit`.
|
|
||||||
|
|
||||||
### Integration with the Run Skill
|
## All operations go through `./lib/archeflow-git.sh`:
|
||||||
|
|
||||||
The `archeflow:run` skill calls git operations at these points:
|
| Run event | Command |
|
||||||
|
|-----------|---------|
|
||||||
|
| `run.start` | `init <run_id>` (create+switch branch) |
|
||||||
|
| `agent.complete` | `commit <run_id> <phase> "<msg>" [files]` |
|
||||||
|
| `phase.transition` | `phase-commit <run_id> <phase>` |
|
||||||
|
| `run.complete` (ok) | `merge <run_id> [--squash|--no-ff]` |
|
||||||
|
| `run.complete` (fail) | branch preserved |
|
||||||
|
|
||||||
```
|
## Merge
|
||||||
run.start → ./lib/archeflow-git.sh init <run_id>
|
|
||||||
agent.complete → ./lib/archeflow-git.sh commit <run_id> <phase> "<archetype> <summary>" [files...]
|
|
||||||
phase.transition → ./lib/archeflow-git.sh phase-commit <run_id> <phase>
|
|
||||||
fix.applied → ./lib/archeflow-git.sh commit <run_id> fix "<source> — <finding>"
|
|
||||||
cycle.boundary → ./lib/archeflow-git.sh commit <run_id> act "cycle <N> <status>"
|
|
||||||
run.complete (ok) → ./lib/archeflow-git.sh merge <run_id> [--squash|--no-ff]
|
|
||||||
run.complete (fail) → branch preserved, not merged
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
1. Verify all changes committed
|
||||||
|
2. Switch to base branch
|
||||||
## Run Lifecycle
|
3. Merge with configured strategy (squash default)
|
||||||
|
4. Branch NOT auto-deleted (user may inspect)
|
||||||
### 1. Initialization (`run.start`)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./lib/archeflow-git.sh init <run_id>
|
|
||||||
```
|
|
||||||
|
|
||||||
This will:
|
|
||||||
1. Verify a clean working tree (or stash uncommitted changes)
|
|
||||||
2. Create branch `archeflow/<run_id>` from current HEAD
|
|
||||||
3. Switch to the new branch
|
|
||||||
|
|
||||||
### 2. During Execution (phase commits)
|
|
||||||
|
|
||||||
After each agent completes or phase transitions, the run skill calls:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# After an agent completes:
|
|
||||||
./lib/archeflow-git.sh commit <run_id> plan "explorer research" \
|
|
||||||
.archeflow/artifacts/<run_id>/plan-explorer.md
|
|
||||||
|
|
||||||
# After a phase transition:
|
|
||||||
./lib/archeflow-git.sh phase-commit <run_id> plan
|
|
||||||
```
|
|
||||||
|
|
||||||
The `commit` command stages artifact directories and event logs automatically. Additional files can be passed as trailing arguments.
|
|
||||||
|
|
||||||
The `phase-commit` command stages all artifacts matching the phase prefix and commits with a transition message.
|
|
||||||
|
|
||||||
### 3. Completion (merge)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Success — squash merge (default):
|
|
||||||
./lib/archeflow-git.sh merge <run_id> --squash
|
|
||||||
|
|
||||||
# Success — preserve history:
|
|
||||||
./lib/archeflow-git.sh merge <run_id> --no-ff
|
|
||||||
|
|
||||||
# Failure or user abort:
|
|
||||||
# Do nothing. Branch stays for inspection.
|
|
||||||
echo "Branch archeflow/<run_id> preserved for inspection."
|
|
||||||
```
|
|
||||||
|
|
||||||
The merge command:
|
|
||||||
1. Verifies all changes on the branch are committed
|
|
||||||
2. Switches to the base branch (main or wherever the run started)
|
|
||||||
3. Merges with the chosen strategy
|
|
||||||
4. If squash: creates a single commit with `feat: <task summary>`
|
|
||||||
5. Does NOT delete the branch (user may want to inspect)
|
|
||||||
|
|
||||||
### 4. Cleanup (optional, after inspection)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./lib/archeflow-git.sh cleanup <run_id>
|
|
||||||
```
|
|
||||||
|
|
||||||
Deletes the branch after the user has confirmed the merge is correct.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Rollback
|
## Rollback
|
||||||
|
|
||||||
Roll back to the end of any completed phase:
|
`./lib/archeflow-git.sh rollback <run_id> --to <target>`
|
||||||
|
|
||||||
```bash
|
Targets: `plan`, `do`, `check`, `act`, `cycle-N`. Only works on `archeflow/<run_id>` branch. Resets to last commit for target phase and trims event JSONL.
|
||||||
./lib/archeflow-git.sh rollback <run_id> --to plan
|
|
||||||
```
|
|
||||||
|
|
||||||
This will:
|
## Post-Merge Validation
|
||||||
1. Find the last commit for the target phase by searching commit messages
|
|
||||||
2. Show the user what commits will be lost (everything after the target)
|
|
||||||
3. Perform `git reset --hard <commit>` on the branch
|
|
||||||
4. Trim the events JSONL to remove events that occurred after the rollback point
|
|
||||||
|
|
||||||
**Supported rollback targets:** `plan`, `do`, `check`, `act`, or any cycle number (`cycle-1`, `cycle-2`).
|
After merge, runs project test suite (from `test_command` in config) with 5-min timeout. If tests fail: `git revert --no-edit HEAD`.
|
||||||
|
|
||||||
**Safety:** Rollback only works on the run's branch, never on main. The script verifies you are on `archeflow/<run_id>` before proceeding.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
View the git state of a run:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./lib/archeflow-git.sh status <run_id>
|
|
||||||
```
|
|
||||||
|
|
||||||
Output:
|
|
||||||
```
|
|
||||||
Branch: archeflow/2026-04-03-jwt-auth
|
|
||||||
Base: main (3 commits ahead)
|
|
||||||
|
|
||||||
Commits:
|
|
||||||
abc1234 archeflow(plan): explorer research
|
|
||||||
def5678 archeflow(plan): creator outline
|
|
||||||
ghi9012 archeflow(plan→do): phase transition
|
|
||||||
jkl3456 archeflow(do): maker implementation
|
|
||||||
|
|
||||||
Current phase: do
|
|
||||||
Files changed (total): 8
|
|
||||||
Uncommitted changes: none
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
In `.archeflow/config.yaml` or a team preset:
|
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
git:
|
git:
|
||||||
enabled: true # Default: true. Set false to disable all git operations.
|
enabled: true
|
||||||
branch_prefix: "archeflow/" # Default. The run_id is appended.
|
branch_prefix: "archeflow/"
|
||||||
commit_style: conventional # conventional (archeflow(<scope>): msg) | simple (<phase>: msg)
|
|
||||||
merge_strategy: squash # squash | no-ff | rebase
|
merge_strategy: squash # squash | no-ff | rebase
|
||||||
auto_push: false # Push branch to remote after each commit
|
auto_push: false
|
||||||
signing_key: null # SSH key path for signed commits (e.g., ~/.ssh/id_ed25519.pub)
|
signing_key: null
|
||||||
```
|
```
|
||||||
|
|
||||||
The helper script reads this config if it exists. All values have sensible defaults.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Post-Merge Rollback
|
|
||||||
|
|
||||||
After merging, the run skill validates the merge by running the project's test suite. If tests fail, the merge is automatically reverted.
|
|
||||||
|
|
||||||
### Script
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./lib/archeflow-rollback.sh <run_id> [--test-cmd <cmd>]
|
|
||||||
```
|
|
||||||
|
|
||||||
**Behavior:**
|
|
||||||
1. Reads `test_command` from `.archeflow/config.yaml` (or uses `--test-cmd` override)
|
|
||||||
2. Runs the test suite with a 5-minute timeout
|
|
||||||
3. If tests pass: exits 0 (merge is good)
|
|
||||||
4. If tests fail: runs `git revert --no-edit HEAD`, emits a `decision` event, exits 1
|
|
||||||
5. Verifies HEAD is an ArcheFlow merge commit before reverting (warning if not, proceeds anyway)
|
|
||||||
|
|
||||||
**Integration with run skill:** Called in section 4c (All Approved) after `archeflow-git.sh merge`. If it returns non-zero, the orchestrator cycles back with "integration test failure" feedback or reports to the user if max cycles are reached.
|
|
||||||
|
|
||||||
**Configuration:** Set `test_command` in `.archeflow/config.yaml`:
|
|
||||||
```yaml
|
|
||||||
test_command: "npm test" # or "pytest", "cargo test", etc.
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Safety Rules
|
## Safety Rules
|
||||||
|
|
||||||
These rules are inherited from `archeflow:orchestration` and reinforced here:
|
- Never force-push
|
||||||
|
- Never modify main history
|
||||||
1. **Never force-push.** No `--force`, no `--force-with-lease`. If a push fails, diagnose and fix.
|
- Branch stays intact on failure
|
||||||
2. **Never modify main history.** Merges are forward-only. No rebasing main.
|
- Clean merge or abort (no force-resolve on conflicts)
|
||||||
3. **Branch stays intact on failure.** If a run fails or is aborted, the branch is preserved for inspection. Never auto-delete failed branches.
|
- Worktree-compatible (Maker's worktree branch is sub-branch of run branch)
|
||||||
4. **All commits are individually revertable.** Each commit represents a discrete unit of work.
|
|
||||||
5. **Worktree mode compatibility.** If the Maker runs in a worktree, git-integration commits go to the worktree's branch. The merge happens at the run level, not the worktree level. The Maker's worktree branch is a sub-branch of `archeflow/<run_id>`.
|
|
||||||
6. **Clean merge or abort.** If a merge produces conflicts, do not force-resolve. Report the conflict, leave the branch intact, and let the user decide.
|
|
||||||
7. **No signing by default.** Signing is opt-in via config. If configured, all commits on the branch are signed.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Design Principles
|
|
||||||
|
|
||||||
1. **Git is the audit trail.** Every phase transition is a commit. `git log` tells the full story of a run.
|
|
||||||
2. **Rollback is cheap.** Reset to any phase boundary, re-run from there. No need to start over.
|
|
||||||
3. **Merge strategy is a project decision.** Squash for clean history, no-ff for detailed history. Both are valid.
|
|
||||||
4. **Events + git = full observability.** Process events capture *what happened* (decisions, verdicts, timing). Git captures *what changed* (files, diffs). Together they provide complete run archaeology.
|
|
||||||
5. **Fail-safe by default.** Every safety rule defaults to the conservative option. The user must explicitly opt in to destructive operations.
|
|
||||||
|
|||||||
@@ -11,21 +11,14 @@ description: |
|
|||||||
|
|
||||||
# Cross-Run Memory
|
# Cross-Run Memory
|
||||||
|
|
||||||
ArcheFlow forgets everything after each run. If Guardian repeatedly flags the same type of issue (e.g., timeline errors in fiction, missing null checks in code), the next run starts from zero. This skill fixes that by extracting lessons from completed runs and injecting them into future agent prompts.
|
ArcheFlow forgets everything after each run. This skill extracts lessons from completed runs and injects them into future agent prompts, so recurring issues (timeline errors, missing null checks) are caught proactively.
|
||||||
|
|
||||||
## Storage
|
## Storage
|
||||||
|
|
||||||
```
|
```
|
||||||
.archeflow/memory/lessons.jsonl # Append-only, one lesson per line
|
.archeflow/memory/lessons.jsonl # Append-only, one lesson per line
|
||||||
```
|
.archeflow/memory/archive.jsonl # Decayed lessons (frequency reached 0)
|
||||||
|
.archeflow/memory/audit.jsonl # Injection audit trail
|
||||||
Each lesson is a single JSON line:
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
{"id":"m-001","ts":"2026-04-03T14:00:00Z","run_id":"2026-04-03-der-huster","type":"pattern","source":"guardian","description":"Timeline references must match story start day","frequency":2,"severity":"bug","domain":"writing","tags":["continuity","timeline"],"last_seen_run":"2026-04-03-der-huster","runs_since_last_seen":0}
|
|
||||||
{"id":"m-002","ts":"2026-04-03T15:00:00Z","run_id":"2026-04-03-der-huster","type":"preference","source":"user_feedback","description":"User prefers single bundled PR over many small ones","frequency":1,"severity":"info","domain":"general","tags":["workflow"],"last_seen_run":"","runs_since_last_seen":0}
|
|
||||||
{"id":"m-003","ts":"2026-04-04T10:00:00Z","run_id":"2026-04-04-auth-fix","type":"archetype_hint","source":"sage","description":"Voice drift most common in long monologue passages","frequency":3,"severity":"warning","domain":"writing","tags":["voice","prose"],"archetype":"story-sage","last_seen_run":"2026-04-04-auth-fix","runs_since_last_seen":0}
|
|
||||||
{"id":"m-004","ts":"2026-04-04T11:00:00Z","run_id":"2026-04-04-auth-fix","type":"anti_pattern","source":"maker","description":"Splitting auth middleware into per-route handlers causes duplication","frequency":1,"severity":"warning","domain":"code","tags":["auth","middleware"],"last_seen_run":"2026-04-04-auth-fix","runs_since_last_seen":0}
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Lesson Types
|
## Lesson Types
|
||||||
@@ -33,245 +26,95 @@ Each lesson is a single JSON line:
|
|||||||
| Type | Source | Description |
|
| Type | Source | Description |
|
||||||
|------|--------|-------------|
|
|------|--------|-------------|
|
||||||
| `pattern` | Auto-detected | Recurring finding across runs (same category + similar description) |
|
| `pattern` | Auto-detected | Recurring finding across runs (same category + similar description) |
|
||||||
| `preference` | Manual | User correction or workflow preference (added via CLI) |
|
| `preference` | Manual | User correction or workflow preference (injected immediately, skips frequency threshold) |
|
||||||
| `archetype_hint` | Auto-detected | Per-archetype insight (e.g., Sage catches voice drift in monologues) |
|
| `archetype_hint` | Auto-detected | Per-archetype insight (e.g., Sage catches voice drift in monologues) |
|
||||||
| `anti_pattern` | Manual or auto | Something that was tried and failed — avoid repeating |
|
| `anti_pattern` | Manual or auto | Something that was tried and failed -- avoid repeating |
|
||||||
|
|
||||||
## Lesson Fields
|
## Lesson JSON Fields
|
||||||
|
|
||||||
| Field | Type | Description |
|
| Field | Type | Description |
|
||||||
|-------|------|-------------|
|
|-------|------|-------------|
|
||||||
| `id` | string | Unique ID, format `m-NNN` (monotonically increasing) |
|
| `id` | string | `m-NNN` (monotonically increasing) |
|
||||||
| `ts` | ISO 8601 | When the lesson was created or last updated |
|
| `ts` | ISO 8601 | Created or last updated |
|
||||||
| `run_id` | string | Run that created or last triggered this lesson |
|
| `run_id` | string | Run that created or last triggered this lesson |
|
||||||
| `type` | string | One of: `pattern`, `preference`, `archetype_hint`, `anti_pattern` |
|
| `type` | string | `pattern`, `preference`, `archetype_hint`, `anti_pattern` |
|
||||||
| `source` | string | Archetype or `user_feedback` that originated the lesson |
|
| `source` | string | Archetype name or `user_feedback` |
|
||||||
| `description` | string | Human-readable lesson text |
|
| `description` | string | Human-readable lesson text |
|
||||||
| `frequency` | integer | How many times this lesson was triggered |
|
| `frequency` | integer | Times this lesson was triggered |
|
||||||
| `severity` | string | `bug`, `warning`, `info`, or `recommendation` |
|
| `severity` | string | `bug`, `warning`, `info`, `recommendation` |
|
||||||
| `domain` | string | `writing`, `code`, `general`, or project-specific |
|
| `domain` | string | `writing`, `code`, `general`, or project-specific |
|
||||||
| `tags` | string[] | Keywords for matching and filtering |
|
| `tags` | string[] | Keywords for matching and filtering |
|
||||||
| `archetype` | string or null | For `archetype_hint` type — which archetype this applies to |
|
| `archetype` | string? | For `archetype_hint` -- which archetype this applies to |
|
||||||
| `last_seen_run` | string | Run ID where this lesson was last matched |
|
| `last_seen_run` | string | Run ID where last matched |
|
||||||
| `runs_since_last_seen` | integer | Counter for decay — incremented each run that does NOT trigger this lesson |
|
| `runs_since_last_seen` | integer | Counter for decay |
|
||||||
|
|
||||||
|
Example:
|
||||||
|
```jsonl
|
||||||
|
{"id":"m-001","ts":"2026-04-03T14:00:00Z","run_id":"2026-04-03-der-huster","type":"pattern","source":"guardian","description":"Timeline references must match story start day","frequency":2,"severity":"bug","domain":"writing","tags":["continuity","timeline"],"last_seen_run":"2026-04-03-der-huster","runs_since_last_seen":0}
|
||||||
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Auto-Detection
|
## Auto-Detection
|
||||||
|
|
||||||
After each `run.complete`, the orchestrator runs lesson extraction:
|
After each `run.complete`, extract lessons from findings:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
./lib/archeflow-memory.sh extract .archeflow/events/<run_id>.jsonl
|
./lib/archeflow-memory.sh extract .archeflow/events/<run_id>.jsonl
|
||||||
```
|
```
|
||||||
|
|
||||||
### Extraction Algorithm
|
The script reads `review.verdict` events, matches findings against existing lessons by keyword overlap (50%+ threshold), increments frequency on matches, and creates new candidate lessons (frequency: 1) for unmatched findings with severity >= WARNING.
|
||||||
|
|
||||||
1. **Read all `review.verdict` events** from the completed run's JSONL.
|
**Promotion rule:** A finding needs `frequency >= 2` (seen in 2+ runs) before injection. This filters out one-off noise. Preferences skip this threshold.
|
||||||
2. **For each finding** in each verdict:
|
|
||||||
a. Tokenize the finding description into keywords (lowercase, strip punctuation).
|
|
||||||
b. Compare keywords against each existing lesson's description + tags.
|
|
||||||
c. **Match threshold:** 50%+ keyword overlap between finding and lesson.
|
|
||||||
3. **If match found:** Update the existing lesson:
|
|
||||||
- Increment `frequency` by 1
|
|
||||||
- Update `ts` to now
|
|
||||||
- Update `last_seen_run` to current run ID
|
|
||||||
- Reset `runs_since_last_seen` to 0
|
|
||||||
4. **If no match AND severity >= WARNING:** Add as candidate lesson with `frequency: 1`.
|
|
||||||
5. **Candidates become active** when `frequency >= 2` (triggered in a second run).
|
|
||||||
|
|
||||||
### Promotion Rule
|
|
||||||
|
|
||||||
A finding that appears in only one run stays at `frequency: 1` — it might be a one-off. Once the same pattern appears in a second run (matched by keyword overlap), it gets promoted to `frequency: 2` and becomes eligible for injection.
|
|
||||||
|
|
||||||
This prevents noise from single-run anomalies while still capturing genuine recurring issues quickly.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Injection
|
## Injection
|
||||||
|
|
||||||
At run start, before spawning agents, the orchestrator injects relevant lessons:
|
Before spawning agents, inject relevant lessons:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
LESSONS=$(./lib/archeflow-memory.sh inject <domain> <archetype>)
|
LESSONS=$(./lib/archeflow-memory.sh inject <domain> <archetype>)
|
||||||
```
|
```
|
||||||
|
|
||||||
### Injection Rules
|
Rules: filters by domain (or `general`), optionally by archetype, requires `frequency >= 2`, sorts by frequency descending, caps at 10 lessons. Lessons with `frequency >= 5` are always injected regardless of filters.
|
||||||
|
|
||||||
1. Read `lessons.jsonl`.
|
Injected as a markdown section appended to the agent's system prompt:
|
||||||
2. Filter by `domain` (exact match or `general`) and optionally by `archetype`.
|
|
||||||
3. Only include lessons with `frequency >= 2` (confirmed patterns).
|
|
||||||
4. Sort by frequency descending (most common first).
|
|
||||||
5. Cap at **10 lessons** per injection.
|
|
||||||
6. Lessons with `frequency >= 5` are **always injected** regardless of domain/archetype filter (they are universal enough to matter).
|
|
||||||
|
|
||||||
### Injection Format
|
|
||||||
|
|
||||||
Append to the agent's system prompt as a structured section:
|
|
||||||
|
|
||||||
```markdown
|
```markdown
|
||||||
## Known Issues (from past runs)
|
## Known Issues (from past runs)
|
||||||
- Timeline references must match story start day [seen 3x, guardian]
|
- Timeline references must match story start day [seen 3x, guardian]
|
||||||
- Voice drift common in monologue passages >200 words [seen 2x, sage]
|
- Voice drift common in monologue passages >200 words [seen 2x, sage]
|
||||||
- Missing null checks in API response handlers [seen 5x, guardian]
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Integration with Run Skill
|
|
||||||
|
|
||||||
In the `run` skill, after Step 0 (Initialize) and before Step 1 (Plan Phase):
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Load cross-run memory for this domain
|
|
||||||
MEMORY_LESSONS=$(./lib/archeflow-memory.sh inject "$DOMAIN" "")
|
|
||||||
|
|
||||||
# Inject into Explorer/Creator prompts if non-empty
|
|
||||||
if [[ -n "$MEMORY_LESSONS" ]]; then
|
|
||||||
EXPLORER_PROMPT="${EXPLORER_PROMPT}
|
|
||||||
|
|
||||||
${MEMORY_LESSONS}"
|
|
||||||
CREATOR_PROMPT="${CREATOR_PROMPT}
|
|
||||||
|
|
||||||
${MEMORY_LESSONS}"
|
|
||||||
fi
|
|
||||||
```
|
|
||||||
|
|
||||||
For reviewers in the Check phase, inject archetype-specific lessons:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
GUARDIAN_LESSONS=$(./lib/archeflow-memory.sh inject "$DOMAIN" "guardian")
|
|
||||||
SAGE_LESSONS=$(./lib/archeflow-memory.sh inject "$DOMAIN" "sage")
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Decay
|
## Decay
|
||||||
|
|
||||||
Lessons that stop being relevant should fade out. After each `run.complete`, apply decay:
|
After each `run.complete`, apply decay: lessons not seen for 10 runs lose 1 frequency. When frequency reaches 0, the lesson is archived.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
./lib/archeflow-memory.sh decay
|
./lib/archeflow-memory.sh decay
|
||||||
```
|
```
|
||||||
|
|
||||||
### Decay Algorithm
|
|
||||||
|
|
||||||
1. For every lesson in `lessons.jsonl`:
|
|
||||||
- If `last_seen_run` is NOT the current run → increment `runs_since_last_seen` by 1
|
|
||||||
2. If `runs_since_last_seen >= 10`:
|
|
||||||
- Decrement `frequency` by 1
|
|
||||||
- Reset `runs_since_last_seen` to 0
|
|
||||||
3. If `frequency` drops to 0:
|
|
||||||
- Move the lesson to `.archeflow/memory/archive.jsonl` (append)
|
|
||||||
- Remove from `lessons.jsonl`
|
|
||||||
|
|
||||||
This means a lesson that was seen 5 times but then stops appearing will survive 50 runs of non-triggering before being fully archived (5 decrements x 10 runs each).
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Manual Management
|
## Manual Management
|
||||||
|
|
||||||
### Add a lesson
|
```bash
|
||||||
|
archeflow memory add "User prefers single bundled PR" # Add preference (injected immediately)
|
||||||
|
archeflow memory list # Show all active lessons
|
||||||
|
archeflow memory forget m-002 # Archive a lesson
|
||||||
|
```
|
||||||
|
|
||||||
|
## Audit Trail
|
||||||
|
|
||||||
|
Track which lessons are injected per run and whether they were effective. Pass `--audit <run_id>` to inject to log records. After a run, `audit-check <run_id>` compares injected lessons against review findings: no matching finding = helpful (issue prevented), matching finding = ineffective (issue repeated despite injection).
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
archeflow memory add "User prefers single bundled PR over many small ones"
|
./lib/archeflow-memory.sh inject "$DOMAIN" "" --audit "$RUN_ID"
|
||||||
# Internally: ./lib/archeflow-memory.sh add preference "User prefers single bundled PR over many small ones"
|
./lib/archeflow-memory.sh audit-check <run_id>
|
||||||
```
|
```
|
||||||
|
|
||||||
Manually added lessons start at `frequency: 1` but with type `preference`, which means they are injected immediately (preferences skip the frequency >= 2 threshold).
|
|
||||||
|
|
||||||
### List lessons
|
|
||||||
|
|
||||||
```bash
|
|
||||||
archeflow memory list
|
|
||||||
# Internally: ./lib/archeflow-memory.sh list
|
|
||||||
```
|
|
||||||
|
|
||||||
Output:
|
|
||||||
|
|
||||||
```
|
|
||||||
ID Freq Type Domain Description
|
|
||||||
m-001 3 pattern writing Timeline references must match story start day
|
|
||||||
m-002 1 preference general User prefers single bundled PR over many small ones
|
|
||||||
m-003 5 archetype_hint writing Voice drift most common in long monologue passages
|
|
||||||
m-004 1 anti_pattern code Splitting auth middleware causes duplication
|
|
||||||
```
|
|
||||||
|
|
||||||
### Forget a lesson
|
|
||||||
|
|
||||||
```bash
|
|
||||||
archeflow memory forget m-002
|
|
||||||
# Internally: ./lib/archeflow-memory.sh forget m-002
|
|
||||||
```
|
|
||||||
|
|
||||||
Moves the lesson to `archive.jsonl` regardless of frequency.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Integration Points
|
## Integration Points
|
||||||
|
|
||||||
| Moment | Action | Script Command |
|
| Moment | Action | Script Command |
|
||||||
|--------|--------|----------------|
|
|--------|--------|----------------|
|
||||||
| After `run.complete` | Extract lessons from findings | `archeflow-memory.sh extract <events.jsonl>` |
|
| After `run.complete` | Extract lessons from findings | `archeflow-memory.sh extract <events.jsonl>` |
|
||||||
| After extraction | Apply decay to all lessons | `archeflow-memory.sh decay` |
|
| After extraction | Apply decay to all lessons | `archeflow-memory.sh decay` |
|
||||||
| Before agent spawn (run start) | Inject relevant lessons | `archeflow-memory.sh inject <domain> <archetype>` |
|
| Before agent spawn | Inject relevant lessons | `archeflow-memory.sh inject <domain> <archetype>` |
|
||||||
| User command | Add/list/forget lessons | `archeflow-memory.sh add/list/forget` |
|
| User command | Add/list/forget lessons | `archeflow-memory.sh add/list/forget` |
|
||||||
|
|
||||||
## Audit Trail
|
|
||||||
|
|
||||||
Track which lessons are injected into each run and whether they were effective.
|
|
||||||
|
|
||||||
### Storage
|
|
||||||
|
|
||||||
```
|
|
||||||
.archeflow/memory/audit.jsonl # Append-only audit log
|
|
||||||
```
|
|
||||||
|
|
||||||
### Injection Audit Record
|
|
||||||
|
|
||||||
When `--audit <run_id>` is passed to the `inject` command, an audit record is written:
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
{"ts":"2026-04-04T10:00:00Z","run_id":"2026-04-04-auth-fix","domain":"code","archetype":"","lessons_injected":["m-001","m-003"],"lesson_count":2}
|
|
||||||
```
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
```bash
|
|
||||||
./lib/archeflow-memory.sh inject "$DOMAIN" "" --audit "$RUN_ID"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Effectiveness Check
|
|
||||||
|
|
||||||
After a run completes, check whether injected lessons prevented issues:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./lib/archeflow-memory.sh audit-check <run_id>
|
|
||||||
```
|
|
||||||
|
|
||||||
This command:
|
|
||||||
1. Reads `audit.jsonl` for lessons injected in the given run
|
|
||||||
2. Reads the run's event file for `review.verdict` events
|
|
||||||
3. For each injected lesson, checks keyword overlap between the lesson's description and review findings
|
|
||||||
4. **No matching finding** = `helpful` (the lesson likely prevented the issue)
|
|
||||||
5. **Matching finding** = `ineffective` (the issue repeated despite the lesson being injected)
|
|
||||||
6. Appends effectiveness results to `audit.jsonl`
|
|
||||||
|
|
||||||
### Effectiveness Over Time
|
|
||||||
|
|
||||||
By querying `audit.jsonl` for effectiveness records, you can measure:
|
|
||||||
- Which lessons consistently prevent issues (high `helpful` count)
|
|
||||||
- Which lessons are not working (high `ineffective` count — consider rewording or removing)
|
|
||||||
- Overall memory system ROI (ratio of helpful to ineffective across all runs)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Count effectiveness per lesson
|
|
||||||
jq -r 'select(.type == "effectiveness_check") | [.lesson_id, .effectiveness] | @tsv' .archeflow/memory/audit.jsonl | sort | uniq -c
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Design Principles
|
|
||||||
|
|
||||||
1. **Append-only storage.** `lessons.jsonl` is append-only during writes; decay rewrites the file in place but preserves all data (archived lessons move to `archive.jsonl`).
|
|
||||||
2. **Conservative promotion.** A finding must appear in 2+ runs before injection. One-offs are noise.
|
|
||||||
3. **Graceful degradation.** If `lessons.jsonl` doesn't exist, injection returns empty — no error, no block.
|
|
||||||
4. **Cheap.** Keyword matching, not embeddings. `jq` for JSON, `grep` for matching. No external services.
|
|
||||||
5. **Bounded.** Max 10 lessons injected per prompt. Prevents context pollution.
|
|
||||||
|
|||||||
@@ -6,624 +6,138 @@ description: |
|
|||||||
and enforces a shared budget. Each sub-run uses the standard `run` skill internally.
|
and enforces a shared budget. Each sub-run uses the standard `run` skill internally.
|
||||||
<example>User: "archeflow:multi-project" with a multi-run.yaml</example>
|
<example>User: "archeflow:multi-project" with a multi-run.yaml</example>
|
||||||
<example>User: "Run this across archeflow, colette, and giesing"</example>
|
<example>User: "Run this across archeflow, colette, and giesing"</example>
|
||||||
<example>User: "archeflow:multi-project --dry-run"</example>
|
|
||||||
---
|
---
|
||||||
|
|
||||||
# Multi-Project Orchestration
|
# Multi-Project Orchestration
|
||||||
|
|
||||||
Coordinates ArcheFlow runs across multiple projects in a workspace. Each project gets its own
|
Coordinates ArcheFlow runs across multiple projects. Each project gets its own PDCA run (via `run` skill), but dependencies are respected, artifacts shared, and budget tracked globally.
|
||||||
PDCA run (via the standard `run` skill), but dependencies between projects are respected, artifacts
|
|
||||||
are shared, and budget is tracked globally.
|
|
||||||
|
|
||||||
## Prerequisites
|
|
||||||
|
|
||||||
Load these skills (they are referenced throughout):
|
|
||||||
- `archeflow:run` — single-project PDCA execution loop
|
|
||||||
- `archeflow:process-log` — event schema and DAG parent rules
|
|
||||||
- `archeflow:artifact-routing` — artifact naming, context injection, cycle archiving
|
|
||||||
- `archeflow:cost-tracking` — cost aggregation and budget enforcement
|
|
||||||
- `archeflow:domains` — domain detection per project
|
|
||||||
|
|
||||||
## Invocation
|
|
||||||
|
|
||||||
```
|
|
||||||
archeflow:multi-project # Read from .archeflow/multi-run.yaml
|
|
||||||
archeflow:multi-project --config path/to.yaml # Explicit config file
|
|
||||||
archeflow:multi-project --dry-run # Plan phase only for all projects, show cost estimate
|
|
||||||
archeflow:multi-project --resume <multi-run-id> # Resume a failed/paused multi-run
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Multi-Run Definition
|
## Multi-Run Definition
|
||||||
|
|
||||||
A multi-run is defined in YAML, either in `.archeflow/multi-run.yaml` or passed via `--config`.
|
Defined in `.archeflow/multi-run.yaml` or passed via `--config`.
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
name: "giesing-gschichten-v2"
|
name: "giesing-gschichten-v2"
|
||||||
description: "Write second story with improved ArcheFlow + Colette integration"
|
|
||||||
|
|
||||||
projects:
|
projects:
|
||||||
- id: archeflow
|
- id: archeflow
|
||||||
path: "../archeflow" # Relative to workspace root, or absolute
|
path: "../archeflow"
|
||||||
task: "Add memory injection to run skill"
|
task: "Add memory injection to run skill"
|
||||||
workflow: fast # fast | standard | thorough (optional, auto-select if omitted)
|
workflow: fast
|
||||||
domain: code # Optional, auto-detected if omitted
|
depends_on: []
|
||||||
depends_on: [] # No dependencies — can start immediately
|
|
||||||
|
|
||||||
- id: colette
|
- id: colette
|
||||||
path: "../writing.colette"
|
path: "../writing.colette"
|
||||||
task: "Add story-specific voice validation command"
|
task: "Add voice validation command"
|
||||||
workflow: standard
|
depends_on: []
|
||||||
domain: code
|
|
||||||
depends_on: [] # Independent of archeflow — runs in parallel
|
|
||||||
|
|
||||||
- id: giesing
|
- id: giesing
|
||||||
path: "."
|
path: "."
|
||||||
task: "Write story #2 using improved tools"
|
task: "Write story #2"
|
||||||
workflow: kurzgeschichte
|
workflow: kurzgeschichte
|
||||||
domain: writing
|
domain: writing
|
||||||
depends_on: [archeflow, colette] # Waits for both to complete
|
depends_on: [archeflow, colette]
|
||||||
|
|
||||||
budget:
|
budget:
|
||||||
total_usd: 15.00 # Hard cap — stops all projects when exceeded
|
total_usd: 15.00
|
||||||
per_project_usd: 10.00 # Soft cap — warns but does not stop
|
per_project_usd: 10.00
|
||||||
|
|
||||||
parallel: true # Run independent projects concurrently (default: true)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Definition Rules
|
**Rules:** Unique `id` per project. `depends_on` references other `id` values. Cycles rejected at validation. At least one project must have empty `depends_on`. `workflow` and `domain` auto-select if omitted.
|
||||||
|
|
||||||
- `id` must be unique within the multi-run.
|
## Dependency Resolution
|
||||||
- `path` is resolved relative to the directory containing the YAML file unless absolute.
|
|
||||||
- `depends_on` references other project `id` values. Cycles are rejected at validation time.
|
|
||||||
- `workflow` and `domain` are optional. If omitted, the `run` skill auto-selects per project.
|
|
||||||
- At least one project must have an empty `depends_on` (otherwise the DAG has no entry point).
|
|
||||||
|
|
||||||
---
|
Topological sort of the project DAG determines execution order.
|
||||||
|
|
||||||
## Workspace Registry Integration
|
|
||||||
|
|
||||||
If `docs/project-registry.md` exists at the workspace root, the multi-project skill can:
|
|
||||||
|
|
||||||
1. **Auto-discover paths:** When `path` is omitted from a project entry, look up the project `id` in the registry to find its directory.
|
|
||||||
2. **Validate existence:** Before starting, verify that every project path exists on disk. Abort with a clear error if a path is missing.
|
|
||||||
3. **Show registry status:** In the progress table, include the project's current sprint goal from the registry alongside the multi-run status.
|
|
||||||
4. **Update registry:** After the multi-run completes, update each project's status in the registry if meaningful changes were made (new features, completed sprint goals).
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Execution Steps
|
|
||||||
|
|
||||||
### 0. Validate and Initialize
|
|
||||||
|
|
||||||
**0a. Parse and validate the multi-run definition:**
|
|
||||||
|
|
||||||
```
|
```
|
||||||
1. Read the YAML file.
|
|
||||||
2. Validate all required fields (name, projects with id/path/task).
|
|
||||||
3. Resolve all paths to absolute paths.
|
|
||||||
4. Verify each path exists on disk.
|
|
||||||
5. Build the dependency DAG.
|
|
||||||
6. Check for cycles — abort if any detected.
|
|
||||||
7. Identify the entry-point projects (depends_on is empty).
|
|
||||||
8. Verify at least one entry-point exists.
|
|
||||||
```
|
|
||||||
|
|
||||||
**0b. Generate multi-run ID and directory structure:**
|
|
||||||
|
|
||||||
```bash
|
|
||||||
MULTI_RUN_ID="$(date -u +%Y-%m-%d)-${name}"
|
|
||||||
|
|
||||||
# Master event file
|
|
||||||
mkdir -p .archeflow/events
|
|
||||||
touch .archeflow/events/${MULTI_RUN_ID}.jsonl
|
|
||||||
|
|
||||||
# Cross-project artifact directory
|
|
||||||
mkdir -p .archeflow/artifacts/${MULTI_RUN_ID}
|
|
||||||
for project in ${PROJECT_IDS}; do
|
|
||||||
mkdir -p .archeflow/artifacts/${MULTI_RUN_ID}/${project}
|
|
||||||
done
|
|
||||||
|
|
||||||
# Progress file
|
|
||||||
touch .archeflow/multi-progress.md
|
|
||||||
```
|
|
||||||
|
|
||||||
**0c. Emit `multi.start`:**
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
{"ts":"...","run_id":"<MULTI_RUN_ID>","seq":1,"parent":[],"type":"multi.start","phase":"init","agent":null,"data":{"name":"giesing-v2","description":"...","projects":["archeflow","colette","giesing"],"parallel":true,"budget_total_usd":15.00,"dag":{"archeflow":[],"colette":[],"giesing":["archeflow","colette"]}}}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Track state throughout the multi-run:**
|
|
||||||
- `MULTI_RUN_ID` — unique multi-run identifier
|
|
||||||
- `MULTI_SEQ` — master event sequence counter
|
|
||||||
- `PROJECT_STATUS` — map of project_id to status (`pending | running | completed | failed | blocked | skipped`)
|
|
||||||
- `PROJECT_RUN_IDS` — map of project_id to its sub-run_id
|
|
||||||
- `TOTAL_COST` — running cost total across all projects
|
|
||||||
- `REMAINING_BUDGET` — budget minus total cost
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 1. Dependency Resolution
|
|
||||||
|
|
||||||
Build a topological sort of the project DAG. This determines execution order.
|
|
||||||
|
|
||||||
```
|
|
||||||
Given:
|
|
||||||
archeflow: depends_on=[]
|
|
||||||
colette: depends_on=[]
|
|
||||||
giesing: depends_on=[archeflow, colette]
|
|
||||||
|
|
||||||
Topological layers:
|
|
||||||
Layer 0 (immediate): [archeflow, colette] # No deps, start now
|
Layer 0 (immediate): [archeflow, colette] # No deps, start now
|
||||||
Layer 1: [giesing] # Depends on Layer 0
|
Layer 1: [giesing] # Depends on Layer 0
|
||||||
```
|
```
|
||||||
|
|
||||||
**Algorithm:**
|
Independent projects in the same layer run in parallel. When a project completes, downstream projects with all deps met move to the ready queue.
|
||||||
1. Find all projects with zero unmet dependencies. These form the current layer.
|
|
||||||
2. When a project completes, remove it from the dependency lists of all downstream projects.
|
|
||||||
3. Any project whose dependency list becomes empty moves to the ready queue.
|
|
||||||
4. Repeat until all projects are complete, failed, or blocked.
|
|
||||||
|
|
||||||
**Cycle detection:** Before starting, verify the DAG is acyclic. Use Kahn's algorithm — if after processing all nodes the sorted list is shorter than the project list, there is a cycle. Report which projects form the cycle and abort.
|
Cycle detection via Kahn's algorithm. If sorted list is shorter than project list, report the cycle and abort.
|
||||||
|
|
||||||
---
|
## Parallel Execution
|
||||||
|
|
||||||
### 2. Parallel Execution
|
For each ready project, start a sub-run as a parallel subagent with `isolation: "worktree"`. Each sub-run invokes `archeflow:run` with its own run_id, workflow, domain, and budget slice.
|
||||||
|
|
||||||
For each project in the ready queue, start a sub-run. Independent projects run concurrently.
|
When `parallel: false`, run sequentially in topological order.
|
||||||
|
|
||||||
**Starting a sub-run:**
|
## Cross-Project Artifacts
|
||||||
|
|
||||||
```
|
When project B depends on A, B's Explorer receives upstream artifact summaries:
|
||||||
For each ready project:
|
- Only summaries injected (not full artifacts)
|
||||||
1. Set PROJECT_STATUS[project_id] = "running"
|
- Large artifacts (>200 lines): extract summary section only
|
||||||
2. Generate sub-run ID: MULTI_RUN_ID/project_id
|
- Cross-project injection happens only in Plan phase
|
||||||
(e.g., "2026-04-03-giesing-v2/archeflow")
|
- Downstream Explorer has filesystem access to full artifacts if needed
|
||||||
3. Emit project.start to master event file
|
|
||||||
4. cd into the project's path
|
|
||||||
5. Invoke archeflow:run with:
|
|
||||||
- run_id = MULTI_RUN_ID/project_id
|
|
||||||
- workflow = project.workflow (or auto-select)
|
|
||||||
- domain = project.domain (or auto-detect)
|
|
||||||
- budget = min(per_project_budget, remaining_total_budget)
|
|
||||||
- artifact_dir = .archeflow/artifacts/MULTI_RUN_ID/project_id/
|
|
||||||
6. The sub-run emits its own events to its own JSONL file
|
|
||||||
inside the project's directory (standard run behavior)
|
|
||||||
```
|
|
||||||
|
|
||||||
**Concurrency model:**
|
Artifact directory: `.archeflow/artifacts/<MULTI_RUN_ID>/<project_id>/`
|
||||||
|
|
||||||
When `parallel: true` (default), spawn independent projects as parallel subagents:
|
## Budget Coordination
|
||||||
|
|
||||||
```
|
| Level | Type | Behavior |
|
||||||
Agent(
|
|
||||||
description: "Multi-project sub-run: <project_id> — <task>",
|
|
||||||
prompt: "Run archeflow:run in <path> with task: <task>.
|
|
||||||
Run ID: <MULTI_RUN_ID>/<project_id>
|
|
||||||
Workflow: <workflow>
|
|
||||||
Domain: <domain>
|
|
||||||
Budget: $<per_project_budget>
|
|
||||||
Save artifacts to: .archeflow/artifacts/<MULTI_RUN_ID>/<project_id>/
|
|
||||||
When complete, report: status, cost, artifact list, and any issues.",
|
|
||||||
isolation: "worktree",
|
|
||||||
mode: "bypassPermissions"
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
Launch all Layer 0 projects simultaneously. As each completes, check if any Layer 1+ projects become unblocked.
|
|
||||||
|
|
||||||
When `parallel: false`, run projects sequentially in topological order. Still respect dependencies — a project does not start until all its dependencies have completed.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 3. Master Events
|
|
||||||
|
|
||||||
All multi-run-level events are written to `.archeflow/events/<MULTI_RUN_ID>.jsonl`. These track the overall orchestration, not individual PDCA phases (those go to each project's own event file).
|
|
||||||
|
|
||||||
#### Master Event Types
|
|
||||||
|
|
||||||
| Event | When | Key Data |
|
|
||||||
|-------|------|----------|
|
|-------|------|----------|
|
||||||
| `multi.start` | Multi-run begins | Project list, DAG, budget |
|
| `total_usd` | Hard cap | Stops ALL projects when exceeded |
|
||||||
| `project.start` | A sub-run launches | project_id, run_id, path |
|
| `per_project_usd` | Soft cap | Warns but continues |
|
||||||
| `project.complete` | A sub-run finishes successfully | project_id, status, cost, artifacts |
|
|
||||||
| `project.failed` | A sub-run fails | project_id, error, cost_so_far |
|
|
||||||
| `project.blocked` | A dependency failed, blocking this project | project_id, blocked_by |
|
|
||||||
| `project.unblocked` | All dependencies met, project can start | project_id, unblocked_by |
|
|
||||||
| `project.skipped` | User chose to skip a blocked project | project_id, reason |
|
|
||||||
| `budget.warning` | Budget threshold crossed | spent, budget, percent |
|
|
||||||
| `budget.exceeded` | Hard budget cap hit | spent, budget, halted_projects |
|
|
||||||
| `multi.complete` | All projects done (or halted) | status, projects_completed, total_cost |
|
|
||||||
|
|
||||||
#### Example Master Event Stream
|
**Enforcement points:**
|
||||||
|
1. Before starting a sub-run: estimate cost, halt if > remaining budget
|
||||||
|
2. After each sub-run: update total, emit `budget.warning` at threshold, emit `budget.exceeded` at cap
|
||||||
|
|
||||||
```jsonl
|
Each sub-run receives `min(per_project_usd, remaining_total_budget)` as its budget.
|
||||||
{"seq":1,"type":"multi.start","phase":"init","data":{"name":"giesing-v2","projects":["archeflow","colette","giesing"],"parallel":true,"budget_total_usd":15.00}}
|
|
||||||
{"seq":2,"type":"project.start","phase":"run","data":{"project":"archeflow","run_id":"2026-04-03-giesing-v2/archeflow","path":"/home/c/projects/archeflow"}}
|
|
||||||
{"seq":3,"type":"project.start","phase":"run","data":{"project":"colette","run_id":"2026-04-03-giesing-v2/colette","path":"/home/c/projects/writing.colette"}}
|
|
||||||
{"seq":4,"type":"project.complete","phase":"run","data":{"project":"archeflow","status":"completed","run_id":"2026-04-03-giesing-v2/archeflow","cost_usd":1.20,"artifacts":["plan-explorer.md","plan-creator.md","do-maker.md","check-guardian.md"]}}
|
|
||||||
{"seq":5,"type":"project.complete","phase":"run","data":{"project":"colette","status":"completed","run_id":"2026-04-03-giesing-v2/colette","cost_usd":1.80,"artifacts":["plan-creator.md","do-maker.md","check-guardian.md","check-sage.md"]}}
|
|
||||||
{"seq":6,"type":"project.unblocked","phase":"run","data":{"project":"giesing","unblocked_by":["archeflow","colette"]}}
|
|
||||||
{"seq":7,"type":"project.start","phase":"run","data":{"project":"giesing","run_id":"2026-04-03-giesing-v2/giesing","path":"/home/c/projects/book.giesing-gschichten"}}
|
|
||||||
{"seq":8,"type":"project.complete","phase":"run","data":{"project":"giesing","status":"completed","run_id":"2026-04-03-giesing-v2/giesing","cost_usd":3.50,"artifacts":["plan-explorer.md","plan-creator.md","do-maker.md","check-guardian.md","check-sage.md"]}}
|
|
||||||
{"seq":9,"type":"multi.complete","phase":"done","data":{"status":"completed","projects_completed":3,"projects_failed":0,"total_cost_usd":6.50,"budget_remaining_usd":8.50}}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
## Failure Handling
|
||||||
|
|
||||||
### 4. Cross-Project Artifacts
|
|
||||||
|
|
||||||
When project B depends on project A, B's agents can access A's artifacts. This is the primary mechanism for cross-project information flow.
|
|
||||||
|
|
||||||
#### Artifact Directory Layout
|
|
||||||
|
|
||||||
```
|
|
||||||
.archeflow/artifacts/<MULTI_RUN_ID>/
|
|
||||||
├── archeflow/ # Sub-run artifacts from archeflow
|
|
||||||
│ ├── plan-explorer.md
|
|
||||||
│ ├── plan-creator.md
|
|
||||||
│ ├── do-maker.md
|
|
||||||
│ ├── do-maker-files.txt
|
|
||||||
│ └── check-guardian.md
|
|
||||||
├── colette/ # Sub-run artifacts from colette
|
|
||||||
│ ├── plan-creator.md
|
|
||||||
│ ├── do-maker.md
|
|
||||||
│ └── check-sage.md
|
|
||||||
└── giesing/ # Sub-run artifacts from giesing (depends on both)
|
|
||||||
├── plan-explorer.md # Explorer can reference upstream artifacts
|
|
||||||
├── plan-creator.md
|
|
||||||
├── do-maker.md
|
|
||||||
└── check-guardian.md
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Cross-Project Context Injection
|
|
||||||
|
|
||||||
When a dependent project's sub-run starts, inject upstream artifact summaries into the Explorer's prompt:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
## Upstream Project Results
|
|
||||||
|
|
||||||
### archeflow (completed)
|
|
||||||
Summary: Added memory injection to run skill.
|
|
||||||
Key artifacts:
|
|
||||||
- plan-creator.md: <first 20 lines or summary section>
|
|
||||||
- do-maker.md: <implementation summary>
|
|
||||||
|
|
||||||
### colette (completed)
|
|
||||||
Summary: Added story-specific voice validation command.
|
|
||||||
Key artifacts:
|
|
||||||
- plan-creator.md: <first 20 lines or summary section>
|
|
||||||
- do-maker.md: <implementation summary>
|
|
||||||
|
|
||||||
Use these results as context. The changes from these projects are available in their
|
|
||||||
respective directories and have been committed to their branches.
|
|
||||||
```
|
|
||||||
|
|
||||||
**Rules for cross-project injection:**
|
|
||||||
- Only inject summaries, not full artifacts (keep context small).
|
|
||||||
- If an upstream artifact is large (>200 lines), extract the summary/overview section only.
|
|
||||||
- The dependent project's Explorer has filesystem access to read full upstream artifacts if needed.
|
|
||||||
- Cross-project injection happens ONLY in the Plan phase (Explorer and Creator). The Maker works from the Creator's proposal, which already incorporates upstream context.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 5. Budget Coordination
|
|
||||||
|
|
||||||
The multi-run has a shared budget across all projects.
|
|
||||||
|
|
||||||
#### Budget Hierarchy
|
|
||||||
|
|
||||||
```
|
|
||||||
total_usd: 15.00 # Hard cap — stops ALL projects when exceeded
|
|
||||||
per_project_usd: 10.00 # Soft cap — warns but does not stop individual project
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Budget Tracking
|
|
||||||
|
|
||||||
Maintain a running total across all sub-runs:
|
|
||||||
|
|
||||||
```
|
|
||||||
TOTAL_COST = sum of all project costs reported in project.complete events
|
|
||||||
REMAINING = total_usd - TOTAL_COST
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Budget Enforcement Points
|
|
||||||
|
|
||||||
1. **Before starting a sub-run:**
|
|
||||||
- Estimate the sub-run cost (based on workflow and domain).
|
|
||||||
- If estimated cost > REMAINING: warn and ask user (attended) or halt (autonomous).
|
|
||||||
|
|
||||||
2. **After each sub-run completes:**
|
|
||||||
- Update TOTAL_COST with actual cost from the sub-run.
|
|
||||||
- If TOTAL_COST > total_usd * warn_at_percent: emit `budget.warning`.
|
|
||||||
- If TOTAL_COST > total_usd: emit `budget.exceeded`, halt remaining projects.
|
|
||||||
|
|
||||||
3. **Per-project soft cap:**
|
|
||||||
- Each sub-run receives `min(per_project_usd, REMAINING)` as its budget.
|
|
||||||
- The `run` skill's own budget enforcement handles the per-project cap.
|
|
||||||
- If a project exceeds per_project_usd, it warns but continues (soft cap).
|
|
||||||
|
|
||||||
#### Budget Events
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
{"seq":5,"type":"budget.warning","data":{"spent_usd":11.50,"budget_usd":15.00,"percent":77,"message":"Budget 77% consumed"}}
|
|
||||||
{"seq":8,"type":"budget.exceeded","data":{"spent_usd":15.30,"budget_usd":15.00,"halted_projects":["giesing"],"message":"Hard budget cap exceeded. Halting remaining projects."}}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 6. Failure Handling
|
|
||||||
|
|
||||||
Failures in one project affect downstream projects but not independent ones.
|
|
||||||
|
|
||||||
#### Failure Scenarios
|
|
||||||
|
|
||||||
| Scenario | Action |
|
| Scenario | Action |
|
||||||
|----------|--------|
|
|----------|--------|
|
||||||
| Project fails (run error, test failure, max cycles) | Mark as `failed` in master events. Independent projects continue. |
|
| Project fails | Mark `failed`. Independent projects continue. |
|
||||||
| Dependency of project X failed | Mark X as `blocked`. Do not start X. |
|
| Dependency failed | Mark downstream as `blocked`. Do not start. |
|
||||||
| Budget exceeded mid-run | Halt the current project. Mark remaining as `blocked`. |
|
| Budget exceeded | Halt current project. Skip downstream. |
|
||||||
| All entry-point projects fail | Entire multi-run fails. No downstream projects can start. |
|
| All entry-points fail | Entire multi-run fails. |
|
||||||
|
|
||||||
#### Blocked Project Resolution
|
**Blocked project resolution:**
|
||||||
|
- Autonomous mode: skip blocked projects, continue independent ones
|
||||||
|
- Attended mode: offer skip / retry / abort
|
||||||
|
|
||||||
When a project is blocked because a dependency failed, offer three options:
|
## Progress Tracking
|
||||||
|
|
||||||
1. **Skip:** Mark the blocked project as `skipped`. Continue with other independent projects.
|
Live progress at `.archeflow/multi-progress.md`, updated after every project state change:
|
||||||
2. **Retry:** Re-run the failed dependency. If it succeeds, unblock downstream projects.
|
|
||||||
3. **Abort:** Stop the entire multi-run. Report what completed and what did not.
|
|
||||||
|
|
||||||
In **autonomous mode**, the default action is `skip` — blocked projects are skipped, independent projects continue, and the multi-run completes with partial results.
|
|
||||||
|
|
||||||
In **attended mode**, prompt the user with the options above.
|
|
||||||
|
|
||||||
#### Failure Events
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
{"seq":4,"type":"project.failed","data":{"project":"archeflow","error":"Max cycles reached with unresolved CRITICAL findings","cost_usd":2.10}}
|
|
||||||
{"seq":5,"type":"project.blocked","data":{"project":"giesing","blocked_by":["archeflow"],"reason":"Dependency 'archeflow' failed"}}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 7. Progress Tracking
|
|
||||||
|
|
||||||
Maintain a live progress file at `.archeflow/multi-progress.md`. Update it after every project state change.
|
|
||||||
|
|
||||||
```markdown
|
```markdown
|
||||||
# Multi-Run: giesing-v2
|
|
||||||
Started: 2026-04-03T14:00:00Z
|
|
||||||
|
|
||||||
| Project | Status | Domain | Phase | Detail |
|
| Project | Status | Domain | Phase | Detail |
|
||||||
|---------|--------|--------|-------|--------|
|
|---------|--------|--------|-------|--------|
|
||||||
| archeflow | completed | code | -- | 1 cycle, $1.20 |
|
| archeflow | completed | code | -- | 1 cycle, $1.20 |
|
||||||
| colette | running | code | DO | maker drafting |
|
| colette | running | code | DO | maker drafting |
|
||||||
| giesing | blocked | writing | -- | waiting for colette |
|
| giesing | blocked | writing | -- | waiting for colette |
|
||||||
|
|
||||||
## Budget
|
Budget: $3.00 / $15.00 (20%)
|
||||||
| | Amount |
|
|
||||||
|---|--------|
|
|
||||||
| Spent | $3.00 |
|
|
||||||
| Budget | $15.00 |
|
|
||||||
| Remaining | $12.00 |
|
|
||||||
| Utilization | 20% |
|
|
||||||
|
|
||||||
## Dependency Graph
|
|
||||||
```
|
|
||||||
archeflow ----\
|
|
||||||
+---> giesing
|
|
||||||
colette ------/
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Timeline
|
## Master Events
|
||||||
- 14:00:00 — Started archeflow, colette (parallel)
|
|
||||||
- 14:05:23 — archeflow completed ($1.20, 1 cycle)
|
|
||||||
- 14:06:10 — colette DO phase, maker drafting
|
|
||||||
```
|
|
||||||
|
|
||||||
Update this file after:
|
Written to `.archeflow/events/<MULTI_RUN_ID>.jsonl`:
|
||||||
- A project starts
|
|
||||||
- A project changes phase (via status polling or sub-agent reporting)
|
|
||||||
- A project completes or fails
|
|
||||||
- A project becomes unblocked
|
|
||||||
- Budget threshold is crossed
|
|
||||||
|
|
||||||
---
|
| Event | When |
|
||||||
|
|-------|------|
|
||||||
|
| `multi.start` | Multi-run begins |
|
||||||
|
| `project.start` | Sub-run launches |
|
||||||
|
| `project.complete` | Sub-run succeeds |
|
||||||
|
| `project.failed` | Sub-run fails |
|
||||||
|
| `project.blocked` | Dependency failed |
|
||||||
|
| `project.unblocked` | All deps met |
|
||||||
|
| `budget.warning` | Threshold crossed |
|
||||||
|
| `budget.exceeded` | Hard cap hit |
|
||||||
|
| `multi.complete` | All projects done |
|
||||||
|
|
||||||
### 8. Completion
|
## Dry-Run and Resume
|
||||||
|
|
||||||
When all projects are complete (or blocked/skipped with no more actionable items):
|
**`--dry-run`:** Validates DAG, runs `archeflow:run --dry-run` per project, shows cost estimate. Does not execute.
|
||||||
|
|
||||||
**8a. Emit `multi.complete`:**
|
**`--resume <id>`:** Reconstructs state from master events. Retries failed projects, starts pending ones with deps met.
|
||||||
|
|
||||||
```jsonl
|
## Workspace Registry
|
||||||
{"seq":9,"type":"multi.complete","phase":"done","data":{"status":"completed","projects_completed":3,"projects_failed":0,"projects_skipped":0,"total_cost_usd":6.50,"budget_remaining_usd":8.50,"duration_ms":600000}}
|
|
||||||
```
|
|
||||||
|
|
||||||
Status values:
|
If `docs/project-registry.md` exists: auto-discover paths by project id, validate existence, update registry after meaningful changes.
|
||||||
- `completed` — all projects finished successfully
|
|
||||||
- `partial` — some projects completed, some failed/skipped
|
|
||||||
- `failed` — no projects completed successfully
|
|
||||||
- `halted` — stopped due to budget or user abort
|
|
||||||
|
|
||||||
**8b. Generate multi-run report:**
|
## Completion
|
||||||
|
|
||||||
```markdown
|
Status values: `completed` (all done), `partial` (some failed/skipped), `failed` (none completed), `halted` (budget/abort).
|
||||||
# Multi-Run Report: giesing-v2
|
|
||||||
|
|
||||||
## Summary
|
Final report includes per-project results, cost breakdown by phase, and dependency graph execution timeline.
|
||||||
| Metric | Value |
|
|
||||||
|--------|-------|
|
|
||||||
| Projects | 3 |
|
|
||||||
| Completed | 3 |
|
|
||||||
| Failed | 0 |
|
|
||||||
| Total cost | $6.50 / $15.00 |
|
|
||||||
| Duration | 10m 00s |
|
|
||||||
|
|
||||||
## Per-Project Results
|
|
||||||
### archeflow
|
|
||||||
- **Status:** completed
|
|
||||||
- **Task:** Add memory injection to run skill
|
|
||||||
- **Workflow:** fast (1 cycle)
|
|
||||||
- **Cost:** $1.20
|
|
||||||
- **Key artifacts:** plan-creator.md, do-maker.md
|
|
||||||
|
|
||||||
### colette
|
|
||||||
- **Status:** completed
|
|
||||||
- **Task:** Add story-specific voice validation command
|
|
||||||
- **Workflow:** standard (1 cycle)
|
|
||||||
- **Cost:** $1.80
|
|
||||||
- **Key artifacts:** plan-creator.md, do-maker.md, check-sage.md
|
|
||||||
|
|
||||||
### giesing
|
|
||||||
- **Status:** completed
|
|
||||||
- **Task:** Write story #2 using improved tools
|
|
||||||
- **Workflow:** kurzgeschichte (2 cycles)
|
|
||||||
- **Cost:** $3.50
|
|
||||||
- **Key artifacts:** plan-explorer.md, do-maker.md, check-guardian.md
|
|
||||||
|
|
||||||
## Dependency Graph Execution
|
|
||||||
archeflow (Layer 0) ----> completed
|
|
||||||
colette (Layer 0) ----> completed
|
|
||||||
giesing (Layer 1) ----> unblocked ----> completed
|
|
||||||
|
|
||||||
## Cost Breakdown
|
|
||||||
| Project | Plan | Do | Check | Total |
|
|
||||||
|---------|------|----|-------|-------|
|
|
||||||
| archeflow | $0.20 | $0.60 | $0.40 | $1.20 |
|
|
||||||
| colette | $0.30 | $0.80 | $0.70 | $1.80 |
|
|
||||||
| giesing | $0.50 | $2.00 | $1.00 | $3.50 |
|
|
||||||
| **Total** | **$1.00** | **$3.40** | **$2.10** | **$6.50** |
|
|
||||||
```
|
|
||||||
|
|
||||||
**8c. Update master event index:**
|
|
||||||
|
|
||||||
Append to `.archeflow/events/index.jsonl`:
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
{"run_id":"2026-04-03-giesing-v2","ts":"2026-04-03T14:10:00Z","type":"multi","task":"Write second story with improved ArcheFlow + Colette integration","status":"completed","projects":3,"total_cost_usd":6.50}
|
|
||||||
```
|
|
||||||
|
|
||||||
**8d. Update workspace registry (if applicable):**
|
|
||||||
|
|
||||||
If `docs/project-registry.md` exists and project statuses changed meaningfully, update the registry entries for affected projects.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Dry-Run Mode
|
|
||||||
|
|
||||||
When `--dry-run` is specified:
|
|
||||||
|
|
||||||
1. Validate the multi-run definition (DAG, paths, budget).
|
|
||||||
2. For each project (in topological order), run `archeflow:run --dry-run` to get a cost estimate and plan preview.
|
|
||||||
3. Display a summary:
|
|
||||||
|
|
||||||
```
|
|
||||||
Multi-Run Dry Run: giesing-v2
|
|
||||||
Projects: 3
|
|
||||||
Dependency layers: 2
|
|
||||||
Parallel execution: yes
|
|
||||||
|
|
||||||
Layer 0 (parallel):
|
|
||||||
archeflow — fast workflow, code domain
|
|
||||||
Estimated cost: $0.50-1.50
|
|
||||||
colette — standard workflow, code domain
|
|
||||||
Estimated cost: $1.00-3.00
|
|
||||||
|
|
||||||
Layer 1 (after Layer 0):
|
|
||||||
giesing — kurzgeschichte workflow, writing domain
|
|
||||||
Estimated cost: $2.00-5.00
|
|
||||||
|
|
||||||
Total estimated cost: $3.50-9.50
|
|
||||||
Budget: $15.00 (sufficient)
|
|
||||||
|
|
||||||
Proceed? [y/n]
|
|
||||||
```
|
|
||||||
|
|
||||||
4. Do NOT emit `multi.complete`. The multi-run is paused.
|
|
||||||
5. If user says yes, start the full multi-run using the validated config.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Resume Mode
|
|
||||||
|
|
||||||
When `--resume <multi-run-id>` is specified:
|
|
||||||
|
|
||||||
1. Read the master event file `.archeflow/events/<multi-run-id>.jsonl`.
|
|
||||||
2. Reconstruct `PROJECT_STATUS` from events (which projects completed, failed, are pending).
|
|
||||||
3. Identify resumable projects:
|
|
||||||
- `failed` projects can be retried.
|
|
||||||
- `blocked` projects whose blockers are now `completed` (e.g., after manual fix) can start.
|
|
||||||
- `pending` projects that were never started can start if their deps are met.
|
|
||||||
4. Display current state and ask for confirmation.
|
|
||||||
5. Continue the multi-run from where it left off, appending to the existing master event file.
|
|
||||||
|
|
||||||
Resume emits a `multi.resume` event:
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
{"seq":10,"type":"multi.resume","phase":"init","data":{"resumed_from":"2026-04-03-giesing-v2","projects_completed":["archeflow"],"projects_to_run":["colette","giesing"]}}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Integration with Existing Skills
|
|
||||||
|
|
||||||
| Skill | Integration Point |
|
|
||||||
|-------|-------------------|
|
|
||||||
| `run` | Each sub-run is a standard `archeflow:run` invocation. The multi-project skill wraps and coordinates multiple runs. |
|
|
||||||
| `process-log` | Master events follow the same schema (ts, run_id, seq, parent, type, phase, agent, data). Sub-run events use the standard event types. |
|
|
||||||
| `artifact-routing` | Each sub-run follows standard artifact routing internally. Cross-project artifacts follow the injection rules in Section 4. |
|
|
||||||
| `cost-tracking` | Per-project costs come from sub-run `run.complete` events. The multi-project skill aggregates them and enforces the shared budget. |
|
|
||||||
| `domains` | Each project auto-detects its domain independently. Different projects in the same multi-run can have different domains. |
|
|
||||||
| `git-integration` | Each sub-run manages its own branch. The multi-project skill does not merge across repos — each project's Act phase handles its own merge. |
|
|
||||||
| `autonomous-mode` | Multi-project runs are autonomous-mode-friendly. Budget enforcement is strict (halt, don't prompt). Blocked projects are skipped. |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Progress Display
|
|
||||||
|
|
||||||
Throughout the multi-run, display live progress:
|
|
||||||
|
|
||||||
```
|
|
||||||
━━━ ArcheFlow Multi-Run: giesing-v2 ━━━━━━━━━━━━━━━━━━━
|
|
||||||
Projects: 3 | Budget: $15.00 | Parallel: yes
|
|
||||||
|
|
||||||
[archeflow] fast/code -> running (Plan: Creator designing...)
|
|
||||||
[colette] standard/code -> running (Do: Maker implementing...)
|
|
||||||
[giesing] kurzgeschichte/writing -> blocked (waiting: archeflow, colette)
|
|
||||||
|
|
||||||
Cost: $1.80 / $15.00 (12%)
|
|
||||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
||||||
```
|
|
||||||
|
|
||||||
Update the display when:
|
|
||||||
- A project changes state (start, phase change, complete, fail, unblock)
|
|
||||||
- Budget thresholds are crossed
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Error Handling
|
|
||||||
|
|
||||||
| Error | Response |
|
|
||||||
|-------|----------|
|
|
||||||
| YAML parse error | Abort before starting. Report the parse error with line number. |
|
|
||||||
| Dependency cycle detected | Abort. Report which projects form the cycle. |
|
|
||||||
| Project path does not exist | Abort. Report the missing path. |
|
|
||||||
| Sub-run agent fails to return | Mark project as failed (5-min timeout per the `run` skill). Continue independent projects. |
|
|
||||||
| Master event write fails | Log warning. Continue orchestration. Events are observation, not control flow. |
|
|
||||||
| Artifact directory creation fails | Abort the affected project. This is blocking for cross-project artifact sharing. |
|
|
||||||
| Budget exceeded mid-project | Halt that project immediately. Emit `budget.exceeded`. Skip downstream dependents. |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Design Principles
|
|
||||||
|
|
||||||
1. **Each project is autonomous.** Sub-runs use the standard `run` skill without modification. The multi-project skill is a coordinator, not a replacement.
|
|
||||||
2. **DAG over sequence.** Dependencies are declared, not implied by order. Independent projects always run in parallel when possible.
|
|
||||||
3. **Shared budget, independent domains.** Budget is global, but each project detects its own domain, selects its own workflow, and manages its own artifacts.
|
|
||||||
4. **Fail forward.** A failure in one project does not halt independent projects. Only downstream dependents are blocked.
|
|
||||||
5. **Artifacts are the interface.** Projects communicate through saved artifacts, not shared memory or direct agent-to-agent messaging.
|
|
||||||
6. **Resume over restart.** Multi-runs can be resumed from any point. Master events provide enough state to reconstruct progress.
|
|
||||||
7. **Registry-aware.** When a workspace registry exists, use it for discovery and keep it updated. When it does not exist, everything still works.
|
|
||||||
|
|||||||
@@ -1,634 +0,0 @@
|
|||||||
---
|
|
||||||
name: orchestration
|
|
||||||
description: Use when executing a multi-agent orchestration — spawning archetype agents, managing PDCA cycles, coordinating worktrees, and merging results. This is the step-by-step execution guide.
|
|
||||||
---
|
|
||||||
|
|
||||||
# Orchestration Execution
|
|
||||||
|
|
||||||
This skill guides you through running a full ArcheFlow orchestration using Claude Code's native Agent tool and git worktrees.
|
|
||||||
|
|
||||||
## Strategy Selection
|
|
||||||
|
|
||||||
A **strategy** defines the shape of an orchestration run — which phases execute, in what order, and when to iterate. A **workflow** (fast/standard/thorough) controls the depth within a strategy.
|
|
||||||
|
|
||||||
### Available Strategies
|
|
||||||
|
|
||||||
| Strategy | Flow | When to Use |
|
|
||||||
|----------|------|-------------|
|
|
||||||
| `pdca` | Plan -> Do -> Check -> Act (cyclic) | Refactors, thorough reviews, multi-concern tasks |
|
|
||||||
| `pipeline` | Plan -> Implement -> Spec-Review -> Quality-Review -> Verify (linear) | Bug fixes, fast patches, single-concern tasks |
|
|
||||||
| `auto` | Selected by task analysis | Default — let ArcheFlow decide |
|
|
||||||
|
|
||||||
### Strategy Interface
|
|
||||||
|
|
||||||
Every strategy defines:
|
|
||||||
|
|
||||||
- **Phases** — ordered list of execution stages
|
|
||||||
- **Agent mapping** — which archetypes run in each phase
|
|
||||||
- **Transition rules** — conditions for moving between phases
|
|
||||||
- **Iteration model** — cyclic (PDCA) or linear (pipeline)
|
|
||||||
- **Exit conditions** — when the run terminates
|
|
||||||
|
|
||||||
### PDCA Strategy
|
|
||||||
|
|
||||||
The existing orchestration flow (Steps 0-4 below). Cyclic — the Act phase can feed back to Plan for another iteration. Best for tasks requiring multiple review perspectives and iterative refinement.
|
|
||||||
|
|
||||||
### Pipeline Strategy
|
|
||||||
|
|
||||||
Linear flow with no cycle-back. Faster for well-understood tasks where one pass is sufficient.
|
|
||||||
|
|
||||||
| Phase | Agent | Purpose |
|
|
||||||
|-------|-------|---------|
|
|
||||||
| Plan | Creator | Design proposal |
|
|
||||||
| Implement | Maker | Build in worktree |
|
|
||||||
| Spec-Review | Guardian, then Skeptic | Security + assumption check (sequential) |
|
|
||||||
| Quality-Review | Sage | Code quality review |
|
|
||||||
| Verify | (automated) | Run tests, apply targeted fix if CRITICAL |
|
|
||||||
|
|
||||||
No cycle-back — WARNINGs are logged but do not block. CRITICALs in Verify trigger a single targeted fix attempt by the Maker, not a full cycle.
|
|
||||||
|
|
||||||
### Auto-Selection Rules
|
|
||||||
|
|
||||||
When `strategy: auto` (default):
|
|
||||||
|
|
||||||
- Task contains "fix", "bug", "patch", "hotfix" → `pipeline`
|
|
||||||
- Task contains "refactor", "redesign", "review" → `pdca`
|
|
||||||
- Workflow is `thorough` → `pdca` (always)
|
|
||||||
- Workflow is `fast` with single file → `pipeline`
|
|
||||||
- Otherwise → `pdca`
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Step 0: Choose a Workflow
|
|
||||||
|
|
||||||
If `.archeflow/teams/<name>.yaml` exists, the user can reference a team preset: `"Use the backend team"`. Load the preset's phase config instead of built-in defaults. See `archeflow:custom-archetypes` skill for preset format.
|
|
||||||
|
|
||||||
Otherwise, assess the task and pick:
|
|
||||||
|
|
||||||
| Signal | Workflow |
|
|
||||||
|--------|----------|
|
|
||||||
| Small fix, low risk, single concern | `fast` (1 cycle) |
|
|
||||||
| Feature, multiple files, moderate risk | `standard` (2 cycles) |
|
|
||||||
| Security-sensitive, breaking changes, public API | `thorough` (3 cycles) |
|
|
||||||
|
|
||||||
## Workflow Adaptation Rules
|
|
||||||
|
|
||||||
The initial workflow choice is a starting point, not a commitment. These rules adapt the workflow at runtime. Each rule specifies when it evaluates (which phase boundary).
|
|
||||||
|
|
||||||
### A3: Confidence Gate (evaluates: after Plan, before Do)
|
|
||||||
|
|
||||||
**When:** Creator's confidence table has any axis below 0.5.
|
|
||||||
**Action by axis:**
|
|
||||||
|
|
||||||
| Axis | Score < 0.5 Action |
|
|
||||||
|------|-------------------|
|
|
||||||
| Task understanding | **Pause.** Ask user to clarify before proceeding. Do not spawn Maker. |
|
|
||||||
| Solution completeness | **Upgrade to standard.** Add Explorer before Maker starts. |
|
|
||||||
| Risk coverage | **Spawn mini-Explorer** for the specific risky area (parallel, 5 min max). Maker can proceed. |
|
|
||||||
|
|
||||||
A3 runs before any Do/Check agents spawn, so there are no cancellation issues.
|
|
||||||
|
|
||||||
### A1: Conditional Escalation (evaluates: after Check, before next cycle)
|
|
||||||
|
|
||||||
**When:** Guardian rejects with 2+ CRITICAL findings in a `fast` workflow.
|
|
||||||
**Action:** Escalate to `standard` for the **next cycle** — add Skeptic + Sage to the reviewer roster.
|
|
||||||
**Why:** If Guardian found serious issues, more perspectives help find root causes.
|
|
||||||
**Sticky:** Once escalated, the workflow stays escalated for all remaining cycles. A2 does not apply to escalated workflows.
|
|
||||||
|
|
||||||
### A2: Guardian Fast-Path (evaluates: after Guardian, before spawning other reviewers)
|
|
||||||
|
|
||||||
**When:** Guardian finds 0 CRITICAL and 0 WARNING in a non-escalated `standard` or `thorough` workflow.
|
|
||||||
**Action:** Do not spawn Skeptic, Sage, or Trickster. Proceed directly to Act phase.
|
|
||||||
**Why:** Guardian's security review is the strictest gate. Clean pass = safe to skip additional reviewers.
|
|
||||||
**Critical:** Evaluate A2 **after Guardian completes but before other reviewers are spawned.** Do not spawn reviewers in parallel with Guardian — spawn Guardian first, check A2, then spawn remaining reviewers only if A2 doesn't trigger.
|
|
||||||
**Does not apply to:** Escalated workflows (A1 triggered), or first cycle of `thorough` workflows (Trickster is mandatory on first pass).
|
|
||||||
**Log:** Note "Guardian fast-path taken" in orchestration report.
|
|
||||||
|
|
||||||
### Evaluation Order
|
|
||||||
|
|
||||||
```
|
|
||||||
Plan phase completes → A3 (confidence gate)
|
|
||||||
↓
|
|
||||||
Guardian completes → A2 (fast-path check) → if clean, skip other reviewers
|
|
||||||
↓ if not, spawn other reviewers
|
|
||||||
Check phase done → A1 (escalation check) → if 2+ CRITICALs in fast, next cycle is standard
|
|
||||||
```
|
|
||||||
|
|
||||||
## Process Logging
|
|
||||||
|
|
||||||
If `.archeflow/events/` exists (or should be created), emit structured events throughout orchestration. See `archeflow:process-log` skill for full schema.
|
|
||||||
|
|
||||||
**Quick reference — emit at these points:**
|
|
||||||
|
|
||||||
```
|
|
||||||
run.start → After workflow selection, before first agent
|
|
||||||
agent.start → Before each Agent tool call
|
|
||||||
agent.complete → After each Agent returns (include duration, tokens, summary, artifacts)
|
|
||||||
decision → When choosing between alternatives (plot direction, approach, fix strategy)
|
|
||||||
phase.transition → At Plan→Do, Do→Check, Check→Act boundaries
|
|
||||||
review.verdict → After each reviewer delivers verdict
|
|
||||||
fix.applied → After each edit addressing a review finding
|
|
||||||
cycle.boundary → End of PDCA cycle
|
|
||||||
shadow.detected → When shadow threshold triggers
|
|
||||||
run.complete → After final Act phase (include totals)
|
|
||||||
```
|
|
||||||
|
|
||||||
**Helper:** `./lib/archeflow-event.sh <run_id> <type> <phase> <agent> '<json>'`
|
|
||||||
|
|
||||||
**Report:** `./lib/archeflow-report.sh .archeflow/events/<run_id>.jsonl`
|
|
||||||
|
|
||||||
Events are optional — if the events dir doesn't exist, skip logging. Never let logging block orchestration.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Model Configuration
|
|
||||||
|
|
||||||
Model assignment per archetype and workflow is configured in `.archeflow/config.yaml` under the `models:` section. The `archeflow:run` skill (section 0c) handles resolution with fallback chain: per-workflow per-archetype > per-workflow default > per-archetype > global default. When spawning agents manually, read the config to select the appropriate model.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Step 1: Plan Phase
|
|
||||||
|
|
||||||
Spawn agents sequentially — Creator needs Explorer's findings.
|
|
||||||
|
|
||||||
### Explorer (if standard or thorough)
|
|
||||||
|
|
||||||
**Context to include:** Task description, relevant file paths, codebase access.
|
|
||||||
**Context to exclude:** Prior proposals, review outputs, implementation details, feedback from previous cycles.
|
|
||||||
|
|
||||||
```
|
|
||||||
Agent(
|
|
||||||
description: "🔍 Explorer: research context",
|
|
||||||
prompt: "<task description>
|
|
||||||
You are the EXPLORER archetype.
|
|
||||||
Research the codebase to understand:
|
|
||||||
1. What files and functions are involved
|
|
||||||
2. What dependencies exist
|
|
||||||
3. What tests currently cover this area
|
|
||||||
4. What patterns the codebase uses
|
|
||||||
Write your findings as a structured research report.
|
|
||||||
Be thorough but focused — no rabbit holes.",
|
|
||||||
subagent_type: "Explore"
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Creator
|
|
||||||
|
|
||||||
**Context to include:** Task description, Explorer's research output. On cycle 2+: prior cycle's structured feedback (see Cycle Feedback Protocol).
|
|
||||||
**Context to exclude:** Raw file contents (Explorer already summarized), git diffs, reviewer full outputs.
|
|
||||||
|
|
||||||
**Fast workflow only (no Explorer):** The Creator must perform a Mini-Reflect before proposing:
|
|
||||||
1. Restate the task in your own words (catch misunderstandings early)
|
|
||||||
2. List 3 assumptions you're making
|
|
||||||
3. Name the one risk that would cause most damage if wrong
|
|
||||||
|
|
||||||
```
|
|
||||||
Agent(
|
|
||||||
description: "🏗️ Creator: design proposal",
|
|
||||||
prompt: "<task description>
|
|
||||||
You are the CREATOR archetype.
|
|
||||||
<if fast workflow (no Explorer): Before proposing, perform a Mini-Reflect:
|
|
||||||
1. Restate the task in one sentence
|
|
||||||
2. List 3 assumptions you're making
|
|
||||||
3. Name the highest-damage risk
|
|
||||||
Then propose.>
|
|
||||||
<if standard/thorough: Based on the research findings: <Explorer's output>>
|
|
||||||
<if cycle 2+: Prior cycle feedback: <structured feedback — see Cycle Feedback Protocol>>
|
|
||||||
Design a solution proposal including:
|
|
||||||
1. Architecture decisions (with rationale)
|
|
||||||
2. Files to create/modify (with specific changes)
|
|
||||||
3. Alternatives considered (at least 2, with rejection rationale)
|
|
||||||
4. Test strategy
|
|
||||||
5. Confidence (scored by axis: task understanding, solution completeness, risk coverage)
|
|
||||||
6. Risks you foresee
|
|
||||||
<if cycle 2+: 6. How you addressed each unresolved issue from prior feedback>
|
|
||||||
Be decisive. Ship a clear plan, not a menu of options.",
|
|
||||||
subagent_type: "Plan"
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Step 2: Do Phase
|
|
||||||
|
|
||||||
Spawn Maker in an **isolated worktree** so changes don't affect main.
|
|
||||||
|
|
||||||
**Context to include:** Creator's proposal only. On cycle 2+: implementation-routed feedback from Sage/Trickster.
|
|
||||||
**Context to exclude:** Explorer's research, Guardian/Skeptic findings (those go to Creator).
|
|
||||||
|
|
||||||
```
|
|
||||||
Agent(
|
|
||||||
description: "⚒️ Maker: implement proposal",
|
|
||||||
prompt: "<task description>
|
|
||||||
You are the MAKER archetype.
|
|
||||||
Implement this proposal: <Creator's output>
|
|
||||||
<if cycle 2+: Implementation feedback from prior cycle: <Sage/Trickster findings only>>
|
|
||||||
Rules:
|
|
||||||
1. Follow the proposal exactly — don't redesign
|
|
||||||
2. Write tests for every behavioral change
|
|
||||||
3. Commit with descriptive messages
|
|
||||||
4. Run existing tests — nothing may break
|
|
||||||
5. If the proposal is unclear, implement your best interpretation and note it
|
|
||||||
Do NOT skip tests. Do NOT refactor unrelated code.
|
|
||||||
|
|
||||||
BEFORE finishing — Self-Review Checklist:
|
|
||||||
1. Did I change ALL files listed in the proposal's Changes section?
|
|
||||||
2. Did I add tests for each behavioral change?
|
|
||||||
3. Are there files in my diff NOT listed in the proposal? If yes, revert them.
|
|
||||||
4. Do all existing tests still pass?
|
|
||||||
Report any gaps in your Implementation summary.",
|
|
||||||
isolation: "worktree",
|
|
||||||
mode: "bypassPermissions"
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
**Critical:** The Maker MUST commit its changes before finishing. Uncommitted changes in a worktree are lost.
|
|
||||||
|
|
||||||
## Step 3: Check Phase
|
|
||||||
|
|
||||||
Spawn Guardian **first**. After Guardian completes, check adaptation rule A2 (fast-path). If A2 triggers (0 CRITICAL, 0 WARNING, non-escalated workflow), skip remaining reviewers and proceed to Act. Otherwise, spawn remaining reviewers **in parallel**.
|
|
||||||
|
|
||||||
**Reviewer spawning protocol:** The canonical sequence (Guardian first, A2 evaluation, parallel spawning, timeout handling) is defined in `archeflow:check-phase` under "Reviewer Spawning Protocol". Follow that protocol for the exact spawning order, context per reviewer, and timeout rules.
|
|
||||||
|
|
||||||
### Guardian (always runs first)
|
|
||||||
|
|
||||||
**Context to include:** Maker's git diff, proposal risk section only.
|
|
||||||
**Context to exclude:** Explorer's research, full proposal, other reviewer outputs.
|
|
||||||
|
|
||||||
```
|
|
||||||
Agent(
|
|
||||||
description: "🛡️ Guardian: security and risk review",
|
|
||||||
prompt: "You are the GUARDIAN archetype.
|
|
||||||
Review the changes in branch: <maker's branch>
|
|
||||||
Assess:
|
|
||||||
1. Security vulnerabilities (injection, auth bypass, data exposure)
|
|
||||||
2. Reliability risks (error handling, edge cases, race conditions)
|
|
||||||
3. Breaking changes (API compatibility, schema migrations)
|
|
||||||
4. Dependency risks (new deps, version conflicts)
|
|
||||||
Output: APPROVED or REJECTED with specific findings.
|
|
||||||
Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
|
|
||||||
Categories: security, reliability, design, breaking-change, dependency
|
|
||||||
Be rigorous but practical — flag real risks, not theoretical ones."
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Skeptic (if standard or thorough)
|
|
||||||
|
|
||||||
**Context to include:** Creator's proposal (focus on assumptions section).
|
|
||||||
**Context to exclude:** Git diff details, Explorer's research, other reviewer outputs.
|
|
||||||
|
|
||||||
```
|
|
||||||
Agent(
|
|
||||||
description: "🤔 Skeptic: challenge assumptions",
|
|
||||||
prompt: "You are the SKEPTIC archetype.
|
|
||||||
Review the proposal: <Creator's proposal>
|
|
||||||
Challenge:
|
|
||||||
1. Assumptions in the design — what if they're wrong?
|
|
||||||
2. Alternative approaches not considered
|
|
||||||
3. Edge cases not tested
|
|
||||||
4. Scalability concerns
|
|
||||||
Output: APPROVED or REJECTED with counterarguments.
|
|
||||||
Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
|
|
||||||
Categories: design, quality, testing, scalability
|
|
||||||
Be constructive — every challenge must include a suggested alternative."
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Sage (if standard or thorough)
|
|
||||||
|
|
||||||
**Context to include:** Creator's proposal, Maker's git diff, implementation summary.
|
|
||||||
**Context to exclude:** Explorer's raw research, other reviewer outputs.
|
|
||||||
|
|
||||||
```
|
|
||||||
Agent(
|
|
||||||
description: "📚 Sage: holistic quality review",
|
|
||||||
prompt: "You are the SAGE archetype.
|
|
||||||
Review the changes in branch: <maker's branch>
|
|
||||||
Evaluate holistically:
|
|
||||||
1. Code quality (readability, maintainability, simplicity)
|
|
||||||
2. Test coverage (are the tests meaningful, not just present?)
|
|
||||||
3. Documentation (does the change need docs?)
|
|
||||||
4. Consistency with codebase patterns
|
|
||||||
Output: APPROVED or REJECTED with quality findings.
|
|
||||||
Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
|
|
||||||
Categories: quality, testing, design, consistency
|
|
||||||
Judge like a senior engineer doing a PR review."
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Trickster (if thorough only)
|
|
||||||
|
|
||||||
**Context to include:** Maker's git diff only.
|
|
||||||
**Context to exclude:** Everything else — proposal, research, other reviews.
|
|
||||||
|
|
||||||
```
|
|
||||||
Agent(
|
|
||||||
description: "🃏 Trickster: adversarial testing",
|
|
||||||
prompt: "You are the TRICKSTER archetype.
|
|
||||||
Try to break the changes in branch: <maker's branch>
|
|
||||||
Attack vectors:
|
|
||||||
1. Malformed input, boundary values, empty/null/huge data
|
|
||||||
2. Concurrency and race conditions
|
|
||||||
3. Error path exploitation
|
|
||||||
4. Dependency failure scenarios
|
|
||||||
Output: APPROVED or REJECTED with edge cases found.
|
|
||||||
Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
|
|
||||||
Categories: security, reliability, testing
|
|
||||||
Think like a QA engineer who gets paid per bug found."
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Step 4: Act Phase
|
|
||||||
|
|
||||||
Collect all reviewer outputs and decide.
|
|
||||||
|
|
||||||
### Completion Promise (optional)
|
|
||||||
|
|
||||||
If the user defined explicit done criteria with the task, check them now:
|
|
||||||
|
|
||||||
```
|
|
||||||
Completion criteria: <test command passes> AND <Guardian approves>
|
|
||||||
Example: "done when pytest passes and Guardian approves with 0 CRITICAL"
|
|
||||||
```
|
|
||||||
|
|
||||||
If completion criteria are defined, **all criteria must pass** — reviewer approval alone is not sufficient. If tests fail but reviewers approved, cycle back with "tests failing" as feedback to Creator.
|
|
||||||
|
|
||||||
### All Approved (and completion criteria met)
|
|
||||||
1. **Pre-merge hooks:** Check `.archeflow/hooks.yaml` for `pre-merge` hooks. Run them. If `fail_action: abort`, stop and report.
|
|
||||||
2. Merge the Maker's worktree branch into the target branch
|
|
||||||
3. **Post-merge hooks:** Run `post-merge` hooks from `.archeflow/hooks.yaml` if defined. Then run the project's test suite on the merged branch
|
|
||||||
- Tests pass → proceed to step 3
|
|
||||||
- Tests fail → **auto-revert** the merge commit, report the failure, and cycle back with "integration test failure on main" as feedback
|
|
||||||
3. Report: what was implemented, what was reviewed, any warnings noted
|
|
||||||
4. Clean up the worktree
|
|
||||||
5. Record metrics (see Orchestration Metrics)
|
|
||||||
|
|
||||||
### Issues Found (and cycles remaining)
|
|
||||||
1. Build structured feedback using the Cycle Feedback Protocol below
|
|
||||||
2. Go back to Step 1 (Plan) with the feedback
|
|
||||||
3. Creator revises the proposal, addressing each unresolved issue
|
|
||||||
4. Maker re-implements in a fresh worktree
|
|
||||||
5. Reviewers check again
|
|
||||||
|
|
||||||
### Max Cycles Reached with Unresolved Issues
|
|
||||||
1. Report all unresolved findings to the user
|
|
||||||
2. Present the best implementation so far (on its branch)
|
|
||||||
3. Let the user decide: merge as-is, fix manually, or abandon
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Cycle Feedback Protocol
|
|
||||||
|
|
||||||
After the Check phase, build structured feedback for the next cycle. This replaces dumping raw reviewer output.
|
|
||||||
|
|
||||||
### 1. Extract Findings
|
|
||||||
|
|
||||||
Parse each reviewer's output into the standardized format:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
## Cycle N Feedback
|
|
||||||
|
|
||||||
### Unresolved Issues
|
|
||||||
| Source | Severity | Category | Issue | Route to |
|
|
||||||
|--------|----------|----------|-------|----------|
|
|
||||||
| Guardian | CRITICAL | security | SQL injection in user input | Creator |
|
|
||||||
| Skeptic | WARNING | design | Assumes single-tenant only | Creator |
|
|
||||||
| Sage | WARNING | quality | Test names don't describe behavior | Maker |
|
|
||||||
| Trickster | CRITICAL | reliability | Empty string bypasses validation | Creator |
|
|
||||||
|
|
||||||
### Resolved (from cycle N-1)
|
|
||||||
| Source | Issue | Resolution |
|
|
||||||
|--------|-------|------------|
|
|
||||||
| Guardian | Missing rate limit | Added rate limiter middleware |
|
|
||||||
```
|
|
||||||
|
|
||||||
### 2. Route Feedback
|
|
||||||
|
|
||||||
Not all findings go to the same agent:
|
|
||||||
|
|
||||||
| Source | Category | Routes to | Reason |
|
|
||||||
|--------|----------|-----------|--------|
|
|
||||||
| Guardian | security, breaking-change | **Creator** | Design must change |
|
|
||||||
| Guardian | reliability, dependency | **Creator** | Architectural decision needed |
|
|
||||||
| Skeptic | design, scalability | **Creator** | Assumptions need revision |
|
|
||||||
| Sage | quality, consistency | **Maker** | Implementation refinement |
|
|
||||||
| Sage | testing | **Maker** | Test gap, not design flaw |
|
|
||||||
| Trickster | reliability (design flaw) | **Creator** | Needs redesign |
|
|
||||||
| Trickster | reliability (test gap) | **Maker** | Needs more tests |
|
|
||||||
| Trickster | testing | **Maker** | Edge case not covered |
|
|
||||||
|
|
||||||
**Disambiguation rule:** When in doubt: if the fix requires changing the approach, route to Creator. If it requires changing the code within the existing approach, route to Maker.
|
|
||||||
|
|
||||||
### 3. Track Resolution
|
|
||||||
|
|
||||||
Compare cycle N findings against cycle N-1:
|
|
||||||
- If a prior finding no longer appears in the same category → mark **resolved**
|
|
||||||
- If a prior finding persists → it stays **unresolved** with an incremented cycle count
|
|
||||||
- If new findings appear → add as new unresolved issues
|
|
||||||
|
|
||||||
This prevents regression and gives the Creator/Maker a clear list of what to address.
|
|
||||||
|
|
||||||
### 4. Convergence Detection
|
|
||||||
|
|
||||||
If the **same finding** (same category + same file location) appears **unresolved in 2 consecutive cycles**, escalate to user:
|
|
||||||
|
|
||||||
> "Finding persists across 2 cycles: [Guardian] CRITICAL security — SQL injection in src/auth.ts:48. This may need human judgment or a different approach."
|
|
||||||
|
|
||||||
Do not cycle again blindly. The issue is likely structural (wrong design, not wrong implementation) and needs human input.
|
|
||||||
|
|
||||||
### 5. Cross-Archetype Dedup
|
|
||||||
|
|
||||||
If two reviewers raise the same issue (same file + same category + similar description), merge into one finding in the consolidated output:
|
|
||||||
|
|
||||||
```
|
|
||||||
| Guardian + Skeptic | CRITICAL | security | Input not sanitized (src/api.ts:30) | Add validation |
|
|
||||||
```
|
|
||||||
|
|
||||||
Don't double-count in severity tallies. Route to the higher-priority destination (Creator over Maker).
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Orchestration Metrics
|
|
||||||
|
|
||||||
Track lightweight metrics throughout the orchestration. No token counting (unreliable from skill layer) — just timing and outcomes.
|
|
||||||
|
|
||||||
### Per-Phase Logging
|
|
||||||
|
|
||||||
After each phase completes, note:
|
|
||||||
|
|
||||||
```
|
|
||||||
| Phase | Duration | Agents | Outcome |
|
|
||||||
|-------|----------|--------|---------|
|
|
||||||
| Plan | 45s | 2 | Proposal ready (confidence: 0.8) |
|
|
||||||
| Do | 90s | 1 | 4 files changed, 8 tests added |
|
|
||||||
| Check | 60s | 3 | 1 REJECTED (Guardian), 2 APPROVED |
|
|
||||||
| Act | — | — | Cycle back → feedback built |
|
|
||||||
```
|
|
||||||
|
|
||||||
### Orchestration Summary
|
|
||||||
|
|
||||||
At orchestration end, include in the report:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
## Orchestration Metrics
|
|
||||||
| Metric | Value |
|
|
||||||
|--------|-------|
|
|
||||||
| Workflow | standard |
|
|
||||||
| Cycles | 2 of 2 |
|
|
||||||
| Total duration | 4m 30s |
|
|
||||||
| Agents spawned | 9 |
|
|
||||||
| Findings (total) | 5 |
|
|
||||||
| Findings (critical) | 1 |
|
|
||||||
| Findings (resolved) | 4 |
|
|
||||||
| Shadow detections | 0 |
|
|
||||||
```
|
|
||||||
|
|
||||||
Use this data to calibrate future workflow selection — if fast workflows consistently need 0 cycles of revision, the task was well-scoped.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Autonomous Mode
|
|
||||||
|
|
||||||
When running unattended (overnight sessions, batch queues), add these behaviors to the orchestration loop:
|
|
||||||
|
|
||||||
### Between-Task Checkpoint
|
|
||||||
|
|
||||||
After each task completes (success or failure):
|
|
||||||
1. **Commit and push** all changes immediately
|
|
||||||
2. **Update session log** at `.archeflow/session-log.md` with task outcome
|
|
||||||
3. **Check stop conditions** before starting next task:
|
|
||||||
- 3 consecutive failures → STOP
|
|
||||||
- Shadow escalation (same shadow 3+ times) → STOP
|
|
||||||
- Test suite broken after merge → REVERT and STOP
|
|
||||||
- Destructive action detected → STOP
|
|
||||||
|
|
||||||
### Session Log Protocol
|
|
||||||
|
|
||||||
**Primary:** Emit `run.complete` event to `.archeflow/events/<run_id>.jsonl` (see Process Logging section above). The event stream is the source of truth.
|
|
||||||
|
|
||||||
**Secondary:** Also write a human-readable summary to `.archeflow/session-log.md`:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
## Task N: <description>
|
|
||||||
**Workflow:** standard | **Status:** COMPLETED/FAILED
|
|
||||||
**Cycles:** 1 of 2
|
|
||||||
**Findings:** Guardian APPROVED, Skeptic APPROVED, Sage WARNING (test names)
|
|
||||||
**Files changed:** 5 | **Tests added:** 12
|
|
||||||
**Branch:** merged to main (commit abc1234) | OR: archeflow/maker-xyz (NOT merged)
|
|
||||||
**Duration:** 8 min
|
|
||||||
**Events:** `.archeflow/events/<run_id>.jsonl` (full process log)
|
|
||||||
```
|
|
||||||
|
|
||||||
Generate the full Markdown report: `./lib/archeflow-report.sh .archeflow/events/<run_id>.jsonl`
|
|
||||||
|
|
||||||
### Safety Rules
|
|
||||||
- Never force-push. Never modify main history.
|
|
||||||
- All work stays on worktree branches until explicitly merged
|
|
||||||
- Merges use `--no-ff` — individually revertable
|
|
||||||
- Failed tasks leave branches intact for manual inspection
|
|
||||||
|
|
||||||
For full autonomous mode details (task queues, overnight checklists, user controls): load the `archeflow:autonomous-mode` skill.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Shadow Monitoring
|
|
||||||
|
|
||||||
During orchestration, watch for shadow activation after each agent completes. Quick checklist:
|
|
||||||
|
|
||||||
| Archetype | Shadow | Quick Check |
|
|
||||||
|-----------|--------|-------------|
|
|
||||||
| Explorer | Rabbit Hole | Output >2000 words without Recommendation section? |
|
|
||||||
| Creator | Over-Architect | >2 new abstractions for one feature? |
|
|
||||||
| Maker | Rogue | No test files in changeset? Files outside proposal? |
|
|
||||||
| Guardian | Paranoid | CRITICAL:WARNING ratio >2:1? Zero approvals? |
|
|
||||||
| Skeptic | Paralytic | >7 challenges? <50% have alternatives? |
|
|
||||||
| Trickster | False Alarm | Findings in untouched code? >10 findings? |
|
|
||||||
| Sage | Bureaucrat | Review >2x code change length? |
|
|
||||||
|
|
||||||
On detection: apply correction prompt from `archeflow:shadow-detection` skill. On second detection of same shadow: replace agent. On 3+ shadows in same cycle: escalate to user.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Parallel Team Orchestration
|
|
||||||
|
|
||||||
When running multiple independent tasks, spawn parallel ArcheFlow teams. Each team runs its own PDCA cycle on a separate worktree.
|
|
||||||
|
|
||||||
### Rules
|
|
||||||
|
|
||||||
1. **Non-overlapping file scope:** Each team must work on different files. If two tasks touch the same file, run them sequentially.
|
|
||||||
2. **Independent worktrees:** Each team's Maker gets its own worktree branch (`archeflow/team-1-maker`, `archeflow/team-2-maker`).
|
|
||||||
3. **First-finished-first-merged:** Teams merge in completion order. Later teams rebase onto the updated main before their own merge.
|
|
||||||
4. **Merge conflict handling:** If rebase fails, the later team re-runs its Check phase against the merged main. If conflicts are structural, escalate to user.
|
|
||||||
5. **Max 3 parallel teams:** More causes diminishing returns and merge headaches.
|
|
||||||
|
|
||||||
### Spawning Parallel Teams
|
|
||||||
|
|
||||||
```
|
|
||||||
# Launch 2-3 teams in a single message with multiple Agent calls:
|
|
||||||
Agent(description: "🏗️ Team 1: pagination fix (fast)", ...)
|
|
||||||
Agent(description: "🏗️ Team 2: JWT auth (standard)", ...)
|
|
||||||
Agent(description: "🏗️ Team 3: logging refactor (fast)", ...)
|
|
||||||
```
|
|
||||||
|
|
||||||
Each team follows the full PDCA steps independently. The orchestrator monitors all teams and handles merges.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Reviewer Profiles
|
|
||||||
|
|
||||||
Projects can configure which reviewers matter in `.archeflow/config.yaml`:
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
reviewers:
|
|
||||||
always: [guardian] # Always runs
|
|
||||||
default: [sage] # Runs in standard+thorough
|
|
||||||
thorough_only: [trickster] # Only in thorough
|
|
||||||
skip: [skeptic] # Never runs for this project
|
|
||||||
```
|
|
||||||
|
|
||||||
If no config exists, use the built-in workflow defaults. Profiles save tokens by not spawning reviewers that add little value for the specific project.
|
|
||||||
|
|
||||||
## Explorer Cache
|
|
||||||
|
|
||||||
If the same code area was explored recently, skip Explorer and reuse prior research:
|
|
||||||
|
|
||||||
**Cache hit criteria:** Same files affected (>70% overlap by path) AND prior research is <24 hours old AND no commits to those files since the research.
|
|
||||||
|
|
||||||
**On cache hit:** Show the prior research to Creator with a note: "Using cached Explorer research from [timestamp]. If the codebase changed significantly, re-run Explorer."
|
|
||||||
|
|
||||||
**On cache miss:** Run Explorer normally.
|
|
||||||
|
|
||||||
Cache is stored in `.archeflow/explorer-cache/` as timestamped markdown files. The orchestrator checks for matches before spawning Explorer.
|
|
||||||
|
|
||||||
## Learning from History
|
|
||||||
|
|
||||||
Track which archetypes catch real issues per project over time. After each orchestration, append to `.archeflow/metrics.jsonl`:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{"task": "...", "archetype": "guardian", "findings": 2, "critical": 1, "resolved": 2, "useful": true}
|
|
||||||
{"task": "...", "archetype": "skeptic", "findings": 3, "critical": 0, "resolved": 0, "useful": false}
|
|
||||||
```
|
|
||||||
|
|
||||||
A finding is **useful** if it was resolved (led to a code change) rather than dismissed.
|
|
||||||
|
|
||||||
After 10+ orchestrations, the orchestrator can recommend reviewer profile changes:
|
|
||||||
- "Skeptic has found 0 useful issues in 8 runs — consider moving to `skip` or `thorough_only`"
|
|
||||||
- "Guardian catches critical issues in 80% of runs — confirmed as essential"
|
|
||||||
|
|
||||||
This is advisory, not automatic. The user decides based on the data.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Orchestration Report
|
|
||||||
|
|
||||||
After completion, summarize:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
## ArcheFlow Orchestration Report
|
|
||||||
- **Task:** <description>
|
|
||||||
- **Workflow:** standard (2 cycles)
|
|
||||||
- **Cycle 1:** Guardian rejected (SQL injection in user input handler)
|
|
||||||
- **Cycle 2:** All approved after input sanitization added
|
|
||||||
- **Files changed:** 4 files, +120 -30 lines
|
|
||||||
- **Tests added:** 8 new tests
|
|
||||||
- **Branch:** archeflow/maker-<id> → merged to main
|
|
||||||
- **Metrics:** 9 agents, 4m 30s, 5 findings (4 resolved, 1 info remaining)
|
|
||||||
```
|
|
||||||
@@ -1,175 +0,0 @@
|
|||||||
---
|
|
||||||
name: plan-phase
|
|
||||||
description: Use when acting as Explorer or Creator in the Plan phase. Defines output formats for research and proposals.
|
|
||||||
---
|
|
||||||
|
|
||||||
# Plan Phase
|
|
||||||
|
|
||||||
Explorer researches, then Creator designs. Sequential — Creator needs Explorer's findings.
|
|
||||||
|
|
||||||
## Explorer Output Format
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
## Research: <task>
|
|
||||||
|
|
||||||
### Affected Code
|
|
||||||
- `path/file.ext` — description (L<start>-<end>)
|
|
||||||
|
|
||||||
### Dependencies
|
|
||||||
- What depends on what, what breaks if changed
|
|
||||||
|
|
||||||
### Patterns
|
|
||||||
- How the codebase solves similar problems
|
|
||||||
|
|
||||||
### Risks
|
|
||||||
- What could go wrong
|
|
||||||
|
|
||||||
### Recommendation
|
|
||||||
<one paragraph: approach + rationale>
|
|
||||||
```
|
|
||||||
|
|
||||||
## Creator Output Format
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
## Proposal: <task>
|
|
||||||
|
|
||||||
### Mini-Reflect (fast workflow only — skip if Explorer ran)
|
|
||||||
- **Task restated:** <one sentence>
|
|
||||||
- **Assumptions:** 1) ... 2) ... 3) ...
|
|
||||||
- **Highest-damage risk:** <the one thing that would hurt most if wrong>
|
|
||||||
|
|
||||||
### Architecture Decision
|
|
||||||
<What and WHY>
|
|
||||||
|
|
||||||
### Alternatives Considered
|
|
||||||
| Approach | Why Rejected |
|
|
||||||
|----------|-------------|
|
|
||||||
| <option A> | <reason> |
|
|
||||||
| <option B> | <reason> |
|
|
||||||
|
|
||||||
### Changes
|
|
||||||
1. **`path/file.ext`** — What changes and why
|
|
||||||
2. **`path/test.ext`** — What tests to add
|
|
||||||
|
|
||||||
### Test Strategy
|
|
||||||
- <specific test cases>
|
|
||||||
|
|
||||||
### Confidence
|
|
||||||
| Axis | Score | Note |
|
|
||||||
|------|-------|------|
|
|
||||||
| Task understanding | <0.0-1.0> | <why> |
|
|
||||||
| Solution completeness | <0.0-1.0> | <gaps?> |
|
|
||||||
| Risk coverage | <0.0-1.0> | <unknowns?> |
|
|
||||||
|
|
||||||
### Risks
|
|
||||||
- <what could go wrong + mitigations>
|
|
||||||
|
|
||||||
### Not Doing
|
|
||||||
- <adjacent concerns deliberately excluded>
|
|
||||||
```
|
|
||||||
|
|
||||||
**Confidence triggers:** If any axis scores below 0.5, flag it to the orchestrator. Low task understanding → clarify with user. Low solution completeness → consider standard workflow. Low risk coverage → spawn targeted Explorer research.
|
|
||||||
|
|
||||||
## Creator with Prior Feedback (Cycle 2+)
|
|
||||||
|
|
||||||
When the Creator receives structured feedback from a prior cycle, the proposal must include an additional section addressing each unresolved issue:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
## Proposal: <task> (Revision — Cycle N)
|
|
||||||
|
|
||||||
### What Changed (vs. prior proposal)
|
|
||||||
- <brief delta: what was added, removed, or redesigned>
|
|
||||||
|
|
||||||
### Prior Feedback Response
|
|
||||||
| Issue | Source | Action | Rationale |
|
|
||||||
|-------|--------|--------|-----------|
|
|
||||||
| SQL injection in user input | Guardian | **Fixed** — added parameterized queries | Direct security fix |
|
|
||||||
| Assumes single-tenant | Skeptic | **Deferred** — multi-tenant out of scope | Not in task requirements |
|
|
||||||
| Test names unclear | Sage | **Accepted** — routed to Maker | Implementation concern |
|
|
||||||
|
|
||||||
### Architecture Decision
|
|
||||||
<revised design addressing feedback>
|
|
||||||
|
|
||||||
### Changes
|
|
||||||
<updated file list>
|
|
||||||
|
|
||||||
### Test Strategy
|
|
||||||
<updated test cases>
|
|
||||||
|
|
||||||
### Confidence
|
|
||||||
| Axis | Score | Note |
|
|
||||||
|------|-------|------|
|
|
||||||
| Task understanding | <0.0-1.0> | <why> |
|
|
||||||
| Solution completeness | <0.0-1.0> | <gaps?> |
|
|
||||||
| Risk coverage | <0.0-1.0> | <unknowns?> |
|
|
||||||
|
|
||||||
### Risks
|
|
||||||
<updated risks — include any new risks from the revision>
|
|
||||||
|
|
||||||
### Not Doing
|
|
||||||
<updated scope boundaries>
|
|
||||||
```
|
|
||||||
|
|
||||||
**Rules for addressing feedback:**
|
|
||||||
- **Fixed:** Changed the design to resolve the issue. Explain how.
|
|
||||||
- **Deferred:** Not addressing now, with explicit reason. Must not be a CRITICAL finding.
|
|
||||||
- **Accepted:** Acknowledged and routed to Maker for implementation-level fix.
|
|
||||||
- **Disputed:** Disagrees with the finding. Must provide evidence or reasoning.
|
|
||||||
|
|
||||||
CRITICAL findings cannot be deferred or disputed — they must be fixed or the proposal will be rejected again.
|
|
||||||
|
|
||||||
## Task Granularity
|
|
||||||
|
|
||||||
Each change item in the Creator's proposal must be a **2-5 minute task** — specific enough that the Maker can implement it without interpretation.
|
|
||||||
|
|
||||||
### Requirements per Change Item
|
|
||||||
|
|
||||||
Every item in the `### Changes` section must include:
|
|
||||||
|
|
||||||
1. **Exact file path** — `src/auth/handler.ts`, not "the auth module"
|
|
||||||
2. **What to change** — a code block showing the target state or transformation
|
|
||||||
3. **How to verify** — a command or check that confirms correctness
|
|
||||||
|
|
||||||
### Good Example
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
1. **`src/auth/handler.ts:48`** — Add input length validation before token processing
|
|
||||||
```typescript
|
|
||||||
if (!token || token.trim().length === 0) {
|
|
||||||
throw new ValidationError('Token must not be empty');
|
|
||||||
}
|
|
||||||
```
|
|
||||||
**Verify:** `npm test -- --grep "empty token"` passes
|
|
||||||
```
|
|
||||||
|
|
||||||
### Bad Example
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
1. **Auth module** — Fix the validation logic
|
|
||||||
```
|
|
||||||
|
|
||||||
This is too vague. Which file? Which function? What does "fix" mean? The Maker will guess.
|
|
||||||
|
|
||||||
### Granularity Check
|
|
||||||
|
|
||||||
- If a single change item would take **>5 minutes**, split it into smaller items
|
|
||||||
- If a non-trivial task has **<2 change items**, it is under-specified — the Creator missed something
|
|
||||||
- Each item should touch **1-2 files** at most. Cross-cutting changes need separate items per file.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Explorer Skip Conditions
|
|
||||||
|
|
||||||
Not every task needs Explorer research. Use this decision table:
|
|
||||||
|
|
||||||
| Condition | Skip Explorer? | Reason |
|
|
||||||
|-----------|---------------|--------|
|
|
||||||
| Task names specific files (1-2) and change is clear | **Yes** | Context is already known |
|
|
||||||
| Bug fix with stack trace or error message | **Yes** | Root cause is locatable without research |
|
|
||||||
| High confidence + small scope (single function/class) | **Yes** | Creator can mini-reflect instead |
|
|
||||||
| Task contains "investigate", "research", "explore" | **No** | Explicit research request |
|
|
||||||
| Task affects >3 files or unknown scope | **No** | Need dependency mapping |
|
|
||||||
| Unfamiliar area of codebase (no recent commits by team) | **No** | Need pattern discovery |
|
|
||||||
| Security-sensitive change (auth, crypto, input handling) | **No** | Need risk surface mapping |
|
|
||||||
|
|
||||||
When Explorer is skipped, Creator MUST include the **Mini-Reflect** section in its proposal to compensate for missing research context.
|
|
||||||
@@ -1,160 +1,59 @@
|
|||||||
---
|
---
|
||||||
name: presence
|
name: presence
|
||||||
description: |
|
description: |
|
||||||
Defines how ArcheFlow communicates its activity to the user — visible but not noisy.
|
Defines how ArcheFlow communicates its activity to the user -- visible but not noisy.
|
||||||
Show value, not process. Auto-loaded by the run skill.
|
Show value, not process. Auto-loaded by the run skill.
|
||||||
---
|
---
|
||||||
|
|
||||||
# ArcheFlow Presence — Visible Value, Not Noise
|
# ArcheFlow Presence -- Visible Value, Not Noise
|
||||||
|
|
||||||
ArcheFlow should feel like a skilled colleague working alongside you: you know they're there, you see results, but they don't narrate every keystroke.
|
## Output Rules
|
||||||
|
|
||||||
## Principles
|
1. Show outcomes, not mechanics
|
||||||
|
2. One line per phase, not per agent
|
||||||
1. **Show outcomes, not mechanics.** "Guardian caught a timeline bug" — good. "Spawning Guardian agent with attention filters..." — noise.
|
3. Numbers over words
|
||||||
2. **One line per phase, not per agent.** The user sees phases complete, not individual agent lifecycle.
|
4. Silence on clean passes
|
||||||
3. **Numbers over words.** "2 fixes applied" beats "We have successfully applied two fixes to the codebase."
|
5. Value summary at the end
|
||||||
4. **Silence is fine.** If a phase completes cleanly with no findings, don't announce it. Clean passes are the expected case.
|
|
||||||
5. **Value at the end.** The completion summary is the most important output — what was built, what was caught, what was fixed.
|
|
||||||
|
|
||||||
## Status Line Format
|
## Status Line Format
|
||||||
|
|
||||||
At key moments during a run, output a compact status line:
|
**Run start:**
|
||||||
|
|
||||||
### Run Start
|
|
||||||
```
|
```
|
||||||
── archeflow ── <task> ── <workflow> (<max_cycles> cycles) ──
|
-- archeflow -- <task> -- <workflow> (<max_cycles> cycles) --
|
||||||
```
|
|
||||||
Example:
|
|
||||||
```
|
|
||||||
── archeflow ── Write story "Der Huster" ── kurzgeschichte (2 cycles) ──
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Phase Complete (only if something happened worth mentioning)
|
**Phase complete (only if noteworthy):**
|
||||||
```
|
```
|
||||||
✓ plan explorer: 3 directions → chose C (Koffer) | creator: 6 scenes
|
V plan explorer: 3 directions -> chose C | creator: 6 scenes
|
||||||
✓ do 6004 words drafted
|
V do 6004 words drafted
|
||||||
△ check guardian: 1 fix needed | sage: 5 voice adjustments
|
T check guardian: 1 fix needed | sage: 5 voice adjustments
|
||||||
✓ act 6 fixes applied
|
V act 6 fixes applied
|
||||||
```
|
```
|
||||||
|
Symbols: V = clean, T = issues found, X = failed/blocked.
|
||||||
|
|
||||||
Symbols:
|
**Run complete:**
|
||||||
- `✓` — phase clean, no issues
|
|
||||||
- `△` — phase found issues (fixes needed)
|
|
||||||
- `✗` — phase failed (blocked, needs user input)
|
|
||||||
|
|
||||||
### Run Complete
|
|
||||||
```
|
```
|
||||||
── done ── 1 cycle · 5 agents · 6 fixes · ~22 min ──
|
-- done -- 1 cycle . 5 agents . 6 fixes . ~22 min --
|
||||||
```
|
|
||||||
|
|
||||||
If value was delivered, add a one-liner:
|
|
||||||
```
|
|
||||||
── done ── 1 cycle · 5 agents · 6 fixes · ~22 min ──
|
|
||||||
story drafted, reviewed, and polished. see stories/01-der-huster.md
|
story drafted, reviewed, and polished. see stories/01-der-huster.md
|
||||||
```
|
```
|
||||||
|
|
||||||
### Run Complete (with DAG, if terminal supports it)
|
**Activation indicator (session start, one line):**
|
||||||
Only show if the user explicitly asks or if `progress.dag_on_complete: true` in config:
|
|
||||||
```
|
```
|
||||||
── archeflow ── complete ──────────────────────
|
archeflow v0.7.0 . 24 skills . writing domain detected
|
||||||
#1 run.start
|
|
||||||
├── #2 explorer → #3 decision (C) → #4 creator
|
|
||||||
├── #6 maker (6004 words)
|
|
||||||
├── #8 guardian △1 · #9 sage △5
|
|
||||||
└── #12 complete [6 fixes]
|
|
||||||
───────────────────────────────────────────────
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## When to Be Silent
|
## When to Be Silent
|
||||||
|
|
||||||
- **Agent spawning/completion** — don't announce
|
- Agent spawning/completion lifecycle
|
||||||
- **Event emission** — internal bookkeeping, never visible
|
- Event emission
|
||||||
- **Artifact routing** — internal
|
- Artifact routing
|
||||||
- **Clean review passes** — if Guardian says APPROVED with 0 findings, skip it
|
- Clean review passes (0 findings)
|
||||||
- **Phase transitions** — only show if the phase produced visible output
|
- Phase transitions with no visible output
|
||||||
|
|
||||||
## When to Speak
|
## When to Speak
|
||||||
|
|
||||||
- **Run start** — always (user should know ArcheFlow activated)
|
- Run start and complete (always)
|
||||||
- **Findings found** — always (this is the value)
|
- Findings found and fixes applied
|
||||||
- **Fixes applied** — always (this is the outcome)
|
- Budget warnings
|
||||||
- **Run complete** — always (closure)
|
- Shadow detected
|
||||||
- **Budget warnings** — always (user needs to know)
|
- User decision needed
|
||||||
- **Shadow detected** — always (something went wrong)
|
|
||||||
- **User decision needed** — always (blocking)
|
|
||||||
|
|
||||||
## Activation Indicator
|
|
||||||
|
|
||||||
When ArcheFlow activates at session start (via the `using-archeflow` skill), show ONE line:
|
|
||||||
|
|
||||||
```
|
|
||||||
archeflow v0.7.0 · 24 skills · writing domain detected
|
|
||||||
```
|
|
||||||
|
|
||||||
Or for code projects:
|
|
||||||
```
|
|
||||||
archeflow v0.7.0 · 24 skills · code domain
|
|
||||||
```
|
|
||||||
|
|
||||||
If ArcheFlow decides NOT to activate (simple task, single file):
|
|
||||||
```
|
|
||||||
(nothing — silence is correct for simple tasks)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Integration with Progress File
|
|
||||||
|
|
||||||
The `.archeflow/progress.md` file is the detailed view for users who want more. The status lines above are the default — brief, inline, part of the conversation flow.
|
|
||||||
|
|
||||||
Users who want the full picture: `archeflow-progress.sh <run_id> --watch` in a second terminal.
|
|
||||||
|
|
||||||
## Anti-Patterns (Don't Do This)
|
|
||||||
|
|
||||||
```
|
|
||||||
❌ "I'm now activating the ArcheFlow orchestration framework..."
|
|
||||||
❌ "Spawning Explorer agent with model haiku and attention filter..."
|
|
||||||
❌ "The Guardian archetype has completed its security review and found..."
|
|
||||||
❌ "Let me run the convergence detection algorithm to check..."
|
|
||||||
❌ "According to the ArcheFlow process-log event schema..."
|
|
||||||
```
|
|
||||||
|
|
||||||
These expose internal mechanics. The user doesn't care about archetypes, attention filters, or event schemas. They care about: what was done, what was found, what was fixed.
|
|
||||||
|
|
||||||
## Examples: Good Presence
|
|
||||||
|
|
||||||
### Example 1: Feature Implementation
|
|
||||||
```
|
|
||||||
── archeflow ── Add JWT auth ── standard (2 cycles) ──
|
|
||||||
✓ plan 3 files affected, JWT + middleware approach
|
|
||||||
✓ do implemented (auth.ts, middleware.ts, tests)
|
|
||||||
△ check guardian: missing token expiry check
|
|
||||||
✓ act 1 fix applied
|
|
||||||
── done ── 1 cycle · 4 agents · 1 fix · ~8 min ──
|
|
||||||
```
|
|
||||||
|
|
||||||
### Example 2: Story Writing
|
|
||||||
```
|
|
||||||
── archeflow ── Write "Der Huster" ── kurzgeschichte (2 cycles) ──
|
|
||||||
✓ plan 3 plot directions → chose C (Mo krank + Koffer)
|
|
||||||
✓ do 6004 words, 7 scenes
|
|
||||||
△ check 1 timeline bug, 5 voice adjustments
|
|
||||||
✓ act 6 fixes applied
|
|
||||||
── done ── 1 cycle · 5 agents · 6 fixes · ~22 min ──
|
|
||||||
stories/01-der-huster.md ready
|
|
||||||
```
|
|
||||||
|
|
||||||
### Example 3: Quick Fix (minimal output)
|
|
||||||
```
|
|
||||||
── archeflow ── Fix pagination bug ── fast ──
|
|
||||||
✓ fix applied, tests pass
|
|
||||||
── done ── 1 cycle · 3 agents · ~4 min ──
|
|
||||||
```
|
|
||||||
|
|
||||||
### Example 4: Multi-Project
|
|
||||||
```
|
|
||||||
── archeflow ── giesing-story-v2 ── 3 projects ──
|
|
||||||
✓ archeflow artifact routing improved
|
|
||||||
✓ colette voice validation added
|
|
||||||
✓ giesing story #2 drafted (5800 words)
|
|
||||||
── done ── 3 projects · 12 agents · ~35 min ──
|
|
||||||
```
|
|
||||||
|
|||||||
@@ -1,278 +0,0 @@
|
|||||||
---
|
|
||||||
name: process-log
|
|
||||||
description: |
|
|
||||||
Event-based process logging for ArcheFlow orchestrations. Captures every phase transition,
|
|
||||||
agent output, decision, and fix as structured JSONL events. Enables post-hoc reports,
|
|
||||||
dashboards, and process archaeology.
|
|
||||||
<example>Automatically loaded during orchestration</example>
|
|
||||||
<example>User: "Show me how this story was made"</example>
|
|
||||||
---
|
|
||||||
|
|
||||||
# Process Log — Event-Sourced Orchestration History
|
|
||||||
|
|
||||||
Every ArcheFlow orchestration writes structured events to a JSONL file. Events are the **single source of truth** — all reports (Markdown, dashboards, timelines) are generated views.
|
|
||||||
|
|
||||||
## Event Storage
|
|
||||||
|
|
||||||
```
|
|
||||||
.archeflow/events/<run-id>.jsonl # One file per orchestration run
|
|
||||||
.archeflow/events/index.jsonl # Run index (one line per run, for listing)
|
|
||||||
```
|
|
||||||
|
|
||||||
**Run ID format:** `<date>-<slug>` (e.g., `2026-04-03-der-huster`)
|
|
||||||
|
|
||||||
## When to Emit Events
|
|
||||||
|
|
||||||
Emit an event at each of these points during orchestration:
|
|
||||||
|
|
||||||
| Moment | Event Type | Trigger |
|
|
||||||
|--------|-----------|---------|
|
|
||||||
| Orchestration starts | `run.start` | After workflow selection, before first agent |
|
|
||||||
| Agent spawned | `agent.start` | Before each Agent tool call |
|
|
||||||
| Agent completes | `agent.complete` | After each Agent returns |
|
|
||||||
| Phase transition | `phase.transition` | Plan→Do, Do→Check, Check→Act |
|
|
||||||
| Decision made | `decision` | Plot direction chosen, fix applied, workflow adapted |
|
|
||||||
| Review verdict | `review.verdict` | Guardian/Sage/Skeptic delivers verdict |
|
|
||||||
| Fix applied | `fix.applied` | After each edit that addresses a review finding |
|
|
||||||
| Cycle boundary | `cycle.boundary` | End of PDCA cycle, before next (or exit) |
|
|
||||||
| Shadow detected | `shadow.detected` | Shadow threshold triggered |
|
|
||||||
| Orchestration ends | `run.complete` | After final Act phase |
|
|
||||||
|
|
||||||
## Event Schema
|
|
||||||
|
|
||||||
Every event is one JSON line with these required fields:
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
{
|
|
||||||
"ts": "2026-04-03T14:32:07Z",
|
|
||||||
"run_id": "2026-04-03-der-huster",
|
|
||||||
"seq": 4,
|
|
||||||
"parent": [2],
|
|
||||||
"type": "agent.complete",
|
|
||||||
"phase": "plan",
|
|
||||||
"agent": "creator",
|
|
||||||
"data": { ... }
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
| Field | Type | Description |
|
|
||||||
|-------|------|-------------|
|
|
||||||
| `ts` | ISO 8601 | Timestamp |
|
|
||||||
| `run_id` | string | Unique run identifier |
|
|
||||||
| `seq` | integer | Monotonically increasing sequence number within run |
|
|
||||||
| `parent` | int[] | Seq numbers of causal parent events. Forms a DAG. `[]` for root events. |
|
|
||||||
| `type` | string | Event type (see table above) |
|
|
||||||
| `phase` | string | Current PDCA phase: `plan`, `do`, `check`, `act` |
|
|
||||||
| `agent` | string or null | Agent archetype that triggered the event |
|
|
||||||
| `data` | object | Event-type-specific payload (see below) |
|
|
||||||
|
|
||||||
### Parent Relationships (DAG)
|
|
||||||
|
|
||||||
The `parent` field turns the flat event stream into a directed acyclic graph (agent call graph). This enables:
|
|
||||||
|
|
||||||
- **Causal reconstruction:** which agent output caused which downstream action
|
|
||||||
- **Parallel visualization:** agents sharing a parent ran concurrently
|
|
||||||
- **Blame tracking:** trace a fix back through review → draft → outline → research
|
|
||||||
|
|
||||||
Rules:
|
|
||||||
- `run.start` has `parent: []` (root node)
|
|
||||||
- An agent has `parent: [seq of event that triggered it]`
|
|
||||||
- A phase transition has `parent: [seq of all completing events in prior phase]`
|
|
||||||
- A fix has `parent: [seq of the review that found the issue]`
|
|
||||||
- A decision has `parent: [seq of the agent that produced the alternatives]`
|
|
||||||
- Parallel agents share the same parent (fan-out), phase transitions collect them (fan-in)
|
|
||||||
|
|
||||||
Example DAG from a writing workflow:
|
|
||||||
```
|
|
||||||
#1 run.start []
|
|
||||||
├── #2 agent.complete (explorer) [1]
|
|
||||||
│ └── #3 decision (plot direction) [2]
|
|
||||||
├── #4 agent.complete (creator) [2] ← explorer informs creator
|
|
||||||
├── #5 phase.transition (plan→do) [3,4] ← fan-in
|
|
||||||
│ └── #6 agent.complete (maker) [5]
|
|
||||||
├── #7 phase.transition (do→check) [6]
|
|
||||||
│ ├── #8 review (guardian) [7] ← parallel (fan-out)
|
|
||||||
│ └── #9 review (sage) [7] ← parallel (fan-out)
|
|
||||||
├── #10 phase.transition (check→act) [8,9] ← fan-in
|
|
||||||
├── #11 fix (timeline) [8] ← caused by guardian
|
|
||||||
├── #12 fix (voice drift) [9] ← caused by sage
|
|
||||||
└── #18 run.complete [17]
|
|
||||||
```
|
|
||||||
|
|
||||||
## Event Payloads by Type
|
|
||||||
|
|
||||||
### `run.start`
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"task": "Write short story 'Der Huster'",
|
|
||||||
"workflow": "kurzgeschichte",
|
|
||||||
"team": "story-development",
|
|
||||||
"max_cycles": 2,
|
|
||||||
"config": {
|
|
||||||
"voice_profile": "vp-giesing-gschichten-v1",
|
|
||||||
"persona": "giesinger",
|
|
||||||
"target_words": 6000
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### `agent.start`
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"archetype": "story-explorer",
|
|
||||||
"model": "haiku",
|
|
||||||
"prompt_summary": "Research premise, find emotional core, suggest 3 plot directions"
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### `agent.complete`
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"archetype": "story-explorer",
|
|
||||||
"duration_ms": 87605,
|
|
||||||
"tokens": 21645,
|
|
||||||
"artifacts": ["docs/01-der-huster-research.md"],
|
|
||||||
"summary": "3 plot directions developed, recommended C (Mo krank + Koffer)"
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### `decision`
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"what": "plot_direction",
|
|
||||||
"chosen": "C — Mo krank + Koffer aus B",
|
|
||||||
"alternatives": [
|
|
||||||
{"id": "A", "label": "Mo ist weg", "reason_rejected": "Zu passiv für 6k-Story"},
|
|
||||||
{"id": "B", "label": "Huster gehört nicht Mo", "reason_rejected": "Zu Krimi-nah"}
|
|
||||||
],
|
|
||||||
"rationale": "Stärkster emotionaler Kern, passt zum Voice Profile"
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### `review.verdict`
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"archetype": "guardian",
|
|
||||||
"verdict": "approved_with_fixes",
|
|
||||||
"findings": [
|
|
||||||
{"severity": "bug", "description": "Timeline: 'Montag' referenced but story starts Dienstag", "fix_required": true},
|
|
||||||
{"severity": "recommendation", "description": "Gentrification monologue too long for Alex register", "fix_required": false}
|
|
||||||
]
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### `fix.applied`
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"source": "guardian",
|
|
||||||
"finding": "Timeline: Montag → Dienstag",
|
|
||||||
"file": "stories/01-der-huster.md",
|
|
||||||
"line": 302,
|
|
||||||
"before": "das Gegenteil von Montag",
|
|
||||||
"after": "das Gegenteil von Dienstag"
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### `phase.transition`
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"from": "plan",
|
|
||||||
"to": "do",
|
|
||||||
"artifacts_so_far": ["research.md", "outline.md"],
|
|
||||||
"notes": "Explorer recommended direction C, Creator produced 6-scene outline"
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### `cycle.boundary`
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"cycle": 1,
|
|
||||||
"max_cycles": 2,
|
|
||||||
"exit_condition": "all_approved",
|
|
||||||
"met": true,
|
|
||||||
"fixes_applied": 6,
|
|
||||||
"next_action": "complete"
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### `shadow.detected`
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"archetype": "story-explorer",
|
|
||||||
"shadow": "endless_research",
|
|
||||||
"trigger": "output >2000 words without recommendation",
|
|
||||||
"action": "correction_prompt_applied",
|
|
||||||
"occurrence": 1
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### `run.complete`
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"status": "completed",
|
|
||||||
"cycles": 1,
|
|
||||||
"agents_total": 5,
|
|
||||||
"fixes_total": 6,
|
|
||||||
"shadows": 0,
|
|
||||||
"duration_ms": 1295519,
|
|
||||||
"artifacts": [
|
|
||||||
"docs/01-der-huster-research.md",
|
|
||||||
"docs/01-der-huster-outline.md",
|
|
||||||
"stories/01-der-huster.md",
|
|
||||||
"docs/01-der-huster-guardian-review.md",
|
|
||||||
"docs/01-der-huster-sage-review.md",
|
|
||||||
"docs/01-der-huster-process.md"
|
|
||||||
]
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## How to Emit Events
|
|
||||||
|
|
||||||
During orchestration, write events using this pattern:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Append one event to the run's JSONL file
|
|
||||||
echo '{"ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","run_id":"RUN_ID","seq":SEQ,"type":"TYPE","phase":"PHASE","agent":"AGENT","data":{...}}' >> .archeflow/events/RUN_ID.jsonl
|
|
||||||
```
|
|
||||||
|
|
||||||
Or use the helper script:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./lib/archeflow-event.sh RUN_ID TYPE PHASE AGENT '{"key":"value"}'
|
|
||||||
```
|
|
||||||
|
|
||||||
The orchestration skill should call the event emitter at each trigger point listed in the table above.
|
|
||||||
|
|
||||||
## Generating Reports
|
|
||||||
|
|
||||||
After orchestration completes (or during, for live progress):
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Generate markdown process report
|
|
||||||
./lib/archeflow-report.sh .archeflow/events/2026-04-03-der-huster.jsonl > docs/process-report.md
|
|
||||||
|
|
||||||
# List all runs
|
|
||||||
cat .archeflow/events/index.jsonl | jq -r '[.run_id, .status, .task] | @tsv'
|
|
||||||
```
|
|
||||||
|
|
||||||
## Run Index
|
|
||||||
|
|
||||||
After each `run.complete`, append a summary line to `.archeflow/events/index.jsonl`:
|
|
||||||
|
|
||||||
```jsonl
|
|
||||||
{"run_id":"2026-04-03-der-huster","ts":"2026-04-03T16:00:00Z","task":"Write Der Huster","workflow":"kurzgeschichte","status":"completed","cycles":1,"agents":5,"fixes":6,"duration_ms":1295519}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Integration with Existing Skills
|
|
||||||
|
|
||||||
- **`orchestration`**: Emit events at phase transitions and after each agent
|
|
||||||
- **`shadow-detection`**: Emit `shadow.detected` when thresholds trigger
|
|
||||||
- **`autonomous-mode`**: Use `index.jsonl` for session summaries instead of separate session-log
|
|
||||||
- **`workflow-design`**: Custom workflows inherit logging automatically
|
|
||||||
|
|
||||||
## Design Principles
|
|
||||||
|
|
||||||
1. **Append-only.** Never modify or delete events. They are immutable facts.
|
|
||||||
2. **Self-contained.** Each event has enough context to be understood alone (no forward references).
|
|
||||||
3. **Cheap.** One `echo >>` per event. No database, no service, no dependencies.
|
|
||||||
4. **Optional.** If events dir doesn't exist, orchestration works fine without logging. Events are observation, not control flow.
|
|
||||||
@@ -3,37 +3,20 @@ name: progress
|
|||||||
description: |
|
description: |
|
||||||
Live progress file for ArcheFlow orchestrations. Regenerates `.archeflow/progress.md`
|
Live progress file for ArcheFlow orchestrations. Regenerates `.archeflow/progress.md`
|
||||||
after every event emission, giving users real-time visibility into run status, budget
|
after every event emission, giving users real-time visibility into run status, budget
|
||||||
usage, and DAG shape — watchable from a second terminal.
|
usage, and DAG shape -- watchable from a second terminal.
|
||||||
<example>User: "What's happening with my run?"</example>
|
<example>User: "What's happening with my run?"</example>
|
||||||
<example>watch -n 2 cat .archeflow/progress.md</example>
|
<example>watch -n 2 cat .archeflow/progress.md</example>
|
||||||
---
|
---
|
||||||
|
|
||||||
# Live Progress — Real-Time Run Visibility
|
# Live Progress -- Real-Time Run Visibility
|
||||||
|
|
||||||
During long-running orchestrations (Maker drafting, parallel reviews), users have no visibility into what is happening. This skill solves that by maintaining a live progress file that is regenerated after every event.
|
Maintains `.archeflow/progress.md`, updated after every event during a run.
|
||||||
|
|
||||||
## Progress File
|
|
||||||
|
|
||||||
**Location:** `.archeflow/progress.md`
|
|
||||||
|
|
||||||
Updated after every event emission during a run. Users can watch it from a second terminal:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Simple polling
|
|
||||||
watch -n 2 cat .archeflow/progress.md
|
|
||||||
|
|
||||||
# Continuous mode (built-in)
|
|
||||||
./lib/archeflow-progress.sh <run_id> --watch
|
|
||||||
|
|
||||||
# Programmatic consumption
|
|
||||||
./lib/archeflow-progress.sh <run_id> --json
|
|
||||||
```
|
|
||||||
|
|
||||||
## Progress File Format
|
## Progress File Format
|
||||||
|
|
||||||
```markdown
|
```markdown
|
||||||
# ArcheFlow Run: 2026-04-03-der-huster
|
# ArcheFlow Run: 2026-04-03-der-huster
|
||||||
**Status:** DO phase — maker running (3/6 scenes drafted)
|
**Status:** DO phase -- maker running (3/6 scenes drafted)
|
||||||
**Started:** 14:32 | **Elapsed:** 8 min
|
**Started:** 14:32 | **Elapsed:** 8 min
|
||||||
**Budget:** $1.45 / $10.00 (14%)
|
**Budget:** $1.45 / $10.00 (14%)
|
||||||
|
|
||||||
@@ -47,145 +30,40 @@ watch -n 2 cat .archeflow/progress.md
|
|||||||
- [ ] ACT: Apply fixes
|
- [ ] ACT: Apply fixes
|
||||||
|
|
||||||
## Latest Event
|
## Latest Event
|
||||||
#6 agent.start — maker (do) — 14:40
|
#6 agent.start -- maker (do) -- 14:40
|
||||||
|
|
||||||
## DAG (so far)
|
|
||||||
#1 run.start
|
|
||||||
├── #2 story-explorer ✓
|
|
||||||
│ ├── #3 decision ✓
|
|
||||||
│ └── #4 creator ✓
|
|
||||||
├── #5 plan→do ✓
|
|
||||||
└── #6 maker ← running
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## How to Use
|
## Usage
|
||||||
|
|
||||||
### During Orchestration (run skill integration)
|
The `run` skill calls `archeflow-progress.sh` after each event emission:
|
||||||
|
|
||||||
The `run` skill should call `archeflow-progress.sh` after each event emission. This keeps progress decoupled from the event emitter itself — no modification to `archeflow-event.sh` is needed.
|
|
||||||
|
|
||||||
Add this call after every `archeflow-event.sh` invocation in the run loop:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# After emitting an event:
|
|
||||||
./lib/archeflow-event.sh "$RUN_ID" agent.complete plan explorer '{"archetype":"explorer",...}'
|
|
||||||
|
|
||||||
# Update progress:
|
|
||||||
./lib/archeflow-progress.sh "$RUN_ID"
|
|
||||||
```
|
```
|
||||||
|
|
||||||
This is a fast operation (reads JSONL, writes one markdown file) and adds negligible overhead.
|
|
||||||
|
|
||||||
### From a Second Terminal
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# One-shot: see current state
|
|
||||||
./lib/archeflow-progress.sh <run_id>
|
./lib/archeflow-progress.sh <run_id>
|
||||||
cat .archeflow/progress.md
|
|
||||||
|
|
||||||
# Continuous: auto-refresh every 2 seconds
|
|
||||||
./lib/archeflow-progress.sh <run_id> --watch
|
|
||||||
|
|
||||||
# JSON output for dashboards or scripts
|
|
||||||
./lib/archeflow-progress.sh <run_id> --json
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Reactive Mode (via JSONL tail)
|
**From a second terminal:**
|
||||||
|
- One-shot: `cat .archeflow/progress.md`
|
||||||
|
- Continuous: `./lib/archeflow-progress.sh <run_id> --watch`
|
||||||
|
- JSON output: `./lib/archeflow-progress.sh <run_id> --json`
|
||||||
|
|
||||||
```bash
|
## How the Script Works
|
||||||
tail -f .archeflow/events/<run_id>.jsonl | while read line; do
|
|
||||||
./lib/archeflow-progress.sh <run_id>
|
|
||||||
done
|
|
||||||
```
|
|
||||||
|
|
||||||
## Progress Script
|
1. Read `.archeflow/events/<run_id>.jsonl`
|
||||||
|
2. Determine current phase and active agent
|
||||||
**Location:** `lib/archeflow-progress.sh`
|
3. Build checklist from events (only started/completed agents shown)
|
||||||
|
4. Calculate budget from `agent.complete` cost data
|
||||||
```
|
5. Write `.archeflow/progress.md`
|
||||||
Usage:
|
|
||||||
archeflow-progress.sh <run_id> # Generate/update progress.md
|
|
||||||
archeflow-progress.sh <run_id> --watch # Continuous update mode (2s interval)
|
|
||||||
archeflow-progress.sh <run_id> --json # Output as JSON (for dashboards)
|
|
||||||
```
|
|
||||||
|
|
||||||
### What the Script Does
|
|
||||||
|
|
||||||
1. **Read** `.archeflow/events/<run_id>.jsonl` — the event stream for this run
|
|
||||||
2. **Determine** current phase and active agent from the latest events
|
|
||||||
3. **Build checklist** — mark completed agents with timing/cost data, show pending agents as unchecked
|
|
||||||
4. **Show partial DAG** — completed nodes with checkmarks, running node with arrow indicator
|
|
||||||
5. **Calculate budget** — sum `estimated_cost_usd` from `agent.complete` events, compare to budget from `run.start` config or `.archeflow/config.yaml`
|
|
||||||
6. **Compute elapsed time** — difference between `run.start` timestamp and now
|
|
||||||
7. **Write** to `.archeflow/progress.md`
|
|
||||||
|
|
||||||
### Output Modes
|
|
||||||
|
|
||||||
**Default (markdown):** Writes `.archeflow/progress.md` and prints the same content to stdout.
|
|
||||||
|
|
||||||
**`--watch`:** Clears the terminal every 2 seconds, re-reads the JSONL, and regenerates the display. Exits when a `run.complete` event is found.
|
|
||||||
|
|
||||||
**`--json`:** Outputs a structured JSON object to stdout (does not write progress.md):
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"run_id": "2026-04-03-der-huster",
|
|
||||||
"status": "running",
|
|
||||||
"phase": "do",
|
|
||||||
"active_agent": "maker",
|
|
||||||
"elapsed_seconds": 480,
|
|
||||||
"budget_used_usd": 1.45,
|
|
||||||
"budget_total_usd": 10.00,
|
|
||||||
"budget_percent": 14,
|
|
||||||
"completed": [
|
|
||||||
{"agent": "explorer", "phase": "plan", "duration_s": 87, "tokens": 21000, "cost_usd": 0.02},
|
|
||||||
{"agent": "creator", "phase": "plan", "duration_s": 167, "tokens": 26000, "cost_usd": 0.08}
|
|
||||||
],
|
|
||||||
"pending": ["guardian", "sage"],
|
|
||||||
"latest_event": {"seq": 6, "type": "agent.start", "agent": "maker", "phase": "do"},
|
|
||||||
"total_events": 6
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Checklist Construction
|
## Checklist Construction
|
||||||
|
|
||||||
The progress checklist is built from events, not from a predefined workflow definition. Each event type maps to a checklist entry:
|
| Event Type | Entry |
|
||||||
|
|-----------|-------|
|
||||||
| Event Type | Checklist Entry |
|
|
||||||
|-----------|----------------|
|
|
||||||
| `agent.complete` | `- [x] PHASE: archetype (duration, tokens, cost)` |
|
| `agent.complete` | `- [x] PHASE: archetype (duration, tokens, cost)` |
|
||||||
| `agent.start` (no matching complete) | `- [ ] **PHASE: archetype** <- running (elapsed)` |
|
| `agent.start` (no complete) | `- [ ] **PHASE: archetype** <- running` |
|
||||||
| `phase.transition` | `- [x] PHASE -> PHASE transition` |
|
| `phase.transition` | `- [x] PHASE -> PHASE transition` |
|
||||||
| `review.verdict` | `- [x] CHECK: archetype -> VERDICT` |
|
|
||||||
| `fix.applied` | `- [x] ACT: Fix (source)` |
|
|
||||||
| `cycle.boundary` | `- [x] Cycle N complete` |
|
| `cycle.boundary` | `- [x] Cycle N complete` |
|
||||||
|
|
||||||
Pending agents (not yet started) are NOT shown in the checklist — only started or completed agents appear. This avoids guessing which agents will be spawned.
|
Pending (not-yet-started) agents are NOT shown to avoid guessing.
|
||||||
|
|
||||||
## Budget Display
|
## Budget Display
|
||||||
|
|
||||||
Budget information comes from two sources:
|
Source: `run.start` event or `.archeflow/config.yaml`. If no budget configured: show cost only.
|
||||||
|
|
||||||
1. **`run.start` event** — may contain `config.budget_usd`
|
|
||||||
2. **`.archeflow/config.yaml`** — global `budget.per_run_usd`
|
|
||||||
|
|
||||||
If no budget is configured, the budget line shows cost only (no percentage):
|
|
||||||
|
|
||||||
```
|
|
||||||
**Cost:** $1.45 (no budget set)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Integration with Other Skills
|
|
||||||
|
|
||||||
- **`run`**: Should call `archeflow-progress.sh` after each event emission
|
|
||||||
- **`process-log`**: Progress reads the same JSONL that process-log defines
|
|
||||||
- **`cost-tracking`**: Budget data and cost calculations follow cost-tracking conventions
|
|
||||||
- **`autonomous-mode`**: Progress file is useful for monitoring autonomous overnight runs
|
|
||||||
|
|
||||||
## Design Principles
|
|
||||||
|
|
||||||
1. **Read-only on events.** Progress never modifies the JSONL. It is a derived view.
|
|
||||||
2. **Fast.** One JSONL read + one markdown write. No jq streaming, no databases.
|
|
||||||
3. **Decoupled.** No hooks in `archeflow-event.sh`. The `run` skill calls progress explicitly.
|
|
||||||
4. **Optional.** If progress is never called, orchestration works fine. No side effects.
|
|
||||||
5. **Terminal-friendly.** Output is plain markdown — renders well in `cat`, `bat`, `glow`, or any terminal.
|
|
||||||
|
|||||||
1027
skills/run/SKILL.md
1027
skills/run/SKILL.md
File diff suppressed because it is too large
Load Diff
@@ -1,180 +1,129 @@
|
|||||||
---
|
---
|
||||||
name: shadow-detection
|
name: shadow-detection
|
||||||
description: Use when monitoring agent behavior for dysfunction, when an agent seems stuck, or when orchestration quality is degrading. Detects and corrects Jungian shadow activation in archetypes.
|
description: |
|
||||||
|
Corrective action framework for agent dysfunction, system health, and operational policy.
|
||||||
|
Three layers — archetype shadows, system shadows, policy boundaries — one escalation protocol.
|
||||||
---
|
---
|
||||||
|
|
||||||
# Shadow Detection
|
# Corrective Action Framework
|
||||||
|
|
||||||
Every archetype has a **virtue** (its unique contribution) and a **shadow** (the destructive inversion of that virtue). A shadow activates when the virtue is pushed too far.
|
Detect dysfunction. Apply corrective action. Escalate if repeated.
|
||||||
|
|
||||||
```
|
Three layers, one protocol:
|
||||||
Virtue (healthy) → pushed too far → Shadow (dysfunction)
|
- **Archetype Shadows** — individual agent dysfunction (virtue pushed too far)
|
||||||
|
- **System Shadows** — orchestration-level dysfunction (process going wrong)
|
||||||
Contextual Clarity → can't stop → Rabbit Hole
|
- **Policy Boundaries** — operational limits (time, cost, quality thresholds)
|
||||||
Decisive Framing → over-builds → Over-Architect
|
|
||||||
Execution Discipline → no guardrails → Rogue
|
|
||||||
Threat Intuition → sees threats only → Paranoid
|
|
||||||
Assumption Surfacing → questions only → Paralytic
|
|
||||||
Adversarial Creativity → noise over signal → False Alarm
|
|
||||||
Maintainability Judgment → reviews only → Bureaucrat
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Explorer → Rabbit Hole
|
## Archetype Shadows
|
||||||
**Virtue inverted:** Contextual Clarity becomes compulsive investigation — or output that dumps without analyzing.
|
|
||||||
|
|
||||||
**Symptoms:**
|
| Archetype | Shadow | Detect (any) | Corrective Action |
|
||||||
- Research output keeps growing but never synthesizes
|
|-----------|--------|-------------|-------------------|
|
||||||
- "I found one more thing to check" repeated 3+ times
|
| Explorer | Rabbit Hole | Output >2000w without Recommendation; >3 tangents; >15 files no patterns; no synthesis in final 25% | "Summarize top 3 findings and one recommendation in 300 words." |
|
||||||
- Reading more than 15 files without producing findings
|
| Creator | Over-Architect | >2 new abstractions for one feature; "future-proof" in rationale; scope exceeds task >50%; >1 new package | "Design for the current order of magnitude. Remove abstractions for hypothetical requirements." |
|
||||||
- Output is a raw inventory of files with no analysis or recommendation
|
| Maker | Rogue | Zero test files with >=3 files changed; single monolithic commit; files outside proposal; no test run evidence | "Read the proposal. Write a test. Commit. Revert out-of-scope files." |
|
||||||
|
| Guardian | Paranoid | CRITICAL:WARNING ratio >2:1 (min 3); zero APPROVED in 3+ reviews; <50% findings include fix; findings require compromised systems | "For each CRITICAL: would a senior engineer block a PR? If not, downgrade. Every rejection needs a specific fix." |
|
||||||
|
| Skeptic | Paralytic | >7 challenges; <50% include alternatives; same concern 2+ times reworded; >3 findings outside scope | "Rank by impact. Keep top 3 with alternatives. Delete the rest." |
|
||||||
|
| Trickster | False Alarm | Findings in untouched code; >10 findings for <5 files; impossible scenarios; >3 without repro steps | "Delete findings outside the diff. Rank by likelihood x impact. Keep top 3-5." |
|
||||||
|
| Sage | Bureaucrat | Review words >2x diff lines; findings outside changeset; >2 "consider" without action; suggesting docs for trivial functions | "Limit to issues affecting maintainability in 6 months. Every finding needs a specific action." |
|
||||||
|
|
||||||
**Detection Checklist** (trigger on ANY):
|
### Shadow Immunity
|
||||||
- [ ] Output >2000 words without a `### Recommendation` section
|
|
||||||
- [ ] >3 tangent topics not directly related to the original task
|
|
||||||
- [ ] >15 files read with no `### Patterns` identified
|
|
||||||
- [ ] No synthesis language (recommend, suggest, conclusion, finding, summary) in final 25% of output
|
|
||||||
|
|
||||||
**Correction:**
|
Intensity alone is not a shadow. **Shadow = behavior disconnected from the goal.**
|
||||||
"Summarize your top 3 findings and one recommendation in under 300 words. If your output has no Recommendation section, add one. A dump is not research."
|
|
||||||
|
- Explorer reading 20 files in a monorepo with scattered deps -- not rabbit hole if each is relevant
|
||||||
|
- Guardian blocking with 2 CRITICALs -- not paranoid if both are genuine vulnerabilities
|
||||||
|
- Trickster finding 5 edge cases -- not false alarm if all are in changed code with repro steps
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Creator → Over-Architect
|
## System Shadows
|
||||||
**Virtue inverted:** Decisive Framing becomes designing at the wrong scale.
|
|
||||||
|
|
||||||
**Symptoms:**
|
Orchestration-level dysfunction that isn't tied to one archetype.
|
||||||
- Abstraction layers for one-time operations
|
|
||||||
- Future-proofing for requirements that don't exist
|
|
||||||
- Configuration systems for things that could be constants
|
|
||||||
- Proposal has more infrastructure than business logic
|
|
||||||
|
|
||||||
**Detection Checklist** (trigger on ANY):
|
| Shadow | Detect | Corrective Action |
|
||||||
- [ ] >2 new abstractions (interfaces, base classes, factories, registries) for a single feature
|
|--------|--------|-------------------|
|
||||||
- [ ] "In the future we might need..." or "future-proof" appears in rationale
|
| **Tunnel Vision** | All reviewers flag same category (e.g., 4 security findings, 0 quality/testing) | "Redistribute attention. Are we missing quality, testing, or design concerns?" |
|
||||||
- [ ] Proposal scope (files changed) exceeds original task scope by >50%
|
| **Echo Chamber** | Unanimous approval in <30s on standard/thorough workflow | "Suspicious fast consensus. Re-run Guardian with adversarial prompt." |
|
||||||
- [ ] More than 1 new package/module introduced for a single feature
|
| **Gold Plating** | Maker working on INFO fixes while CRITICALs remain open | "Fix CRITICALs first. Park INFO items." |
|
||||||
|
| **Analysis Paralysis** | Plan phase >2x longer than Do phase; Explorer spawned 3+ times | "Stop researching. Ship a proposal with known gaps." |
|
||||||
**Correction:**
|
| **Cargo Cult** | Memory lesson injected but the same finding repeats anyway | "Lesson ineffective. Reword, strengthen, or remove it." |
|
||||||
"Design for the current order of magnitude. If the app has 1000 users, design for 10,000 — not 10 million. Remove abstractions that serve hypothetical requirements."
|
| **Broken Window** | 3+ WARNINGs deferred across consecutive runs in the same project | "Accumulated tech debt. Schedule a cleanup sprint." |
|
||||||
|
| **Scope Creep** | Maker changes >2x files listed in proposal | "Revert to proposal scope. If more files needed, update the proposal first." |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Maker → Rogue
|
## Policy Boundaries
|
||||||
**Virtue inverted:** Execution Discipline becomes reckless shipping — or expanding beyond the plan.
|
|
||||||
|
|
||||||
**Symptoms:**
|
Operational limits that protect session quality, cost, and resumability.
|
||||||
- Writing code before reading the proposal fully
|
|
||||||
- No tests, or tests written after implementation
|
|
||||||
- Large uncommitted working tree
|
|
||||||
- Files changed that aren't mentioned in the proposal
|
|
||||||
|
|
||||||
**Detection Checklist** (trigger on ANY):
|
### Checkpoint Policy
|
||||||
- [ ] Zero test files (`.test.`, `.spec.`, `_test.`) in the changeset with >=3 files changed
|
|
||||||
- [ ] Single monolithic commit instead of incremental commits
|
|
||||||
- [ ] Diff contains files not listed in the Creator's proposal `### Changes` section
|
|
||||||
- [ ] No evidence of running existing test suite before finishing
|
|
||||||
|
|
||||||
**Correction:**
|
Every **45 minutes** or **3 completed tasks** (whichever first):
|
||||||
"Read the proposal. Write a test. Commit what you have. Revert changes to files not in the proposal. Then continue."
|
|
||||||
|
1. Commit + push all work in progress
|
||||||
|
2. Write handoff summary to `control-center.md`
|
||||||
|
3. Log token spend so far
|
||||||
|
4. Compare output quality: last task vs first task
|
||||||
|
5. If quality degrading -> STOP with clean state
|
||||||
|
6. If budget >80% spent -> STOP with clean state
|
||||||
|
7. Otherwise -> continue
|
||||||
|
|
||||||
|
### Budget Gate
|
||||||
|
|
||||||
|
| Threshold | Action |
|
||||||
|
|-----------|--------|
|
||||||
|
| 50% budget spent | Log warning, continue |
|
||||||
|
| 80% budget spent | Downgrade models (sonnet->haiku for reviewers) |
|
||||||
|
| 95% budget spent | Complete current task, then STOP |
|
||||||
|
| 100% budget | STOP immediately, commit WIP |
|
||||||
|
|
||||||
|
### Circuit Breaker
|
||||||
|
|
||||||
|
| Trigger | Action |
|
||||||
|
|---------|--------|
|
||||||
|
| 3 consecutive agent failures/timeouts | STOP. Infrastructure issue, not a code problem. |
|
||||||
|
| 3 consecutive task failures in sprint | STOP. Something systemic is wrong. |
|
||||||
|
| Same shadow detected 3+ times in one cycle | STOP. Task needs to be broken down or re-scoped. |
|
||||||
|
| Test suite broken after merge | Auto-revert, STOP, report. |
|
||||||
|
|
||||||
|
### Diminishing Returns
|
||||||
|
|
||||||
|
| Signal | Action |
|
||||||
|
|--------|--------|
|
||||||
|
| Cycle N findings identical to cycle N-1 | STOP cycling. Present best result. |
|
||||||
|
| Convergence score <0.5 for 2 consecutive cycles | STOP. "This needs a different approach." |
|
||||||
|
| Reviewer finding count increases cycle over cycle | STOP. Implementation is diverging, not converging. |
|
||||||
|
|
||||||
|
### Context Pollution
|
||||||
|
|
||||||
|
| Signal | Action |
|
||||||
|
|--------|--------|
|
||||||
|
| >15 memory lessons injected into one prompt | Prune to top 5 by frequency |
|
||||||
|
| >20 findings tracked across cycles | Summarize into top 5 themes |
|
||||||
|
| Agent prompt exceeds estimated 50% of context window | Strip examples, keep rules only |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Guardian → Paranoid
|
## Unified Escalation Protocol
|
||||||
**Virtue inverted:** Threat Intuition becomes blocking everything — without offering a path forward.
|
|
||||||
|
|
||||||
**Symptoms:**
|
All three layers use the same escalation:
|
||||||
- Every finding marked CRITICAL
|
|
||||||
- Blocking on theoretical risks with < 1% probability
|
|
||||||
- Rejecting without suggesting how to fix
|
|
||||||
- Security concerns for internal-only code at external-API severity
|
|
||||||
|
|
||||||
**Detection Checklist** (trigger on ANY):
|
| Step | Archetype Shadows | System Shadows | Policy Boundaries |
|
||||||
- [ ] CRITICAL:WARNING ratio >2:1 (with minimum 3 total findings)
|
|------|-------------------|----------------|-------------------|
|
||||||
- [ ] Zero APPROVED verdicts in 3+ consecutive reviews
|
| **1st** | Apply corrective action, let agent continue | Apply corrective action, continue run | Apply boundary action (downgrade, checkpoint) |
|
||||||
- [ ] <50% of findings include a suggested fix in the `Fix` column
|
| **2nd** (same issue) | Replace the agent -- shadow is entrenched | Pause run, report to user | Force stop with clean state |
|
||||||
- [ ] Findings reference attack scenarios that require already-compromised internal systems
|
| **3rd** (pattern) | Escalate to user: "task needs re-scoping" | Escalate to user: "systemic issue" | Escalate to user: "resource limits reached" |
|
||||||
|
|
||||||
**Correction:**
|
|
||||||
"For each CRITICAL finding, answer: Would a senior engineer block a PR for this? If not, downgrade. Every rejection must include a specific, implementable fix."
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Skeptic → Paralytic
|
## Integration
|
||||||
**Virtue inverted:** Assumption Surfacing becomes inability to approve anything — drowning signal in tangential concerns.
|
|
||||||
|
|
||||||
**Symptoms:**
|
Shadow checks run **after each agent completes** during orchestration. System shadow checks run **at phase boundaries**. Policy checks run **on a timer and at task boundaries**.
|
||||||
- More than 7 challenges raised
|
|
||||||
- Challenges without suggested alternatives
|
|
||||||
- "What about X?" chains that drift from the task
|
|
||||||
- Restating the same concern in different words
|
|
||||||
|
|
||||||
**Detection Checklist** (trigger on ANY):
|
The `run` skill references this framework at:
|
||||||
- [ ] >7 findings/challenges raised in a single review
|
- Step 3 (Check phase): archetype shadow monitoring
|
||||||
- [ ] <50% of findings include an alternative in the `Fix` column
|
- Step 4 (Act phase): convergence/diminishing returns
|
||||||
- [ ] Same conceptual concern appears 2+ times with different wording
|
- Step 5 (Completion): effectiveness scoring
|
||||||
- [ ] >3 findings reference code or scenarios outside the task scope
|
- Sprint skill: checkpoint policy between batches
|
||||||
|
|
||||||
**Correction:**
|
|
||||||
"Rank your challenges by impact. Keep the top 3. Each must include a specific alternative. Delete the rest."
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Trickster → False Alarm
|
|
||||||
**Virtue inverted:** Adversarial Creativity becomes noise — too many low-signal findings drowning the real issues.
|
|
||||||
|
|
||||||
**Symptoms:**
|
|
||||||
- Testing code that wasn't changed
|
|
||||||
- Reporting non-bugs as bugs (unrealistic test scenarios)
|
|
||||||
- 20 findings when 3 good ones would cover the real risks
|
|
||||||
- Edge cases for edge cases (diminishing returns)
|
|
||||||
|
|
||||||
**Detection Checklist** (trigger on ANY):
|
|
||||||
- [ ] Any finding references code untouched by the Maker's diff
|
|
||||||
- [ ] >10 findings for a change touching <5 files
|
|
||||||
- [ ] Findings describe scenarios requiring conditions that can't occur in the deployment context
|
|
||||||
- [ ] >3 findings without reproduction steps
|
|
||||||
|
|
||||||
**Correction:**
|
|
||||||
"Quality over quantity. Delete findings outside the Maker's diff. Rank remaining by likelihood x impact. Keep top 3-5. Three real findings beat twenty noise."
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Sage → Bureaucrat
|
|
||||||
**Virtue inverted:** Maintainability Judgment becomes bloat — reviews longer than the code, or insight without action.
|
|
||||||
|
|
||||||
**Symptoms:**
|
|
||||||
- Review longer than the code change itself
|
|
||||||
- Requesting documentation for self-evident code
|
|
||||||
- Suggesting refactors unrelated to the current task
|
|
||||||
- Deep-sounding analysis that doesn't end with a specific action
|
|
||||||
|
|
||||||
**Detection Checklist** (trigger on ANY):
|
|
||||||
- [ ] Review word count >2x the code change's line count (rough: review words > diff lines x 2)
|
|
||||||
- [ ] Any finding references files not in the Maker's changeset
|
|
||||||
- [ ] >2 findings use "consider" or "think about" without a concrete action in the `Fix` column
|
|
||||||
- [ ] Suggesting documentation for functions with <5 lines or self-descriptive names
|
|
||||||
|
|
||||||
**Correction:**
|
|
||||||
"Limit your review to issues that affect maintainability in the next 6 months. Every finding must end with a specific action. If you can't state the consequence of NOT fixing it, don't raise it."
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Shadow Escalation Protocol
|
|
||||||
|
|
||||||
1. **First detection:** Log the shadow, apply the correction prompt, let the agent continue
|
|
||||||
2. **Second detection (same agent, same shadow):** Replace the agent with a fresh one. The shadow is entrenched.
|
|
||||||
3. **Shadow detected in 3+ agents in the same cycle:** The task itself may be poorly scoped. Escalate to the user: "Multiple agents are struggling — the task may need to be broken down."
|
|
||||||
|
|
||||||
## Shadow Immunity
|
|
||||||
|
|
||||||
Some behaviors LOOK like shadows but aren't:
|
|
||||||
|
|
||||||
- Explorer reading 20 files in a monorepo with scattered dependencies → **not a rabbit hole** if each file is genuinely relevant
|
|
||||||
- Creator adding an abstraction → **not over-architect** if the abstraction is genuinely needed by the current task
|
|
||||||
- Guardian blocking with 2 CRITICAL findings → **not paranoid** if both are genuine security vulnerabilities
|
|
||||||
- Trickster finding 5 edge cases → **not false alarm** if all are in the changed code with reproduction steps
|
|
||||||
- Sage writing a long review → **not bureaucrat** if the change is large and every finding is actionable
|
|
||||||
|
|
||||||
**Rule of thumb:** Shadow = behavior disconnected from the goal. Intensity alone is not a shadow.
|
|
||||||
|
|||||||
@@ -20,16 +20,10 @@ This is the **primary operational mode** for ArcheFlow in multi-project workspac
|
|||||||
Use it when the user says "run the sprint", "work the queue", "go autonomous", or
|
Use it when the user says "run the sprint", "work the queue", "go autonomous", or
|
||||||
invokes `af-sprint`.
|
invokes `af-sprint`.
|
||||||
|
|
||||||
Do NOT use `archeflow:run` for individual tasks within a sprint — the sprint runner
|
Do NOT use `archeflow:run` for individual tasks within a sprint -- the sprint runner
|
||||||
handles task dispatch internally, using `archeflow:run` only when a task warrants
|
handles task dispatch internally, using `archeflow:run` only when a task warrants
|
||||||
full PDCA orchestration.
|
full PDCA orchestration.
|
||||||
|
|
||||||
## Prerequisites
|
|
||||||
|
|
||||||
- `docs/orchestra/queue.json` — task queue (managed by `./scripts/ws`)
|
|
||||||
- `./scripts/ws` — workspace CLI for queue operations
|
|
||||||
- Each project is a separate git repo under the workspace root
|
|
||||||
|
|
||||||
## Invocation
|
## Invocation
|
||||||
|
|
||||||
```
|
```
|
||||||
@@ -46,21 +40,12 @@ af-sprint --project writing.colette # Only process items for this project
|
|||||||
|
|
||||||
### Step 0: Orient
|
### Step 0: Orient
|
||||||
|
|
||||||
```bash
|
Load queue from `docs/orchestra/queue.json`. Check mode (`AUTONOM` / `ATTENDED` / `PAUSED`).
|
||||||
# Load queue and workspace state
|
Show one-line status: `sprint: AUTONOM | 7 pending (1xP0, 1xP2, 5xP3) | 4 slots`
|
||||||
QUEUE=$(cat docs/orchestra/queue.json)
|
|
||||||
MODE=$(echo "$QUEUE" | jq -r '.mode')
|
|
||||||
```
|
|
||||||
|
|
||||||
Check mode:
|
- `AUTONOM` -- proceed without asking
|
||||||
- `AUTONOM` → proceed without asking
|
- `ATTENDED` -- show plan, wait for user approval before each batch
|
||||||
- `ATTENDED` → show plan, wait for user approval before each batch
|
- `PAUSED` -- report status only, do not start tasks
|
||||||
- `PAUSED` → report status only, do not start tasks
|
|
||||||
|
|
||||||
Show one-line status:
|
|
||||||
```
|
|
||||||
sprint: AUTONOM · 7 pending (1×P0, 1×P2, 5×P3) · 4 slots
|
|
||||||
```
|
|
||||||
|
|
||||||
### Step 1: Select Batch
|
### Step 1: Select Batch
|
||||||
|
|
||||||
@@ -69,234 +54,111 @@ Pick tasks for the next batch. Rules:
|
|||||||
1. **Priority cascade**: P0 first, then P1, then P2. Never start P3 unless user explicitly includes it.
|
1. **Priority cascade**: P0 first, then P1, then P2. Never start P3 unless user explicitly includes it.
|
||||||
2. **Dependency check**: Skip tasks whose `depends_on` items aren't all `completed`.
|
2. **Dependency check**: Skip tasks whose `depends_on` items aren't all `completed`.
|
||||||
3. **One agent per project**: Never run two tasks on the same project simultaneously.
|
3. **One agent per project**: Never run two tasks on the same project simultaneously.
|
||||||
4. **Cost-aware concurrency**:
|
4. **Cost-aware concurrency**: L/XL tasks (expensive) max 2 concurrent. Fill remaining slots with S/M tasks. Target mix: 1-2 expensive + 2-3 cheap.
|
||||||
- Estimate task cost from `estimate` field: S=cheap, M=moderate, L=expensive, XL=very expensive
|
|
||||||
- **Expensive tasks** (L, XL): max 2 concurrent
|
|
||||||
- **Cheap tasks** (S, M): fill remaining slots
|
|
||||||
- Target mix: 1-2 expensive + 2-3 cheap = 4-5 total
|
|
||||||
5. **Slot limit**: Never exceed `--slots` (default 4).
|
5. **Slot limit**: Never exceed `--slots` (default 4).
|
||||||
|
|
||||||
```python
|
|
||||||
# Pseudocode for batch selection
|
|
||||||
batch = []
|
|
||||||
used_projects = set()
|
|
||||||
expensive_count = 0
|
|
||||||
|
|
||||||
for priority in ["P0", "P1", "P2"]:
|
|
||||||
for task in queue_items(priority, status="pending"):
|
|
||||||
if len(batch) >= MAX_SLOTS:
|
|
||||||
break
|
|
||||||
if task.project in used_projects:
|
|
||||||
continue # One agent per project
|
|
||||||
if not deps_satisfied(task):
|
|
||||||
continue
|
|
||||||
if task.estimate in ("L", "XL"):
|
|
||||||
if expensive_count >= 2:
|
|
||||||
continue
|
|
||||||
expensive_count += 1
|
|
||||||
batch.append(task)
|
|
||||||
used_projects.add(task.project)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Step 2: Assess and Dispatch
|
### Step 2: Assess and Dispatch
|
||||||
|
|
||||||
For each task in the batch, decide the execution strategy:
|
For each task in the batch, decide the execution strategy:
|
||||||
|
|
||||||
| Signal | Strategy | What happens |
|
| Signal | Strategy |
|
||||||
|--------|----------|-------------|
|
|--------|----------|
|
||||||
| Estimate S, clear scope | **Direct** | Spawn Agent() with task description, no orchestration |
|
| Estimate S, clear scope | **Direct** -- Agent with task description, no orchestration |
|
||||||
| Estimate M, multi-file | **Direct+** | Spawn Agent() with task + "read code first, run tests after" |
|
| Estimate M, multi-file | **Direct+** -- Agent with "read code first, run tests after" |
|
||||||
| Estimate L/XL, code | **Feature-dev style** | Agent explores → implements → self-reviews (see below) |
|
| Estimate L/XL, code | **Feature-dev** -- Agent explores, plans, implements, tests, self-reviews, commits |
|
||||||
| Estimate L/XL, writing | **PDCA** | Use af-run with writing domain archetypes |
|
| Estimate L/XL, writing | **PDCA** -- Use af-run with writing domain archetypes |
|
||||||
| Task contains "validate", "test", "lint", "check" | **Direct** | Cheap analytical task, no orchestration |
|
| validate/test/lint/check tasks | **Direct** -- cheap analytical, no orchestration |
|
||||||
| Task contains "review", "audit", "security" | **Review** | Spawn Guardian + relevant reviewers only |
|
| review/audit/security tasks | **Review** -- spawn Guardian + relevant reviewers only |
|
||||||
|
|
||||||
### L/XL Code Task Template (feature-dev style)
|
### L/XL Code Task Template
|
||||||
|
|
||||||
For complex code tasks, give the agent a structured process instead of PDCA:
|
Give the agent a structured process:
|
||||||
|
|
||||||
```
|
```
|
||||||
Agent(
|
Agent(prompt: "You are working on <project> at <path>. Task: <description>
|
||||||
description: "<project>: <task-short>",
|
|
||||||
prompt: "You are working on project <project> at <path>.
|
|
||||||
Task: <task description>
|
|
||||||
|
|
||||||
Follow this process:
|
1. EXPLORE: Read CLAUDE.md, docs/status.md, relevant source files.
|
||||||
1. EXPLORE: Read CLAUDE.md, docs/status.md, and the relevant source files.
|
2. PLAN: Identify files to change, write brief plan (what, where, why).
|
||||||
Understand existing patterns before writing anything.
|
3. IMPLEMENT: Follow existing code patterns strictly.
|
||||||
2. PLAN: Identify 2-3 files to change. Write a brief plan (what, where, why).
|
4. TEST: Run project test suite, fix failures.
|
||||||
If ambiguous, list your assumptions.
|
5. SELF-REVIEW: Re-read diff -- error handling, protocol compliance, test coverage.
|
||||||
3. IMPLEMENT: Make the changes. Follow existing code patterns strictly.
|
|
||||||
4. TEST: Run the project's test suite. Fix any failures.
|
|
||||||
5. SELF-REVIEW: Before committing, re-read your diff. Check:
|
|
||||||
- Error handling: what happens when this fails?
|
|
||||||
- Protocol compliance: am I using the right function signatures?
|
|
||||||
- Tests: did I test the important paths?
|
|
||||||
6. COMMIT + PUSH: Conventional commits, signed, pushed.
|
6. COMMIT + PUSH: Conventional commits, signed, pushed.
|
||||||
|
|
||||||
<standard rules>
|
STATUS: DONE | DONE_WITH_CONCERNS | NEEDS_CONTEXT | BLOCKED")
|
||||||
|
|
||||||
STATUS: DONE | DONE_WITH_CONCERNS | NEEDS_CONTEXT | BLOCKED"
|
|
||||||
)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
This gives the agent feature-dev's structured exploration without the multi-agent overhead.
|
### Agent Spawn Template
|
||||||
For writing/research L/XL tasks, use af-run instead — archetypes add value where linters don't exist.
|
|
||||||
|
|
||||||
**Agent spawn template:**
|
Spawn ALL batch agents in a **single message** (parallel execution). Each agent gets:
|
||||||
|
|
||||||
For each task in the batch, spawn an Agent in the SAME message (parallel dispatch):
|
|
||||||
|
|
||||||
```
|
```
|
||||||
Agent(
|
Agent(
|
||||||
description: "<project>: <task-short>",
|
description: "<project>: <task-short>",
|
||||||
prompt: "You are working on project <project> at <path>.
|
prompt: "You are working on <project> at <path>. Task: <description>
|
||||||
Task: <task description>
|
|
||||||
<notes if any>
|
|
||||||
|
|
||||||
Rules:
|
Rules:
|
||||||
- Read the project's CLAUDE.md first
|
- Read the project's CLAUDE.md first
|
||||||
- Commit with: git -c user.signingkey=/home/c/.ssh/id_ed25519_dev.pub commit
|
- Commit: git -c user.signingkey=/home/c/.ssh/id_ed25519_dev.pub commit
|
||||||
- NO Co-Authored-By trailers
|
- NO Co-Authored-By trailers, conventional commits
|
||||||
- Conventional commits
|
- Push: GIT_SSH_COMMAND='ssh -i /home/c/.ssh/id_ed25519_dev -o IdentitiesOnly=yes' git push origin main
|
||||||
- Push when done: GIT_SSH_COMMAND='ssh -i /home/c/.ssh/id_ed25519_dev -o IdentitiesOnly=yes' git push origin main
|
|
||||||
- Run tests if the project has them
|
- Run tests if the project has them
|
||||||
- Report: what you did, what changed, any blockers
|
- Report: what you did, what changed, any blockers
|
||||||
|
|
||||||
STATUS: DONE | DONE_WITH_CONCERNS | NEEDS_CONTEXT | BLOCKED",
|
STATUS: DONE | DONE_WITH_CONCERNS | NEEDS_CONTEXT | BLOCKED",
|
||||||
subagent_type: "general-purpose",
|
isolation: "worktree" # Only for L/XL tasks; S/M run directly
|
||||||
isolation: "worktree" # Only for L/XL tasks; S/M tasks run directly
|
|
||||||
)
|
)
|
||||||
```
|
```
|
||||||
|
|
||||||
**CRITICAL: Spawn all batch agents in a SINGLE message.** This enables parallel execution.
|
|
||||||
Do not spawn them sequentially.
|
|
||||||
|
|
||||||
### Step 3: Mark Running
|
### Step 3: Mark Running
|
||||||
|
|
||||||
After spawning, update the queue:
|
Update the queue after spawning:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# For each spawned task
|
./scripts/ws start <task-id> # or update queue.json status to "running" directly
|
||||||
./scripts/ws start <task-id> # or manually update queue.json status to "running"
|
|
||||||
```
|
|
||||||
|
|
||||||
If `./scripts/ws start` doesn't exist, update queue.json directly:
|
|
||||||
```python
|
|
||||||
task["status"] = "running"
|
|
||||||
# Write back to docs/orchestra/queue.json
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Step 4: Collect Results
|
### Step 4: Collect Results
|
||||||
|
|
||||||
As agents complete, process their results:
|
Parse status token from agent output. Based on status:
|
||||||
|
- `DONE` -- mark completed, note result
|
||||||
|
- `DONE_WITH_CONCERNS` -- mark completed, log concerns for user review
|
||||||
|
- `NEEDS_CONTEXT` -- mark pending, add concern to notes, skip for now
|
||||||
|
- `BLOCKED` -- mark failed, add blocker to notes
|
||||||
|
|
||||||
1. **Parse status token** from agent output (last line: `STATUS: DONE|...`)
|
Update: `./scripts/ws done <task-id> -r "<summary>"` or `./scripts/ws fail <task-id> -r "<reason>"`
|
||||||
2. **Based on status**:
|
|
||||||
- `DONE` → mark completed, note result
|
|
||||||
- `DONE_WITH_CONCERNS` → mark completed, log concerns for user review
|
|
||||||
- `NEEDS_CONTEXT` → mark pending, add concern to notes, skip for now
|
|
||||||
- `BLOCKED` → mark failed, add blocker to notes
|
|
||||||
3. **Update queue**:
|
|
||||||
```bash
|
|
||||||
./scripts/ws done <task-id> -r "<summary of what was done>"
|
|
||||||
# or
|
|
||||||
./scripts/ws fail <task-id> -r "<reason>"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Step 5: Report and Loop
|
### Step 5: Report and Loop
|
||||||
|
|
||||||
After batch completes, show sprint status:
|
Show batch status, then **immediately select next batch** (no user prompt in AUTONOM mode):
|
||||||
|
|
||||||
```
|
```
|
||||||
── Sprint Batch 1 ──────────────────────────────
|
-- Sprint Batch 1 --------------------------------------------------
|
||||||
✓ writing.colette fanout run done (45s)
|
+ writing.colette fanout run done (45s)
|
||||||
✓ book.3sets validation done (30s)
|
+ book.3sets validation done (30s)
|
||||||
△ book.sos meta-book concept needs_context (missing outline)
|
! book.sos meta-book concept needs_context
|
||||||
✓ tool.archeflow af-review mode done (60s)
|
+ tool.archeflow af-review mode done (60s)
|
||||||
|
|
||||||
Queue: 3 completed, 1 blocked, 3 remaining
|
Queue: 3 completed, 1 blocked, 3 remaining
|
||||||
Next batch: 2 items ready
|
--------------------------------------------------------------------
|
||||||
────────────────────────────────────────────────
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Then **immediately select and dispatch the next batch** (Step 1). Don't wait for user input in AUTONOM mode.
|
|
||||||
|
|
||||||
### Step 6: Sprint Complete
|
### Step 6: Sprint Complete
|
||||||
|
|
||||||
When no more tasks are schedulable (all done, blocked, or P3-only):
|
When no more tasks are schedulable:
|
||||||
|
|
||||||
1. Update `docs/control-center.md` Handoff section
|
1. Update `docs/control-center.md` Handoff section
|
||||||
2. Run `./scripts/ws log --summary "<sprint summary>"` if available
|
2. Run `./scripts/ws log --summary "<sprint summary>"`
|
||||||
3. Show final sprint report:
|
3. Show final report with duration, tasks completed/blocked/remaining, projects touched, commits
|
||||||
|
|
||||||
```
|
|
||||||
── Sprint Complete ─────────────────────────────
|
|
||||||
Duration: 12 min
|
|
||||||
Tasks: 5 completed, 1 blocked, 1 remaining (P3)
|
|
||||||
Projects touched: 4
|
|
||||||
Commits: 7
|
|
||||||
────────────────────────────────────────────────
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Mode Behavior
|
## Mode Behavior
|
||||||
|
|
||||||
### AUTONOM
|
| Mode | Dispatch | Between batches | Stops for |
|
||||||
- Dispatch immediately, no user confirmation
|
|------|----------|----------------|-----------|
|
||||||
- Commit + push after each agent completes
|
| **AUTONOM** | Immediate | One-line status, no pause | BLOCKED or budget exhaustion |
|
||||||
- Only pause for BLOCKED tasks or budget exhaustion
|
| **ATTENDED** | Show batch, wait for approval | Show results, ask "Continue? [y/n/edit]" | User decision |
|
||||||
- Report between batches (one-line status)
|
| **PAUSED** | No dispatch | -- | Always (status display only) |
|
||||||
|
|
||||||
### ATTENDED
|
|
||||||
- Show the selected batch before dispatching
|
|
||||||
- Wait for user to approve: "Proceed with this batch? [y/n]"
|
|
||||||
- After each batch, show results and ask: "Continue to next batch? [y/n/edit]"
|
|
||||||
- "edit" lets the user reprioritize before next batch
|
|
||||||
|
|
||||||
### PAUSED
|
|
||||||
- Show queue status only
|
|
||||||
- Do not dispatch any agents
|
|
||||||
- Useful for reviewing state between sessions
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## When to Use ArcheFlow Orchestration Within Sprint
|
|
||||||
|
|
||||||
Most sprint tasks should be **direct agent dispatch** (no PDCA/pipeline overhead).
|
|
||||||
Only escalate to full orchestration when:
|
|
||||||
|
|
||||||
| Signal | Action |
|
|
||||||
|--------|--------|
|
|
||||||
| Task is S/M, clear scope, single project | Direct dispatch |
|
|
||||||
| Task is L/XL | Use pipeline or PDCA strategy |
|
|
||||||
| Task mentions "security", "auth", "encryption" | Add Guardian review |
|
|
||||||
| Task is a review/audit | Spawn reviewers only (af-review mode) |
|
|
||||||
| Task failed in a previous sprint | Escalate to PDCA with Explorer |
|
|
||||||
|
|
||||||
The sprint runner's job is **throughput**, not perfection. Ship fast, fix forward.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Integration with Existing Tools
|
|
||||||
|
|
||||||
| Tool | How sprint uses it |
|
|
||||||
|------|-------------------|
|
|
||||||
| `./scripts/ws next` | Get next schedulable task |
|
|
||||||
| `./scripts/ws done <id>` | Mark task completed |
|
|
||||||
| `./scripts/ws fail <id>` | Mark task failed |
|
|
||||||
| `./scripts/ws orient` | Initial workspace overview |
|
|
||||||
| `./scripts/ws validate` | Pre-flight queue validation |
|
|
||||||
| `git` per project | Commit + push after each agent |
|
|
||||||
| `archeflow:run` | Only for L/XL tasks needing PDCA |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Error Recovery
|
## Error Recovery
|
||||||
|
|
||||||
- **Agent crashes mid-task**: Mark task as `failed`, add error to notes, continue with next batch
|
- **Agent crash**: Mark `failed`, continue with next batch
|
||||||
- **Git push fails**: Log the error, do NOT retry. User will handle push conflicts manually.
|
- **Git push fails**: Log error, do NOT retry -- user handles conflicts
|
||||||
- **Queue file corrupted**: Run `./scripts/ws validate`. If invalid, stop sprint and report.
|
- **Queue corrupted**: Run `./scripts/ws validate`, stop if invalid
|
||||||
- **Budget exceeded**: Stop sprint, report remaining tasks and estimated cost.
|
- **Budget exceeded**: Stop sprint, report remaining tasks and estimated cost
|
||||||
- **All tasks blocked**: Report dependency graph, suggest which blockers to resolve first.
|
- **All blocked**: Report dependency graph, suggest which blockers to resolve first
|
||||||
|
|||||||
@@ -7,316 +7,79 @@ description: |
|
|||||||
<example>User: "archeflow init writing-short-story"</example>
|
<example>User: "archeflow init writing-short-story"</example>
|
||||||
<example>User: "archeflow template save my-backend-setup"</example>
|
<example>User: "archeflow template save my-backend-setup"</example>
|
||||||
<example>User: "archeflow template list"</example>
|
<example>User: "archeflow template list"</example>
|
||||||
<example>User: "archeflow init --from ../book.giesing-gschichten"</example>
|
|
||||||
---
|
---
|
||||||
|
|
||||||
# Template Gallery — Shareable ArcheFlow Configurations
|
# Template Gallery -- Shareable ArcheFlow Configurations
|
||||||
|
|
||||||
Workflows, team presets, custom archetypes, and domain configs should be reusable across projects. This skill defines the template system that makes ArcheFlow setups portable and shareable.
|
Makes ArcheFlow setups portable and reusable across projects.
|
||||||
|
|
||||||
## Template Storage
|
## Template Storage
|
||||||
|
|
||||||
Templates live in two locations, with project-local overriding global:
|
|
||||||
|
|
||||||
| Location | Scope | Precedence |
|
| Location | Scope | Precedence |
|
||||||
|----------|-------|------------|
|
|----------|-------|------------|
|
||||||
| `.archeflow/templates/` | Project-local | Higher (checked first) |
|
| `.archeflow/templates/` | Project-local | Higher (checked first) |
|
||||||
| `~/.archeflow/templates/` | Global (user-wide) | Lower (fallback) |
|
| `~/.archeflow/templates/` | Global (user-wide) | Lower (fallback) |
|
||||||
|
|
||||||
### Directory Structure
|
Subdirectories: `workflows/`, `teams/`, `archetypes/`, `domains/`, `bundles/`.
|
||||||
|
|
||||||
```
|
## Bundles
|
||||||
~/.archeflow/templates/
|
|
||||||
├── workflows/
|
|
||||||
│ ├── kurzgeschichte.yaml
|
|
||||||
│ ├── feature-implementation.yaml
|
|
||||||
│ └── security-review.yaml
|
|
||||||
├── teams/
|
|
||||||
│ ├── story-development.yaml
|
|
||||||
│ ├── backend.yaml
|
|
||||||
│ └── fullstack.yaml
|
|
||||||
├── archetypes/
|
|
||||||
│ ├── story-explorer.md
|
|
||||||
│ ├── story-sage.md
|
|
||||||
│ └── db-specialist.md
|
|
||||||
├── domains/
|
|
||||||
│ ├── writing.yaml
|
|
||||||
│ ├── code.yaml
|
|
||||||
│ └── research.yaml
|
|
||||||
└── bundles/
|
|
||||||
├── writing-short-story/
|
|
||||||
│ ├── manifest.yaml
|
|
||||||
│ ├── team.yaml
|
|
||||||
│ ├── workflow.yaml
|
|
||||||
│ ├── archetypes/
|
|
||||||
│ │ ├── story-explorer.md
|
|
||||||
│ │ └── story-sage.md
|
|
||||||
│ └── domain.yaml
|
|
||||||
└── backend-feature/
|
|
||||||
├── manifest.yaml
|
|
||||||
├── team.yaml
|
|
||||||
├── workflow.yaml
|
|
||||||
└── domain.yaml
|
|
||||||
```
|
|
||||||
|
|
||||||
Individual templates (workflows/, teams/, archetypes/, domains/) are single files that can be used standalone. Bundles are complete setups that include everything a project needs.
|
A bundle is a complete setup (team + workflow + archetypes + domain) in one directory.
|
||||||
|
|
||||||
---
|
**Manifest (`manifest.yaml`):**
|
||||||
|
|
||||||
## Bundle Manifest
|
|
||||||
|
|
||||||
Every bundle has a `manifest.yaml` that declares what it contains, what it requires, and what variables it exposes.
|
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
name: writing-short-story
|
name: writing-short-story
|
||||||
description: "Complete setup for short fiction writing with ArcheFlow"
|
description: "Complete setup for short fiction writing"
|
||||||
version: 1
|
|
||||||
domain: writing
|
domain: writing
|
||||||
includes:
|
includes:
|
||||||
team: story-development.yaml
|
team: story-development.yaml
|
||||||
workflow: kurzgeschichte.yaml
|
workflow: kurzgeschichte.yaml
|
||||||
archetypes:
|
archetypes: [story-explorer.md, story-sage.md]
|
||||||
- story-explorer.md
|
|
||||||
- story-sage.md
|
|
||||||
domain: writing.yaml
|
domain: writing.yaml
|
||||||
requires:
|
requires: [colette.yaml]
|
||||||
- colette.yaml # Project must have this file
|
|
||||||
variables:
|
|
||||||
target_words: 6000 # Default, can be overridden at init time
|
|
||||||
max_cycles: 2 # Default, can be overridden at init time
|
|
||||||
```
|
|
||||||
|
|
||||||
### Manifest Fields
|
|
||||||
|
|
||||||
| Field | Required | Description |
|
|
||||||
|-------|----------|-------------|
|
|
||||||
| `name` | Yes | Bundle identifier (used in `archeflow init <name>`) |
|
|
||||||
| `description` | Yes | Human-readable description |
|
|
||||||
| `version` | No | Bundle version (integer, default 1) |
|
|
||||||
| `domain` | No | Domain this bundle is designed for |
|
|
||||||
| `includes` | Yes | Map of file types to filenames within the bundle |
|
|
||||||
| `requires` | No | List of files that must exist in the target project |
|
|
||||||
| `variables` | No | Key-value pairs with defaults, overridable at init |
|
|
||||||
|
|
||||||
### Includes Types
|
|
||||||
|
|
||||||
| Key | Target location in `.archeflow/` | Accepts |
|
|
||||||
|-----|----------------------------------|---------|
|
|
||||||
| `team` | `teams/<filename>` | Single YAML file |
|
|
||||||
| `workflow` | `workflows/<filename>` | Single YAML file |
|
|
||||||
| `archetypes` | `archetypes/<filename>` | List of Markdown files |
|
|
||||||
| `domain` | `domains/<filename>` | Single YAML file |
|
|
||||||
| `hooks` | `hooks.yaml` | Single YAML file |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Operations
|
|
||||||
|
|
||||||
### `archeflow init <bundle-name>`
|
|
||||||
|
|
||||||
Initialize a project's `.archeflow/` directory from a named bundle.
|
|
||||||
|
|
||||||
**Procedure:**
|
|
||||||
|
|
||||||
1. Search for the bundle:
|
|
||||||
- `.archeflow/templates/bundles/<name>/manifest.yaml` (project-local)
|
|
||||||
- `~/.archeflow/templates/bundles/<name>/manifest.yaml` (global)
|
|
||||||
- If not found: error with list of available bundles
|
|
||||||
2. Read `manifest.yaml`
|
|
||||||
3. Check `requires`:
|
|
||||||
- For each required file, verify it exists in the project root
|
|
||||||
- If missing: error with `"Required file not found: <file>. This bundle requires it."`
|
|
||||||
4. Check for existing `.archeflow/` setup:
|
|
||||||
- If `.archeflow/teams/`, `.archeflow/workflows/`, etc. already contain files: warn and ask before overwriting
|
|
||||||
- Never silently overwrite existing configuration
|
|
||||||
5. Copy files from bundle to `.archeflow/`:
|
|
||||||
- `team` → `.archeflow/teams/<filename>`
|
|
||||||
- `workflow` → `.archeflow/workflows/<filename>`
|
|
||||||
- `archetypes` → `.archeflow/archetypes/<filename>` (each file)
|
|
||||||
- `domain` → `.archeflow/domains/<filename>`
|
|
||||||
- `hooks` → `.archeflow/hooks.yaml`
|
|
||||||
6. Create `.archeflow/config.yaml` with variables from manifest:
|
|
||||||
```yaml
|
|
||||||
# Generated by archeflow init from bundle: <name>
|
|
||||||
bundle: <name>
|
|
||||||
bundle_version: <version>
|
|
||||||
initialized: <timestamp>
|
|
||||||
variables:
|
variables:
|
||||||
target_words: 6000
|
target_words: 6000
|
||||||
max_cycles: 2
|
max_cycles: 2
|
||||||
```
|
```
|
||||||
7. Print setup summary:
|
|
||||||
```
|
|
||||||
ArcheFlow initialized from bundle: <name>
|
|
||||||
Team: <team filename> → .archeflow/teams/
|
|
||||||
Workflow: <workflow filename> → .archeflow/workflows/
|
|
||||||
Archetypes: <count> files → .archeflow/archetypes/
|
|
||||||
Domain: <domain filename> → .archeflow/domains/
|
|
||||||
Config: .archeflow/config.yaml (variables: target_words=6000, max_cycles=2)
|
|
||||||
|
|
||||||
Ready to run: archeflow:run
|
| Field | Required | Description |
|
||||||
```
|
|-------|----------|-------------|
|
||||||
|
| `name` | Yes | Bundle identifier for `archeflow init <name>` |
|
||||||
|
| `description` | Yes | Human-readable description |
|
||||||
|
| `includes` | Yes | File types to filenames within bundle |
|
||||||
|
| `requires` | No | Files that must exist in target project |
|
||||||
|
| `variables` | No | Key-value defaults, overridable at init |
|
||||||
|
|
||||||
### `archeflow init --from <project-path>`
|
## Operations
|
||||||
|
|
||||||
Clone another project's ArcheFlow setup into the current project.
|
**`archeflow init <bundle-name>`**
|
||||||
|
1. Find bundle (project-local, then global)
|
||||||
|
2. Check `requires` files exist
|
||||||
|
3. Warn before overwriting existing `.archeflow/` config
|
||||||
|
4. Copy files to `.archeflow/` (teams/, workflows/, archetypes/, domains/)
|
||||||
|
5. Generate `.archeflow/config.yaml` with variables
|
||||||
|
|
||||||
**Procedure:**
|
**`archeflow init --from <project-path>`**
|
||||||
|
- Copy teams/, workflows/, archetypes/, domains/, config.yaml, hooks.yaml
|
||||||
|
- Skip run-specific data: events/, artifacts/, context/, templates/
|
||||||
|
|
||||||
1. Verify `<project-path>/.archeflow/` exists
|
**`archeflow template save <name>`**
|
||||||
2. Copy these subdirectories (if they exist):
|
- Package current `.archeflow/` into `~/.archeflow/templates/bundles/<name>/`
|
||||||
- `teams/`
|
- Auto-generate manifest.yaml
|
||||||
- `workflows/`
|
|
||||||
- `archetypes/`
|
|
||||||
- `domains/`
|
|
||||||
- `config.yaml`
|
|
||||||
- `hooks.yaml`
|
|
||||||
3. Do NOT copy (run-specific data):
|
|
||||||
- `events/`
|
|
||||||
- `artifacts/`
|
|
||||||
- `context/` (generated by colette-bridge, project-specific)
|
|
||||||
- `templates/` (project-local templates stay local)
|
|
||||||
4. Warn if target `.archeflow/` already has files
|
|
||||||
5. Print summary of what was copied
|
|
||||||
|
|
||||||
### `archeflow template save <name>`
|
**`archeflow template list`**
|
||||||
|
- Show all bundles and individual templates (global + project-local)
|
||||||
Save the current project's `.archeflow/` setup as a reusable template bundle.
|
|
||||||
|
|
||||||
**Procedure:**
|
|
||||||
|
|
||||||
1. Verify `.archeflow/` exists and has content
|
|
||||||
2. Create bundle directory: `~/.archeflow/templates/bundles/<name>/`
|
|
||||||
- If it already exists: warn and ask before overwriting
|
|
||||||
3. Copy from `.archeflow/` to bundle:
|
|
||||||
- `teams/*.yaml` → bundle `team` (first file, or prompt if multiple)
|
|
||||||
- `workflows/*.yaml` → bundle `workflow` (first file, or prompt if multiple)
|
|
||||||
- `archetypes/*.md` → bundle `archetypes/`
|
|
||||||
- `domains/*.yaml` → bundle `domain` (first file, or prompt if multiple)
|
|
||||||
- `hooks.yaml` → bundle (if exists)
|
|
||||||
4. Generate `manifest.yaml`:
|
|
||||||
```yaml
|
|
||||||
name: <name>
|
|
||||||
description: "Saved from <project directory name>"
|
|
||||||
version: 1
|
|
||||||
domain: <from domain yaml if present>
|
|
||||||
includes:
|
|
||||||
team: <filename>
|
|
||||||
workflow: <filename>
|
|
||||||
archetypes: [<filenames>]
|
|
||||||
domain: <filename>
|
|
||||||
requires: []
|
|
||||||
variables: <from config.yaml variables section if present>
|
|
||||||
```
|
|
||||||
5. Print summary:
|
|
||||||
```
|
|
||||||
Template saved: <name>
|
|
||||||
Location: ~/.archeflow/templates/bundles/<name>/
|
|
||||||
Files: <count> files
|
|
||||||
Use with: archeflow init <name>
|
|
||||||
```
|
|
||||||
|
|
||||||
### `archeflow template list`
|
|
||||||
|
|
||||||
List all available templates — both individual files and bundles, from both global and project-local locations.
|
|
||||||
|
|
||||||
**Output format:**
|
|
||||||
|
|
||||||
```
|
|
||||||
ArcheFlow Templates
|
|
||||||
====================
|
|
||||||
|
|
||||||
Bundles:
|
|
||||||
writing-short-story Complete setup for short fiction writing [global]
|
|
||||||
backend-feature Backend feature implementation [global]
|
|
||||||
my-project-setup Saved from book.giesing-gschichten [global]
|
|
||||||
|
|
||||||
Individual Templates:
|
|
||||||
Workflows:
|
|
||||||
kurzgeschichte.yaml [global]
|
|
||||||
feature-implementation.yaml [global]
|
|
||||||
Teams:
|
|
||||||
story-development.yaml [global]
|
|
||||||
backend.yaml [global]
|
|
||||||
Archetypes:
|
|
||||||
story-explorer.md [global]
|
|
||||||
story-sage.md [global]
|
|
||||||
Domains:
|
|
||||||
writing.yaml [global]
|
|
||||||
code.yaml [global]
|
|
||||||
```
|
|
||||||
|
|
||||||
### `archeflow template share <name> <path>`
|
|
||||||
|
|
||||||
Export a template bundle to a directory for sharing (e.g., via git, email, file share).
|
|
||||||
|
|
||||||
**Procedure:**
|
|
||||||
|
|
||||||
1. Find the bundle (global or local)
|
|
||||||
2. Copy the entire bundle directory to `<path>/<name>/`
|
|
||||||
3. Print the path and a one-liner for importing:
|
|
||||||
```
|
|
||||||
Exported: <path>/<name>/
|
|
||||||
To import: cp -r <path>/<name> ~/.archeflow/templates/bundles/
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Variable Substitution
|
## Variable Substitution
|
||||||
|
|
||||||
Bundle manifests can define variables with defaults. These are stored in `.archeflow/config.yaml` after init and can be overridden:
|
Variables in manifests are stored in `.archeflow/config.yaml` after init. Substitution happens at run time, not template time.
|
||||||
|
|
||||||
- At init time: `archeflow init writing-short-story --set target_words=8000`
|
Override at init: `archeflow init writing-short-story --set target_words=8000`
|
||||||
- After init: edit `.archeflow/config.yaml` directly
|
|
||||||
|
|
||||||
Variables are available to workflows and the run skill via config:
|
## Individual Templates
|
||||||
|
|
||||||
```yaml
|
Single files can be copied directly without a bundle:
|
||||||
# In a workflow, reference variables:
|
- `~/.archeflow/templates/workflows/<name>.yaml`
|
||||||
phases:
|
- `~/.archeflow/templates/archetypes/<name>.md`
|
||||||
do:
|
- `~/.archeflow/templates/teams/<name>.yaml`
|
||||||
description: |
|
|
||||||
Draft the story. Target: ${target_words} words.
|
|
||||||
```
|
|
||||||
|
|
||||||
Variable substitution happens at run time, not at init time. The workflow file contains the `${variable}` placeholder; the run skill reads `.archeflow/config.yaml` and substitutes before passing to agents.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Individual Template Usage
|
|
||||||
|
|
||||||
Not everything needs a bundle. Individual templates can be copied directly:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Copy a single workflow
|
|
||||||
cp ~/.archeflow/templates/workflows/kurzgeschichte.yaml .archeflow/workflows/
|
|
||||||
|
|
||||||
# Copy a single archetype
|
|
||||||
cp ~/.archeflow/templates/archetypes/story-explorer.md .archeflow/archetypes/
|
|
||||||
|
|
||||||
# Copy a team preset
|
|
||||||
cp ~/.archeflow/templates/teams/story-development.yaml .archeflow/teams/
|
|
||||||
```
|
|
||||||
|
|
||||||
The `archeflow init` command handles bundles. For individual files, manual copy or the helper script (`lib/archeflow-init.sh`) can be used.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Integration with Other Skills
|
|
||||||
|
|
||||||
- **`archeflow:run`** — Reads `.archeflow/config.yaml` for variables, applies them during run initialization
|
|
||||||
- **`archeflow:domains`** — Domain YAML from templates is loaded like any other domain config
|
|
||||||
- **`archeflow:custom-archetypes`** — Archetype .md files from templates work identically to hand-written ones
|
|
||||||
- **`archeflow:workflow-design`** — Workflow YAML from templates follows the same schema
|
|
||||||
- **`archeflow:colette-bridge`** — Bundle `requires: [colette.yaml]` ensures the bridge has what it needs
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Design Principles
|
|
||||||
|
|
||||||
1. **Bundles are self-contained.** Everything needed to set up a project is in the bundle directory. No external dependencies beyond `requires`.
|
|
||||||
2. **Never silently overwrite.** Init warns before replacing existing files. Templates are helpers, not bulldozers.
|
|
||||||
3. **Global + local layering.** Project-local templates override global ones. This allows per-project customization without polluting the global registry.
|
|
||||||
4. **Skip run data.** Events, artifacts, and context are run-specific. Templates carry only configuration.
|
|
||||||
5. **Variables are late-bound.** Substitution happens at run time, not template time. This keeps templates generic.
|
|
||||||
6. **Plain files, no magic.** Templates are just directories of YAML and Markdown files. No databases, no registries, no lock files.
|
|
||||||
|
|||||||
@@ -5,180 +5,51 @@ description: Use at session start when implementing features, reviewing code, de
|
|||||||
|
|
||||||
# ArcheFlow -- Active
|
# ArcheFlow -- Active
|
||||||
|
|
||||||
Multi-agent orchestration using archetypal roles and PDCA quality cycles.
|
On activation, print ONE line then proceed silently:
|
||||||
|
|
||||||
## Session Start
|
|
||||||
|
|
||||||
On activation, print ONE line:
|
|
||||||
```
|
```
|
||||||
archeflow v0.7.0 · 25 skills · <domain> domain
|
archeflow v0.8.0 · 19 skills · <domain> domain
|
||||||
```
|
```
|
||||||
Where `<domain>` is auto-detected: `writing` if `colette.yaml` exists, `research` if paper/thesis files exist, `code` otherwise. Then proceed silently — no further announcement unless `archeflow:run` is invoked.
|
Domain auto-detected: `writing` if `colette.yaml` exists, `research` if paper/thesis files, `code` otherwise.
|
||||||
|
|
||||||
During runs, follow the `archeflow:presence` skill for output format: show outcomes not mechanics, one line per phase, value at the end.
|
## When to Use What
|
||||||
|
|
||||||
## IMPORTANT: When to Use What
|
| Need | Command | When |
|
||||||
|
|------|---------|------|
|
||||||
### Use `/af-sprint` (primary mode) when:
|
| **Work the queue** | `/af-sprint` | Multiple tasks pending across projects, "run the sprint" |
|
||||||
- User says "run the sprint", "work the queue", "go autonomous"
|
| **Deep orchestration** | `/af-run` | Writing/research tasks, security-sensitive code, complex multi-module refactors |
|
||||||
- Multiple tasks are pending across projects
|
| **Code review** | `/af-review` | Review diff/branch/commits before merging, security-sensitive changes |
|
||||||
- The workspace queue (docs/orchestra/queue.json) has pending items
|
| **Single feature** | `feature-dev` or direct | Clear scope, one project -- no orchestration needed |
|
||||||
|
|
||||||
### Use `/af-review` when:
|
|
||||||
- User wants to review code before merging
|
|
||||||
- A diff, branch, or commit range needs quality check
|
|
||||||
- Security-sensitive changes need Guardian analysis
|
|
||||||
|
|
||||||
### Use `/af-run` (deep orchestration) when:
|
|
||||||
- **Writing/research tasks** -- archetypes add value where linters don't exist
|
|
||||||
- **Security-sensitive code changes** -- auth, encryption, API keys
|
|
||||||
- **Complex multi-module refactors** with unclear approach
|
|
||||||
|
|
||||||
### Do NOT use ArcheFlow for:
|
|
||||||
- **Single-feature code development** -- use `feature-dev` plugin or work directly
|
|
||||||
- **Simple fixes** -- just do them
|
|
||||||
- **Questions, exploration, reading** -- no code changes needed
|
|
||||||
|
|
||||||
Choose the workflow based on risk:
|
|
||||||
|
|
||||||
| Signal | Workflow | Command |
|
|
||||||
|--------|----------|---------|
|
|
||||||
| Small fix, low risk, single concern | `fast` | Creator --> Maker --> Guardian |
|
|
||||||
| Feature, multiple files, moderate risk | `standard` | Explorer + Creator --> Maker --> Guardian + Skeptic + Sage |
|
|
||||||
| Security-sensitive, breaking changes, public API | `thorough` | Explorer + Creator --> Maker --> All 4 reviewers |
|
|
||||||
|
|
||||||
## When to Skip ArcheFlow
|
## When to Skip ArcheFlow
|
||||||
|
|
||||||
Do NOT use ArcheFlow for these -- just do them directly:
|
Do NOT use for: single-line fixes, questions, reading/exploring, config tweaks, git ops.
|
||||||
|
|
||||||
- Single-line fixes, typos, formatting
|
## Workflow Selection
|
||||||
- Answering questions (no code changes)
|
|
||||||
- Reading/exploring code without making changes
|
|
||||||
- Config changes to a single file
|
|
||||||
- Git operations (commit, push, branch)
|
|
||||||
|
|
||||||
**Mini-Reflect fallback:** Even when skipping ArcheFlow, apply a quick reflection for non-trivial single-file changes: (1) restate what you're changing, (2) name one assumption, (3) check if it could break anything. This takes ~10 seconds and catches misunderstandings before they become commits.
|
| Signal | Workflow | Pipeline |
|
||||||
|
|--------|----------|----------|
|
||||||
## Archetypes
|
| Small fix, low risk | `fast` | Creator --> Maker --> Guardian |
|
||||||
|
| Feature, multi-file, moderate risk | `standard` | Explorer + Creator --> Maker --> Guardian + Skeptic + Sage |
|
||||||
| Archetype | Avatar | Virtue | Shadow | Phase |
|
| Security, breaking changes, public API | `thorough` | Explorer + Creator --> Maker --> All 4 reviewers |
|
||||||
|-----------|--------|--------|--------|-------|
|
|
||||||
| **Explorer** | 🔍 | Contextual Clarity | Rabbit Hole | Plan |
|
|
||||||
| **Creator** | 🏗️ | Decisive Framing | Over-Architect | Plan |
|
|
||||||
| **Maker** | ⚒️ | Execution Discipline | Rogue | Do |
|
|
||||||
| **Guardian** | 🛡️ | Threat Intuition | Paranoid | Check |
|
|
||||||
| **Skeptic** | 🤔 | Assumption Surfacing | Paralytic | Check |
|
|
||||||
| **Trickster** | 🃏 | Adversarial Creativity | False Alarm | Check |
|
|
||||||
| **Sage** | 📚 | Maintainability Judgment | Bureaucrat | Check |
|
|
||||||
|
|
||||||
## PDCA Cycle
|
|
||||||
|
|
||||||
```
|
|
||||||
Plan --> Explorer researches, Creator proposes
|
|
||||||
Do --> Maker implements in isolated worktree
|
|
||||||
Check --> Reviewers assess in parallel (approve/reject)
|
|
||||||
Act --> All approved? Merge. Issues? Cycle back to Plan.
|
|
||||||
```
|
|
||||||
|
|
||||||
## Progress Indicators
|
|
||||||
|
|
||||||
During orchestration, emit phase markers so the user can track progress:
|
|
||||||
|
|
||||||
```
|
|
||||||
--- ArcheFlow: <task> -------------------------
|
|
||||||
Workflow: standard (2 cycles max)
|
|
||||||
|
|
||||||
🔍 [Plan] Explorer researching... done (35s)
|
|
||||||
🏗️ [Plan] Creator designing proposal... done (25s, confidence: 0.8)
|
|
||||||
⚒️ [Do] Maker implementing... done (90s, 4 files, 8 tests)
|
|
||||||
🛡️ [Check] Guardian reviewing... APPROVED
|
|
||||||
🤔 [Check] Skeptic challenging... APPROVED (1 INFO)
|
|
||||||
📚 [Check] Sage reviewing... APPROVED
|
|
||||||
[Act] All approved -- merging... merged to main
|
|
||||||
|
|
||||||
--- Complete: 3m 10s, 1 cycle -----------------
|
|
||||||
```
|
|
||||||
|
|
||||||
Update each line as agents complete. This gives the user real-time visibility without interrupting the flow.
|
|
||||||
|
|
||||||
## Dry-Run Mode
|
|
||||||
|
|
||||||
When the user asks "what would ArcheFlow do?" or uses `--dry-run`, show the plan without executing:
|
|
||||||
|
|
||||||
```
|
|
||||||
Dry run for: "Add JWT authentication"
|
|
||||||
Workflow: standard (2 cycles)
|
|
||||||
Agents: 🔍 Explorer --> 🏗️ Creator --> ⚒️ Maker --> 🛡️ Guardian + 🤔 Skeptic + 📚 Sage
|
|
||||||
Est. agents: 6 per cycle, 12 max
|
|
||||||
Worktree: yes (isolated branch)
|
|
||||||
Proceed? [y/n]
|
|
||||||
```
|
|
||||||
|
|
||||||
## Quick Start
|
|
||||||
|
|
||||||
When the user gives an implementation task:
|
|
||||||
|
|
||||||
1. Assess: does this need ArcheFlow? (see criteria above)
|
|
||||||
2. If yes: load `archeflow:orchestration` skill
|
|
||||||
3. Pick workflow (fast/standard/thorough)
|
|
||||||
4. Execute the PDCA steps from the orchestration skill
|
|
||||||
5. Emit progress indicators throughout (see above)
|
|
||||||
|
|
||||||
## Available Commands
|
## Available Commands
|
||||||
|
|
||||||
| Command | What it does |
|
| Command | What it does |
|
||||||
|---------|-------------|
|
|---------|-------------|
|
||||||
| `archeflow:run` | Automated PDCA loop -- single command to orchestrate a full run |
|
| `/af-sprint` | Queue-driven parallel agent runner (primary mode) |
|
||||||
| `archeflow:orchestration` | Load manual PDCA execution guide |
|
| `/af-run <task>` | PDCA orchestration loop (`--dry-run`, `--start-from`, `--workflow`) |
|
||||||
| `archeflow:shadow-detection` | Load shadow monitoring rules |
|
| `/af-review` | Guardian-led code review on diff/branch/range |
|
||||||
| `archeflow:autonomous-mode` | Load autonomous/overnight session protocol |
|
| `/af-status` | Current run state, active agents, findings |
|
||||||
| `archeflow:status` | Show current orchestration state (phase, cycle, active agents) |
|
| `/af-report` | Full process report for a run |
|
||||||
| `archeflow:history` | Show past orchestration summaries from `.archeflow/session-log.md` |
|
| `/af-init` | Initialize ArcheFlow in a project |
|
||||||
|
| `/af-score` | Archetype effectiveness scores |
|
||||||
|
| `/af-memory` | Cross-run lesson memory |
|
||||||
|
| `/af-fanout` | Colette book fanout via agents |
|
||||||
|
| `/af-dag` | DAG of current/last run |
|
||||||
|
|
||||||
### `archeflow:status`
|
## Mini-Reflect Fallback
|
||||||
Read `.archeflow/state.json` (if exists) and report:
|
|
||||||
- Current task, phase, and cycle
|
|
||||||
- Active agents and their status
|
|
||||||
- Findings so far (by severity)
|
|
||||||
- Time elapsed
|
|
||||||
|
|
||||||
### `archeflow:history`
|
Even when skipping ArcheFlow, apply for non-trivial changes:
|
||||||
Read `.archeflow/session-log.md` and show the last 5 orchestration summaries in compact format.
|
1. Restate what you're changing
|
||||||
|
2. Name one assumption
|
||||||
## Skills Reference (All 24)
|
3. Check if it could break anything
|
||||||
|
|
||||||
### Core Orchestration
|
|
||||||
- **archeflow:run** -- Automated PDCA execution loop with `--start-from` and `--dry-run`
|
|
||||||
- **archeflow:orchestration** -- Step-by-step manual execution guide
|
|
||||||
- **archeflow:plan-phase** -- Explorer and Creator output formats and protocols
|
|
||||||
- **archeflow:do-phase** -- Maker implementation rules and worktree commit strategy
|
|
||||||
- **archeflow:check-phase** -- Shared reviewer protocols and output format
|
|
||||||
- **archeflow:act-phase** -- Post-Check decision logic: collect findings, route fixes, exit or cycle
|
|
||||||
|
|
||||||
### Quality and Safety
|
|
||||||
- **archeflow:shadow-detection** -- Quantitative dysfunction detection and correction
|
|
||||||
- **archeflow:attention-filters** -- Context optimization per archetype
|
|
||||||
- **archeflow:convergence** -- Detects convergence, stalling, and oscillation in multi-cycle runs
|
|
||||||
- **archeflow:artifact-routing** -- Inter-phase artifact protocol for naming, storage, and routing
|
|
||||||
|
|
||||||
### Process Intelligence
|
|
||||||
- **archeflow:process-log** -- Event-sourced JSONL logging with DAG parent relationships
|
|
||||||
- **archeflow:memory** -- Cross-run learning from recurring findings
|
|
||||||
- **archeflow:effectiveness** -- Archetype scoring on signal-to-noise, fix rate, cost efficiency
|
|
||||||
- **archeflow:progress** -- Live progress file watchable from a second terminal
|
|
||||||
|
|
||||||
### Integration
|
|
||||||
- **archeflow:colette-bridge** -- Bridges ArcheFlow with the Colette writing platform
|
|
||||||
- **archeflow:git-integration** -- Git-per-phase commits, branch-per-run, rollback
|
|
||||||
- **archeflow:multi-project** -- Cross-repo orchestration with dependency DAG and shared budget
|
|
||||||
|
|
||||||
### Configuration
|
|
||||||
- **archeflow:custom-archetypes** -- Create domain-specific roles
|
|
||||||
- **archeflow:workflow-design** -- Design custom workflows with per-phase archetype assignment
|
|
||||||
- **archeflow:domains** -- Domain adapters for writing, research, and non-code workflows
|
|
||||||
- **archeflow:cost-tracking** -- Budget enforcement and model tier recommendations
|
|
||||||
- **archeflow:templates** -- Template gallery for sharing workflows, teams, and setup bundles
|
|
||||||
- **archeflow:autonomous-mode** -- Unattended overnight sessions
|
|
||||||
|
|
||||||
### Meta
|
|
||||||
- **archeflow:using-archeflow** -- This skill: session-start activation and quick reference
|
|
||||||
|
|||||||
@@ -1,248 +1,70 @@
|
|||||||
---
|
---
|
||||||
name: workflow-design
|
name: workflow-design
|
||||||
description: Use when designing custom orchestration workflows — choosing which archetypes run in each PDCA phase, setting exit conditions, and configuring PDCA cycles.
|
description: Use when designing custom orchestration workflows -- choosing which archetypes run in each PDCA phase, setting exit conditions, and configuring PDCA cycles.
|
||||||
---
|
---
|
||||||
|
|
||||||
# Workflow Design — PDCA Cycles
|
# Workflow Design -- PDCA Cycles
|
||||||
|
|
||||||
ArcheFlow's PDCA cycles spiral upward through iterations — each cycle incorporates feedback from the previous one, producing progressively better results. Each cycle incorporates feedback from the previous one.
|
PDCA cycles spiral upward: each cycle incorporates feedback from the previous one.
|
||||||
|
|
||||||
```
|
|
||||||
╱ Act ──────────── Done ✓
|
|
||||||
╱ ↑
|
|
||||||
╱ Check (review)
|
|
||||||
╱ ↑
|
|
||||||
╱ Do (implement)
|
|
||||||
╱ ↑
|
|
||||||
╱ Plan (design) ← Cycle 2 (with feedback from Cycle 1)
|
|
||||||
╱ ↑
|
|
||||||
╱ Act ─┘ (issues found → feed back)
|
|
||||||
│ ↑
|
|
||||||
│ Check (review)
|
|
||||||
│ ↑
|
|
||||||
│ Do (implement)
|
|
||||||
│ ↑
|
|
||||||
│ Plan (design) ← Cycle 1 (initial)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Strategy vs Workflow
|
|
||||||
|
|
||||||
A **strategy** defines the execution shape: PDCA is cyclic (Plan-Do-Check-Act with feedback loops), pipeline is linear (Plan-Implement-Review-Verify, no cycle-back). A **workflow** defines the depth: fast uses fewer agents and cycles, thorough uses more. Strategy and workflow are orthogonal — you can run a `fast` workflow with either strategy, though `thorough` always uses PDCA because linear flows cannot iterate on findings.
|
|
||||||
|
|
||||||
## Built-in Workflows
|
## Built-in Workflows
|
||||||
|
|
||||||
### `fast` — Single Turn
|
| Workflow | Plan | Do | Check | Exit | Max Cycles |
|
||||||
```
|
|----------|------|----|-------|------|------------|
|
||||||
Plan: Creator designs
|
| `fast` | Creator | Maker | Guardian | approve/reject | 1 |
|
||||||
Do: Maker implements (worktree)
|
| `standard` | Explorer + Creator | Maker | Guardian + Skeptic + Sage | all_approved | 2 |
|
||||||
Check: Guardian reviews
|
| `thorough` | Explorer + Creator | Maker | Guardian + Skeptic + Sage + Trickster | all_approved | 3 |
|
||||||
Act: Approve or reject (1 cycle max)
|
|
||||||
```
|
|
||||||
**Use for:** Bug fixes, small changes, low-risk tasks.
|
|
||||||
|
|
||||||
### `standard` — Two Cycles
|
|
||||||
```
|
|
||||||
Plan: Explorer researches → Creator designs
|
|
||||||
Do: Maker implements (worktree)
|
|
||||||
Check: Guardian + Skeptic + Sage review (parallel)
|
|
||||||
Act: Approve or cycle (2 cycles max)
|
|
||||||
```
|
|
||||||
**Use for:** Features, refactors, moderate-risk changes.
|
|
||||||
|
|
||||||
### `thorough` — Three Cycles
|
|
||||||
```
|
|
||||||
Plan: Explorer researches → Creator designs
|
|
||||||
Do: Maker implements (worktree)
|
|
||||||
Check: Guardian + Skeptic + Sage + Trickster (parallel)
|
|
||||||
Act: Approve or cycle (3 cycles max)
|
|
||||||
```
|
|
||||||
**Use for:** Security-critical, public APIs, infrastructure changes.
|
|
||||||
|
|
||||||
## Designing Custom Workflows
|
## Designing Custom Workflows
|
||||||
|
|
||||||
### Step 1: Identify the Concern
|
**Step 1: Identify the concern**
|
||||||
|
|
||||||
What's the primary risk?
|
| Risk | Emphasize in Check |
|
||||||
|
|------|-------------------|
|
||||||
|
| Security | Guardian + Trickster |
|
||||||
|
| Correctness | Skeptic + Sage |
|
||||||
|
| Performance | Custom `perf-tester` |
|
||||||
|
| Compliance | Custom `compliance-auditor` |
|
||||||
|
| Data integrity | Custom `db-specialist` |
|
||||||
|
|
||||||
| Primary Risk | Emphasize |
|
**Step 2: Phase assignment rules**
|
||||||
|-------------|-----------|
|
- Plan always includes Creator
|
||||||
| Security | Guardian + Trickster in Check |
|
- Do always includes Maker
|
||||||
| Correctness | Skeptic + Sage in Check |
|
- Check needs at least one reviewer
|
||||||
| Performance | Custom `perf-tester` archetype |
|
- Max 3 archetypes per phase
|
||||||
| Compliance | Custom `compliance-auditor` archetype |
|
- Explorer goes in Plan only; Maker goes in Do only
|
||||||
| Data integrity | Custom `db-specialist` archetype |
|
|
||||||
| User experience | Custom `ux-reviewer` archetype |
|
|
||||||
|
|
||||||
### Step 2: Assign Phases
|
**Step 3: Exit conditions**
|
||||||
|
|
||||||
Rules:
|
| Condition | Cycle ends when |
|
||||||
- **Plan** always includes Creator (someone must propose)
|
|-----------|----------------|
|
||||||
- **Do** always includes Maker (someone must build)
|
| `all_approved` | Every reviewer says APPROVED |
|
||||||
- **Check** needs at least one reviewer
|
| `no_critical` | No CRITICAL findings |
|
||||||
- Max 3 archetypes per phase (diminishing returns beyond that)
|
| `convergence` | No new issues vs previous cycle |
|
||||||
- Explorer goes in Plan only (research before design)
|
| `always` | Runs all maxCycles unconditionally |
|
||||||
- Maker goes in Do only (build from plan, not from scratch)
|
|
||||||
|
|
||||||
### Step 3: Set Exit Conditions
|
**Step 4: Max cycles** -- 1 (fast), 2 (balanced), 3 (thorough). 4+ rarely useful.
|
||||||
|
|
||||||
| Condition | When Cycle Ends | Best For |
|
|
||||||
|-----------|----------------|----------|
|
|
||||||
| `all_approved` | Every Check reviewer says APPROVED | Consensus-driven (default) |
|
|
||||||
| `no_critical` | No CRITICAL findings in Check output | Speed with safety net |
|
|
||||||
| `convergence` | No new issues vs. previous cycle | Diminishing returns detection |
|
|
||||||
| `always` | Runs all maxCycles unconditionally | Research, exploration |
|
|
||||||
|
|
||||||
### Step 4: Set Max Cycles
|
|
||||||
|
|
||||||
- **1 cycle:** Fast, low-risk (fast workflow)
|
|
||||||
- **2 cycles:** Balanced — one shot + one fix (standard workflow)
|
|
||||||
- **3 cycles:** Thorough — usually converges by cycle 3
|
|
||||||
- **4+ cycles:** Rarely useful. If 3 cycles don't converge, the task needs human input.
|
|
||||||
|
|
||||||
## Example Custom Workflows
|
|
||||||
|
|
||||||
### Security-First
|
|
||||||
```
|
|
||||||
Plan: Explorer (threat modeling) → Creator
|
|
||||||
Do: Maker
|
|
||||||
Check: Guardian + Trickster (parallel)
|
|
||||||
Exit: all_approved, max 3 cycles
|
|
||||||
```
|
|
||||||
|
|
||||||
### Research-Heavy
|
|
||||||
```
|
|
||||||
Plan: Explorer (deep research) → Creator
|
|
||||||
Do: Maker
|
|
||||||
Check: Skeptic + Sage (parallel)
|
|
||||||
Exit: all_approved, max 2 cycles
|
|
||||||
```
|
|
||||||
|
|
||||||
### Domain-Specific (with custom archetypes)
|
|
||||||
```
|
|
||||||
Plan: Explorer → Creator
|
|
||||||
Do: Maker
|
|
||||||
Check: Guardian + db-specialist + compliance-auditor (parallel)
|
|
||||||
Exit: all_approved, max 2 cycles
|
|
||||||
```
|
|
||||||
|
|
||||||
### Minimal Validation
|
|
||||||
```
|
|
||||||
Plan: Creator (no research)
|
|
||||||
Do: Maker
|
|
||||||
Check: Guardian
|
|
||||||
Exit: no_critical, max 1 cycle
|
|
||||||
```
|
|
||||||
|
|
||||||
## Hook Points
|
## Hook Points
|
||||||
|
|
||||||
Add project-specific validation at key moments in the PDCA cycle. Define hooks in `.archeflow/hooks.yaml`:
|
Define in `.archeflow/hooks.yaml`:
|
||||||
|
|
||||||
```yaml
|
| Hook | When | Typical use |
|
||||||
# .archeflow/hooks.yaml
|
|
||||||
pre-plan:
|
|
||||||
- command: "npm run lint"
|
|
||||||
description: "Ensure clean baseline before planning"
|
|
||||||
fail_action: abort # abort | warn | ignore
|
|
||||||
|
|
||||||
post-check:
|
|
||||||
- command: "npm test"
|
|
||||||
description: "Run tests after review to verify reviewer suggestions"
|
|
||||||
fail_action: cycle_back
|
|
||||||
|
|
||||||
pre-merge:
|
|
||||||
- command: "./scripts/check-migrations.sh"
|
|
||||||
description: "Verify migration safety before merging"
|
|
||||||
fail_action: abort
|
|
||||||
|
|
||||||
post-merge:
|
|
||||||
- command: "npm run integration-test"
|
|
||||||
description: "Full integration test after merge"
|
|
||||||
fail_action: revert
|
|
||||||
```
|
|
||||||
|
|
||||||
**Available hook points:**
|
|
||||||
| Hook | When | Typical Use |
|
|
||||||
|------|------|-------------|
|
|------|------|-------------|
|
||||||
| `pre-plan` | Before Explorer/Creator start | Lint, ensure clean baseline |
|
| `pre-plan` | Before Explorer/Creator | Lint, clean baseline |
|
||||||
| `post-plan` | After Creator's proposal | Validate proposal against constraints |
|
| `post-plan` | After Creator's proposal | Validate constraints |
|
||||||
| `pre-do` | Before Maker starts | Check worktree setup |
|
| `pre-do` | Before Maker | Check worktree |
|
||||||
| `post-do` | After Maker commits | Quick smoke test |
|
| `post-do` | After Maker commits | Smoke test |
|
||||||
| `post-check` | After reviewers finish | Run test suite |
|
| `post-check` | After reviewers | Run test suite |
|
||||||
| `pre-merge` | Before merging to main | Migration safety, API compatibility |
|
| `pre-merge` | Before merge | Migration safety |
|
||||||
| `post-merge` | After merge completes | Integration tests, deploy checks |
|
| `post-merge` | After merge | Integration tests |
|
||||||
|
|
||||||
## Workflow Template Library
|
Each hook has `command`, `description`, and `fail_action` (abort / warn / ignore / cycle_back / revert).
|
||||||
|
|
||||||
Pre-built workflows for common scenarios. Use as-is or as starting points for custom workflows.
|
|
||||||
|
|
||||||
### API Design
|
|
||||||
```yaml
|
|
||||||
name: api-design
|
|
||||||
description: New or changed API endpoints
|
|
||||||
plan: [explorer, creator]
|
|
||||||
do: [maker]
|
|
||||||
check: [guardian, skeptic] # Guardian for security, Skeptic for API design assumptions
|
|
||||||
exit: all_approved
|
|
||||||
max_cycles: 2
|
|
||||||
hooks:
|
|
||||||
post-check: "npm run api-compatibility-check"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Database Migration
|
|
||||||
```yaml
|
|
||||||
name: migration
|
|
||||||
description: Schema changes and data migrations
|
|
||||||
plan: [explorer, creator]
|
|
||||||
do: [maker]
|
|
||||||
check: [guardian, db-specialist] # Requires custom db-specialist archetype
|
|
||||||
exit: all_approved
|
|
||||||
max_cycles: 2
|
|
||||||
hooks:
|
|
||||||
pre-merge: "./scripts/check-migration-reversibility.sh"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Dependency Upgrade
|
|
||||||
```yaml
|
|
||||||
name: dep-upgrade
|
|
||||||
description: Upgrading dependencies (major versions, security patches)
|
|
||||||
plan: [creator] # No Explorer needed — changelog is the research
|
|
||||||
do: [maker]
|
|
||||||
check: [guardian]
|
|
||||||
exit: no_critical
|
|
||||||
max_cycles: 1
|
|
||||||
hooks:
|
|
||||||
post-do: "npm audit"
|
|
||||||
post-merge: "npm test && npm run e2e"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Documentation Rewrite
|
|
||||||
```yaml
|
|
||||||
name: docs-rewrite
|
|
||||||
description: Major documentation changes
|
|
||||||
plan: [explorer, creator]
|
|
||||||
do: [maker]
|
|
||||||
check: [sage] # Quality/consistency only — no security review needed
|
|
||||||
exit: all_approved
|
|
||||||
max_cycles: 1
|
|
||||||
```
|
|
||||||
|
|
||||||
### Hotfix
|
|
||||||
```yaml
|
|
||||||
name: hotfix
|
|
||||||
description: Emergency production fix
|
|
||||||
plan: [creator]
|
|
||||||
do: [maker]
|
|
||||||
check: [guardian]
|
|
||||||
exit: no_critical
|
|
||||||
max_cycles: 1
|
|
||||||
hooks:
|
|
||||||
post-merge: "npm test"
|
|
||||||
```
|
|
||||||
|
|
||||||
## Anti-Patterns
|
## Anti-Patterns
|
||||||
|
|
||||||
- **Kitchen sink:** Putting all 7 archetypes in Check. Most can't add value simultaneously.
|
- All 7 archetypes in Check (diminishing returns)
|
||||||
- **Runaway cycles:** maxCycles > 4 burns tokens without convergence.
|
- maxCycles > 4 (burns tokens without convergence)
|
||||||
- **Reviewerless Do:** Skipping Check phase "to save time." You'll pay in bugs.
|
- Skipping Check phase
|
||||||
- **Maker in Plan:** Maker should implement from a proposal, not design on the fly.
|
- Maker in Plan phase
|
||||||
- **Solo orchestration:** One archetype in every phase. That's just a single agent with extra steps.
|
- One archetype in every phase (just a single agent with overhead)
|
||||||
|
|||||||
Reference in New Issue
Block a user