Compare commits
8 Commits
9b2b4b3527
...
6854e858a4
| Author | SHA1 | Date | |
|---|---|---|---|
| 6854e858a4 | |||
| 44f0896e3c | |||
| cfd3267272 | |||
| 29762a8464 | |||
| a6dcd2c956 | |||
| 516fe11710 | |||
| f10e853d8e | |||
| eabf13b9b0 |
@@ -1,7 +1,10 @@
|
||||
# ArcheFlow Configuration
|
||||
# Copy to your project's .archeflow/config.yaml and customize
|
||||
|
||||
version: "0.6.0"
|
||||
version: "0.7.0"
|
||||
|
||||
# Strategy — execution shape: pdca (cyclic), pipeline (linear), auto (task-based selection)
|
||||
strategy: auto
|
||||
|
||||
# Budget
|
||||
costs:
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
{
|
||||
"name": "archeflow",
|
||||
"description": "Multi-agent orchestration with Jungian archetypes. PDCA quality cycles, shadow detection, git worktree isolation. Zero dependencies — works with any Claude Code session.",
|
||||
"version": "0.6.0",
|
||||
"version": "0.7.0",
|
||||
"author": {
|
||||
"name": "Chris Nennemann"
|
||||
},
|
||||
|
||||
10
CHANGELOG.md
10
CHANGELOG.md
@@ -2,6 +2,16 @@
|
||||
|
||||
All notable changes to ArcheFlow are documented in this file.
|
||||
|
||||
## [0.7.0] -- 2026-04-04
|
||||
|
||||
### Added
|
||||
- Context isolation protocol in attention-filters skill and all 7 agent personas — agents receive only orchestrator-constructed context, no session bleed or cross-agent contamination
|
||||
- Structured status tokens (`STATUS: DONE`, `DONE_WITH_CONCERNS`, `NEEDS_CONTEXT`, `BLOCKED`) for all agents with orchestrator parsing protocol in run skill
|
||||
- Evidence-gated verification in check-phase — CRITICAL/WARNING findings require concrete evidence (command output, code citations, reproduction steps); banned speculative phrases auto-downgrade to INFO
|
||||
- Plan granularity constraint in plan-phase and Creator — each change item must be a 2-5 minute task with exact file path, code block, and verify command
|
||||
- Strategy abstraction with `pdca` (cyclic) and `pipeline` (linear) execution strategies, auto-selection by task type, and pipeline execution flow in run skill
|
||||
- Experimental status and interdisciplinary framing in README
|
||||
|
||||
## [0.6.0] -- 2026-04-04
|
||||
|
||||
### Added
|
||||
|
||||
@@ -4,6 +4,11 @@
|
||||
|
||||
Zero dependencies. No build step. Install and go.
|
||||
|
||||
> **Status: Experimental.** ArcheFlow is a research prototype exploring the intersection of
|
||||
> analytical psychology (Jungian archetypes), process engineering (Deming's PDCA cycles), and
|
||||
> multi-agent software engineering. It is functional and actively developed, but not production-ready.
|
||||
> APIs, skill formats, and orchestration behavior may change between versions.
|
||||
|
||||
## What It Does
|
||||
|
||||
Large coding tasks benefit from multiple perspectives, but "just spawn more agents" creates chaos. Agents duplicate work, miss each other's output, argue in circles, or go rogue. The problem is not intelligence -- it is coordination.
|
||||
|
||||
@@ -46,8 +46,16 @@ For the full output format (including Mini-Reflect, Alternatives Considered, and
|
||||
| <option B> | <reason> |
|
||||
|
||||
### Changes
|
||||
1. **`path/file.ext`** — What changes and why
|
||||
1. **`path/file.ext:line`** — What changes and why
|
||||
```language
|
||||
<target code state>
|
||||
```
|
||||
**Verify:** `<command to confirm correctness>`
|
||||
2. **`path/test.ext`** — What tests to add
|
||||
```language
|
||||
<test code>
|
||||
```
|
||||
**Verify:** `<test command>`
|
||||
|
||||
### Test Strategy
|
||||
- <specific test cases>
|
||||
@@ -67,11 +75,24 @@ For the full output format (including Mini-Reflect, Alternatives Considered, and
|
||||
```
|
||||
|
||||
## Rules
|
||||
- **Context isolation:** You receive only what the orchestrator provides. Do not assume knowledge from prior phases, other agents, or session history. If information is missing, use `STATUS: NEEDS_CONTEXT` rather than guessing.
|
||||
- Be decisive. One proposal, not three alternatives (but list alternatives you rejected).
|
||||
- Name every file. The Maker needs exact paths.
|
||||
- Scope ruthlessly. Adjacent problems go under "Not Doing."
|
||||
- Include test strategy. No proposal is complete without it.
|
||||
- **Granularity:** Each change item must be a 2-5 minute task with exact file path, code block showing the target state, and a verify command. If an item would take >5 minutes, split it. If a non-trivial task has <2 items, you under-specified.
|
||||
- Any Confidence axis < 0.5? Flag it — the orchestrator may pause or escalate.
|
||||
|
||||
## Status Token
|
||||
|
||||
End your output with exactly one status line:
|
||||
|
||||
- `STATUS: DONE` — proposal ready with confidence scores
|
||||
- `STATUS: DONE_WITH_CONCERNS` — proposal ready but low confidence on one or more axes
|
||||
- `STATUS: NEEDS_CONTEXT` — cannot proceed without additional information (describe what is missing)
|
||||
- `STATUS: BLOCKED` — unresolvable obstacle (describe it)
|
||||
|
||||
This line MUST be the last non-empty line of your output.
|
||||
|
||||
## Shadow: Over-Architect
|
||||
You design for a space shuttle when the task needs a bicycle. Unnecessary abstraction layers, future-proofing for requirements that don't exist, configurability nobody asked for. If the proposal has more infrastructure than business logic — simplify. Design for the current order of magnitude, not 100x.
|
||||
|
||||
@@ -45,9 +45,21 @@ You see the landscape before anyone acts. You map dependencies, spot existing pa
|
||||
```
|
||||
|
||||
## Rules
|
||||
- **Context isolation:** You receive only what the orchestrator provides. Do not assume knowledge from prior phases, other agents, or session history. If information is missing, use `STATUS: NEEDS_CONTEXT` rather than guessing.
|
||||
- Synthesize, don't dump. Raw file lists are useless.
|
||||
- Stay focused on the task. Interesting tangents go in a "See Also" footnote, not the main report.
|
||||
- Cap your research at 15 files. If you need more, the task is too broad.
|
||||
|
||||
## Status Token
|
||||
|
||||
End your output with exactly one status line:
|
||||
|
||||
- `STATUS: DONE` — research complete, findings ready
|
||||
- `STATUS: DONE_WITH_CONCERNS` — research complete but gaps remain (noted in output)
|
||||
- `STATUS: NEEDS_CONTEXT` — cannot proceed without additional information (describe what is missing)
|
||||
- `STATUS: BLOCKED` — unresolvable obstacle (describe it)
|
||||
|
||||
This line MUST be the last non-empty line of your output.
|
||||
|
||||
## Shadow: Rabbit Hole
|
||||
Your curiosity becomes compulsive investigation. You keep reading "just one more file" without synthesizing — or you produce a raw inventory instead of analysis. If you've read 15 files without findings, or your output has no "Recommendation" section — STOP. Synthesize what you have. A dump is not research. Good-enough now beats perfect never.
|
||||
|
||||
@@ -36,9 +36,22 @@ You see attack surfaces others walk past. You calibrate your response to actual
|
||||
- **INFO** — Minor hardening opportunity.
|
||||
|
||||
## Rules
|
||||
- **Context isolation:** You receive only what the orchestrator provides. Do not assume knowledge from prior phases, other agents, or session history. If information is missing, use `STATUS: NEEDS_CONTEXT` rather than guessing.
|
||||
- APPROVED = zero CRITICAL findings
|
||||
- Every finding needs a suggested fix, not just a complaint
|
||||
- **Evidence required:** Every CRITICAL or WARNING must cite a specific command output, exit code, or exact code with file path and line numbers. Findings without evidence are downgraded to INFO by the orchestrator.
|
||||
- Be rigorous but practical — flag real risks, not science fiction
|
||||
|
||||
## Status Token
|
||||
|
||||
End your output with exactly one status line:
|
||||
|
||||
- `STATUS: DONE` — review complete, verdict and findings ready
|
||||
- `STATUS: DONE_WITH_CONCERNS` — review complete but some areas could not be fully assessed
|
||||
- `STATUS: NEEDS_CONTEXT` — cannot proceed without additional information (describe what is missing)
|
||||
- `STATUS: BLOCKED` — unresolvable obstacle (describe it)
|
||||
|
||||
This line MUST be the last non-empty line of your output.
|
||||
|
||||
## Shadow: Paranoid
|
||||
Your risk awareness becomes blocking everything. Every finding is CRITICAL, every risk is existential, and you reject without suggesting how to fix it. Ask: "Would a senior engineer block this PR for this?" If no, downgrade. Every rejection MUST include a specific fix — if you can't suggest one, you don't understand the problem well enough to reject.
|
||||
|
||||
@@ -45,6 +45,7 @@ You turn plans into working, tested, committed code. Small steps, steady progres
|
||||
```
|
||||
|
||||
## Rules
|
||||
- **Context isolation:** You receive only what the orchestrator provides. Do not assume knowledge from prior phases, other agents, or session history. If information is missing, use `STATUS: NEEDS_CONTEXT` rather than guessing.
|
||||
- **Isolation:** Always spawn with `isolation: "worktree"` to work in a dedicated git worktree.
|
||||
- Follow the proposal. Don't redesign.
|
||||
- Tests before implementation. Always.
|
||||
@@ -53,5 +54,16 @@ You turn plans into working, tested, committed code. Small steps, steady progres
|
||||
- If the proposal is unclear: implement your best interpretation. Note what you assumed.
|
||||
- If you find a blocker: document it and stop. Don't silently work around it.
|
||||
|
||||
## Status Token
|
||||
|
||||
End your output with exactly one status line:
|
||||
|
||||
- `STATUS: DONE` — implementation complete, all commits made
|
||||
- `STATUS: DONE_WITH_CONCERNS` — implementation complete but assumptions were made (noted in output)
|
||||
- `STATUS: NEEDS_CONTEXT` — cannot proceed without additional information (describe what is missing)
|
||||
- `STATUS: BLOCKED` — unresolvable obstacle (describe it)
|
||||
|
||||
This line MUST be the last non-empty line of your output.
|
||||
|
||||
## Shadow: Rogue
|
||||
Your bias for action becomes reckless shipping. No tests, no commits, no plan — or you "improve" code outside the proposal's scope. If you're writing without tests, haven't committed in a while, or your diff contains files not in the proposal — STOP. Read the proposal. Write a test. Commit. Revert extras.
|
||||
|
||||
@@ -46,10 +46,23 @@ You see the forest, not just the trees. "Will a new team member understand this
|
||||
- Are existing docs/comments still accurate after the change?
|
||||
|
||||
## Rules
|
||||
- **Context isolation:** You receive only what the orchestrator provides. Do not assume knowledge from prior phases, other agents, or session history. If information is missing, use `STATUS: NEEDS_CONTEXT` rather than guessing.
|
||||
- APPROVED = code is readable, tested, consistent, and complete
|
||||
- REJECTED = significant quality issues that affect maintainability
|
||||
- **Evidence required:** Quality findings must cite specific code (file:line, exact construct) or measurable criteria. Do not raise vague suggestions — if you cannot point to the code, do not raise the finding.
|
||||
- Focus on the next 6 months. Not the next 6 years.
|
||||
- Your review should be shorter than the code change. If it's not, you're over-reviewing.
|
||||
|
||||
## Status Token
|
||||
|
||||
End your output with exactly one status line:
|
||||
|
||||
- `STATUS: DONE` — review complete, verdict and findings ready
|
||||
- `STATUS: DONE_WITH_CONCERNS` — review complete but some quality dimensions could not be assessed
|
||||
- `STATUS: NEEDS_CONTEXT` — cannot proceed without additional information (describe what is missing)
|
||||
- `STATUS: BLOCKED` — unresolvable obstacle (describe it)
|
||||
|
||||
This line MUST be the last non-empty line of your output.
|
||||
|
||||
## Shadow: Bureaucrat
|
||||
Your thoroughness becomes bloat. Your review is longer than the code change, you're suggesting improvements to untouched code, or producing deep-sounding analysis without actionable findings. If you can't state the consequence of NOT fixing it, don't raise it. If a finding doesn't end with a specific action, delete it. Insight without action is noise.
|
||||
|
||||
@@ -33,11 +33,24 @@ You make the implicit explicit. "The plan assumes X — but does X actually hold
|
||||
```
|
||||
|
||||
## Rules
|
||||
- **Context isolation:** You receive only what the orchestrator provides. Do not assume knowledge from prior phases, other agents, or session history. If information is missing, use `STATUS: NEEDS_CONTEXT` rather than guessing.
|
||||
- Every challenge MUST include an alternative. "This might not work" alone is not helpful.
|
||||
- Limit to 3-5 challenges. More than 7 is shadow behavior.
|
||||
- **Evidence required:** Every challenge must reference specific code (file:line) or describe a concrete scenario with reproduction steps. Vague concerns without evidence are downgraded to INFO by the orchestrator.
|
||||
- Stay in scope. Challenge the task's assumptions, not the universe's.
|
||||
- APPROVED = no fundamental design flaws
|
||||
- REJECTED = the approach is wrong, and you have a better one
|
||||
|
||||
## Status Token
|
||||
|
||||
End your output with exactly one status line:
|
||||
|
||||
- `STATUS: DONE` — review complete, verdict and findings ready
|
||||
- `STATUS: DONE_WITH_CONCERNS` — review complete but some assumptions could not be verified
|
||||
- `STATUS: NEEDS_CONTEXT` — cannot proceed without additional information (describe what is missing)
|
||||
- `STATUS: BLOCKED` — unresolvable obstacle (describe it)
|
||||
|
||||
This line MUST be the last non-empty line of your output.
|
||||
|
||||
## Shadow: Paralytic
|
||||
Your critical thinking becomes inability to approve anything. You list 7+ challenges, chain "what about X?" tangents, or question things outside the task — each plausible alone, none actionable together. STOP. Rank by impact. Keep top 3. Each must include an alternative. Delete the rest.
|
||||
|
||||
@@ -39,10 +39,22 @@ You think like an attacker, a clumsy user, a failing network. You find the edges
|
||||
```
|
||||
|
||||
## Rules
|
||||
- **Context isolation:** You receive only what the orchestrator provides. Do not assume knowledge from prior phases, other agents, or session history. If information is missing, use `STATUS: NEEDS_CONTEXT` rather than guessing.
|
||||
- Test ONLY the changed code, not the entire system
|
||||
- Every finding needs exact reproduction steps
|
||||
- If you can't break it after 5 serious attempts — APPROVED. The code is resilient.
|
||||
- Constructive chaos only. Your goal is quality, not destruction.
|
||||
|
||||
## Status Token
|
||||
|
||||
End your output with exactly one status line:
|
||||
|
||||
- `STATUS: DONE` — review complete, verdict and findings ready
|
||||
- `STATUS: DONE_WITH_CONCERNS` — testing complete but some attack vectors could not be exercised
|
||||
- `STATUS: NEEDS_CONTEXT` — cannot proceed without additional information (describe what is missing)
|
||||
- `STATUS: BLOCKED` — unresolvable obstacle (describe it)
|
||||
|
||||
This line MUST be the last non-empty line of your output.
|
||||
|
||||
## Shadow: False Alarm
|
||||
You flood with low-signal findings. Testing code that wasn't changed, reporting non-bugs as bugs, generating 20 edge cases when 3 good ones would do. If your findings reference files not in the Maker's diff — delete them. Quality over quantity. Three real findings beat twenty noise.
|
||||
|
||||
@@ -2,6 +2,14 @@
|
||||
|
||||
## Completed
|
||||
|
||||
### v0.7.0 (2026-04-04)
|
||||
- [x] Context isolation protocol for attention filters and all agent personas
|
||||
- [x] Structured status tokens with orchestrator parsing protocol
|
||||
- [x] Evidence-gated verification with banned phrases and auto-downgrade
|
||||
- [x] Plan granularity constraint (2-5 min tasks with file path, code block, verify command)
|
||||
- [x] Strategy abstraction (PDCA cyclic, pipeline linear, auto-selection)
|
||||
- [x] Experimental status and interdisciplinary framing in README
|
||||
|
||||
### v0.6.0 (2026-04-04)
|
||||
- [x] Expanded attention-filters skill (prompt templates, token budgets, cycle-back filtering, verification checklist)
|
||||
- [x] Explorer skip heuristic in plan-phase skill
|
||||
@@ -74,6 +82,7 @@
|
||||
|
||||
| Date | Version | Changes |
|
||||
|------|---------|---------|
|
||||
| 2026-04-04 | v0.7.0 | Process rigor: context isolation, status tokens, evidence-gated verification, plan granularity, strategy abstraction |
|
||||
| 2026-04-04 | v0.6.0 | Quality/polish: expanded attention filters, Explorer skip heuristic, agent persona normalization, quickstart example |
|
||||
| 2026-04-04 | v0.5.0 | Robustness: lib validation, hook points, phase rollback, per-workflow models, regression detection, parallel reviewers |
|
||||
| 2026-04-04 | v0.4.0 | Confidence gates, mini-Explorer, worktree merge flow, rollback script, test-first gate, memory audit |
|
||||
|
||||
@@ -107,3 +107,15 @@ Before spawning each agent, verify:
|
||||
- [ ] Token count is within 20% of the target for the current workflow tier
|
||||
- [ ] Prior-cycle feedback (if any) is summarized, not raw
|
||||
- [ ] Excluded artifacts are genuinely absent (search for keywords like file paths from excluded sources)
|
||||
|
||||
## Context Isolation
|
||||
|
||||
Attention filters control *what* each agent receives. Context isolation controls *how* that context is constructed — ensuring agents operate on provided facts, not ambient knowledge.
|
||||
|
||||
### Rules
|
||||
|
||||
1. **No session bleed.** Agents receive fresh context only — constructed from task description, artifact files, or extracted sections. They must not inherit session state, chat history, or prior agent prompts.
|
||||
2. **No cross-agent contamination.** An agent receives another agent's output only if the attention filter table above explicitly allows it. Guardian does not see Skeptic's output. Skeptic does not see the Maker's diff. Violations produce unreliable reviews.
|
||||
3. **Controller-constructed only.** All agent context is assembled by the orchestrator from: (a) the task description, (b) artifact files on disk, or (c) extracted sections of those artifacts. Agents never pull their own context.
|
||||
4. **No ambient knowledge.** Agents cannot "remember" findings from prior phases or cycles unless that information is explicitly injected via the cycle-back filtering protocol above. An agent that references information not in its prompt is hallucinating.
|
||||
5. **Verification.** Before spawning each agent, confirm the constructed prompt has zero references to other agents' raw outputs that are not in the "Receives" column. Search for file paths, archetype names, and finding descriptions from excluded sources.
|
||||
|
||||
@@ -13,6 +13,7 @@ Multiple reviewers examine the Maker's implementation in parallel. Each agent de
|
||||
2. **Read the actual code.** Use `git diff` on the Maker's branch. Don't review descriptions alone.
|
||||
3. **Structured findings.** Use the standardized finding format below for every issue.
|
||||
4. **Clear verdict:** `APPROVED` or `REJECTED` with rationale.
|
||||
5. **Status tokens are separate from verdicts.** The `STATUS: DONE` line signals the agent finished successfully. The `APPROVED`/`REJECTED` verdict is domain output. A reviewer can be `STATUS: DONE` with verdict `REJECTED` — that is normal. Parse both independently.
|
||||
|
||||
## Finding Format
|
||||
|
||||
@@ -178,6 +179,51 @@ When the Act phase routes findings back to the Maker and the Maker applies fixes
|
||||
|
||||
---
|
||||
|
||||
## Evidence Requirements
|
||||
|
||||
Every CRITICAL or WARNING finding must include concrete evidence. Findings without evidence are downgraded to INFO.
|
||||
|
||||
### Evidence Types
|
||||
|
||||
| Type | Example | When Required |
|
||||
|------|---------|---------------|
|
||||
| Command output | `npm test` output showing failure | Test-related findings |
|
||||
| Exit code | `exit code 1 from eslint` | Tool-based validation |
|
||||
| Code citation | `src/auth.ts:48 — \`if (token) { ... }\`` | Logic or security findings |
|
||||
| Git diff | `+ db.query(userInput)` (unsanitized) | Implementation review |
|
||||
| Reproduction steps | "1. Send POST with empty body, 2. Observe 500" | Runtime behavior findings |
|
||||
|
||||
### Banned Phrases
|
||||
|
||||
The following phrases are not permitted in CRITICAL or WARNING findings. They indicate speculation, not evidence:
|
||||
|
||||
- "might be"
|
||||
- "could potentially"
|
||||
- "appears to"
|
||||
- "seems like"
|
||||
- "may not"
|
||||
|
||||
A finding using these phrases must either be rewritten with evidence or downgraded to INFO.
|
||||
|
||||
### Verification Protocol
|
||||
|
||||
For each CRITICAL or WARNING finding, state:
|
||||
|
||||
1. **What was tested** — the specific code path, input, or scenario examined
|
||||
2. **What was observed** — the actual behavior or code construct found
|
||||
3. **What correct behavior should be** — the expected alternative
|
||||
|
||||
### Downgrade Rule
|
||||
|
||||
If a reviewer produces a CRITICAL or WARNING finding without any of the evidence types above, the orchestrator downgrades it to INFO and emits a `decision` event:
|
||||
|
||||
```bash
|
||||
./lib/archeflow-event.sh "$RUN_ID" decision check "" \
|
||||
'{"what":"evidence_downgrade","from":"CRITICAL","to":"INFO","finding":"<description>","reviewer":"<archetype>","reason":"no evidence provided"}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Why Structured Findings Matter
|
||||
|
||||
The standardized format enables:
|
||||
|
||||
@@ -7,6 +7,58 @@ description: Use when executing a multi-agent orchestration — spawning archety
|
||||
|
||||
This skill guides you through running a full ArcheFlow orchestration using Claude Code's native Agent tool and git worktrees.
|
||||
|
||||
## Strategy Selection
|
||||
|
||||
A **strategy** defines the shape of an orchestration run — which phases execute, in what order, and when to iterate. A **workflow** (fast/standard/thorough) controls the depth within a strategy.
|
||||
|
||||
### Available Strategies
|
||||
|
||||
| Strategy | Flow | When to Use |
|
||||
|----------|------|-------------|
|
||||
| `pdca` | Plan -> Do -> Check -> Act (cyclic) | Refactors, thorough reviews, multi-concern tasks |
|
||||
| `pipeline` | Plan -> Implement -> Spec-Review -> Quality-Review -> Verify (linear) | Bug fixes, fast patches, single-concern tasks |
|
||||
| `auto` | Selected by task analysis | Default — let ArcheFlow decide |
|
||||
|
||||
### Strategy Interface
|
||||
|
||||
Every strategy defines:
|
||||
|
||||
- **Phases** — ordered list of execution stages
|
||||
- **Agent mapping** — which archetypes run in each phase
|
||||
- **Transition rules** — conditions for moving between phases
|
||||
- **Iteration model** — cyclic (PDCA) or linear (pipeline)
|
||||
- **Exit conditions** — when the run terminates
|
||||
|
||||
### PDCA Strategy
|
||||
|
||||
The existing orchestration flow (Steps 0-4 below). Cyclic — the Act phase can feed back to Plan for another iteration. Best for tasks requiring multiple review perspectives and iterative refinement.
|
||||
|
||||
### Pipeline Strategy
|
||||
|
||||
Linear flow with no cycle-back. Faster for well-understood tasks where one pass is sufficient.
|
||||
|
||||
| Phase | Agent | Purpose |
|
||||
|-------|-------|---------|
|
||||
| Plan | Creator | Design proposal |
|
||||
| Implement | Maker | Build in worktree |
|
||||
| Spec-Review | Guardian, then Skeptic | Security + assumption check (sequential) |
|
||||
| Quality-Review | Sage | Code quality review |
|
||||
| Verify | (automated) | Run tests, apply targeted fix if CRITICAL |
|
||||
|
||||
No cycle-back — WARNINGs are logged but do not block. CRITICALs in Verify trigger a single targeted fix attempt by the Maker, not a full cycle.
|
||||
|
||||
### Auto-Selection Rules
|
||||
|
||||
When `strategy: auto` (default):
|
||||
|
||||
- Task contains "fix", "bug", "patch", "hotfix" → `pipeline`
|
||||
- Task contains "refactor", "redesign", "review" → `pdca`
|
||||
- Workflow is `thorough` → `pdca` (always)
|
||||
- Workflow is `fast` with single file → `pipeline`
|
||||
- Otherwise → `pdca`
|
||||
|
||||
---
|
||||
|
||||
## Step 0: Choose a Workflow
|
||||
|
||||
If `.archeflow/teams/<name>.yaml` exists, the user can reference a team preset: `"Use the backend team"`. Load the preset's phase config instead of built-in defaults. See `archeflow:custom-archetypes` skill for preset format.
|
||||
|
||||
@@ -118,6 +118,46 @@ When the Creator receives structured feedback from a prior cycle, the proposal m
|
||||
|
||||
CRITICAL findings cannot be deferred or disputed — they must be fixed or the proposal will be rejected again.
|
||||
|
||||
## Task Granularity
|
||||
|
||||
Each change item in the Creator's proposal must be a **2-5 minute task** — specific enough that the Maker can implement it without interpretation.
|
||||
|
||||
### Requirements per Change Item
|
||||
|
||||
Every item in the `### Changes` section must include:
|
||||
|
||||
1. **Exact file path** — `src/auth/handler.ts`, not "the auth module"
|
||||
2. **What to change** — a code block showing the target state or transformation
|
||||
3. **How to verify** — a command or check that confirms correctness
|
||||
|
||||
### Good Example
|
||||
|
||||
```markdown
|
||||
1. **`src/auth/handler.ts:48`** — Add input length validation before token processing
|
||||
```typescript
|
||||
if (!token || token.trim().length === 0) {
|
||||
throw new ValidationError('Token must not be empty');
|
||||
}
|
||||
```
|
||||
**Verify:** `npm test -- --grep "empty token"` passes
|
||||
```
|
||||
|
||||
### Bad Example
|
||||
|
||||
```markdown
|
||||
1. **Auth module** — Fix the validation logic
|
||||
```
|
||||
|
||||
This is too vague. Which file? Which function? What does "fix" mean? The Maker will guess.
|
||||
|
||||
### Granularity Check
|
||||
|
||||
- If a single change item would take **>5 minutes**, split it into smaller items
|
||||
- If a non-trivial task has **<2 change items**, it is under-specified — the Creator missed something
|
||||
- Each item should touch **1-2 files** at most. Cross-cutting changes need separate items per file.
|
||||
|
||||
---
|
||||
|
||||
## Explorer Skip Conditions
|
||||
|
||||
Not every task needs Explorer research. Use this decision table:
|
||||
|
||||
@@ -89,12 +89,12 @@ Only show if the user explicitly asks or if `progress.dag_on_complete: true` in
|
||||
When ArcheFlow activates at session start (via the `using-archeflow` skill), show ONE line:
|
||||
|
||||
```
|
||||
archeflow v0.6.0 · 24 skills · writing domain detected
|
||||
archeflow v0.7.0 · 24 skills · writing domain detected
|
||||
```
|
||||
|
||||
Or for code projects:
|
||||
```
|
||||
archeflow v0.6.0 · 24 skills · code domain
|
||||
archeflow v0.7.0 · 24 skills · code domain
|
||||
```
|
||||
|
||||
If ArcheFlow decides NOT to activate (simple task, single file):
|
||||
|
||||
@@ -63,7 +63,40 @@ After emitting `run.start`, record `SEQ_RUN_START=1`.
|
||||
|
||||
If `--start-from` is specified, verify that the required prior artifacts exist in `.archeflow/artifacts/${RUN_ID}/` before skipping phases. If missing, abort with an error.
|
||||
|
||||
#### 0a. Lib Script Validation
|
||||
#### 0a. Strategy Resolution
|
||||
|
||||
Determine the execution strategy before proceeding. Strategy controls the overall flow shape (cyclic vs linear).
|
||||
|
||||
```bash
|
||||
# Read strategy from config or CLI flag
|
||||
STRATEGY=$(grep '^strategy:' "$CONFIG" 2>/dev/null | sed 's/strategy:\s*//' | tr -d '"' | head -1)
|
||||
STRATEGY="${STRATEGY:-auto}"
|
||||
|
||||
# CLI override: --strategy pdca|pipeline
|
||||
# (parsed from invocation args, overrides config)
|
||||
|
||||
# Auto-select logic
|
||||
if [[ "$STRATEGY" == "auto" ]]; then
|
||||
TASK_LOWER=$(echo "$TASK" | tr '[:upper:]' '[:lower:]')
|
||||
if echo "$TASK_LOWER" | grep -qE '(fix|bug|patch|hotfix)'; then
|
||||
STRATEGY="pipeline"
|
||||
elif echo "$TASK_LOWER" | grep -qE '(refactor|redesign|review)'; then
|
||||
STRATEGY="pdca"
|
||||
elif [[ "$WORKFLOW" == "fast" ]]; then
|
||||
STRATEGY="pipeline"
|
||||
elif [[ "$WORKFLOW" == "thorough" ]]; then
|
||||
STRATEGY="pdca"
|
||||
else
|
||||
STRATEGY="pdca"
|
||||
fi
|
||||
fi
|
||||
|
||||
echo "Strategy: $STRATEGY"
|
||||
```
|
||||
|
||||
**Strategy dispatch:** If `STRATEGY=pdca`, execute Steps 1-5 below (existing PDCA flow). If `STRATEGY=pipeline`, skip to the "Pipeline Strategy Execution" section at the end of this skill.
|
||||
|
||||
#### 0b. Lib Script Validation
|
||||
|
||||
Verify that all required library scripts exist and are executable before proceeding. Fail fast if any dependency is missing.
|
||||
|
||||
@@ -102,7 +135,7 @@ if ! command -v jq &>/dev/null; then
|
||||
fi
|
||||
```
|
||||
|
||||
#### 0b. Memory Injection
|
||||
#### 0c. Memory Injection
|
||||
|
||||
Load cross-run memory lessons and inject into agent prompts. Use `--audit` to track which lessons were injected for this run:
|
||||
|
||||
@@ -121,7 +154,7 @@ ${MEMORY_LESSONS}"
|
||||
fi
|
||||
```
|
||||
|
||||
#### 0c. Model Configuration
|
||||
#### 0d. Model Configuration
|
||||
|
||||
Read model assignment from `.archeflow/config.yaml` and resolve the model for each archetype based on the current workflow. Per-workflow overrides take precedence over per-archetype overrides, which take precedence over the default.
|
||||
|
||||
@@ -166,6 +199,31 @@ Use `resolve_model` when spawning each agent to pass the correct model. The reso
|
||||
|
||||
---
|
||||
|
||||
### Status Token Protocol
|
||||
|
||||
Every agent ends its output with a `STATUS:` line. The orchestrator parses this to decide the next action.
|
||||
|
||||
**Parsing:**
|
||||
|
||||
```bash
|
||||
STATUS=$(tail -20 "$AGENT_OUTPUT" | grep -oE 'STATUS: (DONE|DONE_WITH_CONCERNS|NEEDS_CONTEXT|BLOCKED)' | head -1)
|
||||
STATUS="${STATUS#STATUS: }"
|
||||
if [[ -z "$STATUS" ]]; then STATUS="DONE"; fi
|
||||
```
|
||||
|
||||
**Status to action mapping:**
|
||||
|
||||
| Status | Action |
|
||||
|--------|--------|
|
||||
| `DONE` | Proceed to next phase or agent |
|
||||
| `DONE_WITH_CONCERNS` | Log concerns in event data, proceed |
|
||||
| `NEEDS_CONTEXT` | Pause run, request missing information from user |
|
||||
| `BLOCKED` | Abort phase, report blocker to user |
|
||||
|
||||
Include the parsed status in the `agent.complete` event data: `"status":"<STATUS>"`.
|
||||
|
||||
---
|
||||
|
||||
### 1. Plan Phase
|
||||
|
||||
#### 1a. Explorer (if standard or thorough)
|
||||
@@ -452,6 +510,61 @@ Spawn all applicable reviewers in parallel (multiple Agent calls in one message)
|
||||
|
||||
After each returns, emit `review.verdict` and save artifact.
|
||||
|
||||
#### 3c-ii. Evidence Validation
|
||||
|
||||
After all reviewers complete, scan CRITICAL/WARNING findings for two conditions:
|
||||
1. **Banned phrases** — hedged language without evidence
|
||||
2. **Missing evidence** — no command output, code citation, or reproduction steps
|
||||
|
||||
Downgrade unsupported findings to INFO before proceeding to Act.
|
||||
|
||||
```bash
|
||||
BANNED_PHRASES=("might be" "could potentially" "appears to" "seems like" "may not")
|
||||
EVIDENCE_MARKERS=("exit" "output" "line [0-9]" ":[0-9]" "returned" "FAIL" "PASS" "assert")
|
||||
|
||||
for artifact in .archeflow/artifacts/${RUN_ID}/check-*.md; do
|
||||
REVIEWER=$(basename "$artifact" .md | sed 's/check-//')
|
||||
|
||||
# Read findings table rows (CRITICAL and WARNING only)
|
||||
grep -E '\| (CRITICAL|WARNING) \|' "$artifact" 2>/dev/null | while IFS= read -r line; do
|
||||
SEVERITY=$(echo "$line" | grep -oE '(CRITICAL|WARNING)' | head -1)
|
||||
DOWNGRADE_REASON=""
|
||||
|
||||
# Check 1: banned phrases
|
||||
for phrase in "${BANNED_PHRASES[@]}"; do
|
||||
if echo "$line" | grep -qi "$phrase"; then
|
||||
DOWNGRADE_REASON="banned phrase: $phrase"
|
||||
break
|
||||
fi
|
||||
done
|
||||
|
||||
# Check 2: no evidence markers (only if not already flagged)
|
||||
if [[ -z "$DOWNGRADE_REASON" ]]; then
|
||||
HAS_EVIDENCE=false
|
||||
for marker in "${EVIDENCE_MARKERS[@]}"; do
|
||||
if echo "$line" | grep -qiE "$marker"; then
|
||||
HAS_EVIDENCE=true
|
||||
break
|
||||
fi
|
||||
done
|
||||
if [[ "$HAS_EVIDENCE" == "false" ]]; then
|
||||
DOWNGRADE_REASON="no evidence cited"
|
||||
fi
|
||||
fi
|
||||
|
||||
if [[ -n "$DOWNGRADE_REASON" ]]; then
|
||||
echo "EVIDENCE DOWNGRADE: $REVIEWER $SEVERITY finding — $DOWNGRADE_REASON"
|
||||
./lib/archeflow-event.sh "$RUN_ID" decision check "" \
|
||||
'{"what":"evidence_downgrade","from":"'"$SEVERITY"'","to":"INFO","reviewer":"'"$REVIEWER"'","reason":"'"$DOWNGRADE_REASON"'"}'
|
||||
# Note: the orchestrator tracks downgraded findings separately —
|
||||
# do not modify the artifact file (avoids sed corruption on table rows)
|
||||
fi
|
||||
done
|
||||
done
|
||||
```
|
||||
|
||||
**Important:** Downgraded findings are tracked in events, NOT by modifying artifact files. The Act phase reads the decision events to know which findings were downgraded and excludes them from CRITICAL tallies.
|
||||
|
||||
#### 3d. Phase Transition: Check to Act
|
||||
|
||||
Collect all verdict seq numbers for the parent array.
|
||||
@@ -689,3 +802,89 @@ Run ID: <run_id> | Workflow: <standard> | Cycle: 1/<max>
|
||||
Artifacts: .archeflow/artifacts/<run_id>/
|
||||
Report: .archeflow/events/<run_id>.jsonl
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pipeline Strategy Execution
|
||||
|
||||
When `STRATEGY=pipeline`, execute this linear flow instead of the PDCA cycle above.
|
||||
|
||||
### Pipeline Phases
|
||||
|
||||
```
|
||||
Plan -> Implement -> Spec-Review -> Quality-Review -> Verify
|
||||
```
|
||||
|
||||
No cycle-back. Each phase runs once.
|
||||
|
||||
### 1. Plan
|
||||
|
||||
Spawn Creator only (no Explorer). Use fast-workflow Creator prompt with Mini-Reflect.
|
||||
|
||||
Save output to `.archeflow/artifacts/${RUN_ID}/plan-creator.md`.
|
||||
|
||||
### 2. Implement
|
||||
|
||||
Spawn Maker in isolated worktree with Creator's proposal.
|
||||
|
||||
Save output to `.archeflow/artifacts/${RUN_ID}/do-maker.md`.
|
||||
|
||||
### 3. Spec-Review
|
||||
|
||||
Run Guardian and Skeptic **sequentially** (Guardian first, then Skeptic only if Guardian has findings).
|
||||
|
||||
- Guardian receives: Maker's git diff + proposal risk section
|
||||
- Skeptic receives: Creator's proposal (assumptions focus)
|
||||
|
||||
Save to `check-guardian.md` and `check-skeptic.md`.
|
||||
|
||||
### 4. Quality-Review
|
||||
|
||||
Spawn Sage with proposal + diff + implementation summary.
|
||||
|
||||
Save to `check-sage.md`.
|
||||
|
||||
### 5. Verify
|
||||
|
||||
Run the project's test suite. If tests pass and no CRITICAL findings exist:
|
||||
|
||||
1. Merge the Maker's branch
|
||||
2. Emit `run.complete`
|
||||
|
||||
If CRITICAL findings exist:
|
||||
|
||||
1. **Do NOT merge yet** — the branch remains separate
|
||||
2. Spawn Maker for a **single targeted fix** — provide only the CRITICAL findings as context
|
||||
3. Re-run the reviewer(s) that raised the CRITICAL finding(s) on just the fixed files
|
||||
4. Re-run test suite
|
||||
5. If tests pass and re-review approves: merge
|
||||
6. If still failing after this one fix attempt: **abort** — do NOT merge, report to user with the branch name for manual resolution
|
||||
|
||||
```bash
|
||||
# Pipeline verify: explicit merge guard
|
||||
if [[ "$VERIFY_PASS" == "true" ]]; then
|
||||
./lib/archeflow-git.sh merge "$RUN_ID" --no-ff
|
||||
./lib/archeflow-rollback.sh "$RUN_ID" # post-merge test validation
|
||||
else
|
||||
echo "Pipeline aborted: CRITICAL findings not resolved after 1 fix attempt."
|
||||
echo "Branch: archeflow/$RUN_ID (not merged)"
|
||||
# Emit run.complete with status: aborted
|
||||
fi
|
||||
```
|
||||
|
||||
WARNINGs are logged in the run event but do not block the merge.
|
||||
|
||||
### Pipeline Progress Display
|
||||
|
||||
```
|
||||
━━━ ArcheFlow Pipeline: <task> ━━━━━━━━━━━━━━━━
|
||||
Run ID: <run_id> | Strategy: pipeline
|
||||
|
||||
[Plan] Creator designing... -> done (20s)
|
||||
[Implement] Maker building... -> done (60s, 3 files)
|
||||
[Spec] Guardian reviewing... -> APPROVED
|
||||
[Quality] Sage reviewing... -> APPROVED (1 WARNING)
|
||||
[Verify] Tests passing... -> merged to main
|
||||
|
||||
━━━ Complete: 2m 15s ━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
```
|
||||
|
||||
@@ -11,7 +11,7 @@ Multi-agent orchestration using archetypal roles and PDCA quality cycles.
|
||||
|
||||
On activation, print ONE line:
|
||||
```
|
||||
archeflow v0.6.0 · 25 skills · <domain> domain
|
||||
archeflow v0.7.0 · 25 skills · <domain> domain
|
||||
```
|
||||
Where `<domain>` is auto-detected: `writing` if `colette.yaml` exists, `research` if paper/thesis files exist, `code` otherwise. Then proceed silently — no further announcement unless `archeflow:run` is invoked.
|
||||
|
||||
|
||||
@@ -25,6 +25,10 @@ ArcheFlow's PDCA cycles spiral upward through iterations — each cycle incorpor
|
||||
│ Plan (design) ← Cycle 1 (initial)
|
||||
```
|
||||
|
||||
## Strategy vs Workflow
|
||||
|
||||
A **strategy** defines the execution shape: PDCA is cyclic (Plan-Do-Check-Act with feedback loops), pipeline is linear (Plan-Implement-Review-Verify, no cycle-back). A **workflow** defines the depth: fast uses fewer agents and cycles, thorough uses more. Strategy and workflow are orthogonal — you can run a `fast` workflow with either strategy, though `thorough` always uses PDCA because linear flows cannot iterate on findings.
|
||||
|
||||
## Built-in Workflows
|
||||
|
||||
### `fast` — Single Turn
|
||||
|
||||
Reference in New Issue
Block a user