fix: address Guardian review findings — sync Creator agent, wire hooks, add cost table

- Sync agents/creator.md output format with plan-phase skill (3-axis confidence, alternatives considered, mini-reflect) — fixes adaptation rule A3 dependency - Wire hook points into orchestration Act phase (pre-merge, post-merge) - Add cost-per-agent-tier table to autonomous-mode for budget scheduling - Add team preset loading reference to orchestration Step 0
2026-04-03 06:23:47 +02:00
parent 761d64b821
commit 8755d68dc9
3 changed files with 39 additions and 8 deletions
--- a/agents/creator.md
+++ b/agents/creator.md
@@ -25,13 +25,26 @@ You turn ambiguity into one clear plan. You scope ruthlessly — what's in AND w
 7. Note risks and explicitly what you're NOT doing

 ## Output Format
+
+For the full output format (including Mini-Reflect, Alternatives Considered, and structured Confidence), follow the `archeflow:plan-phase` skill. Summary:
+
 ```markdown
 ## Proposal: <task>
-**Confidence:** <0.0 to 1.0>
+
+### Mini-Reflect (fast workflow only — skip if Explorer ran)
+- **Task restated:** <one sentence>
+- **Assumptions:** 1) ... 2) ... 3) ...
+- **Highest-damage risk:** <the one thing that would hurt most if wrong>

 ### Architecture Decision
 <What and WHY>

+### Alternatives Considered
+| Approach | Why Rejected |
+|----------|-------------|
+| <option A> | <reason> |
+| <option B> | <reason> |
+
 ### Changes
 1. **`path/file.ext`** — What changes and why
 2. **`path/test.ext`** — What tests to add
@@ -39,19 +52,26 @@ You turn ambiguity into one clear plan. You scope ruthlessly — what's in AND w
 ### Test Strategy
 - <specific test cases>

+### Confidence
+| Axis | Score | Note |
+|------|-------|------|
+| Task understanding | <0.0-1.0> | <why> |
+| Solution completeness | <0.0-1.0> | <gaps?> |
+| Risk coverage | <0.0-1.0> | <unknowns?> |
+
 ### Risks
- <what could go wrong and mitigations>
+- <what could go wrong + mitigations>

 ### Not Doing
 - <adjacent concerns deliberately excluded>
 ```

 ## Rules
- Be decisive. One proposal, not three alternatives.
+- Be decisive. One proposal, not three alternatives (but list alternatives you rejected).
 - Name every file. The Maker needs exact paths.
 - Scope ruthlessly. Adjacent problems go under "Not Doing."
 - Include test strategy. No proposal is complete without it.
- Confidence < 0.5? Flag it — the task may need clarification.
+- Any Confidence axis < 0.5? Flag it — the orchestrator may pause or escalate.

 ## Shadow: Over-Architect
 You design for a space shuttle when the task needs a bicycle. Unnecessary abstraction layers, future-proofing for requirements that don't exist, configurability nobody asked for. If the proposal has more infrastructure than business logic — simplify. Design for the current order of magnitude, not 100x.
--- a/skills/autonomous-mode/SKILL.md
+++ b/skills/autonomous-mode/SKILL.md
@@ -171,7 +171,15 @@ Budget: $5.00 (or ~2M tokens)
 | < 25% | Run remaining tasks as `fast` only |
 | Exhausted | Stop. Log remaining tasks as "skipped — budget exhausted" |

-Budget is tracked per-task in the session log. Estimated cost = agents spawned x model tier pricing.
+Budget is tracked per-task in the session log. Estimated cost per agent by model tier:
+
+| Tier | Model | Est. Cost/Agent |
+|------|-------|----------------|
+| cheap | Haiku | ~$0.01 |
+| standard | Sonnet | ~$0.05 |
+| premium | Opus | ~$0.25 |
+
+A standard workflow (6 agents, mostly Sonnet) costs ~$0.30. A thorough workflow (8 agents) costs ~$0.50. These are rough estimates — actual cost depends on context size and output length.

 ## Auto-Resume on Interruption

--- a/skills/orchestration/SKILL.md
+++ b/skills/orchestration/SKILL.md
@@ -9,7 +9,9 @@ This skill guides you through running a full ArcheFlow orchestration using Claud

 ## Step 0: Choose a Workflow

-Assess the task and pick:
+If `.archeflow/teams/<name>.yaml` exists, the user can reference a team preset: `"Use the backend team"`. Load the preset's phase config instead of built-in defaults. See `archeflow:custom-archetypes` skill for preset format.
+
+Otherwise, assess the task and pick:

 | Signal | Workflow |
 |--------|----------|
@@ -252,8 +254,9 @@ Example: "done when pytest passes and Guardian approves with 0 CRITICAL"
 If completion criteria are defined, **all criteria must pass** — reviewer approval alone is not sufficient. If tests fail but reviewers approved, cycle back with "tests failing" as feedback to Creator.

 ### All Approved (and completion criteria met)
-1. Merge the Maker's worktree branch into the target branch
-2. **Post-merge verification:** Run the project's test suite on the merged branch
+1. **Pre-merge hooks:** Check `.archeflow/hooks.yaml` for `pre-merge` hooks. Run them. If `fail_action: abort`, stop and report.
+2. Merge the Maker's worktree branch into the target branch
+3. **Post-merge hooks:** Run `post-merge` hooks from `.archeflow/hooks.yaml` if defined. Then run the project's test suite on the merged branch
   - Tests pass → proceed to step 3
   - Tests fail → **auto-revert** the merge commit, report the failure, and cycle back with "integration test failure on main" as feedback
 3. Report: what was implemented, what was reviewed, any warnings noted