feat: add run replay for archetype effectiveness analysis

- archeflow-decision.sh records decision points during runs - archeflow-replay.sh: timeline, whatif, compare commands - What-if replay with adjustable archetype weights - /af-replay skill for interactive use - Tests in archeflow-replay.bats
2026-04-06 21:43:29 +02:00
parent 506143d613
commit 4f8e2a9962
8 changed files with 129 additions and 4 deletions
--- a/skills/run/SKILL.md
+++ b/skills/run/SKILL.md
@@ -352,6 +352,7 @@ Emit events via `./lib/archeflow-event.sh <run_id> <type> <phase> <agent> '<json
 | After agent returns | `agent.complete` | archetype, duration_ms, artifacts, summary |
 | Phase boundary | `phase.transition` | from, to, artifacts_so_far |
 | Alternative chosen | `decision` | what, chosen, alternatives, rationale |
+| Orchestrator decision (replay) | `decision.point` | archetype, input, decision, confidence — use `./lib/archeflow-decision.sh` |
 | Reviewer verdict | `review.verdict` | archetype, verdict, findings[] |
 | Fix addressing review | `fix.applied` | source, finding, file, line |
 | End of PDCA cycle | `cycle.boundary` | cycle, max_cycles, exit_condition, convergence |
@@ -403,6 +404,12 @@ Scores stored in `.archeflow/memory/effectiveness.jsonl`. After 10+ runs, recomm

 ---

+## Run replay (decision log + what-if)
+
+After key choices (routing, fast-path skip, escalation), emit `decision.point` via `./lib/archeflow-decision.sh` so runs can be inspected with `./lib/archeflow-replay.sh timeline|whatif|compare <run_id>`. Weighted what-if helps estimate how much each review archetype swayed the effective ship/block outcome. See skill `af-replay`.
+
+---
+
 ## Dry-Run Mode

 When `--dry-run`: Run Plan phase only. Display workflow, agent counts, confidence scores, cost estimate. Ask user to proceed. If yes, continue with `--start-from do`.