- archeflow-decision.sh records decision points during runs - archeflow-replay.sh: timeline, whatif, compare commands - What-if replay with adjustable archetype weights - /af-replay skill for interactive use - Tests in archeflow-replay.bats
43 lines
2.0 KiB
Markdown
43 lines
2.0 KiB
Markdown
---
|
||
name: af-replay
|
||
description: "Replay and analyze a recorded ArcheFlow run: decision timeline and weighted what-if. Usage: /af-replay <run-id> [--timeline|--whatif|--compare] [--weights arch=w,...]"
|
||
user-invocable: true
|
||
---
|
||
|
||
# ArcheFlow Run Replay
|
||
|
||
Inspect a completed or in-progress run logged in `.archeflow/events/<run_id>.jsonl`. Use this to study which archetypes drove outcomes and to simulate **weighted** consensus (what-if).
|
||
|
||
## Recording (during PDCA)
|
||
|
||
After each meaningful orchestration choice, log a **decision point** (in addition to `review.verdict` where applicable):
|
||
|
||
```bash
|
||
./lib/archeflow-decision.sh <run_id> <phase> <archetype> '<input_summary>' '<decision>' <confidence> [parent_seq]
|
||
```
|
||
|
||
Fields stored: `phase`, `archetype`, `input`, `decision`, `confidence`, `ts` (event timestamp). The event type is `decision.point`.
|
||
|
||
Lower-level alternative:
|
||
|
||
```bash
|
||
./lib/archeflow-event.sh "$RUN_ID" decision.point check guardian \
|
||
'{"archetype":"guardian","input":"diff","decision":"needs_changes","confidence":0.85}' 7
|
||
```
|
||
|
||
## Commands (from project root)
|
||
|
||
| Action | Shell |
|
||
|--------|--------|
|
||
| Timeline | `./lib/archeflow-replay.sh timeline <run_id>` |
|
||
| What-if | `./lib/archeflow-replay.sh whatif <run_id> [--weights guardian=2,sage=0.5] [--threshold 0.5] [--json]` |
|
||
| Both | `./lib/archeflow-replay.sh compare <run_id> [--weights ...]` |
|
||
|
||
- **Timeline** lists `decision.point` rows and `review.verdict` (check phase).
|
||
- **What-if** reads the **last** `review.verdict` per archetype in check. **Original** outcome uses strict any-veto (any non-approve → BLOCK). **Replay** uses weighted mean strictness: each reviewer contributes weight × (1 if not approved, else 0); BLOCK if mean ≥ threshold (default 0.5).
|
||
- **`--json`** emits machine-readable output for dashboards or scripts.
|
||
|
||
## Learning effectiveness
|
||
|
||
Correlate `decision.point` confidence and verdicts with cycle outcomes (`cycle.boundary`, `run.complete`) and `./lib/archeflow-score.sh extract` to see which archetypes add signal for which task shapes.
|