refactor: simplify memory and shadow-detection skills

Trim verbose implementation details that duplicate what the bash helper scripts already handle. Memory skill: 278 -> 120 lines. Shadow detection skill: 180 -> 66 lines. All essential protocols, tables, and commands preserved; removed redundant algorithm descriptions, multiple examples, and narrative prose.
2026-04-06 20:42:47 +02:00
parent af1f4e7da7
commit 8837a359ac
2 changed files with 83 additions and 354 deletions
--- a/skills/memory/SKILL.md
+++ b/skills/memory/SKILL.md
@@ -11,21 +11,14 @@ description: |

 # Cross-Run Memory

-ArcheFlow forgets everything after each run. If Guardian repeatedly flags the same type of issue (e.g., timeline errors in fiction, missing null checks in code), the next run starts from zero. This skill fixes that by extracting lessons from completed runs and injecting them into future agent prompts.
+ArcheFlow forgets everything after each run. This skill extracts lessons from completed runs and injects them into future agent prompts, so recurring issues (timeline errors, missing null checks) are caught proactively.

 ## Storage

 ```
 .archeflow/memory/lessons.jsonl     # Append-only, one lesson per line
-```
-
-Each lesson is a single JSON line:
-
-```jsonl
-{"id":"m-001","ts":"2026-04-03T14:00:00Z","run_id":"2026-04-03-der-huster","type":"pattern","source":"guardian","description":"Timeline references must match story start day","frequency":2,"severity":"bug","domain":"writing","tags":["continuity","timeline"],"last_seen_run":"2026-04-03-der-huster","runs_since_last_seen":0}
-{"id":"m-002","ts":"2026-04-03T15:00:00Z","run_id":"2026-04-03-der-huster","type":"preference","source":"user_feedback","description":"User prefers single bundled PR over many small ones","frequency":1,"severity":"info","domain":"general","tags":["workflow"],"last_seen_run":"","runs_since_last_seen":0}
-{"id":"m-003","ts":"2026-04-04T10:00:00Z","run_id":"2026-04-04-auth-fix","type":"archetype_hint","source":"sage","description":"Voice drift most common in long monologue passages","frequency":3,"severity":"warning","domain":"writing","tags":["voice","prose"],"archetype":"story-sage","last_seen_run":"2026-04-04-auth-fix","runs_since_last_seen":0}
-{"id":"m-004","ts":"2026-04-04T11:00:00Z","run_id":"2026-04-04-auth-fix","type":"anti_pattern","source":"maker","description":"Splitting auth middleware into per-route handlers causes duplication","frequency":1,"severity":"warning","domain":"code","tags":["auth","middleware"],"last_seen_run":"2026-04-04-auth-fix","runs_since_last_seen":0}
+.archeflow/memory/archive.jsonl     # Decayed lessons (frequency reached 0)
+.archeflow/memory/audit.jsonl       # Injection audit trail
 ```

 ## Lesson Types
@@ -33,245 +26,95 @@ Each lesson is a single JSON line:
 | Type | Source | Description |
 |------|--------|-------------|
 | `pattern` | Auto-detected | Recurring finding across runs (same category + similar description) |
-| `preference` | Manual | User correction or workflow preference (added via CLI) |
+| `preference` | Manual | User correction or workflow preference (injected immediately, skips frequency threshold) |
 | `archetype_hint` | Auto-detected | Per-archetype insight (e.g., Sage catches voice drift in monologues) |
-| `anti_pattern` | Manual or auto | Something that was tried and failed — avoid repeating |
+| `anti_pattern` | Manual or auto | Something that was tried and failed -- avoid repeating |

-## Lesson Fields
+## Lesson JSON Fields

 | Field | Type | Description |
 |-------|------|-------------|
-| `id` | string | Unique ID, format `m-NNN` (monotonically increasing) |
-| `ts` | ISO 8601 | When the lesson was created or last updated |
+| `id` | string | `m-NNN` (monotonically increasing) |
+| `ts` | ISO 8601 | Created or last updated |
 | `run_id` | string | Run that created or last triggered this lesson |
-| `type` | string | One of: `pattern`, `preference`, `archetype_hint`, `anti_pattern` |
-| `source` | string | Archetype or `user_feedback` that originated the lesson |
+| `type` | string | `pattern`, `preference`, `archetype_hint`, `anti_pattern` |
+| `source` | string | Archetype name or `user_feedback` |
 | `description` | string | Human-readable lesson text |
-| `frequency` | integer | How many times this lesson was triggered |
-| `severity` | string | `bug`, `warning`, `info`, or `recommendation` |
+| `frequency` | integer | Times this lesson was triggered |
+| `severity` | string | `bug`, `warning`, `info`, `recommendation` |
 | `domain` | string | `writing`, `code`, `general`, or project-specific |
 | `tags` | string[] | Keywords for matching and filtering |
-| `archetype` | string or null | For `archetype_hint` type — which archetype this applies to |
-| `last_seen_run` | string | Run ID where this lesson was last matched |
-| `runs_since_last_seen` | integer | Counter for decay — incremented each run that does NOT trigger this lesson |
+| `archetype` | string? | For `archetype_hint` -- which archetype this applies to |
+| `last_seen_run` | string | Run ID where last matched |
+| `runs_since_last_seen` | integer | Counter for decay |
+
+Example:
+```jsonl
+{"id":"m-001","ts":"2026-04-03T14:00:00Z","run_id":"2026-04-03-der-huster","type":"pattern","source":"guardian","description":"Timeline references must match story start day","frequency":2,"severity":"bug","domain":"writing","tags":["continuity","timeline"],"last_seen_run":"2026-04-03-der-huster","runs_since_last_seen":0}
+```

 ---

 ## Auto-Detection

-After each `run.complete`, the orchestrator runs lesson extraction:
+After each `run.complete`, extract lessons from findings:

 ```bash
 ./lib/archeflow-memory.sh extract .archeflow/events/<run_id>.jsonl
 ```

-### Extraction Algorithm
+The script reads `review.verdict` events, matches findings against existing lessons by keyword overlap (50%+ threshold), increments frequency on matches, and creates new candidate lessons (frequency: 1) for unmatched findings with severity >= WARNING.

-1. **Read all `review.verdict` events** from the completed run's JSONL.
-2. **For each finding** in each verdict:
-   a. Tokenize the finding description into keywords (lowercase, strip punctuation).
-   b. Compare keywords against each existing lesson's description + tags.
-   c. **Match threshold:** 50%+ keyword overlap between finding and lesson.
-3. **If match found:** Update the existing lesson:
-   - Increment `frequency` by 1
-   - Update `ts` to now
-   - Update `last_seen_run` to current run ID
-   - Reset `runs_since_last_seen` to 0
-4. **If no match AND severity >= WARNING:** Add as candidate lesson with `frequency: 1`.
-5. **Candidates become active** when `frequency >= 2` (triggered in a second run).
-
-### Promotion Rule
-
-A finding that appears in only one run stays at `frequency: 1` — it might be a one-off. Once the same pattern appears in a second run (matched by keyword overlap), it gets promoted to `frequency: 2` and becomes eligible for injection.
-
-This prevents noise from single-run anomalies while still capturing genuine recurring issues quickly.
-
---
+**Promotion rule:** A finding needs `frequency >= 2` (seen in 2+ runs) before injection. This filters out one-off noise. Preferences skip this threshold.

 ## Injection

-At run start, before spawning agents, the orchestrator injects relevant lessons:
+Before spawning agents, inject relevant lessons:

 ```bash
 LESSONS=$(./lib/archeflow-memory.sh inject <domain> <archetype>)
 ```

-### Injection Rules
+Rules: filters by domain (or `general`), optionally by archetype, requires `frequency >= 2`, sorts by frequency descending, caps at 10 lessons. Lessons with `frequency >= 5` are always injected regardless of filters.

-1. Read `lessons.jsonl`.
-2. Filter by `domain` (exact match or `general`) and optionally by `archetype`.
-3. Only include lessons with `frequency >= 2` (confirmed patterns).
-4. Sort by frequency descending (most common first).
-5. Cap at **10 lessons** per injection.
-6. Lessons with `frequency >= 5` are **always injected** regardless of domain/archetype filter (they are universal enough to matter).
-
-### Injection Format
-
-Append to the agent's system prompt as a structured section:
+Injected as a markdown section appended to the agent's system prompt:

 ```markdown
 ## Known Issues (from past runs)
 - Timeline references must match story start day [seen 3x, guardian]
 - Voice drift common in monologue passages >200 words [seen 2x, sage]
- Missing null checks in API response handlers [seen 5x, guardian]
 ```

-### Integration with Run Skill
-
-In the `run` skill, after Step 0 (Initialize) and before Step 1 (Plan Phase):
-
-```bash
-# Load cross-run memory for this domain
-MEMORY_LESSONS=$(./lib/archeflow-memory.sh inject "$DOMAIN" "")
-
-# Inject into Explorer/Creator prompts if non-empty
-if [[ -n "$MEMORY_LESSONS" ]]; then
-  EXPLORER_PROMPT="${EXPLORER_PROMPT}
-
-${MEMORY_LESSONS}"
-  CREATOR_PROMPT="${CREATOR_PROMPT}
-
-${MEMORY_LESSONS}"
-fi
-```
-
-For reviewers in the Check phase, inject archetype-specific lessons:
-
-```bash
-GUARDIAN_LESSONS=$(./lib/archeflow-memory.sh inject "$DOMAIN" "guardian")
-SAGE_LESSONS=$(./lib/archeflow-memory.sh inject "$DOMAIN" "sage")
-```
-
---
-
 ## Decay

-Lessons that stop being relevant should fade out. After each `run.complete`, apply decay:
+After each `run.complete`, apply decay: lessons not seen for 10 runs lose 1 frequency. When frequency reaches 0, the lesson is archived.

 ```bash
 ./lib/archeflow-memory.sh decay
 ```

-### Decay Algorithm
-
-1. For every lesson in `lessons.jsonl`:
-   - If `last_seen_run` is NOT the current run → increment `runs_since_last_seen` by 1
-2. If `runs_since_last_seen >= 10`:
-   - Decrement `frequency` by 1
-   - Reset `runs_since_last_seen` to 0
-3. If `frequency` drops to 0:
-   - Move the lesson to `.archeflow/memory/archive.jsonl` (append)
-   - Remove from `lessons.jsonl`
-
-This means a lesson that was seen 5 times but then stops appearing will survive 50 runs of non-triggering before being fully archived (5 decrements x 10 runs each).
-
---
-
 ## Manual Management

-### Add a lesson
+```bash
+archeflow memory add "User prefers single bundled PR"   # Add preference (injected immediately)
+archeflow memory list                                    # Show all active lessons
+archeflow memory forget m-002                            # Archive a lesson
+```
+
+## Audit Trail
+
+Track which lessons are injected per run and whether they were effective. Pass `--audit <run_id>` to inject to log records. After a run, `audit-check <run_id>` compares injected lessons against review findings: no matching finding = helpful (issue prevented), matching finding = ineffective (issue repeated despite injection).

 ```bash
-archeflow memory add "User prefers single bundled PR over many small ones"
-# Internally: ./lib/archeflow-memory.sh add preference "User prefers single bundled PR over many small ones"
+./lib/archeflow-memory.sh inject "$DOMAIN" "" --audit "$RUN_ID"
+./lib/archeflow-memory.sh audit-check <run_id>
 ```

-Manually added lessons start at `frequency: 1` but with type `preference`, which means they are injected immediately (preferences skip the frequency >= 2 threshold).
-
-### List lessons
-
-```bash
-archeflow memory list
-# Internally: ./lib/archeflow-memory.sh list
-```
-
-Output:
-
-```
-ID       Freq  Type            Domain   Description
-m-001    3     pattern         writing  Timeline references must match story start day
-m-002    1     preference      general  User prefers single bundled PR over many small ones
-m-003    5     archetype_hint  writing  Voice drift most common in long monologue passages
-m-004    1     anti_pattern    code     Splitting auth middleware causes duplication
-```
-
-### Forget a lesson
-
-```bash
-archeflow memory forget m-002
-# Internally: ./lib/archeflow-memory.sh forget m-002
-```
-
-Moves the lesson to `archive.jsonl` regardless of frequency.
-
---
-
 ## Integration Points

 | Moment | Action | Script Command |
 |--------|--------|----------------|
 | After `run.complete` | Extract lessons from findings | `archeflow-memory.sh extract <events.jsonl>` |
 | After extraction | Apply decay to all lessons | `archeflow-memory.sh decay` |
-| Before agent spawn (run start) | Inject relevant lessons | `archeflow-memory.sh inject <domain> <archetype>` |
+| Before agent spawn | Inject relevant lessons | `archeflow-memory.sh inject <domain> <archetype>` |
 | User command | Add/list/forget lessons | `archeflow-memory.sh add/list/forget` |
-
-## Audit Trail
-
-Track which lessons are injected into each run and whether they were effective.
-
-### Storage
-
-```
-.archeflow/memory/audit.jsonl    # Append-only audit log
-```
-
-### Injection Audit Record
-
-When `--audit <run_id>` is passed to the `inject` command, an audit record is written:
-
-```jsonl
-{"ts":"2026-04-04T10:00:00Z","run_id":"2026-04-04-auth-fix","domain":"code","archetype":"","lessons_injected":["m-001","m-003"],"lesson_count":2}
-```
-
-Usage:
-```bash
-./lib/archeflow-memory.sh inject "$DOMAIN" "" --audit "$RUN_ID"
-```
-
-### Effectiveness Check
-
-After a run completes, check whether injected lessons prevented issues:
-
-```bash
-./lib/archeflow-memory.sh audit-check <run_id>
-```
-
-This command:
-1. Reads `audit.jsonl` for lessons injected in the given run
-2. Reads the run's event file for `review.verdict` events
-3. For each injected lesson, checks keyword overlap between the lesson's description and review findings
-4. **No matching finding** = `helpful` (the lesson likely prevented the issue)
-5. **Matching finding** = `ineffective` (the issue repeated despite the lesson being injected)
-6. Appends effectiveness results to `audit.jsonl`
-
-### Effectiveness Over Time
-
-By querying `audit.jsonl` for effectiveness records, you can measure:
- Which lessons consistently prevent issues (high `helpful` count)
- Which lessons are not working (high `ineffective` count — consider rewording or removing)
- Overall memory system ROI (ratio of helpful to ineffective across all runs)
-
-```bash
-# Count effectiveness per lesson
-jq -r 'select(.type == "effectiveness_check") | [.lesson_id, .effectiveness] | @tsv' .archeflow/memory/audit.jsonl | sort | uniq -c
-```
-
---
-
-## Design Principles
-
-1. **Append-only storage.** `lessons.jsonl` is append-only during writes; decay rewrites the file in place but preserves all data (archived lessons move to `archive.jsonl`).
-2. **Conservative promotion.** A finding must appear in 2+ runs before injection. One-offs are noise.
-3. **Graceful degradation.** If `lessons.jsonl` doesn't exist, injection returns empty — no error, no block.
-4. **Cheap.** Keyword matching, not embeddings. `jq` for JSON, `grep` for matching. No external services.
-5. **Bounded.** Max 10 lessons injected per prompt. Prevents context pollution.