refactor: trim 11 secondary ArcheFlow skills from 3340 to 952 lines

Remove verbose YAML examples, bash pseudo-code, tutorial prose, and motivational content from configuration/integration skills while preserving all operational protocols, reference tables, and rules. Skills trimmed: domains, colette-bridge, multi-project, cost-tracking, git-integration, custom-archetypes, workflow-design, templates, autonomous-mode, progress, presence.
2026-04-06 20:48:50 +02:00
parent c8bd55d97c
commit d94688ca1b
11 changed files with 485 additions and 2868 deletions
--- a/skills/cost-tracking/SKILL.md
+++ b/skills/cost-tracking/SKILL.md
@@ -8,320 +8,87 @@ description: |
  <example>Automatically active when budget is configured</example>
 ---

-# Cost Tracking — Budget-Aware Orchestration
+# Cost Tracking -- Budget-Aware Orchestration

-Every ArcheFlow orchestration consumes LLM tokens. This skill tracks costs per agent and per run, enforces budgets, and recommends cost-optimal model assignments.
+Tracks costs per agent and per run, enforces budgets, and selects cost-optimal models.

-## Model Pricing Table
+## Model Pricing

-Current pricing (update when models change):
+| Model | Input ($/M tok) | Output ($/M tok) |
+|-------|----------------:|-----------------:|
+| claude-opus-4-6 | 15.00 | 75.00 |
+| claude-sonnet-4-6 | 3.00 | 15.00 |
+| claude-haiku-4-5 | 0.80 | 4.00 |

-| Model | Input ($/M tokens) | Output ($/M tokens) | Notes |
-|-------|--------------------:|---------------------:|-------|
-| `claude-opus-4-6` | 15.00 | 75.00 | Highest quality, use sparingly |
-| `claude-sonnet-4-6` | 3.00 | 15.00 | Good balance of quality and cost |
-| `claude-haiku-4-5` | 0.80 | 4.00 | Cheap, fast, good for structured tasks |
+**Prompt caching:** 90% discount on cached input tokens. Structure system prompts for cache hits.
+**Batches API:** 50% discount. Use for non-time-sensitive bulk ops.

-**Prompt caching** (when applicable): 90% discount on cached input tokens. The orchestrator should structure system prompts to maximize cache hits (archetype instructions, voice profiles, and domain context are cache-friendly since they repeat across agents in a run).
-
-**Batches API**: 50% discount on all tokens. Use for non-time-sensitive bulk operations (validation passes, consistency checks).
-
-## Per-Agent Cost Tracking
-
-Every `agent.complete` event includes cost data:
-
-```jsonl
-{
-  "type": "agent.complete",
-  "data": {
-    "archetype": "story-explorer",
-    "duration_ms": 87605,
-    "tokens_input": 15000,
-    "tokens_output": 6000,
-    "tokens_cache_read": 8000,
-    "model": "haiku",
-    "estimated_cost_usd": 0.02,
-    "summary": "3 plot directions developed, recommended C"
-  }
-}
-```
-
-### Cost Calculation
+## Cost Calculation

 ```
-cost = (tokens_input - tokens_cache_read) * input_price / 1_000_000
-     + tokens_cache_read * input_price * 0.10 / 1_000_000
-     + tokens_output * output_price / 1_000_000
+cost = (input - cache_read) * input_price/1M
+     + cache_read * input_price * 0.10/1M
+     + output * output_price/1M
 ```

-If exact token counts are unavailable (Claude Code doesn't always expose them), estimate based on character count:
+If exact tokens unavailable, estimate: `tokens ~= chars / 4`. Mark with `cost_estimated: true`.

-```
-estimated_tokens = character_count / 4   # rough heuristic
-```
+## Default Model Assignments

-Mark estimated costs with `"cost_estimated": true` in the event data so reports can distinguish measured from estimated values.
+| Archetype | Code | Writing |
+|-----------|------|---------|
+| Explorer | haiku | haiku |
+| Creator | sonnet | sonnet |
+| Maker | sonnet | **sonnet** |
+| Guardian | haiku | haiku |
+| Skeptic | haiku | haiku |
+| Sage | sonnet | **sonnet** |
+| Trickster | haiku | haiku |

-## Run-Level Aggregation
+Opus is user-opt-in only (team preset `model_overrides`).

-The `run.complete` event includes cost totals:
+**Resolution order:** team preset override > domain override > archetype default.

-```jsonl
-{
-  "type": "run.complete",
-  "data": {
-    "status": "completed",
-    "total_tokens_input": 95000,
-    "total_tokens_output": 33000,
-    "total_tokens_cache_read": 42000,
-    "total_cost_usd": 1.45,
-    "budget_usd": 10.00,
-    "budget_remaining_usd": 8.55,
-    "agents_total": 5,
-    "cost_by_phase": {
-      "plan": 0.35,
-      "do": 0.72,
-      "check": 0.38
-    },
-    "cost_by_model": {
-      "haiku": 0.12,
-      "sonnet": 1.33
-    }
-  }
-}
-```
+## Pre-Agent Cost Estimates

-### Cost Summary in Orchestration Report
+| Archetype | Typical Input | Typical Output |
+|-----------|-------------:|---------------:|
+| Explorer | 8k | 4k |
+| Creator | 12k | 6k |
+| Maker | 15k | 12k |
+| Guardian | 10k | 3k |
+| Skeptic | 8k | 3k |
+| Sage | 12k | 4k |
+| Trickster | 8k | 4k |

-After each orchestration, the report includes a cost section:
-
-```markdown
-## Cost Summary
-| Phase | Model(s) | Tokens (in/out) | Cost |
-|-------|----------|-----------------|------|
-| Plan  | haiku, sonnet | 32k / 12k | $0.35 |
-| Do    | sonnet | 40k / 15k | $0.72 |
-| Check | haiku, sonnet | 23k / 6k | $0.38 |
-| **Total** | | **95k / 33k** | **$1.45** |
-
-Budget: $10.00 | Spent: $1.45 | Remaining: $8.55
-```
+After 10+ runs, use actual averages from `metrics.jsonl` instead.

 ## Budget Configuration

-Budgets are defined in team presets or `.archeflow/config.yaml`:
-
 ```yaml
-# .archeflow/config.yaml
 budget:
-  per_run_usd: 10.00        # Max cost per orchestration run
-  per_agent_usd: 3.00       # Max cost per individual agent
-  daily_usd: 50.00          # Max daily spend across all runs
-  warn_at_percent: 75       # Warn when this % of budget is consumed
+  per_run_usd: 10.00
+  per_agent_usd: 3.00
+  daily_usd: 50.00
+  warn_at_percent: 75
 ```

-```yaml
-# Team preset override
-name: story-development
-domain: writing
-budget:
-  per_run_usd: 5.00         # Writing runs are usually cheaper
-```
-
-Team preset budget overrides the global config for that run.
-
-### Budget Precedence
-
-1. Team preset `budget` (if set)
-2. `.archeflow/config.yaml` `budget`
-3. No budget (unlimited) — costs are still tracked but not enforced
+Team preset budget overrides global config. No budget = unlimited (costs still tracked).

 ## Budget Enforcement

-Budget checks happen at two points:
+**Pre-agent:** Estimate cost. If > remaining budget: stop (autonomous) or warn (attended).

-### 1. Pre-Agent Check (before spawning)
+**Post-agent:** Update total. Warn at threshold. Stop if budget exceeded.

-Before each agent is spawned, estimate its cost and check against remaining budget:
+## Cost Optimization

-```
-estimated_agent_cost = estimate_tokens(archetype, task_complexity) * model_price
-remaining_budget = budget - sum(costs_so_far)
-
-if estimated_agent_cost > remaining_budget:
-    WARN: "Estimated cost for {archetype} (${estimated}) would exceed remaining budget (${remaining}). Continue? [y/N]"
-```
-
-**In autonomous mode**: if budget would be exceeded, STOP the run and report. Do not prompt — there is no one to answer.
-
-**In attended mode**: warn and ask the user. They can approve the overage or stop.
-
-### 2. Post-Agent Check (after completion)
-
-After each agent completes, update the running total and check:
-
-```
-if total_cost > budget * warn_at_percent / 100:
-    WARN: "Budget ${warn_at_percent}% consumed (${total_cost} of ${budget})"
-
-if total_cost > budget:
-    STOP: "Budget exceeded (${total_cost} of ${budget}). Run halted."
-```
-
-### Pre-Agent Cost Estimation
-
-Rough token estimates by archetype (calibrate over time with actual data from `metrics.jsonl`):
-
-| Archetype | Typical Input | Typical Output | Notes |
-|-----------|-------------:|---------------:|-------|
-| Explorer | 8k | 4k | Research, reads many files |
-| Creator | 12k | 6k | Receives Explorer output, produces plan |
-| Maker | 15k | 12k | Largest output (implementation/prose) |
-| Guardian | 10k | 3k | Reads diff, structured output |
-| Skeptic | 8k | 3k | Reads proposal, structured challenges |
-| Sage | 12k | 4k | Reads diff + proposal |
-| Trickster | 8k | 4k | Reads diff, generates test cases |
-
-These are starting estimates. After 10+ runs, use actual averages from `metrics.jsonl` instead.
-
-## Cost-Aware Model Selection
-
-Each archetype has a recommended model tier based on the quality requirements of its role:
-
-### Default Model Assignments (Code Domain)
-
-| Archetype | Model | Rationale |
-|-----------|-------|-----------|
-| Explorer | haiku | Research is structured extraction — cheap model handles it well |
-| Creator | sonnet | Design decisions need reasoning quality |
-| Maker | sonnet | Implementation needs quality to avoid rework cycles |
-| Guardian | haiku | Security/risk review is checklist-driven — structured and cheap |
-| Skeptic | haiku | Challenge generation follows patterns — cheap |
-| Sage | sonnet | Holistic quality judgment needs nuance |
-| Trickster | haiku | Adversarial testing is systematic — cheap |
-
-### Writing Domain Overrides
-
-Writing tasks need higher quality for prose-generating agents:
-
-| Archetype | Model | Rationale |
-|-----------|-------|-----------|
-| Explorer / story-explorer | haiku | Research is still cheap |
-| Creator | sonnet | Outline design needs narrative judgment |
-| Maker | **sonnet** | Prose quality is the product — cannot be cheap |
-| Guardian | haiku | Plot/continuity checks are structured |
-| Skeptic | haiku | Premise challenges are structured |
-| Sage / story-sage | **sonnet** | Voice and craft judgment need taste |
-| Trickster | haiku | Reader-confusion analysis is systematic |
-
-**When to escalate to opus**: Only for final-pass prose polishing on high-stakes content (book manuscripts, not short stories). Never for review or research agents. The user must explicitly opt in via:
-
-```yaml
-# Team preset
-model_overrides:
-  maker: opus    # Only for final polish pass
-```
-
-### Domain-Driven Model Selection
-
-The effective model for each agent is resolved in this order:
-
-1. **Team preset `model_overrides`** (highest priority — explicit choice)
-2. **Domain `model_overrides`** (from `.archeflow/domains/<name>.yaml`)
-3. **Archetype default** (from the table above)
-4. **Custom archetype `model` field** (from archetype YAML frontmatter)
-
-Example resolution for `story-sage` in a writing run:
- Team preset says nothing about story-sage → skip
- Writing domain says `story-sage: sonnet` → **use sonnet**
- Archetype YAML says `model: sonnet` → would have been used if domain didn't specify
-
-## Cost Optimization Strategies
-
-### 1. Prompt Caching
-
-Structure prompts so that stable content comes first (maximizes cache prefix hits):
-
-```
-[System prompt — archetype instructions]     ← cached across agents in same run
-[Domain context — voice profile, persona]    ← cached across agents in same run
-[Phase context — Explorer output, proposal]  ← changes per agent
-[Task-specific instructions]                 ← changes per agent
-```
-
-Estimated savings: 30-50% on input tokens for runs with 5+ agents.
-
-### 2. Guardian Fast-Path (A2)
-
-When Guardian approves with 0 issues, skip Skeptic/Sage/Trickster. This saves 2-3 agent calls per cycle. See `archeflow:orchestration` skill, rule A2.
-
-Typical savings: $0.30-0.80 per skipped cycle (depending on models).
-
-### 3. Explorer Cache
-
-Reuse recent Explorer research instead of re-running. See `archeflow:orchestration` skill, Explorer Cache section.
-
-Typical savings: $0.02-0.05 per cache hit (haiku Explorer).
-
-### 4. Batches API for Bulk Operations
-
-When running consistency checks, validation passes, or other non-time-sensitive work across multiple files, use the Batches API (50% discount):
-
-```yaml
-# Mark agents as batch-eligible in team presets
-batch_eligible:
-  - guardian    # Structured review, can wait
-  - skeptic    # Challenge generation, can wait
-```
-
-Only use batches when the user is not waiting for real-time results (overnight runs, autonomous mode).
-
-### 5. Early Termination
-
-If the first cycle produces a clean Guardian pass (A2 fast-path) AND the Maker's self-review checklist is clean, skip the remaining cycles even if `max_cycles > 1`. This avoids spending tokens on unnecessary verification.
+1. **Prompt caching:** Stable content first (archetype instructions, voice profiles). Saves 30-50% on input.
+2. **Guardian fast-path (A2):** 0 issues = skip remaining reviewers. Saves $0.30-0.80/cycle.
+3. **Explorer cache:** Reuse recent research. Saves $0.02-0.05/hit.
+4. **Batches API:** For autonomous/overnight review passes (50% discount).
+5. **Early termination:** Clean Guardian + clean Maker self-review = skip remaining cycles.

 ## Daily Cost Tracking

-Across runs, maintain a daily cost ledger:
-
-```
-.archeflow/costs/<YYYY-MM-DD>.jsonl
-```
-
-Each line is one run's cost summary:
-
-```jsonl
-{"run_id":"2026-04-03-der-huster","cost_usd":1.45,"tokens_input":95000,"tokens_output":33000,"models":{"haiku":2,"sonnet":3},"domain":"writing"}
-{"run_id":"2026-04-03-auth-refactor","cost_usd":2.10,"tokens_input":120000,"tokens_output":45000,"models":{"haiku":3,"sonnet":2},"domain":"code"}
-```
-
-Daily budget enforcement reads this file to check `daily_usd` limits before starting new runs.
-
-### Cost Report Command
-
-```bash
-# Show today's costs
-./lib/archeflow-costs.sh today
-
-# Show costs for a date range
-./lib/archeflow-costs.sh 2026-04-01 2026-04-03
-
-# Show costs for a specific run
-./lib/archeflow-costs.sh run 2026-04-03-der-huster
-```
-
-## Integration with Other Skills
-
- **`orchestration`**: Calls pre-agent and post-agent budget checks. Includes cost summary in orchestration report.
- **`process-log`**: Cost data is embedded in `agent.complete` and `run.complete` events. No separate cost events needed.
- **`domains`**: Reads `model_overrides` from the active domain to determine effective model per agent.
- **`autonomous-mode`**: Enforces budget strictly (no prompts — just stop on budget exceeded). Uses daily budget to limit overnight spend.
- **`workflow-design`**: Custom workflows can specify per-phase model assignments that override domain defaults.
-
-## Design Principles
-
-1. **Track always, enforce optionally.** Cost data is in every event regardless of whether a budget is set. Budget enforcement is opt-in.
-2. **Estimate before spend.** Always estimate before spawning an agent. Surprises are worse than slightly inaccurate estimates.
-3. **Cheapest model that works.** Default to haiku. Upgrade to sonnet only when the task demonstrably needs it. Opus is user-opt-in only.
-4. **Transparent.** Every cost shows up in the orchestration report. No hidden token spend.
-5. **Learn from history.** After enough runs, replace estimates with actual averages from `metrics.jsonl`.
+Ledger at `.archeflow/costs/<YYYY-MM-DD>.jsonl`. One line per run with cost, tokens, models, domain. Daily budget enforcement reads this before starting new runs.