Files
claude-archeflow-plugin/skills/multi-project/SKILL.md

630 lines
25 KiB
Markdown

---
name: multi-project
description: |
Multi-project orchestration for workspaces with 20+ repos. Builds a dependency DAG across
projects, runs independent sub-runs in parallel, shares artifacts between dependent projects,
and enforces a shared budget. Each sub-run uses the standard `run` skill internally.
<example>User: "archeflow:multi-project" with a multi-run.yaml</example>
<example>User: "Run this across archeflow, colette, and giesing"</example>
<example>User: "archeflow:multi-project --dry-run"</example>
---
# Multi-Project Orchestration
Coordinates ArcheFlow runs across multiple projects in a workspace. Each project gets its own
PDCA run (via the standard `run` skill), but dependencies between projects are respected, artifacts
are shared, and budget is tracked globally.
## Prerequisites
Load these skills (they are referenced throughout):
- `archeflow:run` — single-project PDCA execution loop
- `archeflow:process-log` — event schema and DAG parent rules
- `archeflow:artifact-routing` — artifact naming, context injection, cycle archiving
- `archeflow:cost-tracking` — cost aggregation and budget enforcement
- `archeflow:domains` — domain detection per project
## Invocation
```
archeflow:multi-project # Read from .archeflow/multi-run.yaml
archeflow:multi-project --config path/to.yaml # Explicit config file
archeflow:multi-project --dry-run # Plan phase only for all projects, show cost estimate
archeflow:multi-project --resume <multi-run-id> # Resume a failed/paused multi-run
```
---
## Multi-Run Definition
A multi-run is defined in YAML, either in `.archeflow/multi-run.yaml` or passed via `--config`.
```yaml
name: "giesing-gschichten-v2"
description: "Write second story with improved ArcheFlow + Colette integration"
projects:
- id: archeflow
path: "../archeflow" # Relative to workspace root, or absolute
task: "Add memory injection to run skill"
workflow: fast # fast | standard | thorough (optional, auto-select if omitted)
domain: code # Optional, auto-detected if omitted
depends_on: [] # No dependencies — can start immediately
- id: colette
path: "../writing.colette"
task: "Add story-specific voice validation command"
workflow: standard
domain: code
depends_on: [] # Independent of archeflow — runs in parallel
- id: giesing
path: "."
task: "Write story #2 using improved tools"
workflow: kurzgeschichte
domain: writing
depends_on: [archeflow, colette] # Waits for both to complete
budget:
total_usd: 15.00 # Hard cap — stops all projects when exceeded
per_project_usd: 10.00 # Soft cap — warns but does not stop
parallel: true # Run independent projects concurrently (default: true)
```
### Definition Rules
- `id` must be unique within the multi-run.
- `path` is resolved relative to the directory containing the YAML file unless absolute.
- `depends_on` references other project `id` values. Cycles are rejected at validation time.
- `workflow` and `domain` are optional. If omitted, the `run` skill auto-selects per project.
- At least one project must have an empty `depends_on` (otherwise the DAG has no entry point).
---
## Workspace Registry Integration
If `docs/project-registry.md` exists at the workspace root, the multi-project skill can:
1. **Auto-discover paths:** When `path` is omitted from a project entry, look up the project `id` in the registry to find its directory.
2. **Validate existence:** Before starting, verify that every project path exists on disk. Abort with a clear error if a path is missing.
3. **Show registry status:** In the progress table, include the project's current sprint goal from the registry alongside the multi-run status.
4. **Update registry:** After the multi-run completes, update each project's status in the registry if meaningful changes were made (new features, completed sprint goals).
---
## Execution Steps
### 0. Validate and Initialize
**0a. Parse and validate the multi-run definition:**
```
1. Read the YAML file.
2. Validate all required fields (name, projects with id/path/task).
3. Resolve all paths to absolute paths.
4. Verify each path exists on disk.
5. Build the dependency DAG.
6. Check for cycles — abort if any detected.
7. Identify the entry-point projects (depends_on is empty).
8. Verify at least one entry-point exists.
```
**0b. Generate multi-run ID and directory structure:**
```bash
MULTI_RUN_ID="$(date -u +%Y-%m-%d)-${name}"
# Master event file
mkdir -p .archeflow/events
touch .archeflow/events/${MULTI_RUN_ID}.jsonl
# Cross-project artifact directory
mkdir -p .archeflow/artifacts/${MULTI_RUN_ID}
for project in ${PROJECT_IDS}; do
mkdir -p .archeflow/artifacts/${MULTI_RUN_ID}/${project}
done
# Progress file
touch .archeflow/multi-progress.md
```
**0c. Emit `multi.start`:**
```jsonl
{"ts":"...","run_id":"<MULTI_RUN_ID>","seq":1,"parent":[],"type":"multi.start","phase":"init","agent":null,"data":{"name":"giesing-v2","description":"...","projects":["archeflow","colette","giesing"],"parallel":true,"budget_total_usd":15.00,"dag":{"archeflow":[],"colette":[],"giesing":["archeflow","colette"]}}}
```
**Track state throughout the multi-run:**
- `MULTI_RUN_ID` — unique multi-run identifier
- `MULTI_SEQ` — master event sequence counter
- `PROJECT_STATUS` — map of project_id to status (`pending | running | completed | failed | blocked | skipped`)
- `PROJECT_RUN_IDS` — map of project_id to its sub-run_id
- `TOTAL_COST` — running cost total across all projects
- `REMAINING_BUDGET` — budget minus total cost
---
### 1. Dependency Resolution
Build a topological sort of the project DAG. This determines execution order.
```
Given:
archeflow: depends_on=[]
colette: depends_on=[]
giesing: depends_on=[archeflow, colette]
Topological layers:
Layer 0 (immediate): [archeflow, colette] # No deps, start now
Layer 1: [giesing] # Depends on Layer 0
```
**Algorithm:**
1. Find all projects with zero unmet dependencies. These form the current layer.
2. When a project completes, remove it from the dependency lists of all downstream projects.
3. Any project whose dependency list becomes empty moves to the ready queue.
4. Repeat until all projects are complete, failed, or blocked.
**Cycle detection:** Before starting, verify the DAG is acyclic. Use Kahn's algorithm — if after processing all nodes the sorted list is shorter than the project list, there is a cycle. Report which projects form the cycle and abort.
---
### 2. Parallel Execution
For each project in the ready queue, start a sub-run. Independent projects run concurrently.
**Starting a sub-run:**
```
For each ready project:
1. Set PROJECT_STATUS[project_id] = "running"
2. Generate sub-run ID: MULTI_RUN_ID/project_id
(e.g., "2026-04-03-giesing-v2/archeflow")
3. Emit project.start to master event file
4. cd into the project's path
5. Invoke archeflow:run with:
- run_id = MULTI_RUN_ID/project_id
- workflow = project.workflow (or auto-select)
- domain = project.domain (or auto-detect)
- budget = min(per_project_budget, remaining_total_budget)
- artifact_dir = .archeflow/artifacts/MULTI_RUN_ID/project_id/
6. The sub-run emits its own events to its own JSONL file
inside the project's directory (standard run behavior)
```
**Concurrency model:**
When `parallel: true` (default), spawn independent projects as parallel subagents:
```
Agent(
description: "Multi-project sub-run: <project_id> — <task>",
prompt: "Run archeflow:run in <path> with task: <task>.
Run ID: <MULTI_RUN_ID>/<project_id>
Workflow: <workflow>
Domain: <domain>
Budget: $<per_project_budget>
Save artifacts to: .archeflow/artifacts/<MULTI_RUN_ID>/<project_id>/
When complete, report: status, cost, artifact list, and any issues.",
isolation: "worktree",
mode: "bypassPermissions"
)
```
Launch all Layer 0 projects simultaneously. As each completes, check if any Layer 1+ projects become unblocked.
When `parallel: false`, run projects sequentially in topological order. Still respect dependencies — a project does not start until all its dependencies have completed.
---
### 3. Master Events
All multi-run-level events are written to `.archeflow/events/<MULTI_RUN_ID>.jsonl`. These track the overall orchestration, not individual PDCA phases (those go to each project's own event file).
#### Master Event Types
| Event | When | Key Data |
|-------|------|----------|
| `multi.start` | Multi-run begins | Project list, DAG, budget |
| `project.start` | A sub-run launches | project_id, run_id, path |
| `project.complete` | A sub-run finishes successfully | project_id, status, cost, artifacts |
| `project.failed` | A sub-run fails | project_id, error, cost_so_far |
| `project.blocked` | A dependency failed, blocking this project | project_id, blocked_by |
| `project.unblocked` | All dependencies met, project can start | project_id, unblocked_by |
| `project.skipped` | User chose to skip a blocked project | project_id, reason |
| `budget.warning` | Budget threshold crossed | spent, budget, percent |
| `budget.exceeded` | Hard budget cap hit | spent, budget, halted_projects |
| `multi.complete` | All projects done (or halted) | status, projects_completed, total_cost |
#### Example Master Event Stream
```jsonl
{"seq":1,"type":"multi.start","phase":"init","data":{"name":"giesing-v2","projects":["archeflow","colette","giesing"],"parallel":true,"budget_total_usd":15.00}}
{"seq":2,"type":"project.start","phase":"run","data":{"project":"archeflow","run_id":"2026-04-03-giesing-v2/archeflow","path":"/home/c/projects/archeflow"}}
{"seq":3,"type":"project.start","phase":"run","data":{"project":"colette","run_id":"2026-04-03-giesing-v2/colette","path":"/home/c/projects/writing.colette"}}
{"seq":4,"type":"project.complete","phase":"run","data":{"project":"archeflow","status":"completed","run_id":"2026-04-03-giesing-v2/archeflow","cost_usd":1.20,"artifacts":["plan-explorer.md","plan-creator.md","do-maker.md","check-guardian.md"]}}
{"seq":5,"type":"project.complete","phase":"run","data":{"project":"colette","status":"completed","run_id":"2026-04-03-giesing-v2/colette","cost_usd":1.80,"artifacts":["plan-creator.md","do-maker.md","check-guardian.md","check-sage.md"]}}
{"seq":6,"type":"project.unblocked","phase":"run","data":{"project":"giesing","unblocked_by":["archeflow","colette"]}}
{"seq":7,"type":"project.start","phase":"run","data":{"project":"giesing","run_id":"2026-04-03-giesing-v2/giesing","path":"/home/c/projects/book.giesing-gschichten"}}
{"seq":8,"type":"project.complete","phase":"run","data":{"project":"giesing","status":"completed","run_id":"2026-04-03-giesing-v2/giesing","cost_usd":3.50,"artifacts":["plan-explorer.md","plan-creator.md","do-maker.md","check-guardian.md","check-sage.md"]}}
{"seq":9,"type":"multi.complete","phase":"done","data":{"status":"completed","projects_completed":3,"projects_failed":0,"total_cost_usd":6.50,"budget_remaining_usd":8.50}}
```
---
### 4. Cross-Project Artifacts
When project B depends on project A, B's agents can access A's artifacts. This is the primary mechanism for cross-project information flow.
#### Artifact Directory Layout
```
.archeflow/artifacts/<MULTI_RUN_ID>/
├── archeflow/ # Sub-run artifacts from archeflow
│ ├── plan-explorer.md
│ ├── plan-creator.md
│ ├── do-maker.md
│ ├── do-maker-files.txt
│ └── check-guardian.md
├── colette/ # Sub-run artifacts from colette
│ ├── plan-creator.md
│ ├── do-maker.md
│ └── check-sage.md
└── giesing/ # Sub-run artifacts from giesing (depends on both)
├── plan-explorer.md # Explorer can reference upstream artifacts
├── plan-creator.md
├── do-maker.md
└── check-guardian.md
```
#### Cross-Project Context Injection
When a dependent project's sub-run starts, inject upstream artifact summaries into the Explorer's prompt:
```markdown
## Upstream Project Results
### archeflow (completed)
Summary: Added memory injection to run skill.
Key artifacts:
- plan-creator.md: <first 20 lines or summary section>
- do-maker.md: <implementation summary>
### colette (completed)
Summary: Added story-specific voice validation command.
Key artifacts:
- plan-creator.md: <first 20 lines or summary section>
- do-maker.md: <implementation summary>
Use these results as context. The changes from these projects are available in their
respective directories and have been committed to their branches.
```
**Rules for cross-project injection:**
- Only inject summaries, not full artifacts (keep context small).
- If an upstream artifact is large (>200 lines), extract the summary/overview section only.
- The dependent project's Explorer has filesystem access to read full upstream artifacts if needed.
- Cross-project injection happens ONLY in the Plan phase (Explorer and Creator). The Maker works from the Creator's proposal, which already incorporates upstream context.
---
### 5. Budget Coordination
The multi-run has a shared budget across all projects.
#### Budget Hierarchy
```
total_usd: 15.00 # Hard cap — stops ALL projects when exceeded
per_project_usd: 10.00 # Soft cap — warns but does not stop individual project
```
#### Budget Tracking
Maintain a running total across all sub-runs:
```
TOTAL_COST = sum of all project costs reported in project.complete events
REMAINING = total_usd - TOTAL_COST
```
#### Budget Enforcement Points
1. **Before starting a sub-run:**
- Estimate the sub-run cost (based on workflow and domain).
- If estimated cost > REMAINING: warn and ask user (attended) or halt (autonomous).
2. **After each sub-run completes:**
- Update TOTAL_COST with actual cost from the sub-run.
- If TOTAL_COST > total_usd * warn_at_percent: emit `budget.warning`.
- If TOTAL_COST > total_usd: emit `budget.exceeded`, halt remaining projects.
3. **Per-project soft cap:**
- Each sub-run receives `min(per_project_usd, REMAINING)` as its budget.
- The `run` skill's own budget enforcement handles the per-project cap.
- If a project exceeds per_project_usd, it warns but continues (soft cap).
#### Budget Events
```jsonl
{"seq":5,"type":"budget.warning","data":{"spent_usd":11.50,"budget_usd":15.00,"percent":77,"message":"Budget 77% consumed"}}
{"seq":8,"type":"budget.exceeded","data":{"spent_usd":15.30,"budget_usd":15.00,"halted_projects":["giesing"],"message":"Hard budget cap exceeded. Halting remaining projects."}}
```
---
### 6. Failure Handling
Failures in one project affect downstream projects but not independent ones.
#### Failure Scenarios
| Scenario | Action |
|----------|--------|
| Project fails (run error, test failure, max cycles) | Mark as `failed` in master events. Independent projects continue. |
| Dependency of project X failed | Mark X as `blocked`. Do not start X. |
| Budget exceeded mid-run | Halt the current project. Mark remaining as `blocked`. |
| All entry-point projects fail | Entire multi-run fails. No downstream projects can start. |
#### Blocked Project Resolution
When a project is blocked because a dependency failed, offer three options:
1. **Skip:** Mark the blocked project as `skipped`. Continue with other independent projects.
2. **Retry:** Re-run the failed dependency. If it succeeds, unblock downstream projects.
3. **Abort:** Stop the entire multi-run. Report what completed and what did not.
In **autonomous mode**, the default action is `skip` — blocked projects are skipped, independent projects continue, and the multi-run completes with partial results.
In **attended mode**, prompt the user with the options above.
#### Failure Events
```jsonl
{"seq":4,"type":"project.failed","data":{"project":"archeflow","error":"Max cycles reached with unresolved CRITICAL findings","cost_usd":2.10}}
{"seq":5,"type":"project.blocked","data":{"project":"giesing","blocked_by":["archeflow"],"reason":"Dependency 'archeflow' failed"}}
```
---
### 7. Progress Tracking
Maintain a live progress file at `.archeflow/multi-progress.md`. Update it after every project state change.
```markdown
# Multi-Run: giesing-v2
Started: 2026-04-03T14:00:00Z
| Project | Status | Domain | Phase | Detail |
|---------|--------|--------|-------|--------|
| archeflow | completed | code | -- | 1 cycle, $1.20 |
| colette | running | code | DO | maker drafting |
| giesing | blocked | writing | -- | waiting for colette |
## Budget
| | Amount |
|---|--------|
| Spent | $3.00 |
| Budget | $15.00 |
| Remaining | $12.00 |
| Utilization | 20% |
## Dependency Graph
```
archeflow ----\
+---> giesing
colette ------/
```
## Timeline
- 14:00:00 — Started archeflow, colette (parallel)
- 14:05:23 — archeflow completed ($1.20, 1 cycle)
- 14:06:10 — colette DO phase, maker drafting
```
Update this file after:
- A project starts
- A project changes phase (via status polling or sub-agent reporting)
- A project completes or fails
- A project becomes unblocked
- Budget threshold is crossed
---
### 8. Completion
When all projects are complete (or blocked/skipped with no more actionable items):
**8a. Emit `multi.complete`:**
```jsonl
{"seq":9,"type":"multi.complete","phase":"done","data":{"status":"completed","projects_completed":3,"projects_failed":0,"projects_skipped":0,"total_cost_usd":6.50,"budget_remaining_usd":8.50,"duration_ms":600000}}
```
Status values:
- `completed` — all projects finished successfully
- `partial` — some projects completed, some failed/skipped
- `failed` — no projects completed successfully
- `halted` — stopped due to budget or user abort
**8b. Generate multi-run report:**
```markdown
# Multi-Run Report: giesing-v2
## Summary
| Metric | Value |
|--------|-------|
| Projects | 3 |
| Completed | 3 |
| Failed | 0 |
| Total cost | $6.50 / $15.00 |
| Duration | 10m 00s |
## Per-Project Results
### archeflow
- **Status:** completed
- **Task:** Add memory injection to run skill
- **Workflow:** fast (1 cycle)
- **Cost:** $1.20
- **Key artifacts:** plan-creator.md, do-maker.md
### colette
- **Status:** completed
- **Task:** Add story-specific voice validation command
- **Workflow:** standard (1 cycle)
- **Cost:** $1.80
- **Key artifacts:** plan-creator.md, do-maker.md, check-sage.md
### giesing
- **Status:** completed
- **Task:** Write story #2 using improved tools
- **Workflow:** kurzgeschichte (2 cycles)
- **Cost:** $3.50
- **Key artifacts:** plan-explorer.md, do-maker.md, check-guardian.md
## Dependency Graph Execution
archeflow (Layer 0) ----> completed
colette (Layer 0) ----> completed
giesing (Layer 1) ----> unblocked ----> completed
## Cost Breakdown
| Project | Plan | Do | Check | Total |
|---------|------|----|-------|-------|
| archeflow | $0.20 | $0.60 | $0.40 | $1.20 |
| colette | $0.30 | $0.80 | $0.70 | $1.80 |
| giesing | $0.50 | $2.00 | $1.00 | $3.50 |
| **Total** | **$1.00** | **$3.40** | **$2.10** | **$6.50** |
```
**8c. Update master event index:**
Append to `.archeflow/events/index.jsonl`:
```jsonl
{"run_id":"2026-04-03-giesing-v2","ts":"2026-04-03T14:10:00Z","type":"multi","task":"Write second story with improved ArcheFlow + Colette integration","status":"completed","projects":3,"total_cost_usd":6.50}
```
**8d. Update workspace registry (if applicable):**
If `docs/project-registry.md` exists and project statuses changed meaningfully, update the registry entries for affected projects.
---
## Dry-Run Mode
When `--dry-run` is specified:
1. Validate the multi-run definition (DAG, paths, budget).
2. For each project (in topological order), run `archeflow:run --dry-run` to get a cost estimate and plan preview.
3. Display a summary:
```
Multi-Run Dry Run: giesing-v2
Projects: 3
Dependency layers: 2
Parallel execution: yes
Layer 0 (parallel):
archeflow — fast workflow, code domain
Estimated cost: $0.50-1.50
colette — standard workflow, code domain
Estimated cost: $1.00-3.00
Layer 1 (after Layer 0):
giesing — kurzgeschichte workflow, writing domain
Estimated cost: $2.00-5.00
Total estimated cost: $3.50-9.50
Budget: $15.00 (sufficient)
Proceed? [y/n]
```
4. Do NOT emit `multi.complete`. The multi-run is paused.
5. If user says yes, start the full multi-run using the validated config.
---
## Resume Mode
When `--resume <multi-run-id>` is specified:
1. Read the master event file `.archeflow/events/<multi-run-id>.jsonl`.
2. Reconstruct `PROJECT_STATUS` from events (which projects completed, failed, are pending).
3. Identify resumable projects:
- `failed` projects can be retried.
- `blocked` projects whose blockers are now `completed` (e.g., after manual fix) can start.
- `pending` projects that were never started can start if their deps are met.
4. Display current state and ask for confirmation.
5. Continue the multi-run from where it left off, appending to the existing master event file.
Resume emits a `multi.resume` event:
```jsonl
{"seq":10,"type":"multi.resume","phase":"init","data":{"resumed_from":"2026-04-03-giesing-v2","projects_completed":["archeflow"],"projects_to_run":["colette","giesing"]}}
```
---
## Integration with Existing Skills
| Skill | Integration Point |
|-------|-------------------|
| `run` | Each sub-run is a standard `archeflow:run` invocation. The multi-project skill wraps and coordinates multiple runs. |
| `process-log` | Master events follow the same schema (ts, run_id, seq, parent, type, phase, agent, data). Sub-run events use the standard event types. |
| `artifact-routing` | Each sub-run follows standard artifact routing internally. Cross-project artifacts follow the injection rules in Section 4. |
| `cost-tracking` | Per-project costs come from sub-run `run.complete` events. The multi-project skill aggregates them and enforces the shared budget. |
| `domains` | Each project auto-detects its domain independently. Different projects in the same multi-run can have different domains. |
| `git-integration` | Each sub-run manages its own branch. The multi-project skill does not merge across repos — each project's Act phase handles its own merge. |
| `autonomous-mode` | Multi-project runs are autonomous-mode-friendly. Budget enforcement is strict (halt, don't prompt). Blocked projects are skipped. |
---
## Progress Display
Throughout the multi-run, display live progress:
```
━━━ ArcheFlow Multi-Run: giesing-v2 ━━━━━━━━━━━━━━━━━━━
Projects: 3 | Budget: $15.00 | Parallel: yes
[archeflow] fast/code -> running (Plan: Creator designing...)
[colette] standard/code -> running (Do: Maker implementing...)
[giesing] kurzgeschichte/writing -> blocked (waiting: archeflow, colette)
Cost: $1.80 / $15.00 (12%)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
```
Update the display when:
- A project changes state (start, phase change, complete, fail, unblock)
- Budget thresholds are crossed
---
## Error Handling
| Error | Response |
|-------|----------|
| YAML parse error | Abort before starting. Report the parse error with line number. |
| Dependency cycle detected | Abort. Report which projects form the cycle. |
| Project path does not exist | Abort. Report the missing path. |
| Sub-run agent fails to return | Mark project as failed (5-min timeout per the `run` skill). Continue independent projects. |
| Master event write fails | Log warning. Continue orchestration. Events are observation, not control flow. |
| Artifact directory creation fails | Abort the affected project. This is blocking for cross-project artifact sharing. |
| Budget exceeded mid-project | Halt that project immediately. Emit `budget.exceeded`. Skip downstream dependents. |
---
## Design Principles
1. **Each project is autonomous.** Sub-runs use the standard `run` skill without modification. The multi-project skill is a coordinator, not a replacement.
2. **DAG over sequence.** Dependencies are declared, not implied by order. Independent projects always run in parallel when possible.
3. **Shared budget, independent domains.** Budget is global, but each project detects its own domain, selects its own workflow, and manages its own artifacts.
4. **Fail forward.** A failure in one project does not halt independent projects. Only downstream dependents are blocked.
5. **Artifacts are the interface.** Projects communicate through saved artifacts, not shared memory or direct agent-to-agent messaging.
6. **Resume over restart.** Multi-runs can be resumed from any point. Master events provide enough state to reconstruct progress.
7. **Registry-aware.** When a workspace registry exists, use it for discovery and keep it updated. When it does not exist, everything still works.