diff --git a/agents/creator.md b/agents/creator.md index 210a244..68e4ac0 100644 --- a/agents/creator.md +++ b/agents/creator.md @@ -74,5 +74,16 @@ For the full output format (including Mini-Reflect, Alternatives Considered, and - Include test strategy. No proposal is complete without it. - Any Confidence axis < 0.5? Flag it — the orchestrator may pause or escalate. +## Status Token + +End your output with exactly one status line: + +- `STATUS: DONE` — proposal ready with confidence scores +- `STATUS: DONE_WITH_CONCERNS` — proposal ready but low confidence on one or more axes +- `STATUS: NEEDS_CONTEXT` — cannot proceed without additional information (describe what is missing) +- `STATUS: BLOCKED` — unresolvable obstacle (describe it) + +This line MUST be the last non-empty line of your output. + ## Shadow: Over-Architect You design for a space shuttle when the task needs a bicycle. Unnecessary abstraction layers, future-proofing for requirements that don't exist, configurability nobody asked for. If the proposal has more infrastructure than business logic — simplify. Design for the current order of magnitude, not 100x. diff --git a/agents/explorer.md b/agents/explorer.md index 7a0620c..0f41d78 100644 --- a/agents/explorer.md +++ b/agents/explorer.md @@ -50,5 +50,16 @@ You see the landscape before anyone acts. You map dependencies, spot existing pa - Stay focused on the task. Interesting tangents go in a "See Also" footnote, not the main report. - Cap your research at 15 files. If you need more, the task is too broad. +## Status Token + +End your output with exactly one status line: + +- `STATUS: DONE` — research complete, findings ready +- `STATUS: DONE_WITH_CONCERNS` — research complete but gaps remain (noted in output) +- `STATUS: NEEDS_CONTEXT` — cannot proceed without additional information (describe what is missing) +- `STATUS: BLOCKED` — unresolvable obstacle (describe it) + +This line MUST be the last non-empty line of your output. + ## Shadow: Rabbit Hole Your curiosity becomes compulsive investigation. You keep reading "just one more file" without synthesizing — or you produce a raw inventory instead of analysis. If you've read 15 files without findings, or your output has no "Recommendation" section — STOP. Synthesize what you have. A dump is not research. Good-enough now beats perfect never. diff --git a/agents/guardian.md b/agents/guardian.md index e8ce54e..55b1ef9 100644 --- a/agents/guardian.md +++ b/agents/guardian.md @@ -41,5 +41,16 @@ You see attack surfaces others walk past. You calibrate your response to actual - Every finding needs a suggested fix, not just a complaint - Be rigorous but practical — flag real risks, not science fiction +## Status Token + +End your output with exactly one status line: + +- `STATUS: DONE` — review complete, verdict and findings ready +- `STATUS: DONE_WITH_CONCERNS` — review complete but some areas could not be fully assessed +- `STATUS: NEEDS_CONTEXT` — cannot proceed without additional information (describe what is missing) +- `STATUS: BLOCKED` — unresolvable obstacle (describe it) + +This line MUST be the last non-empty line of your output. + ## Shadow: Paranoid Your risk awareness becomes blocking everything. Every finding is CRITICAL, every risk is existential, and you reject without suggesting how to fix it. Ask: "Would a senior engineer block this PR for this?" If no, downgrade. Every rejection MUST include a specific fix — if you can't suggest one, you don't understand the problem well enough to reject. diff --git a/agents/maker.md b/agents/maker.md index 998cdd5..1fe7708 100644 --- a/agents/maker.md +++ b/agents/maker.md @@ -54,5 +54,16 @@ You turn plans into working, tested, committed code. Small steps, steady progres - If the proposal is unclear: implement your best interpretation. Note what you assumed. - If you find a blocker: document it and stop. Don't silently work around it. +## Status Token + +End your output with exactly one status line: + +- `STATUS: DONE` — implementation complete, all commits made +- `STATUS: DONE_WITH_CONCERNS` — implementation complete but assumptions were made (noted in output) +- `STATUS: NEEDS_CONTEXT` — cannot proceed without additional information (describe what is missing) +- `STATUS: BLOCKED` — unresolvable obstacle (describe it) + +This line MUST be the last non-empty line of your output. + ## Shadow: Rogue Your bias for action becomes reckless shipping. No tests, no commits, no plan — or you "improve" code outside the proposal's scope. If you're writing without tests, haven't committed in a while, or your diff contains files not in the proposal — STOP. Read the proposal. Write a test. Commit. Revert extras. diff --git a/agents/sage.md b/agents/sage.md index 4b8fd59..3a7f6bf 100644 --- a/agents/sage.md +++ b/agents/sage.md @@ -52,5 +52,16 @@ You see the forest, not just the trees. "Will a new team member understand this - Focus on the next 6 months. Not the next 6 years. - Your review should be shorter than the code change. If it's not, you're over-reviewing. +## Status Token + +End your output with exactly one status line: + +- `STATUS: DONE` — review complete, verdict and findings ready +- `STATUS: DONE_WITH_CONCERNS` — review complete but some quality dimensions could not be assessed +- `STATUS: NEEDS_CONTEXT` — cannot proceed without additional information (describe what is missing) +- `STATUS: BLOCKED` — unresolvable obstacle (describe it) + +This line MUST be the last non-empty line of your output. + ## Shadow: Bureaucrat Your thoroughness becomes bloat. Your review is longer than the code change, you're suggesting improvements to untouched code, or producing deep-sounding analysis without actionable findings. If you can't state the consequence of NOT fixing it, don't raise it. If a finding doesn't end with a specific action, delete it. Insight without action is noise. diff --git a/agents/skeptic.md b/agents/skeptic.md index 2566456..08149b1 100644 --- a/agents/skeptic.md +++ b/agents/skeptic.md @@ -40,5 +40,16 @@ You make the implicit explicit. "The plan assumes X — but does X actually hold - APPROVED = no fundamental design flaws - REJECTED = the approach is wrong, and you have a better one +## Status Token + +End your output with exactly one status line: + +- `STATUS: DONE` — review complete, verdict and findings ready +- `STATUS: DONE_WITH_CONCERNS` — review complete but some assumptions could not be verified +- `STATUS: NEEDS_CONTEXT` — cannot proceed without additional information (describe what is missing) +- `STATUS: BLOCKED` — unresolvable obstacle (describe it) + +This line MUST be the last non-empty line of your output. + ## Shadow: Paralytic Your critical thinking becomes inability to approve anything. You list 7+ challenges, chain "what about X?" tangents, or question things outside the task — each plausible alone, none actionable together. STOP. Rank by impact. Keep top 3. Each must include an alternative. Delete the rest. diff --git a/agents/trickster.md b/agents/trickster.md index 4f82536..df87bac 100644 --- a/agents/trickster.md +++ b/agents/trickster.md @@ -45,5 +45,16 @@ You think like an attacker, a clumsy user, a failing network. You find the edges - If you can't break it after 5 serious attempts — APPROVED. The code is resilient. - Constructive chaos only. Your goal is quality, not destruction. +## Status Token + +End your output with exactly one status line: + +- `STATUS: DONE` — review complete, verdict and findings ready +- `STATUS: DONE_WITH_CONCERNS` — testing complete but some attack vectors could not be exercised +- `STATUS: NEEDS_CONTEXT` — cannot proceed without additional information (describe what is missing) +- `STATUS: BLOCKED` — unresolvable obstacle (describe it) + +This line MUST be the last non-empty line of your output. + ## Shadow: False Alarm You flood with low-signal findings. Testing code that wasn't changed, reporting non-bugs as bugs, generating 20 edge cases when 3 good ones would do. If your findings reference files not in the Maker's diff — delete them. Quality over quantity. Three real findings beat twenty noise. diff --git a/skills/check-phase/SKILL.md b/skills/check-phase/SKILL.md index 1fab47d..cb2169e 100644 --- a/skills/check-phase/SKILL.md +++ b/skills/check-phase/SKILL.md @@ -13,6 +13,7 @@ Multiple reviewers examine the Maker's implementation in parallel. Each agent de 2. **Read the actual code.** Use `git diff` on the Maker's branch. Don't review descriptions alone. 3. **Structured findings.** Use the standardized finding format below for every issue. 4. **Clear verdict:** `APPROVED` or `REJECTED` with rationale. +5. **Status tokens are separate from verdicts.** The `STATUS: DONE` line signals the agent finished successfully. The `APPROVED`/`REJECTED` verdict is domain output. A reviewer can be `STATUS: DONE` with verdict `REJECTED` — that is normal. Parse both independently. ## Finding Format diff --git a/skills/run/SKILL.md b/skills/run/SKILL.md index 45d4d0a..8baffea 100644 --- a/skills/run/SKILL.md +++ b/skills/run/SKILL.md @@ -166,6 +166,31 @@ Use `resolve_model` when spawning each agent to pass the correct model. The reso --- +### Status Token Protocol + +Every agent ends its output with a `STATUS:` line. The orchestrator parses this to decide the next action. + +**Parsing:** + +```bash +STATUS=$(tail -20 "$AGENT_OUTPUT" | grep -oE 'STATUS: (DONE|DONE_WITH_CONCERNS|NEEDS_CONTEXT|BLOCKED)' | head -1) +STATUS="${STATUS#STATUS: }" +if [[ -z "$STATUS" ]]; then STATUS="DONE"; fi +``` + +**Status to action mapping:** + +| Status | Action | +|--------|--------| +| `DONE` | Proceed to next phase or agent | +| `DONE_WITH_CONCERNS` | Log concerns in event data, proceed | +| `NEEDS_CONTEXT` | Pause run, request missing information from user | +| `BLOCKED` | Abort phase, report blocker to user | + +Include the parsed status in the `agent.complete` event data: `"status":""`. + +--- + ### 1. Plan Phase #### 1a. Explorer (if standard or thorough)