feat: core improvements — feedback loop, attention filters, shadow heuristics, metrics, auto-activation
- Cross-cycle feedback protocol with structured finding format, routing, and resolution tracking - Attention filter enforcement: explicit context include/exclude per archetype - Shadow detection: quantitative checklists with concrete thresholds - Orchestration metrics: per-phase timing, agent count, findings summary - Autonomous mode wiring: checkpoint protocol, session log, stop conditions - Auto-activation: SessionStart hook fires ArcheFlow for implementation tasks without user config - Emoji avatars for all 7 archetypes - Standardized finding format across all reviewers for cross-cycle tracking - Persisted implementation plan in docs/
This commit is contained in:
@@ -11,12 +11,33 @@ Multiple reviewers examine the Maker's implementation in parallel. Each agent de
|
||||
|
||||
1. **Read the proposal first.** Review against the intended design, not invented requirements.
|
||||
2. **Read the actual code.** Use `git diff` on the Maker's branch. Don't review descriptions alone.
|
||||
3. **Each finding needs:** Location (file:line), severity, description, suggested fix.
|
||||
4. **Severity:**
|
||||
- **CRITICAL** — Must fix. Blocks approval.
|
||||
- **WARNING** — Should fix. Doesn't block alone.
|
||||
- **INFO** — Nice to have. Never blocks.
|
||||
5. **Clear verdict:** `APPROVED` or `REJECTED` with rationale.
|
||||
3. **Structured findings.** Use the standardized finding format below for every issue.
|
||||
4. **Clear verdict:** `APPROVED` or `REJECTED` with rationale.
|
||||
|
||||
## Finding Format
|
||||
|
||||
Every finding must use this format for cross-cycle tracking:
|
||||
|
||||
```
|
||||
| Location | Severity | Category | Description | Fix |
|
||||
|----------|----------|----------|-------------|-----|
|
||||
| src/auth/handler.ts:48 | CRITICAL | security | Empty string bypasses validation | Add length check before processing |
|
||||
```
|
||||
|
||||
**Severity:**
|
||||
- **CRITICAL** — Must fix. Blocks approval.
|
||||
- **WARNING** — Should fix. Doesn't block alone.
|
||||
- **INFO** — Nice to have. Never blocks.
|
||||
|
||||
**Categories** (use consistently for cross-cycle tracking):
|
||||
- `security` — Injection, auth bypass, data exposure, secrets
|
||||
- `reliability` — Error handling, edge cases, race conditions, crashes
|
||||
- `design` — Architecture, assumptions, scalability, coupling
|
||||
- `breaking-change` — API compatibility, schema migrations, removals
|
||||
- `dependency` — New deps, version conflicts, license issues
|
||||
- `quality` — Readability, maintainability, naming, duplication
|
||||
- `testing` — Missing tests, weak assertions, untested paths
|
||||
- `consistency` — Deviates from codebase patterns
|
||||
|
||||
## Consolidated Output
|
||||
|
||||
@@ -26,19 +47,33 @@ After all reviewers finish, compile:
|
||||
## Check Phase Results — Cycle N
|
||||
|
||||
### Guardian: APPROVED
|
||||
- WARNING: Missing rate limit (src/auth/handler.ts:52)
|
||||
| Location | Severity | Category | Description | Fix |
|
||||
|----------|----------|----------|-------------|-----|
|
||||
| src/auth/handler.ts:52 | WARNING | security | Missing rate limit | Add rate limiter middleware |
|
||||
|
||||
### Skeptic: APPROVED
|
||||
- INFO: Consider caching validated tokens
|
||||
| Location | Severity | Category | Description | Fix |
|
||||
|----------|----------|----------|-------------|-----|
|
||||
| src/auth/handler.ts:30 | INFO | design | Consider caching validated tokens | Add TTL cache for token validation |
|
||||
|
||||
### Sage: APPROVED
|
||||
- WARNING: Test names could be more descriptive
|
||||
| Location | Severity | Category | Description | Fix |
|
||||
|----------|----------|----------|-------------|-----|
|
||||
| tests/auth.test.ts:15 | WARNING | testing | Test names don't describe behavior | Rename to "should reject expired tokens" |
|
||||
|
||||
### Trickster: REJECTED
|
||||
- CRITICAL: Empty string bypasses validation (src/auth/handler.ts:48)
|
||||
Reproduction: POST /auth with `{"token": ""}`
|
||||
Expected: 400 | Actual: 500
|
||||
| Location | Severity | Category | Description | Fix |
|
||||
|----------|----------|----------|-------------|-----|
|
||||
| src/auth/handler.ts:48 | CRITICAL | reliability | Empty string bypasses validation | Add `if (!token || token.trim() === '')` guard |
|
||||
|
||||
### Verdict: REJECTED — 1 critical finding
|
||||
→ Feed back to Plan phase for cycle N+1
|
||||
→ Build cycle feedback (see orchestration skill) and feed to Plan phase
|
||||
```
|
||||
|
||||
## Why Structured Findings Matter
|
||||
|
||||
The standardized format enables:
|
||||
- **Cross-cycle tracking:** Same category + location = same issue. Can detect resolution or regression.
|
||||
- **Feedback routing:** Security/design findings → Creator. Quality/testing findings → Maker.
|
||||
- **Shadow detection:** CRITICAL:WARNING ratios, finding counts, and category distributions are measurable.
|
||||
- **Metrics:** Severity counts feed into the orchestration summary.
|
||||
|
||||
@@ -22,9 +22,13 @@ Assess the task and pick:
|
||||
Spawn agents sequentially — Creator needs Explorer's findings.
|
||||
|
||||
### Explorer (if standard or thorough)
|
||||
|
||||
**Context to include:** Task description, relevant file paths, codebase access.
|
||||
**Context to exclude:** Prior proposals, review outputs, implementation details, feedback from previous cycles.
|
||||
|
||||
```
|
||||
Agent(
|
||||
description: "Explorer: research context",
|
||||
description: "🔍 Explorer: research context",
|
||||
prompt: "<task description>
|
||||
You are the EXPLORER archetype.
|
||||
Research the codebase to understand:
|
||||
@@ -39,18 +43,24 @@ Agent(
|
||||
```
|
||||
|
||||
### Creator
|
||||
|
||||
**Context to include:** Task description, Explorer's research output. On cycle 2+: prior cycle's structured feedback (see Cycle Feedback Protocol).
|
||||
**Context to exclude:** Raw file contents (Explorer already summarized), git diffs, reviewer full outputs.
|
||||
|
||||
```
|
||||
Agent(
|
||||
description: "Creator: design proposal",
|
||||
description: "🏗️ Creator: design proposal",
|
||||
prompt: "<task description>
|
||||
You are the CREATOR archetype.
|
||||
Based on the research findings: <Explorer's output>
|
||||
<if cycle 2+: Prior cycle feedback: <structured feedback — see Cycle Feedback Protocol>>
|
||||
Design a solution proposal including:
|
||||
1. Architecture decisions (with rationale)
|
||||
2. Files to create/modify (with specific changes)
|
||||
3. Test strategy
|
||||
4. Confidence score (0.0 to 1.0)
|
||||
5. Risks you foresee
|
||||
<if cycle 2+: 6. How you addressed each unresolved issue from prior feedback>
|
||||
Be decisive. Ship a clear plan, not a menu of options.",
|
||||
subagent_type: "Plan"
|
||||
)
|
||||
@@ -60,12 +70,16 @@ Agent(
|
||||
|
||||
Spawn Maker in an **isolated worktree** so changes don't affect main.
|
||||
|
||||
**Context to include:** Creator's proposal only. On cycle 2+: implementation-routed feedback from Sage/Trickster.
|
||||
**Context to exclude:** Explorer's research, Guardian/Skeptic findings (those go to Creator).
|
||||
|
||||
```
|
||||
Agent(
|
||||
description: "Maker: implement proposal",
|
||||
description: "⚒️ Maker: implement proposal",
|
||||
prompt: "<task description>
|
||||
You are the MAKER archetype.
|
||||
Implement this proposal: <Creator's output>
|
||||
<if cycle 2+: Implementation feedback from prior cycle: <Sage/Trickster findings only>>
|
||||
Rules:
|
||||
1. Follow the proposal exactly — don't redesign
|
||||
2. Write tests for every behavioral change
|
||||
@@ -85,9 +99,13 @@ Agent(
|
||||
Spawn reviewers **in parallel** — they read the Maker's changes independently.
|
||||
|
||||
### Guardian
|
||||
|
||||
**Context to include:** Maker's git diff, proposal risk section only.
|
||||
**Context to exclude:** Explorer's research, full proposal, other reviewer outputs.
|
||||
|
||||
```
|
||||
Agent(
|
||||
description: "Guardian: security and risk review",
|
||||
description: "🛡️ Guardian: security and risk review",
|
||||
prompt: "You are the GUARDIAN archetype.
|
||||
Review the changes in branch: <maker's branch>
|
||||
Assess:
|
||||
@@ -96,31 +114,42 @@ Agent(
|
||||
3. Breaking changes (API compatibility, schema migrations)
|
||||
4. Dependency risks (new deps, version conflicts)
|
||||
Output: APPROVED or REJECTED with specific findings.
|
||||
Each finding needs: location, severity (critical/warning/info), description, fix suggestion.
|
||||
Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
|
||||
Categories: security, reliability, design, breaking-change, dependency
|
||||
Be rigorous but practical — flag real risks, not theoretical ones."
|
||||
)
|
||||
```
|
||||
|
||||
### Skeptic (if standard or thorough)
|
||||
|
||||
**Context to include:** Creator's proposal (focus on assumptions section).
|
||||
**Context to exclude:** Git diff details, Explorer's research, other reviewer outputs.
|
||||
|
||||
```
|
||||
Agent(
|
||||
description: "Skeptic: challenge assumptions",
|
||||
description: "🤔 Skeptic: challenge assumptions",
|
||||
prompt: "You are the SKEPTIC archetype.
|
||||
Review the changes in branch: <maker's branch>
|
||||
Review the proposal: <Creator's proposal>
|
||||
Challenge:
|
||||
1. Assumptions in the design — what if they're wrong?
|
||||
2. Alternative approaches not considered
|
||||
3. Edge cases not tested
|
||||
4. Scalability concerns
|
||||
Output: APPROVED or REJECTED with counterarguments.
|
||||
Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
|
||||
Categories: design, quality, testing, scalability
|
||||
Be constructive — every challenge must include a suggested alternative."
|
||||
)
|
||||
```
|
||||
|
||||
### Sage (if standard or thorough)
|
||||
|
||||
**Context to include:** Creator's proposal, Maker's git diff, implementation summary.
|
||||
**Context to exclude:** Explorer's raw research, other reviewer outputs.
|
||||
|
||||
```
|
||||
Agent(
|
||||
description: "Sage: holistic quality review",
|
||||
description: "📚 Sage: holistic quality review",
|
||||
prompt: "You are the SAGE archetype.
|
||||
Review the changes in branch: <maker's branch>
|
||||
Evaluate holistically:
|
||||
@@ -129,14 +158,20 @@ Agent(
|
||||
3. Documentation (does the change need docs?)
|
||||
4. Consistency with codebase patterns
|
||||
Output: APPROVED or REJECTED with quality findings.
|
||||
Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
|
||||
Categories: quality, testing, design, consistency
|
||||
Judge like a senior engineer doing a PR review."
|
||||
)
|
||||
```
|
||||
|
||||
### Trickster (if thorough only)
|
||||
|
||||
**Context to include:** Maker's git diff only.
|
||||
**Context to exclude:** Everything else — proposal, research, other reviews.
|
||||
|
||||
```
|
||||
Agent(
|
||||
description: "Trickster: adversarial testing",
|
||||
description: "🃏 Trickster: adversarial testing",
|
||||
prompt: "You are the TRICKSTER archetype.
|
||||
Try to break the changes in branch: <maker's branch>
|
||||
Attack vectors:
|
||||
@@ -145,6 +180,8 @@ Agent(
|
||||
3. Error path exploitation
|
||||
4. Dependency failure scenarios
|
||||
Output: APPROVED or REJECTED with edge cases found.
|
||||
Each finding: | file:line | CRITICAL/WARNING/INFO | category | description | fix |
|
||||
Categories: security, reliability, testing
|
||||
Think like a QA engineer who gets paid per bug found."
|
||||
)
|
||||
```
|
||||
@@ -157,11 +194,12 @@ Collect all reviewer outputs and decide:
|
||||
1. Merge the Maker's worktree branch into the target branch
|
||||
2. Report: what was implemented, what was reviewed, any warnings noted
|
||||
3. Clean up the worktree
|
||||
4. Record metrics (see Orchestration Metrics)
|
||||
|
||||
### Issues Found (and cycles remaining)
|
||||
1. Collect all findings into a feedback summary
|
||||
1. Build structured feedback using the Cycle Feedback Protocol below
|
||||
2. Go back to Step 1 (Plan) with the feedback
|
||||
3. Creator revises the proposal based on reviewer findings
|
||||
3. Creator revises the proposal, addressing each unresolved issue
|
||||
4. Maker re-implements in a fresh worktree
|
||||
5. Reviewers check again
|
||||
|
||||
@@ -170,11 +208,156 @@ Collect all reviewer outputs and decide:
|
||||
2. Present the best implementation so far (on its branch)
|
||||
3. Let the user decide: merge as-is, fix manually, or abandon
|
||||
|
||||
---
|
||||
|
||||
## Cycle Feedback Protocol
|
||||
|
||||
After the Check phase, build structured feedback for the next cycle. This replaces dumping raw reviewer output.
|
||||
|
||||
### 1. Extract Findings
|
||||
|
||||
Parse each reviewer's output into the standardized format:
|
||||
|
||||
```markdown
|
||||
## Cycle N Feedback
|
||||
|
||||
### Unresolved Issues
|
||||
| Source | Severity | Category | Issue | Route to |
|
||||
|--------|----------|----------|-------|----------|
|
||||
| Guardian | CRITICAL | security | SQL injection in user input | Creator |
|
||||
| Skeptic | WARNING | design | Assumes single-tenant only | Creator |
|
||||
| Sage | WARNING | quality | Test names don't describe behavior | Maker |
|
||||
| Trickster | CRITICAL | reliability | Empty string bypasses validation | Creator |
|
||||
|
||||
### Resolved (from cycle N-1)
|
||||
| Source | Issue | Resolution |
|
||||
|--------|-------|------------|
|
||||
| Guardian | Missing rate limit | Added rate limiter middleware |
|
||||
```
|
||||
|
||||
### 2. Route Feedback
|
||||
|
||||
Not all findings go to the same agent:
|
||||
|
||||
| Finding source | Routes to | Rationale |
|
||||
|----------------|-----------|-----------|
|
||||
| Guardian (security, breaking-change) | **Creator** | Design must change |
|
||||
| Skeptic (design, scalability) | **Creator** | Assumptions need revision |
|
||||
| Sage (quality, consistency) | **Maker** | Implementation refinement |
|
||||
| Trickster (reliability, testing) | **Creator** if design flaw, **Maker** if test gap | Depends on root cause |
|
||||
|
||||
### 3. Track Resolution
|
||||
|
||||
Compare cycle N findings against cycle N-1:
|
||||
- If a prior finding no longer appears in the same category → mark **resolved**
|
||||
- If a prior finding persists → it stays **unresolved** with an incremented cycle count
|
||||
- If new findings appear → add as new unresolved issues
|
||||
|
||||
This prevents regression and gives the Creator/Maker a clear list of what to address.
|
||||
|
||||
---
|
||||
|
||||
## Orchestration Metrics
|
||||
|
||||
Track lightweight metrics throughout the orchestration. No token counting (unreliable from skill layer) — just timing and outcomes.
|
||||
|
||||
### Per-Phase Logging
|
||||
|
||||
After each phase completes, note:
|
||||
|
||||
```
|
||||
| Phase | Duration | Agents | Outcome |
|
||||
|-------|----------|--------|---------|
|
||||
| Plan | 45s | 2 | Proposal ready (confidence: 0.8) |
|
||||
| Do | 90s | 1 | 4 files changed, 8 tests added |
|
||||
| Check | 60s | 3 | 1 REJECTED (Guardian), 2 APPROVED |
|
||||
| Act | — | — | Cycle back → feedback built |
|
||||
```
|
||||
|
||||
### Orchestration Summary
|
||||
|
||||
At orchestration end, include in the report:
|
||||
|
||||
```markdown
|
||||
## Orchestration Metrics
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Workflow | standard |
|
||||
| Cycles | 2 of 2 |
|
||||
| Total duration | 4m 30s |
|
||||
| Agents spawned | 9 |
|
||||
| Findings (total) | 5 |
|
||||
| Findings (critical) | 1 |
|
||||
| Findings (resolved) | 4 |
|
||||
| Shadow detections | 0 |
|
||||
```
|
||||
|
||||
Use this data to calibrate future workflow selection — if fast workflows consistently need 0 cycles of revision, the task was well-scoped.
|
||||
|
||||
---
|
||||
|
||||
## Autonomous Mode
|
||||
|
||||
When running unattended (overnight sessions, batch queues), add these behaviors to the orchestration loop:
|
||||
|
||||
### Between-Task Checkpoint
|
||||
|
||||
After each task completes (success or failure):
|
||||
1. **Commit and push** all changes immediately
|
||||
2. **Update session log** at `.archeflow/session-log.md` with task outcome
|
||||
3. **Check stop conditions** before starting next task:
|
||||
- 3 consecutive failures → STOP
|
||||
- Shadow escalation (same shadow 3+ times) → STOP
|
||||
- Test suite broken after merge → REVERT and STOP
|
||||
- Destructive action detected → STOP
|
||||
|
||||
### Session Log Protocol
|
||||
|
||||
Write to `.archeflow/session-log.md` after each task:
|
||||
|
||||
```markdown
|
||||
## Task N: <description>
|
||||
**Workflow:** standard | **Status:** COMPLETED/FAILED
|
||||
**Cycles:** 1 of 2
|
||||
**Findings:** Guardian APPROVED, Skeptic APPROVED, Sage WARNING (test names)
|
||||
**Files changed:** 5 | **Tests added:** 12
|
||||
**Branch:** merged to main (commit abc1234) | OR: archeflow/maker-xyz (NOT merged)
|
||||
**Duration:** 8 min
|
||||
```
|
||||
|
||||
### Safety Rules
|
||||
- Never force-push. Never modify main history.
|
||||
- All work stays on worktree branches until explicitly merged
|
||||
- Merges use `--no-ff` — individually revertable
|
||||
- Failed tasks leave branches intact for manual inspection
|
||||
|
||||
For full autonomous mode details (task queues, overnight checklists, user controls): load the `archeflow:autonomous-mode` skill.
|
||||
|
||||
---
|
||||
|
||||
## Shadow Monitoring
|
||||
|
||||
During orchestration, watch for shadow activation after each agent completes. Quick checklist:
|
||||
|
||||
| Archetype | Shadow | Quick Check |
|
||||
|-----------|--------|-------------|
|
||||
| Explorer | Rabbit Hole | Output >2000 words without Recommendation section? |
|
||||
| Creator | Over-Architect | >2 new abstractions for one feature? |
|
||||
| Maker | Rogue | No test files in changeset? Files outside proposal? |
|
||||
| Guardian | Paranoid | CRITICAL:WARNING ratio >2:1? Zero approvals? |
|
||||
| Skeptic | Paralytic | >7 challenges? <50% have alternatives? |
|
||||
| Trickster | False Alarm | Findings in untouched code? >10 findings? |
|
||||
| Sage | Bureaucrat | Review >2x code change length? |
|
||||
|
||||
On detection: apply correction prompt from `archeflow:shadow-detection` skill. On second detection of same shadow: replace agent. On 3+ shadows in same cycle: escalate to user.
|
||||
|
||||
---
|
||||
|
||||
## Orchestration Report
|
||||
|
||||
After completion, summarize:
|
||||
|
||||
```
|
||||
```markdown
|
||||
## ArcheFlow Orchestration Report
|
||||
- **Task:** <description>
|
||||
- **Workflow:** standard (2 cycles)
|
||||
@@ -183,4 +366,5 @@ After completion, summarize:
|
||||
- **Files changed:** 4 files, +120 -30 lines
|
||||
- **Tests added:** 8 new tests
|
||||
- **Branch:** archeflow/maker-<id> → merged to main
|
||||
- **Metrics:** 9 agents, 4m 30s, 5 findings (4 resolved, 1 info remaining)
|
||||
```
|
||||
|
||||
@@ -50,3 +50,42 @@ Explorer researches, then Creator designs. Sequential — Creator needs Explorer
|
||||
### Not Doing
|
||||
- <adjacent concerns deliberately excluded>
|
||||
```
|
||||
|
||||
## Creator with Prior Feedback (Cycle 2+)
|
||||
|
||||
When the Creator receives structured feedback from a prior cycle, the proposal must include an additional section addressing each unresolved issue:
|
||||
|
||||
```markdown
|
||||
## Proposal: <task> (Revision — Cycle N)
|
||||
**Confidence:** <0.0 to 1.0>
|
||||
|
||||
### Prior Feedback Response
|
||||
| Issue | Source | Action | Rationale |
|
||||
|-------|--------|--------|-----------|
|
||||
| SQL injection in user input | Guardian | **Fixed** — added parameterized queries | Direct security fix |
|
||||
| Assumes single-tenant | Skeptic | **Deferred** — multi-tenant out of scope | Not in task requirements |
|
||||
| Test names unclear | Sage | **Accepted** — routed to Maker | Implementation concern |
|
||||
|
||||
### Architecture Decision
|
||||
<revised design addressing feedback>
|
||||
|
||||
### Changes
|
||||
<updated file list>
|
||||
|
||||
### Test Strategy
|
||||
<updated test cases>
|
||||
|
||||
### Risks
|
||||
<updated risks — include any new risks from the revision>
|
||||
|
||||
### Not Doing
|
||||
<updated scope boundaries>
|
||||
```
|
||||
|
||||
**Rules for addressing feedback:**
|
||||
- **Fixed:** Changed the design to resolve the issue. Explain how.
|
||||
- **Deferred:** Not addressing now, with explicit reason. Must not be a CRITICAL finding.
|
||||
- **Accepted:** Acknowledged and routed to Maker for implementation-level fix.
|
||||
- **Disputed:** Disagrees with the finding. Must provide evidence or reasoning.
|
||||
|
||||
CRITICAL findings cannot be deferred or disputed — they must be fixed or the proposal will be rejected again.
|
||||
|
||||
@@ -30,10 +30,11 @@ Maintainability Judgment → reviews only → Bureaucrat
|
||||
- Reading more than 15 files without producing findings
|
||||
- Output is a raw inventory of files with no analysis or recommendation
|
||||
|
||||
**Triggers:**
|
||||
- Output length > 2000 words without a recommendation section
|
||||
- More than 3 "see also" or "related" tangents
|
||||
- No patterns or recommendation in output
|
||||
**Detection Checklist** (trigger on ANY):
|
||||
- [ ] Output >2000 words without a `### Recommendation` section
|
||||
- [ ] >3 tangent topics not directly related to the original task
|
||||
- [ ] >15 files read with no `### Patterns` identified
|
||||
- [ ] No synthesis language (recommend, suggest, conclusion, finding, summary) in final 25% of output
|
||||
|
||||
**Correction:**
|
||||
"Summarize your top 3 findings and one recommendation in under 300 words. If your output has no Recommendation section, add one. A dump is not research."
|
||||
@@ -49,10 +50,11 @@ Maintainability Judgment → reviews only → Bureaucrat
|
||||
- Configuration systems for things that could be constants
|
||||
- Proposal has more infrastructure than business logic
|
||||
|
||||
**Triggers:**
|
||||
- More than 2 new abstractions (interfaces, base classes, factories) for a single feature
|
||||
- "In the future we might need..." appears in rationale
|
||||
- Proposal scope exceeds original task by > 50%
|
||||
**Detection Checklist** (trigger on ANY):
|
||||
- [ ] >2 new abstractions (interfaces, base classes, factories, registries) for a single feature
|
||||
- [ ] "In the future we might need..." or "future-proof" appears in rationale
|
||||
- [ ] Proposal scope (files changed) exceeds original task scope by >50%
|
||||
- [ ] More than 1 new package/module introduced for a single feature
|
||||
|
||||
**Correction:**
|
||||
"Design for the current order of magnitude. If the app has 1000 users, design for 10,000 — not 10 million. Remove abstractions that serve hypothetical requirements."
|
||||
@@ -68,10 +70,11 @@ Maintainability Judgment → reviews only → Bureaucrat
|
||||
- Large uncommitted working tree
|
||||
- Files changed that aren't mentioned in the proposal
|
||||
|
||||
**Triggers:**
|
||||
- No test files in the changeset
|
||||
- Single monolithic commit instead of incremental commits
|
||||
- Diff contains files not listed in the Creator's proposal
|
||||
**Detection Checklist** (trigger on ANY):
|
||||
- [ ] Zero test files (`.test.`, `.spec.`, `_test.`) in the changeset with >=3 files changed
|
||||
- [ ] Single monolithic commit instead of incremental commits
|
||||
- [ ] Diff contains files not listed in the Creator's proposal `### Changes` section
|
||||
- [ ] No evidence of running existing test suite before finishing
|
||||
|
||||
**Correction:**
|
||||
"Read the proposal. Write a test. Commit what you have. Revert changes to files not in the proposal. Then continue."
|
||||
@@ -87,10 +90,11 @@ Maintainability Judgment → reviews only → Bureaucrat
|
||||
- Rejecting without suggesting how to fix
|
||||
- Security concerns for internal-only code at external-API severity
|
||||
|
||||
**Triggers:**
|
||||
- CRITICAL:WARNING ratio > 2:1
|
||||
- Zero APPROVED verdicts in 3+ consecutive reviews
|
||||
- Less than 50% of findings include a suggested fix
|
||||
**Detection Checklist** (trigger on ANY):
|
||||
- [ ] CRITICAL:WARNING ratio >2:1 (with minimum 3 total findings)
|
||||
- [ ] Zero APPROVED verdicts in 3+ consecutive reviews
|
||||
- [ ] <50% of findings include a suggested fix in the `Fix` column
|
||||
- [ ] Findings reference attack scenarios that require already-compromised internal systems
|
||||
|
||||
**Correction:**
|
||||
"For each CRITICAL finding, answer: Would a senior engineer block a PR for this? If not, downgrade. Every rejection must include a specific, implementable fix."
|
||||
@@ -106,10 +110,11 @@ Maintainability Judgment → reviews only → Bureaucrat
|
||||
- "What about X?" chains that drift from the task
|
||||
- Restating the same concern in different words
|
||||
|
||||
**Triggers:**
|
||||
- Challenge count > 7
|
||||
- Less than 50% of challenges include alternatives
|
||||
- Same conceptual concern raised multiple times
|
||||
**Detection Checklist** (trigger on ANY):
|
||||
- [ ] >7 findings/challenges raised in a single review
|
||||
- [ ] <50% of findings include an alternative in the `Fix` column
|
||||
- [ ] Same conceptual concern appears 2+ times with different wording
|
||||
- [ ] >3 findings reference code or scenarios outside the task scope
|
||||
|
||||
**Correction:**
|
||||
"Rank your challenges by impact. Keep the top 3. Each must include a specific alternative. Delete the rest."
|
||||
@@ -125,13 +130,14 @@ Maintainability Judgment → reviews only → Bureaucrat
|
||||
- 20 findings when 3 good ones would cover the real risks
|
||||
- Edge cases for edge cases (diminishing returns)
|
||||
|
||||
**Triggers:**
|
||||
- Findings reference code untouched by the implementation
|
||||
- More than 10 findings for a small change
|
||||
- Findings describe scenarios that can't happen in the actual deployment context
|
||||
**Detection Checklist** (trigger on ANY):
|
||||
- [ ] Any finding references code untouched by the Maker's diff
|
||||
- [ ] >10 findings for a change touching <5 files
|
||||
- [ ] Findings describe scenarios requiring conditions that can't occur in the deployment context
|
||||
- [ ] >3 findings without reproduction steps
|
||||
|
||||
**Correction:**
|
||||
"Quality over quantity. Delete findings outside the Maker's diff. Rank remaining by likelihood × impact. Keep top 3-5. Three real findings beat twenty noise."
|
||||
"Quality over quantity. Delete findings outside the Maker's diff. Rank remaining by likelihood x impact. Keep top 3-5. Three real findings beat twenty noise."
|
||||
|
||||
---
|
||||
|
||||
@@ -144,10 +150,11 @@ Maintainability Judgment → reviews only → Bureaucrat
|
||||
- Suggesting refactors unrelated to the current task
|
||||
- Deep-sounding analysis that doesn't end with a specific action
|
||||
|
||||
**Triggers:**
|
||||
- Review word count > 2x the code change's word count
|
||||
- Suggestions reference files not in the changeset
|
||||
- Findings contain "consider" or "think about" without a specific action
|
||||
**Detection Checklist** (trigger on ANY):
|
||||
- [ ] Review word count >2x the code change's line count (rough: review words > diff lines x 2)
|
||||
- [ ] Any finding references files not in the Maker's changeset
|
||||
- [ ] >2 findings use "consider" or "think about" without a concrete action in the `Fix` column
|
||||
- [ ] Suggesting documentation for functions with <5 lines or self-descriptive names
|
||||
|
||||
**Correction:**
|
||||
"Limit your review to issues that affect maintainability in the next 6 months. Every finding must end with a specific action. If you can't state the consequence of NOT fixing it, don't raise it."
|
||||
|
||||
@@ -3,21 +3,50 @@ name: using-archeflow
|
||||
description: Use at session start when implementing features, reviewing code, debugging, or any task that benefits from multiple perspectives. Activates ArcheFlow multi-agent orchestration.
|
||||
---
|
||||
|
||||
# ArcheFlow
|
||||
# ArcheFlow — Active
|
||||
|
||||
Multi-agent orchestration using archetypal roles and PDCA quality cycles.
|
||||
|
||||
## IMPORTANT: When to Activate
|
||||
|
||||
You MUST use ArcheFlow orchestration (load `archeflow:orchestration` skill and follow its steps) for any task that matches:
|
||||
|
||||
- **New features** — any feature touching 2+ files
|
||||
- **Refactoring** — structural changes across modules
|
||||
- **Security-sensitive changes** — auth, encryption, input handling, API keys
|
||||
- **Bug fixes with unclear root cause** — use Explorer to investigate first
|
||||
- **Code review requests** — spawn Guardian + relevant reviewers
|
||||
- **Multi-file changes** — anything beyond a single-file edit
|
||||
|
||||
Choose the workflow based on risk:
|
||||
|
||||
| Signal | Workflow | Command |
|
||||
|--------|----------|---------|
|
||||
| Small fix, low risk, single concern | `fast` | Creator → Maker → Guardian |
|
||||
| Feature, multiple files, moderate risk | `standard` | Explorer + Creator → Maker → Guardian + Skeptic + Sage |
|
||||
| Security-sensitive, breaking changes, public API | `thorough` | Explorer + Creator → Maker → All 4 reviewers |
|
||||
|
||||
## When to Skip ArcheFlow
|
||||
|
||||
Do NOT use ArcheFlow for these — just do them directly:
|
||||
|
||||
- Single-line fixes, typos, formatting
|
||||
- Answering questions (no code changes)
|
||||
- Reading/exploring code without making changes
|
||||
- Config changes to a single file
|
||||
- Git operations (commit, push, branch)
|
||||
|
||||
## Archetypes
|
||||
|
||||
| Archetype | Virtue | Shadow | Phase |
|
||||
|-----------|--------|--------|-------|
|
||||
| **Explorer** | Contextual Clarity | Rabbit Hole | Plan |
|
||||
| **Creator** | Decisive Framing | Over-Architect | Plan |
|
||||
| **Maker** | Execution Discipline | Rogue | Do |
|
||||
| **Guardian** | Threat Intuition | Paranoid | Check |
|
||||
| **Skeptic** | Assumption Surfacing | Paralytic | Check |
|
||||
| **Trickster** | Adversarial Creativity | False Alarm | Check |
|
||||
| **Sage** | Maintainability Judgment | Bureaucrat | Check |
|
||||
| Archetype | Avatar | Virtue | Shadow | Phase |
|
||||
|-----------|--------|--------|--------|-------|
|
||||
| **Explorer** | 🔍 | Contextual Clarity | Rabbit Hole | Plan |
|
||||
| **Creator** | 🏗️ | Decisive Framing | Over-Architect | Plan |
|
||||
| **Maker** | ⚒️ | Execution Discipline | Rogue | Do |
|
||||
| **Guardian** | 🛡️ | Threat Intuition | Paranoid | Check |
|
||||
| **Skeptic** | 🤔 | Assumption Surfacing | Paralytic | Check |
|
||||
| **Trickster** | 🃏 | Adversarial Creativity | False Alarm | Check |
|
||||
| **Sage** | 📚 | Maintainability Judgment | Bureaucrat | Check |
|
||||
|
||||
## PDCA Cycle
|
||||
|
||||
@@ -28,24 +57,20 @@ Check → Reviewers assess in parallel (approve/reject)
|
||||
Act → All approved? Merge. Issues? Cycle back to Plan.
|
||||
```
|
||||
|
||||
## Workflows
|
||||
## Quick Start
|
||||
|
||||
| Workflow | Archetypes | Cycles |
|
||||
|----------|-----------|:------:|
|
||||
| `fast` | Creator → Maker → Guardian | 1 |
|
||||
| `standard` | Explorer + Creator → Maker → Guardian + Skeptic + Sage | 2 |
|
||||
| `thorough` | Explorer + Creator → Maker → All 4 reviewers | 3 |
|
||||
When the user gives an implementation task:
|
||||
|
||||
## When to Use
|
||||
1. Assess: does this need ArcheFlow? (see criteria above)
|
||||
2. If yes: load `archeflow:orchestration` skill
|
||||
3. Pick workflow (fast/standard/thorough)
|
||||
4. Execute the PDCA steps from the orchestration skill
|
||||
|
||||
**Use** for features spanning multiple files, security-sensitive changes, or when multiple perspectives help.
|
||||
**Skip** for single-file fixes, formatting, or purely informational tasks.
|
||||
## Skills Reference
|
||||
|
||||
## Skills
|
||||
|
||||
- **archeflow:orchestration** — Step-by-step execution guide
|
||||
- **archeflow:orchestration** — Step-by-step execution guide (load this to run)
|
||||
- **archeflow:plan-phase** / **do-phase** / **check-phase** — Phase protocols
|
||||
- **archeflow:shadow-detection** — Recognizing dysfunction
|
||||
- **archeflow:shadow-detection** — Recognizing and correcting dysfunction
|
||||
- **archeflow:attention-filters** — What context each archetype receives
|
||||
- **archeflow:autonomous-mode** — Unattended sessions
|
||||
- **archeflow:custom-archetypes** / **workflow-design** — Extending ArcheFlow
|
||||
|
||||
Reference in New Issue
Block a user