feat: principles #26-#32 — PDCA, prod testing, changelog, emergency stop, guardian, debug logs, multi-auth
This commit is contained in:
87
README.md
87
README.md
@@ -269,6 +269,93 @@ When an agent needs a tool that isn't installed, it should install it automatica
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Quality & Process
|
||||||
|
|
||||||
|
### 26. PDCA Every Sprint
|
||||||
|
|
||||||
|
Plan-Do-Check-Act after every sprint, not just at the end. Check catches bugs before they compound.
|
||||||
|
|
||||||
|
- Plan: define features + acceptance criteria
|
||||||
|
- Do: implement with team, commit after each feature
|
||||||
|
- Check: test in production, read debug logs, try bad inputs, verify on mobile
|
||||||
|
- Act: fix everything found before starting next sprint
|
||||||
|
- Never skip Check. A shipped bug costs 10x more than a caught bug.
|
||||||
|
|
||||||
|
**Origin:** Sprint 1-3 each had a PDCA cycle that caught rate limiting issues, SSE race conditions, and Caddy routing gaps.
|
||||||
|
|
||||||
|
### 27. Test in Production, Not in Mocks
|
||||||
|
|
||||||
|
For single-user tools: test against the real deployment. Mocks hide integration bugs.
|
||||||
|
|
||||||
|
- `curl` against the live API after each deploy
|
||||||
|
- Try the PWA on your actual phone
|
||||||
|
- Submit real jobs through the real worker
|
||||||
|
- Read the real debug logs
|
||||||
|
|
||||||
|
**Origin:** "Committe regelmäßig und test in production — keine mocks!"
|
||||||
|
|
||||||
|
### 28. Changelog as First-Class Artifact
|
||||||
|
|
||||||
|
Every project gets a CHANGELOG.md. Updated with every sprint. The user should never have to ask "what changed?"
|
||||||
|
|
||||||
|
- Reverse-chronological, grouped by version/sprint
|
||||||
|
- Include Added/Changed/Security/Fixed sections
|
||||||
|
- Link to relevant commits if helpful
|
||||||
|
- Update it DURING the sprint, not after
|
||||||
|
|
||||||
|
**Origin:** "Ich brauch gute changelogs um bei allem laufenden zu bleiben."
|
||||||
|
|
||||||
|
### 29. Emergency Stop (Not-Aus)
|
||||||
|
|
||||||
|
Every autonomous system needs a kill switch. One button, kills everything, no confirmation cascade.
|
||||||
|
|
||||||
|
- Cancel all running jobs immediately
|
||||||
|
- Pause the system (workers stop polling)
|
||||||
|
- Log the event as critical
|
||||||
|
- Resume button to unpause
|
||||||
|
- Visible at all times, not buried in a menu
|
||||||
|
|
||||||
|
**Origin:** "Und wir brauchen einen Not-Aus-Knopf ;)"
|
||||||
|
|
||||||
|
### 30. Self-Monitoring (Guardian Pattern)
|
||||||
|
|
||||||
|
The system monitors itself. A background watchdog checks health every N minutes and logs findings.
|
||||||
|
|
||||||
|
- Check: stuck jobs, dead workers, error spikes, DB connectivity
|
||||||
|
- Log structured findings to a queryable debug_log
|
||||||
|
- Agent can read the logs to self-diagnose
|
||||||
|
- Future: alert the user via push/webhook when degraded
|
||||||
|
- Clean up old logs automatically
|
||||||
|
|
||||||
|
**Origin:** "We should have a guardian who checks every other minute what's going on."
|
||||||
|
|
||||||
|
### 31. Debug Logs as Agent Interface
|
||||||
|
|
||||||
|
Structured debug logs aren't just for humans — they're an API for the agent to understand system health.
|
||||||
|
|
||||||
|
- Queryable by level, component, time range
|
||||||
|
- Secret-safe (auto-redact tokens, keys, passwords)
|
||||||
|
- Agent reads them between sprints to catch issues
|
||||||
|
- Self-healing: agent detects error patterns and applies fixes
|
||||||
|
|
||||||
|
**Origin:** Built during dispatch development — agent reads `/debug/logs` to diagnose production issues.
|
||||||
|
|
||||||
|
### 32. Multi-Layer Auth for Admin Endpoints
|
||||||
|
|
||||||
|
Regular API operations and admin/debug operations need different auth levels.
|
||||||
|
|
||||||
|
- Regular token: job CRUD, worker operations
|
||||||
|
- Admin token: debug logs, stats, worker management, emergency stop
|
||||||
|
- Rate limiting: stricter on admin endpoints
|
||||||
|
- Never share the same token for both levels
|
||||||
|
|
||||||
|
**Origin:** "Ich hoffe wir haben da ne mehrstufige Authentifizierung dahinter..."
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## (inbox — unsorted ideas)
|
## (inbox — unsorted ideas)
|
||||||
|
|
||||||
|
- **Least-privilege agent access**: Agents should SSH as a dedicated non-root user (e.g. `deploy@`) with scoped sudo for only what they need (systemctl, caddy reload). No root SSH long-term.
|
||||||
|
- **Immutable deploy artifacts**: Agent builds a tarball/image, uploads it, runs a deploy script. Never edits files in-place on production.
|
||||||
|
|
||||||
_Drop new principles here. They get organized on next pass._
|
_Drop new principles here. They get organized on next pass._
|
||||||
|
|||||||
Reference in New Issue
Block a user