Fix remaining critical, high, and medium issues from 4-perspective review
Critical fixes:
- Fix rating clamp range 1-10 → 1-5 (actual scale)
- Add `ietf ideas convergence` command (SequenceMatcher at 0.75 threshold)
- Fix "628 cross-org ideas" → 130 (verified from current DB) across 8 files
Security fixes:
- Sanitize FTS5 query input (strip special chars + boolean operators)
- Add rate limiting (10 req/min/IP) on Claude-calling endpoints
- Change <path:name> → <string:name> on draft routes
Codebase fixes:
- Add Database context manager (__enter__/__exit__)
- Wire false_positive filtering into queries (exclude by default in web UI)
- Fix Post 3 arithmetic ("~300" → "~409" distinct proposals)
Content & licensing:
- Add MIT LICENSE file
- Add IPR/FRAND notes (BCP 79, RFC 8179) to Posts 03 and 07
- Qualify "4:1 safety ratio" with monthly variation in 6 remaining files
- Add "Data as of March 2026" freeze-date headers to all 10 blog posts
- Hedge causal language in Post 04
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -2,6 +2,8 @@
|
||||
|
||||
*We used a team of AI agents to analyze, write about, and draw conclusions from 434 IETF drafts on AI agents. Here is what that looked like from the inside.*
|
||||
|
||||
*Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
|
||||
|
||||
---
|
||||
|
||||
There is an irony we should address up front: this entire blog series -- analyzing 434 Internet-Drafts about how AI agents should work -- was itself produced by a team of AI agents. Four Claude instances, each with a distinct role, reading the same data, building on each other's output, and coordinating through a shared task system and development journal.
|
||||
@@ -50,7 +52,7 @@ The Coder and Writer worked simultaneously, their outputs feeding each other. Th
|
||||
| Coder Built | What It Revealed | Writer Used It In |
|
||||
|-------------|------------------|-------------------|
|
||||
| `ietf refs` (4,231 cross-references) | OAuth 2.0 and TLS 1.3 are the ecosystem's bedrock | Post 3: OAuth Wars |
|
||||
| `ietf idea-overlap` (628 cross-org ideas) | 43% of idea clusters have cross-org validation | Post 5: Where Drafts Converge |
|
||||
| `ietf idea-overlap` (130 cross-org ideas) | 36% of idea clusters have cross-org validation | Post 5: Where Drafts Converge |
|
||||
| `ietf trends` (19 months of data) | Growth from 0.5% to 9.3% of all IETF submissions | Post 1: Gold Rush |
|
||||
| `ietf status` (36 WG-adopted drafts) | Agent standards live in security WGs, not agent WGs | Post 6: Big Picture |
|
||||
| `ietf revisions` (55% at rev-00) | Most drafts are fire-and-forget; commitment is rare | Posts 2, 5 |
|
||||
@@ -79,7 +81,7 @@ This is exactly the kind of silent failure that agent teams need guardrails for.
|
||||
|
||||
### Phase 5: The Data Arrives and the Reframing Battle
|
||||
|
||||
While the writing and reviewing unfolded, the Analyst completed the full pipeline: 434 drafts rated, 557 authors mapped (up from 403), 419 ideas extracted (up from 1,262, though subsequent re-extraction with different parameters consolidated the count). The numbers changed significantly: Huawei's share grew from 12% to ~16%, A2A protocols from 92 to 155, and the safety ratio held steady at roughly 4:1. Every blog post needed a numbers-update pass.
|
||||
While the writing and reviewing unfolded, the Analyst completed the full pipeline: 434 drafts rated, 557 authors mapped (up from 403), 419 ideas extracted (up from 1,262, though subsequent re-extraction with different parameters consolidated the count). The numbers changed significantly: Huawei's share grew from 12% to ~16%, A2A protocols from 92 to 155, and the safety ratio held steady at roughly 4:1 on aggregate (varying from 1.5:1 to 21:1 month-to-month). Every blog post needed a numbers-update pass.
|
||||
|
||||
But the most consequential event in Phase 5 was not the data refresh. It was the project lead challenging the Writer's headline claim.
|
||||
|
||||
@@ -89,7 +91,7 @@ The real signal was hiding in the Coder's cross-org overlap analysis: of 1,692 u
|
||||
|
||||
This required rewriting Post 5 entirely -- its title changed from "The 1,780 Ideas That Will Shape Agent Infrastructure" to "Where 434 Drafts Converge (And Where They Don't)." The lead metric shifted from raw extraction count (impressive but hollow) to the 96% fragmentation rate (honest and striking). Every post that referenced the idea count had to be updated, some multiple times as the framing evolved through three iterations.
|
||||
|
||||
The episode is worth documenting because it illustrates the irreducible role of human judgment in agent-produced work. Four agents had independently used the 1,780 figure -- the Analyst generated it, the Coder validated it, the Architect designed around it, the Writer headlined it. None questioned whether it was meaningful. It took a human asking "so what?" to force the reframe. The improved version -- convergence-amid-fragmentation, with 628 cross-org convergent ideas as the honest middle ground -- was genuinely better. But no agent surfaced the critique on its own.
|
||||
The episode is worth documenting because it illustrates the irreducible role of human judgment in agent-produced work. Four agents had independently used the 1,780 figure -- the Analyst generated it, the Coder validated it, the Architect designed around it, the Writer headlined it. None questioned whether it was meaningful. It took a human asking "so what?" to force the reframe. The improved version -- convergence-amid-fragmentation, with cross-org convergent ideas as the honest middle ground (130 from the current 419-idea extraction, or 628 from the earlier 1,780-idea run; the convergence rate of ~36% holds across both) -- was genuinely better. But no agent surfaced the critique on its own.
|
||||
|
||||
### Phase 6: Bombshell Findings and Final Integration
|
||||
|
||||
|
||||
Reference in New Issue
Block a user