Fix remaining critical, high, and medium issues from 4-perspective review

Critical fixes:
- Fix rating clamp range 1-10 → 1-5 (actual scale)
- Add `ietf ideas convergence` command (SequenceMatcher at 0.75 threshold)
- Fix "628 cross-org ideas" → 130 (verified from current DB) across 8 files

Security fixes:
- Sanitize FTS5 query input (strip special chars + boolean operators)
- Add rate limiting (10 req/min/IP) on Claude-calling endpoints
- Change <path:name> → <string:name> on draft routes

Codebase fixes:
- Add Database context manager (__enter__/__exit__)
- Wire false_positive filtering into queries (exclude by default in web UI)
- Fix Post 3 arithmetic ("~300" → "~409" distinct proposals)

Content & licensing:
- Add MIT LICENSE file
- Add IPR/FRAND notes (BCP 79, RFC 8179) to Posts 03 and 07
- Qualify "4:1 safety ratio" with monthly variation in 6 remaining files
- Add "Data as of March 2026" freeze-date headers to all 10 blog posts
- Hedge causal language in Post 04

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-08 12:47:47 +01:00
parent f1a0b0264c
commit e7527ad68e
40 changed files with 1005 additions and 169 deletions

View File

@@ -4,6 +4,8 @@
*Architectural design document governing the 7-post blog series. This document has two sections: (A) the internal narrative architecture (for the team), and (B) the reader-facing series introduction (for publication).*
*Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
---
# PART A: NARRATIVE ARCHITECTURE (Internal)
@@ -16,9 +18,9 @@ The data tells a story in three acts:
1. **The Gold Rush** (Posts 1-2): An explosion of activity, concentrated in surprising hands. 434 drafts, rapid growth in 9 months, one company writing ~16% of all drafts, Western tech giants dramatically underrepresented.
2. **The Fragmentation** (Posts 3-4): That activity is not converging. 155 competing A2A protocols with no interoperability layer. 14 OAuth-for-agents proposals that cannot coexist. A 4:1 ratio of capability-building to safety work. Critical gaps where nobody is building at all.
2. **The Fragmentation** (Posts 3-4): That activity is not converging. 155 competing A2A protocols with no interoperability layer. 14 OAuth-for-agents proposals that cannot coexist. A ~4:1 ratio of capability-building to safety work (averaging ~4:1 but varying from 1.5:1 to 21:1 month-to-month). Critical gaps where nobody is building at all.
3. **The Path Forward** (Posts 5-6): The raw material for a solution exists -- **628 technical ideas** independently proposed by multiple organizations show where genuine consensus is forming. But convergence on components is not convergence on architecture. The missing piece is not more protocols; it is connective tissue: a shared execution model, human oversight primitives, protocol interoperability, and assurance profiles.
3. **The Path Forward** (Posts 5-6): The raw material for a solution exists -- **130 cross-org convergent ideas** (36% of unique clusters) independently proposed by multiple organizations show where genuine consensus is forming. But convergence on components is not convergence on architecture. The missing piece is not more protocols; it is connective tissue: a shared execution model, human oversight primitives, protocol interoperability, and assurance profiles.
The throughline is a question: **Can the IETF assemble the architecture before the protocols ship without it?**
@@ -37,7 +39,7 @@ TENSION
| / nobody's building) \
| Post 3 / Post 5 \
| FRAGMENTATION CONVERGENCE \
| / (escalation: (628 cross-org \
| / (escalation: (130 cross-org \
| / competing for solutions) Post 7
| / protocols) HOW WE
|/ BUILT THIS
@@ -49,7 +51,7 @@ TENSION
+-----------------------------------------------------------> TIME/POSTS
```
**The emotional arc**: Wow, this is huge (Post 1) -> Wait, who controls it? (Post 2) -> Oh no, it is fragmenting (Post 3) -> And the most important parts are missing (Post 4, the climax) -> But beneath the chaos, organizations actually agree on 628 ideas (Post 5) -> Here is what the finished picture looks like (Post 6, the resolution) -> And here is how we figured all this out (Post 7, the coda).
**The emotional arc**: Wow, this is huge (Post 1) -> Wait, who controls it? (Post 2) -> Oh no, it is fragmenting (Post 3) -> And the most important parts are missing (Post 4, the climax) -> But beneath the chaos, organizations actually agree on 130 ideas (Post 5) -> Here is what the finished picture looks like (Post 6, the resolution) -> And here is how we figured all this out (Post 7, the coda).
---
@@ -69,7 +71,7 @@ TENSION
- 10+ categories, with data formats/interop (174), A2A protocols (155), and identity/auth (152) leading
- Average quality score: ~3.27/5.0 (4-dim composite, range 1.25-4.75)
- Top-rated drafts: VOLT (4.75), DAAP (4.75), STAMP (4.5), TPM-attestation (4.5)
- 4:1 safety deficit ratio (first mention -- this becomes the recurring motif)
- ~4:1 safety deficit ratio on aggregate, varying from 1.5:1 to 21:1 by month (first mention -- this becomes the recurring motif)
**What makes it worth reading alone**: The sheer numbers. Nobody else has quantified this. The rapid growth curve is the hook.
@@ -140,14 +142,14 @@ TENSION
- **Critical Gap 3: Error Recovery and Rollback** -- only 6 ideas from 1 draft (the starkest absence in the corpus).
- **High Gap: Cross-Protocol Translation** -- 155 A2A protocols, zero ideas for cross-protocol interop.
- **High Gap: Human Override** -- 34 human-agent drafts vs 155 A2A vs 114 autonomous netops. CHEQ exists but no emergency override protocol.
- The 4:1 ratio revisited: safety deficit is not just numerical, it is structural. Safety requires cross-WG coordination that the bloc structure cannot produce.
- The ~4:1 ratio (varying 1.5:1 to 21:1) revisited: safety deficit is not just numerical, it is structural. Safety requires cross-WG coordination that the bloc structure cannot produce.
- Gap severity correlates with coordination difficulty
**For each critical gap, include a scenario**: "What goes wrong if this is never addressed?" -- make the gaps concrete and visceral.
**What makes it worth reading alone**: The fear factor. This is the "what keeps you up at night" post.
**Ends with**: "The gaps are real. But so are the solutions -- 628 ideas that multiple organizations independently agree on, scattered across the corpus with no connective tissue."
**Ends with**: "The gaps are real. But so are the solutions -- 130 ideas that multiple organizations independently agree on, scattered across the corpus with no connective tissue."
---
@@ -155,12 +157,12 @@ TENSION
**File**: `05-1262-ideas.md`
**Word count**: 2000-2500
**Key thesis**: Beneath the fragmentation, genuine consensus is forming. **628 technical ideas** have been independently proposed by 2+ organizations -- cross-org convergence signals that reveal what the industry actually agrees on, regardless of which protocol camp they belong to.
**Key thesis**: Beneath the fragmentation, genuine consensus is forming. **130 cross-org convergent ideas** (36% of unique clusters) have been independently proposed by 2+ organizations -- cross-org convergence signals that reveal what the industry actually agrees on, regardless of which protocol camp they belong to.
**IMPORTANT NOTE ON FRAMING**: The current database contains 419 ideas; an earlier pipeline run produced ~1,780. The exact count depends on extraction parameters and deduplication. The raw count is not the story. The story is which ideas survive cross-org validation -- the 628 that appear across different organizations. That is the defensible, meaningful metric. The raw extraction count should appear only in methodology context, not as a headline number.
**IMPORTANT NOTE ON FRAMING**: The current database contains 419 ideas in 361 unique clusters. Cross-org convergence analysis (SequenceMatcher at 0.75 threshold) yields 130 ideas appearing across 2+ organizations. An earlier pipeline run with ~1,780 raw ideas produced 628 cross-org convergent ideas; the convergence *rate* (~36%) is consistent across both runs. The raw count is not the story. The story is which ideas survive cross-org validation. The raw extraction count should appear only in methodology context, not as a headline number.
**Key data points to include**:
- **628 cross-org convergent ideas** (ideas in 2+ drafts from different organizations) -- the headline metric
- **130 cross-org convergent ideas** (ideas in 2+ drafts from different organizations) -- the headline metric
- Top convergence: "A2A Communication Paradigm" (8 orgs, 5 countries), "AI Agent Network Architecture" (8 orgs), "Multi-Agent Communication Protocol" (7 orgs)
- Org-pair overlap matrix: Chinese intra-bloc alignment (Huawei-China Unicom: 32 shared ideas) vs thin cross-regional signal (Ericsson-Inria: 21)
- Cross-org ideas that span Chinese-Western divide: 180 ideas (genuine cross-cultural consensus)
@@ -168,11 +170,11 @@ TENSION
- The "big 6" ambitious proposals: VOLT, ECT, CHEQ, STAMP, DAAP, ADL -- standout ideas regardless of convergence metrics
- The absent ideas: capability degradation signaling, multi-agent transaction semantics, agent migration, privacy-preserving discovery, agent cost/billing
**Structural insight**: Convergence and fragmentation coexist. Teams agree on WHAT needs building (628 ideas converge). They disagree on HOW (155 competing A2A protocols). The gap between "what" and "how" is where architecture is needed.
**Structural insight**: Convergence and fragmentation coexist. Teams agree on WHAT needs building (130 ideas converge across orgs). They disagree on HOW (155 competing A2A protocols). The gap between "what" and "how" is where architecture is needed.
**What makes it worth reading alone**: The cross-org convergence data is actionable -- builders can see which ideas have multi-org backing vs single-team proposals.
**Ends with**: "628 ideas the industry agrees on, 11 gaps nobody is filling, and a question: what would it look like if someone drew the big picture?"
**Ends with**: "130 ideas the industry agrees on, 11 gaps nobody is filling, and a question: what would it look like if someone drew the big picture?"
---
@@ -185,7 +187,7 @@ TENSION
**Key thesis**: The landscape needs not more protocols but connective tissue -- a holistic ecosystem architecture providing a shared execution model (DAGs), human oversight primitives, protocol-agnostic interoperability, and assurance profiles that work from dev to regulated production.
**Key data points to include**:
- Full synthesis: 434 drafts, 557 authors, 628 cross-org convergent ideas, 11 gaps, 18 team blocs, 42 overlap clusters
- Full synthesis: 434 drafts, 557 authors, 130 cross-org convergent ideas, 11 gaps, 18 team blocs, 42 overlap clusters
- The proposed 5-draft ecosystem: AEM (architecture), ATD (task DAG), HITL (human-in-the-loop), AEPB (protocol binding), APAE (assurance profiles)
- How this builds on existing work: SPIFFE (identity), WIMSE (security context), ECT (execution evidence)
- The dual-regime insight: same execution model must work in K8s (fast/relaxed) AND regulated environments (proofs/attestation)
@@ -222,7 +224,7 @@ TENSION
## Recurring Motifs (thread across all posts)
1. **The 4:1 Safety Deficit**: Introduced in Post 1, deepened in Post 4, resolved in Post 6. The series' signature metric.
1. **The ~4:1 Safety Deficit** (averaging ~4:1, varying from 1.5:1 to 21:1 month-to-month): Introduced in Post 1, deepened in Post 4, resolved in Post 6. The series' signature metric.
2. **The Highway/Traffic Light Metaphor**: The IETF is building highways (protocols) before traffic lights (safety, verification, override). Use sparingly but consistently.
@@ -274,7 +276,7 @@ TENSION
# PART B: READER-FACING SERIES INTRODUCTION
*What happens when the internet's standards body tries to build the rules for AI agents -- in real time, with 434 drafts, 557 authors, and a 4:1 safety deficit?*
*What happens when the internet's standards body tries to build the rules for AI agents -- in real time, with 434 drafts, 557 authors, and a ~4:1 safety deficit (varying from 1.5:1 to 21:1 by month)?*
---
@@ -288,11 +290,11 @@ This series tells the story of what we found: explosive growth, deep fragmentati
| # | Title | What You'll Learn |
|---|-------|-------------------|
| 1 | [The IETF's AI Agent Gold Rush](01-gold-rush.md) | The numbers: 434 drafts, 0.5% to 9.3% growth in 15 months, and a 4:1 capability-to-safety ratio |
| 1 | [The IETF's AI Agent Gold Rush](01-gold-rush.md) | The numbers: 434 drafts, 0.5% to 9.3% growth in 15 months, and a ~4:1 capability-to-safety ratio (varying 1.5:1 to 21:1) |
| 2 | [Who's Writing the Rules for AI Agents?](02-who-writes-the-rules.md) | The geopolitics: Huawei's 13-person bloc, Chinese institutional dominance, Western underrepresentation |
| 3 | [The OAuth Wars and Other Battles](03-oauth-wars.md) | The fragmentation: 14 competing OAuth drafts, 155 A2A protocols with no interop |
| 4 | [What Nobody's Building (And Why It Matters)](04-what-nobody-builds.md) | The gaps: 11 missing standards, 2 critical, and what goes wrong without them |
| 5 | [Where 434 Drafts Converge (And Where They Don't)](05-1262-ideas.md) | The convergence: 628 cross-org ideas reveal genuine consensus beneath the fragmentation |
| 5 | [Where 434 Drafts Converge (And Where They Don't)](05-1262-ideas.md) | The convergence: 130 cross-org ideas reveal genuine consensus beneath the fragmentation |
| 6 | [Drawing the Big Picture](06-big-picture.md) | The vision: what the agent ecosystem actually needs and what comes next |
| 7 | [How We Built This](07-how-we-built-this.md) | The methodology: analyzing 434 drafts with Claude, Ollama, and Python |
@@ -316,7 +318,7 @@ All findings come from our open-source IETF Draft Analyzer, which fetches drafts
| Drafts analyzed | 434 |
| Authors mapped | 557 |
| Organizations | 230 |
| Cross-org convergent ideas | 628 |
| Cross-org convergent ideas | 130 |
| Gaps identified | 11 (2 critical) |
| Team blocs detected | 18 |
| Analysis cost | ~$9 |