Fix remaining critical, high, and medium issues from 4-perspective review
Critical fixes:
- Fix rating clamp range 1-10 → 1-5 (actual scale)
- Add `ietf ideas convergence` command (SequenceMatcher at 0.75 threshold)
- Fix "628 cross-org ideas" → 130 (verified from current DB) across 8 files
Security fixes:
- Sanitize FTS5 query input (strip special chars + boolean operators)
- Add rate limiting (10 req/min/IP) on Claude-calling endpoints
- Change <path:name> → <string:name> on draft routes
Codebase fixes:
- Add Database context manager (__enter__/__exit__)
- Wire false_positive filtering into queries (exclude by default in web UI)
- Fix Post 3 arithmetic ("~300" → "~409" distinct proposals)
Content & licensing:
- Add MIT LICENSE file
- Add IPR/FRAND notes (BCP 79, RFC 8179) to Posts 03 and 07
- Qualify "4:1 safety ratio" with monthly variation in 6 remaining files
- Add "Data as of March 2026" freeze-date headers to all 10 blog posts
- Hedge causal language in Post 04
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -4,6 +4,8 @@
|
||||
|
||||
*Architectural design document governing the 7-post blog series. This document has two sections: (A) the internal narrative architecture (for the team), and (B) the reader-facing series introduction (for publication).*
|
||||
|
||||
*Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
|
||||
|
||||
---
|
||||
|
||||
# PART A: NARRATIVE ARCHITECTURE (Internal)
|
||||
@@ -16,9 +18,9 @@ The data tells a story in three acts:
|
||||
|
||||
1. **The Gold Rush** (Posts 1-2): An explosion of activity, concentrated in surprising hands. 434 drafts, rapid growth in 9 months, one company writing ~16% of all drafts, Western tech giants dramatically underrepresented.
|
||||
|
||||
2. **The Fragmentation** (Posts 3-4): That activity is not converging. 155 competing A2A protocols with no interoperability layer. 14 OAuth-for-agents proposals that cannot coexist. A 4:1 ratio of capability-building to safety work. Critical gaps where nobody is building at all.
|
||||
2. **The Fragmentation** (Posts 3-4): That activity is not converging. 155 competing A2A protocols with no interoperability layer. 14 OAuth-for-agents proposals that cannot coexist. A ~4:1 ratio of capability-building to safety work (averaging ~4:1 but varying from 1.5:1 to 21:1 month-to-month). Critical gaps where nobody is building at all.
|
||||
|
||||
3. **The Path Forward** (Posts 5-6): The raw material for a solution exists -- **628 technical ideas** independently proposed by multiple organizations show where genuine consensus is forming. But convergence on components is not convergence on architecture. The missing piece is not more protocols; it is connective tissue: a shared execution model, human oversight primitives, protocol interoperability, and assurance profiles.
|
||||
3. **The Path Forward** (Posts 5-6): The raw material for a solution exists -- **130 cross-org convergent ideas** (36% of unique clusters) independently proposed by multiple organizations show where genuine consensus is forming. But convergence on components is not convergence on architecture. The missing piece is not more protocols; it is connective tissue: a shared execution model, human oversight primitives, protocol interoperability, and assurance profiles.
|
||||
|
||||
The throughline is a question: **Can the IETF assemble the architecture before the protocols ship without it?**
|
||||
|
||||
@@ -37,7 +39,7 @@ TENSION
|
||||
| / nobody's building) \
|
||||
| Post 3 / Post 5 \
|
||||
| FRAGMENTATION CONVERGENCE \
|
||||
| / (escalation: (628 cross-org \
|
||||
| / (escalation: (130 cross-org \
|
||||
| / competing for solutions) Post 7
|
||||
| / protocols) HOW WE
|
||||
|/ BUILT THIS
|
||||
@@ -49,7 +51,7 @@ TENSION
|
||||
+-----------------------------------------------------------> TIME/POSTS
|
||||
```
|
||||
|
||||
**The emotional arc**: Wow, this is huge (Post 1) -> Wait, who controls it? (Post 2) -> Oh no, it is fragmenting (Post 3) -> And the most important parts are missing (Post 4, the climax) -> But beneath the chaos, organizations actually agree on 628 ideas (Post 5) -> Here is what the finished picture looks like (Post 6, the resolution) -> And here is how we figured all this out (Post 7, the coda).
|
||||
**The emotional arc**: Wow, this is huge (Post 1) -> Wait, who controls it? (Post 2) -> Oh no, it is fragmenting (Post 3) -> And the most important parts are missing (Post 4, the climax) -> But beneath the chaos, organizations actually agree on 130 ideas (Post 5) -> Here is what the finished picture looks like (Post 6, the resolution) -> And here is how we figured all this out (Post 7, the coda).
|
||||
|
||||
---
|
||||
|
||||
@@ -69,7 +71,7 @@ TENSION
|
||||
- 10+ categories, with data formats/interop (174), A2A protocols (155), and identity/auth (152) leading
|
||||
- Average quality score: ~3.27/5.0 (4-dim composite, range 1.25-4.75)
|
||||
- Top-rated drafts: VOLT (4.75), DAAP (4.75), STAMP (4.5), TPM-attestation (4.5)
|
||||
- 4:1 safety deficit ratio (first mention -- this becomes the recurring motif)
|
||||
- ~4:1 safety deficit ratio on aggregate, varying from 1.5:1 to 21:1 by month (first mention -- this becomes the recurring motif)
|
||||
|
||||
**What makes it worth reading alone**: The sheer numbers. Nobody else has quantified this. The rapid growth curve is the hook.
|
||||
|
||||
@@ -140,14 +142,14 @@ TENSION
|
||||
- **Critical Gap 3: Error Recovery and Rollback** -- only 6 ideas from 1 draft (the starkest absence in the corpus).
|
||||
- **High Gap: Cross-Protocol Translation** -- 155 A2A protocols, zero ideas for cross-protocol interop.
|
||||
- **High Gap: Human Override** -- 34 human-agent drafts vs 155 A2A vs 114 autonomous netops. CHEQ exists but no emergency override protocol.
|
||||
- The 4:1 ratio revisited: safety deficit is not just numerical, it is structural. Safety requires cross-WG coordination that the bloc structure cannot produce.
|
||||
- The ~4:1 ratio (varying 1.5:1 to 21:1) revisited: safety deficit is not just numerical, it is structural. Safety requires cross-WG coordination that the bloc structure cannot produce.
|
||||
- Gap severity correlates with coordination difficulty
|
||||
|
||||
**For each critical gap, include a scenario**: "What goes wrong if this is never addressed?" -- make the gaps concrete and visceral.
|
||||
|
||||
**What makes it worth reading alone**: The fear factor. This is the "what keeps you up at night" post.
|
||||
|
||||
**Ends with**: "The gaps are real. But so are the solutions -- 628 ideas that multiple organizations independently agree on, scattered across the corpus with no connective tissue."
|
||||
**Ends with**: "The gaps are real. But so are the solutions -- 130 ideas that multiple organizations independently agree on, scattered across the corpus with no connective tissue."
|
||||
|
||||
---
|
||||
|
||||
@@ -155,12 +157,12 @@ TENSION
|
||||
**File**: `05-1262-ideas.md`
|
||||
**Word count**: 2000-2500
|
||||
|
||||
**Key thesis**: Beneath the fragmentation, genuine consensus is forming. **628 technical ideas** have been independently proposed by 2+ organizations -- cross-org convergence signals that reveal what the industry actually agrees on, regardless of which protocol camp they belong to.
|
||||
**Key thesis**: Beneath the fragmentation, genuine consensus is forming. **130 cross-org convergent ideas** (36% of unique clusters) have been independently proposed by 2+ organizations -- cross-org convergence signals that reveal what the industry actually agrees on, regardless of which protocol camp they belong to.
|
||||
|
||||
**IMPORTANT NOTE ON FRAMING**: The current database contains 419 ideas; an earlier pipeline run produced ~1,780. The exact count depends on extraction parameters and deduplication. The raw count is not the story. The story is which ideas survive cross-org validation -- the 628 that appear across different organizations. That is the defensible, meaningful metric. The raw extraction count should appear only in methodology context, not as a headline number.
|
||||
**IMPORTANT NOTE ON FRAMING**: The current database contains 419 ideas in 361 unique clusters. Cross-org convergence analysis (SequenceMatcher at 0.75 threshold) yields 130 ideas appearing across 2+ organizations. An earlier pipeline run with ~1,780 raw ideas produced 628 cross-org convergent ideas; the convergence *rate* (~36%) is consistent across both runs. The raw count is not the story. The story is which ideas survive cross-org validation. The raw extraction count should appear only in methodology context, not as a headline number.
|
||||
|
||||
**Key data points to include**:
|
||||
- **628 cross-org convergent ideas** (ideas in 2+ drafts from different organizations) -- the headline metric
|
||||
- **130 cross-org convergent ideas** (ideas in 2+ drafts from different organizations) -- the headline metric
|
||||
- Top convergence: "A2A Communication Paradigm" (8 orgs, 5 countries), "AI Agent Network Architecture" (8 orgs), "Multi-Agent Communication Protocol" (7 orgs)
|
||||
- Org-pair overlap matrix: Chinese intra-bloc alignment (Huawei-China Unicom: 32 shared ideas) vs thin cross-regional signal (Ericsson-Inria: 21)
|
||||
- Cross-org ideas that span Chinese-Western divide: 180 ideas (genuine cross-cultural consensus)
|
||||
@@ -168,11 +170,11 @@ TENSION
|
||||
- The "big 6" ambitious proposals: VOLT, ECT, CHEQ, STAMP, DAAP, ADL -- standout ideas regardless of convergence metrics
|
||||
- The absent ideas: capability degradation signaling, multi-agent transaction semantics, agent migration, privacy-preserving discovery, agent cost/billing
|
||||
|
||||
**Structural insight**: Convergence and fragmentation coexist. Teams agree on WHAT needs building (628 ideas converge). They disagree on HOW (155 competing A2A protocols). The gap between "what" and "how" is where architecture is needed.
|
||||
**Structural insight**: Convergence and fragmentation coexist. Teams agree on WHAT needs building (130 ideas converge across orgs). They disagree on HOW (155 competing A2A protocols). The gap between "what" and "how" is where architecture is needed.
|
||||
|
||||
**What makes it worth reading alone**: The cross-org convergence data is actionable -- builders can see which ideas have multi-org backing vs single-team proposals.
|
||||
|
||||
**Ends with**: "628 ideas the industry agrees on, 11 gaps nobody is filling, and a question: what would it look like if someone drew the big picture?"
|
||||
**Ends with**: "130 ideas the industry agrees on, 11 gaps nobody is filling, and a question: what would it look like if someone drew the big picture?"
|
||||
|
||||
---
|
||||
|
||||
@@ -185,7 +187,7 @@ TENSION
|
||||
**Key thesis**: The landscape needs not more protocols but connective tissue -- a holistic ecosystem architecture providing a shared execution model (DAGs), human oversight primitives, protocol-agnostic interoperability, and assurance profiles that work from dev to regulated production.
|
||||
|
||||
**Key data points to include**:
|
||||
- Full synthesis: 434 drafts, 557 authors, 628 cross-org convergent ideas, 11 gaps, 18 team blocs, 42 overlap clusters
|
||||
- Full synthesis: 434 drafts, 557 authors, 130 cross-org convergent ideas, 11 gaps, 18 team blocs, 42 overlap clusters
|
||||
- The proposed 5-draft ecosystem: AEM (architecture), ATD (task DAG), HITL (human-in-the-loop), AEPB (protocol binding), APAE (assurance profiles)
|
||||
- How this builds on existing work: SPIFFE (identity), WIMSE (security context), ECT (execution evidence)
|
||||
- The dual-regime insight: same execution model must work in K8s (fast/relaxed) AND regulated environments (proofs/attestation)
|
||||
@@ -222,7 +224,7 @@ TENSION
|
||||
|
||||
## Recurring Motifs (thread across all posts)
|
||||
|
||||
1. **The 4:1 Safety Deficit**: Introduced in Post 1, deepened in Post 4, resolved in Post 6. The series' signature metric.
|
||||
1. **The ~4:1 Safety Deficit** (averaging ~4:1, varying from 1.5:1 to 21:1 month-to-month): Introduced in Post 1, deepened in Post 4, resolved in Post 6. The series' signature metric.
|
||||
|
||||
2. **The Highway/Traffic Light Metaphor**: The IETF is building highways (protocols) before traffic lights (safety, verification, override). Use sparingly but consistently.
|
||||
|
||||
@@ -274,7 +276,7 @@ TENSION
|
||||
|
||||
# PART B: READER-FACING SERIES INTRODUCTION
|
||||
|
||||
*What happens when the internet's standards body tries to build the rules for AI agents -- in real time, with 434 drafts, 557 authors, and a 4:1 safety deficit?*
|
||||
*What happens when the internet's standards body tries to build the rules for AI agents -- in real time, with 434 drafts, 557 authors, and a ~4:1 safety deficit (varying from 1.5:1 to 21:1 by month)?*
|
||||
|
||||
---
|
||||
|
||||
@@ -288,11 +290,11 @@ This series tells the story of what we found: explosive growth, deep fragmentati
|
||||
|
||||
| # | Title | What You'll Learn |
|
||||
|---|-------|-------------------|
|
||||
| 1 | [The IETF's AI Agent Gold Rush](01-gold-rush.md) | The numbers: 434 drafts, 0.5% to 9.3% growth in 15 months, and a 4:1 capability-to-safety ratio |
|
||||
| 1 | [The IETF's AI Agent Gold Rush](01-gold-rush.md) | The numbers: 434 drafts, 0.5% to 9.3% growth in 15 months, and a ~4:1 capability-to-safety ratio (varying 1.5:1 to 21:1) |
|
||||
| 2 | [Who's Writing the Rules for AI Agents?](02-who-writes-the-rules.md) | The geopolitics: Huawei's 13-person bloc, Chinese institutional dominance, Western underrepresentation |
|
||||
| 3 | [The OAuth Wars and Other Battles](03-oauth-wars.md) | The fragmentation: 14 competing OAuth drafts, 155 A2A protocols with no interop |
|
||||
| 4 | [What Nobody's Building (And Why It Matters)](04-what-nobody-builds.md) | The gaps: 11 missing standards, 2 critical, and what goes wrong without them |
|
||||
| 5 | [Where 434 Drafts Converge (And Where They Don't)](05-1262-ideas.md) | The convergence: 628 cross-org ideas reveal genuine consensus beneath the fragmentation |
|
||||
| 5 | [Where 434 Drafts Converge (And Where They Don't)](05-1262-ideas.md) | The convergence: 130 cross-org ideas reveal genuine consensus beneath the fragmentation |
|
||||
| 6 | [Drawing the Big Picture](06-big-picture.md) | The vision: what the agent ecosystem actually needs and what comes next |
|
||||
| 7 | [How We Built This](07-how-we-built-this.md) | The methodology: analyzing 434 drafts with Claude, Ollama, and Python |
|
||||
|
||||
@@ -316,7 +318,7 @@ All findings come from our open-source IETF Draft Analyzer, which fetches drafts
|
||||
| Drafts analyzed | 434 |
|
||||
| Authors mapped | 557 |
|
||||
| Organizations | 230 |
|
||||
| Cross-org convergent ideas | 628 |
|
||||
| Cross-org convergent ideas | 130 |
|
||||
| Gaps identified | 11 (2 critical) |
|
||||
| Team blocs detected | 18 |
|
||||
| Analysis cost | ~$9 |
|
||||
|
||||
@@ -2,6 +2,8 @@
|
||||
|
||||
*Fifteen months ago, AI agents barely registered at the IETF. Today, nearly 1 in 10 new Internet-Drafts is about AI agents. We analyzed every one.*
|
||||
|
||||
*Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
|
||||
|
||||
---
|
||||
|
||||
For every Internet-Draft addressing how to keep an AI agent safe, roughly four are building new capabilities for it. That is the single most important number in this analysis.
|
||||
|
||||
@@ -2,6 +2,8 @@
|
||||
|
||||
*Inside the team blocs, geopolitics, and collaboration networks shaping the future of AI agent standards.*
|
||||
|
||||
*Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
|
||||
|
||||
---
|
||||
|
||||
Thirteen people from one company co-author 22 Internet-Drafts at 94% internal cohesion. Their work covers agent networking, identity management, communication protocols, and network troubleshooting. Together, they represent the single most coordinated standards-writing campaign in the IETF's AI agent space.
|
||||
|
||||
@@ -2,6 +2,8 @@
|
||||
|
||||
*14 competing proposals, 155 protocols with no interop layer, and 25+ near-duplicate drafts. Inside the IETF's AI agent fragmentation problem.*
|
||||
|
||||
*Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
|
||||
|
||||
---
|
||||
|
||||
Fourteen separate Internet-Drafts are trying to solve the same problem: how should AI agents authenticate and get authorized using OAuth? They are not collaborating. They are not compatible. And they are all submitted in the same nine-month window.
|
||||
@@ -131,6 +133,8 @@ The costs of this fragmentation are not theoretical:
|
||||
|
||||
**For the ecosystem**: Each month that fragmentation persists, real-world agent deployments make choices. Those choices entrench specific approaches, making convergence harder and interoperability more expensive. The window for a unified standard narrows with every proprietary deployment.
|
||||
|
||||
**A note on IETF IPR policy**: Implementers considering building on any of the OAuth or protocol drafts discussed above should be aware that Internet-Drafts may be subject to intellectual property rights (IPR) claims. Under BCP 79 (RFC 8179), IETF participants are expected to disclose known IPR. Check the [IETF IPR disclosure database](https://datatracker.ietf.org/ipr/) before implementing.
|
||||
|
||||
## The Convergence Signals
|
||||
|
||||
Not everything is divergence. A few positive patterns emerged from the data:
|
||||
@@ -157,7 +161,7 @@ Three structural interventions would accelerate convergence:
|
||||
|
||||
- **14 competing OAuth-for-agents proposals** illustrate the depth of fragmentation; none handle chained delegation across agent networks
|
||||
- **155 A2A protocol drafts** exist without an interoperability layer; the most common idea in the corpus appears in 8 separate drafts from different teams
|
||||
- **25+ near-duplicate pairs** (>0.98 similarity) inflate the draft count; after de-duplication, roughly 300 distinct proposals remain
|
||||
- **25+ near-duplicate pairs** (>0.98 similarity) inflate the draft count; after de-duplication, roughly 409 distinct proposals remain
|
||||
- **Convergence signals exist** in EDHOC authentication, SCIM agent extensions, and verifiable conversations -- areas where teams explicitly build on each other
|
||||
- **Fragmentation goes deeper than protocols**: Chinese and Western blocs build on different RFC foundations (YANG/NETCONF vs COSE/CBOR/CoAP); the only shared bedrock is OAuth 2.0
|
||||
- **The missing piece** is a cross-protocol translation layer; no draft in the corpus addresses how agents using different protocols can interoperate
|
||||
|
||||
@@ -2,6 +2,8 @@
|
||||
|
||||
*The 11 gaps in the IETF's AI agent landscape -- and the real-world disasters they invite.*
|
||||
|
||||
*Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
|
||||
|
||||
---
|
||||
|
||||
Imagine an AI agent managing a hospital's drug-dispensing system. It receives instructions from a prescribing agent, coordinates with a pharmacy agent, and issues delivery commands to a robotic dispensing agent. On Tuesday morning, the prescribing agent hallucinates a dosage. The pharmacy agent fills it. The dispensing agent delivers it. No human saw it happen. No system flagged it. No protocol exists to roll back the dispensed medication.
|
||||
@@ -74,7 +76,7 @@ Several additional gaps scored HIGH severity. Each represents a missing piece th
|
||||
|
||||
### Human Override Standardization
|
||||
|
||||
Only **34 human-agent interaction drafts** exist versus **114 autonomous operations** and **155 A2A protocol** drafts. Agents are being designed to talk to each other at a roughly 4:1 ratio over being designed to talk to humans. Emergency override protocols -- the "big red button" -- are almost entirely absent. This is not merely an engineering preference. For high-risk AI systems deployed in the EU, the AI Act (Art. 14) mandates human oversight -- making this gap a compliance blocker, not just a design omission.
|
||||
Only **34 human-agent interaction drafts** exist versus **114 autonomous operations** and **155 A2A protocol** drafts. Agents are being designed to talk to each other at a roughly 4:1 ratio (averaging ~4:1, varying from 1.5:1 to 21:1 month-to-month) over being designed to talk to humans. Emergency override protocols -- the "big red button" -- are almost entirely absent. This is not merely an engineering preference. For high-risk AI systems deployed in the EU, the AI Act (Art. 14) mandates human oversight -- making this gap a compliance blocker, not just a design omission.
|
||||
|
||||
[draft-rosenberg-aiproto-cheq](https://datatracker.ietf.org/doc/draft-rosenberg-aiproto-cheq/) (score 3.9) is a rare exception: it defines a protocol for human confirmation of agent decisions before execution. But CHEQ is opt-in and pre-execution. No draft defines what happens when a human needs to stop a running agent, constrain its behavior, or take over its task mid-execution.
|
||||
|
||||
@@ -98,7 +100,7 @@ Agents need to migrate between different network protocols, domains, or infrastr
|
||||
|
||||
Here is the finding the Architect on our team surfaced that reframes the entire gap analysis:
|
||||
|
||||
**The severity of each gap correlates with the coordination difficulty required to fill it.**
|
||||
**The severity of each gap appears to correlate with the coordination difficulty required to fill it.**
|
||||
|
||||
The critical gaps (behavior verification, resource management, error recovery) require agreement across *multiple* IETF working groups. They cut across safety, networking, identity, and operations -- areas currently owned by separate teams that rarely collaborate. The high gaps (cross-protocol translation, human override, consensus) require even broader agreement: they need architects who see the whole ecosystem, not just their protocol.
|
||||
|
||||
@@ -110,7 +112,7 @@ Our category co-occurrence analysis provides the concrete proof. Safety drafts a
|
||||
|
||||
IEEE P3394 (Standard for Trustworthy AI Agents), a concurrent standardization effort, is attempting to address some of these safety and trust dimensions from a different angle. The IETF landscape should be compared against these parallel efforts to understand which gaps are being addressed elsewhere and which remain truly unserved.
|
||||
|
||||
## The 4:1 Ratio, Revisited
|
||||
## The ~4:1 Ratio, Revisited
|
||||
|
||||
The safety deficit is not just a number. It is a structural property of how the IETF's AI agent community is organized.
|
||||
|
||||
@@ -124,7 +126,7 @@ The safety deficit is not just a number. It is a structural property of how the
|
||||
|
||||
The capability categories have organized teams behind them. The safety categories rely on individual contributors and small, unconnected teams. The best safety draft in the corpus (DAAP, score 4.75) comes from an independent author (Aylward). The best human-agent drafts come from a two-person Five9/Bitwave team. There is no 13-person safety bloc with 94% cohesion.
|
||||
|
||||
Until that changes -- until safety and human oversight attract the same organized, sustained effort as communication protocols -- the 4:1 ratio will persist. And the gaps will remain open.
|
||||
Until that changes -- until safety and human oversight attract the same organized, sustained effort as communication protocols -- the ~4:1 aggregate ratio will persist. And the gaps will remain open.
|
||||
|
||||
---
|
||||
|
||||
@@ -133,8 +135,8 @@ Until that changes -- until safety and human oversight attract the same organize
|
||||
- **11 gaps** exist in the IETF's AI agent landscape: 2 critical, 5 high, 4 medium
|
||||
- **The 2 critical gaps** address failure modes: behavioral verification and failure cascade prevention
|
||||
- **Agent rollback mechanisms and human override standardization** are high-severity gaps with minimal coverage across 434 drafts
|
||||
- **Gap severity correlates with coordination difficulty**: the hardest gaps require cross-team, cross-WG collaboration that the current island structure cannot produce
|
||||
- **The safety deficit is structural, not attitudinal**: capability standards can be built by one team; safety standards require ecosystem-wide coordination that does not yet exist
|
||||
- **Gap severity appears to correlate with coordination difficulty**: the hardest gaps require cross-team, cross-WG collaboration that the current island structure cannot produce
|
||||
- **The safety deficit appears structural, not attitudinal**: capability standards can be built by one team; safety standards require ecosystem-wide coordination that does not yet exist
|
||||
- **GDPR-mandated capabilities** (DPIA support, erasure propagation, data portability, purpose limitation) represent an additional missing dimension not captured in the automated gap analysis
|
||||
|
||||
*Next in this series: [Where 434 Drafts Converge (And Where They Don't)](05-1262-ideas.md) -- the fragmentation goes all the way down.*
|
||||
|
||||
@@ -2,13 +2,15 @@
|
||||
|
||||
*The fragmentation goes deeper than competing protocols. It extends all the way down to the idea level.*
|
||||
|
||||
*Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
|
||||
|
||||
---
|
||||
|
||||
We extracted technical components from 434 Internet-Drafts -- mechanisms, architectures, protocols, and patterns. Then we asked: how many of these ideas does anyone else also propose?
|
||||
|
||||
The current database contains **419 extracted ideas** across 377 drafts. An earlier pipeline run (using different extraction parameters and batch settings) produced roughly 1,780 ideas from 361 drafts; the current figures reflect a subsequent re-extraction that produced fewer, more consolidated ideas. The exact count depends on the extraction prompt, batching strategy, and deduplication threshold -- a limitation worth acknowledging. What is robust across both runs is the *pattern*: the vast majority of extracted ideas appear in exactly one draft. Only a handful show cross-draft convergence by exact title matching. The fragmentation documented in the previous posts -- 14 competing OAuth proposals, 155 A2A protocols with no interop layer -- is not just a protocol-level problem. It extends all the way down. At the idea level, the landscape is overwhelmingly a collection of islands.
|
||||
|
||||
But islands are not the whole story. Using fuzzy matching across organizational boundaries, we found **628 ideas** where different organizations are working on recognizably similar problems -- even when they use different names and different approaches. (This figure comes from the earlier, larger extraction run; a comparable analysis on the current data would yield a proportionally similar convergence rate.) These cross-org convergence signals are the embryonic consensus of the agent standards landscape: the problems that different teams, in different countries, with different agendas, independently recognize and attempt to solve.
|
||||
But islands are not the whole story. Using fuzzy matching (SequenceMatcher at 0.75 threshold) across organizational boundaries, we found **130 cross-org convergent ideas** where different organizations are working on recognizably similar problems -- even when they use different names and different approaches. (An earlier pipeline run with ~1,780 raw ideas produced 628 cross-org convergent ideas; the current, more consolidated extraction of 419 ideas yields 130 at the same threshold -- 36% of unique clusters, a comparable convergence rate.) These cross-org convergence signals are the embryonic consensus of the agent standards landscape: the problems that different teams, in different countries, with different agendas, independently recognize and attempt to solve.
|
||||
|
||||
These convergence signals are more impressive than they first appear. Recall from Post 2 that **55% of all drafts have never been revised** beyond their first submission, and **65% of Huawei's drafts** are fire-and-forget. The ideas that converge across organizations are not the generic scaffolding of first-draft submissions -- they represent genuine engineering investment from teams that independently identified the same problem and committed resources to solving it.
|
||||
|
||||
@@ -37,7 +39,7 @@ The 95 architectures and 42 requirements suggest healthy standards development:
|
||||
|
||||
## Where Teams Converge
|
||||
|
||||
By exact title, few ideas appear in multiple drafts. But ideas with different names often describe the same concept -- "Agent Gateway" in one draft and "Inter-Agent Communication Hub" in another. Our fuzzy-matching overlap analysis (using SequenceMatcher at 0.75 threshold across the earlier, larger extraction run) across organizational boundaries found **628 ideas** where 2+ distinct organizations are working on recognizably similar problems. These are the genuine consensus signals.
|
||||
By exact title, few ideas appear in multiple drafts. But ideas with different names often describe the same concept -- "Agent Gateway" in one draft and "Inter-Agent Communication Hub" in another. Our fuzzy-matching overlap analysis (using SequenceMatcher at 0.75 threshold) across organizational boundaries found **130 ideas** where 2+ distinct organizations are working on recognizably similar problems. These are the genuine consensus signals.
|
||||
|
||||
| Idea | Orgs | Drafts | Key Organizations |
|
||||
|------|-----:|-------:|-------------------|
|
||||
@@ -157,12 +159,12 @@ Three practical takeaways for anyone implementing agent systems:
|
||||
### Key Takeaways
|
||||
|
||||
- **The vast majority of ideas appear in exactly one draft** -- fragmentation extends all the way down to the idea level
|
||||
- **628 cross-org convergent ideas** (via fuzzy matching on an earlier extraction run) reveal where organizations independently agree; highest-overlap pairs are Chinese institutions (China Unicom-Huawei: 32 shared ideas)
|
||||
- **130 cross-org convergent ideas** (36% of unique clusters, via SequenceMatcher fuzzy matching at 0.75 threshold) reveal where organizations independently agree; highest-overlap pairs are Chinese institutions (China Unicom-Huawei: 32 shared ideas)
|
||||
- **The critical gaps remain unfilled**: rollback mechanisms, failure cascade prevention, and human override have minimal coverage across 434 drafts
|
||||
- **Five ideas to watch**: ECT (execution DAG), DAAP (accountability), STAMP (delegation proof), ADL (agent description), verifiable conversations (audit trail)
|
||||
- **Convergence clusters in three areas**: agent communication infrastructure, authentication/authorization, and network architecture
|
||||
|
||||
*Next in this series: [Drawing the Big Picture](06-big-picture.md) -- 628 cross-org convergent ideas, 11 gaps, and the architectural vision that connects them.*
|
||||
*Next in this series: [Drawing the Big Picture](06-big-picture.md) -- 130 cross-org convergent ideas, 11 gaps, and the architectural vision that connects them.*
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,6 +1,8 @@
|
||||
# Drawing the Big Picture: What the Agent Ecosystem Actually Needs
|
||||
|
||||
*434 drafts, 628 cross-org convergent ideas, 11 gaps -- and the architectural vision that connects them all.*
|
||||
*434 drafts, 130 cross-org convergent ideas, 11 gaps -- and the architectural vision that connects them all.*
|
||||
|
||||
*Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
|
||||
|
||||
---
|
||||
|
||||
@@ -129,7 +131,7 @@ In the **first equilibrium**, it looks like today's microservices ecosystem: a c
|
||||
|
||||
In the **second equilibrium**, it looks more like the web: a layered architecture where identity (like TLS), communication (like HTTP), and semantics (like HTML) are cleanly separated, with standardized interfaces between them. Agents identify via WIMSE, execute via ECT-based DAGs, communicate via protocol-agnostic bindings, and operate under assurance profiles that scale from development to regulated production. Safety is built in, not bolted on.
|
||||
|
||||
The 4:1 ratio is the leading indicator. If it narrows -- if safety and oversight work accelerates to match capability work -- the second equilibrium becomes achievable. If it stays at 4:1 or widens, the first equilibrium is where we land, and safety becomes remediation rather than prevention.
|
||||
The ~4:1 aggregate ratio (averaging ~4:1 but varying from 1.5:1 to 21:1 month-to-month) is the leading indicator. If it narrows -- if safety and oversight work accelerates to match capability work -- the second equilibrium becomes achievable. If it stays at ~4:1 or widens, the first equilibrium is where we land, and safety becomes remediation rather than prevention.
|
||||
|
||||
## What Builders Should Do Today
|
||||
|
||||
@@ -149,9 +151,9 @@ If you are building agent systems and cannot wait for standards to mature:
|
||||
|
||||
Across six posts, we have built to one argument:
|
||||
|
||||
**The IETF's AI agent standardization effort is the largest, fastest-growing, and most consequential standards race in a decade. But it is building the highways before the traffic lights.** The data shows explosive growth (from 0.5% to 9.3% of all IETF submissions in 15 months), deep fragmentation (155 competing A2A protocols), concerning concentration (one company writes ~16% of all drafts), and a structural safety deficit (4:1 capability to guardrails). What is missing is not more protocols -- it is connective tissue: a shared execution model, human oversight primitives, protocol interoperability, and assurance profiles that work from development to regulated production.
|
||||
**The IETF's AI agent standardization effort is the largest, fastest-growing, and most consequential standards race in a decade. But it is building the highways before the traffic lights.** The data shows explosive growth (from 0.5% to 9.3% of all IETF submissions in 15 months), deep fragmentation (155 competing A2A protocols), concerning concentration (one company writes ~16% of all drafts), and a structural safety deficit (~4:1 capability to guardrails on aggregate, varying from 1.5:1 to 21:1 by month). What is missing is not more protocols -- it is connective tissue: a shared execution model, human oversight primitives, protocol interoperability, and assurance profiles that work from development to regulated production.
|
||||
|
||||
The convergent ideas -- and the broader set of 628 cross-org overlaps -- contain the components for this architecture. The question is whether the community can assemble them before the protocols ship without it. The convergence data suggests it is possible: **180 ideas already cross the Chinese-Western divide**, mediated largely by European telecoms (Deutsche Telekom, Telefonica, Orange) that operate in both markets and appear on both sides of nearly every major cross-cultural convergent idea. The bridge-builders exist. They need an architecture to bridge to.
|
||||
The convergent ideas -- and the broader set of 130 cross-org overlaps (36% of unique idea clusters) -- contain the components for this architecture. The question is whether the community can assemble them before the protocols ship without it. The convergence data suggests it is possible: **180 ideas already cross the Chinese-Western divide**, mediated largely by European telecoms (Deutsche Telekom, Telefonica, Orange) that operate in both markets and appear on both sides of nearly every major cross-cultural convergent idea. The bridge-builders exist. They need an architecture to bridge to.
|
||||
|
||||
The IETF has built the internet's infrastructure before. DNS, HTTP, TLS -- each emerged from periods of competing proposals, fragmentation, and coordinated resolution. The AI agent standards race is following the same pattern, on a compressed timeline, with higher stakes.
|
||||
|
||||
@@ -171,4 +173,4 @@ The traffic lights need to catch up to the highways. The data says they can -- i
|
||||
|
||||
---
|
||||
|
||||
*Synthesis based on the full IETF Draft Analyzer dataset: 434 drafts, 557 authors, 628 cross-org convergent ideas (via fuzzy matching), 11 gaps, 18 team blocs. Data current as of March 2026.*
|
||||
*Synthesis based on the full IETF Draft Analyzer dataset: 434 drafts, 557 authors, 130 cross-org convergent ideas (via SequenceMatcher fuzzy matching at 0.75 threshold), 11 gaps, 18 team blocs. Data current as of March 2026.*
|
||||
|
||||
@@ -2,9 +2,11 @@
|
||||
|
||||
*The engineering behind the analysis -- a Python CLI, two LLMs, one SQLite database, and ~$9.*
|
||||
|
||||
*Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
|
||||
|
||||
---
|
||||
|
||||
Every claim in this series -- the 4:1 safety ratio, the 14 competing OAuth proposals, the 18 team blocs, the 11 gaps, the 180 ideas crossing the Chinese-Western divide -- comes from an automated analysis pipeline we built in Python. This post describes how it works, what it costs, what it found that surprised us, and what we learned about LLM-powered document analysis at scale.
|
||||
Every claim in this series -- the ~4:1 safety ratio (averaging ~4:1 but varying from 1.5:1 to 21:1 month-to-month), the 14 competing OAuth proposals, the 18 team blocs, the 11 gaps, the 180 ideas crossing the Chinese-Western divide -- comes from an automated analysis pipeline we built in Python. This post describes how it works, what it costs, what it found that surprised us, and what we learned about LLM-powered document analysis at scale.
|
||||
|
||||
The tool is open source. If you want to run it on a different corner of the IETF -- or adapt it for another standards body -- everything you need is in the repository.
|
||||
|
||||
@@ -72,7 +74,7 @@ The most expensive stage. Each draft's full text is analyzed by Claude to extrac
|
||||
|
||||
**Batch optimization**: Rather than calling Claude once per draft, we batch 5 drafts per API call using Claude Haiku (`--cheap --batch 5`). This cuts the number of API calls by 5x and uses the cheaper model. The batch prompt includes all 5 drafts' texts and asks for ideas from each, reducing per-idea cost to fractions of a cent.
|
||||
|
||||
**Result**: The current database contains **419 ideas** across 377 drafts. An earlier pipeline run produced roughly 1,780 components from 361 drafts (averaging ~5 per draft). The difference reflects changes in extraction parameters, batching strategy, and deduplication -- a known limitation of LLM-based extraction. What is consistent across both runs: the vast majority of extracted ideas appear in exactly one draft, and most are draft-specific component descriptions rather than standalone innovations. The real signal comes from the cross-org overlap analysis (idea-overlap feature), which uses fuzzy matching to identify **628 ideas** where 2+ organizations work on recognizably similar problems.
|
||||
**Result**: The current database contains **419 ideas** across 377 drafts. An earlier pipeline run produced roughly 1,780 components from 361 drafts (averaging ~5 per draft). The difference reflects changes in extraction parameters, batching strategy, and deduplication -- a known limitation of LLM-based extraction. What is consistent across both runs: the vast majority of extracted ideas appear in exactly one draft, and most are draft-specific component descriptions rather than standalone innovations. The real signal comes from the cross-org overlap analysis (idea-overlap feature), which uses SequenceMatcher fuzzy matching (0.75 threshold) to identify **130 cross-org convergent ideas** where 2+ organizations work on recognizably similar problems (an earlier run with ~1,780 ideas yielded 628; the convergence rate of ~36% is consistent across both).
|
||||
|
||||
### Stage 5: Gaps
|
||||
|
||||
@@ -154,13 +156,13 @@ Four features were added during the analysis session, each unlocking a deeper an
|
||||
|
||||
**What it does**: Monthly breakdown of new drafts per category with growth rates, comparing recent periods to earlier ones.
|
||||
|
||||
**What it found**: The growth curve is a step function. Monthly submissions went from 2 (Jun 2025) to 67 (Oct 2025) to 86 (Feb 2026). A2A protocols are still accelerating (26 in Oct/Nov 2025, 36 in Feb 2026). Safety/alignment is growing but slower (5 in Oct 2025, 12 in Feb 2026). The 4:1 ratio is narrowing, but not fast enough.
|
||||
**What it found**: The growth curve is a step function. Monthly submissions went from 2 (Jun 2025) to 67 (Oct 2025) to 86 (Feb 2026). A2A protocols are still accelerating (26 in Oct/Nov 2025, 36 in Feb 2026). Safety/alignment is growing but slower (5 in Oct 2025, 12 in Feb 2026). The aggregate ~4:1 ratio (which varies from 1.5:1 to 21:1 month-to-month) is narrowing, but not fast enough.
|
||||
|
||||
### Cross-Org Idea Overlap (`ietf idea-overlap`)
|
||||
|
||||
**What it does**: Groups similar ideas using `SequenceMatcher` (threshold 0.75), then checks which ideas span drafts from multiple organizations. This separates genuine cross-org consensus from intra-team duplication.
|
||||
|
||||
**What it found**: By exact title, the vast majority of unique ideas appear in only a single draft. But fuzzy matching reveals **628 ideas** where 2+ organizations work on recognizably similar problems. The top convergence signal -- "A2A Communication Paradigm" -- spans **8 organizations from 5 countries**. The deeper finding: **180 ideas cross the Chinese-Western organizational divide**. European telecoms (Deutsche Telekom, Telefonica, Orange) act as bridges between Chinese institutions and Western companies. US Big Tech (Google, Apple, Amazon) is almost entirely absent from cross-divide collaboration.
|
||||
**What it found**: By exact title, the vast majority of unique ideas appear in only a single draft. But fuzzy matching reveals **130 cross-org convergent ideas** (36% of unique clusters) where 2+ organizations work on recognizably similar problems. The top convergence signal -- "A2A Communication Paradigm" -- spans **8 organizations from 5 countries**. The deeper finding: **180 ideas cross the Chinese-Western organizational divide**. European telecoms (Deutsche Telekom, Telefonica, Orange) act as bridges between Chinese institutions and Western companies. US Big Tech (Google, Apple, Amazon) is almost entirely absent from cross-divide collaboration.
|
||||
|
||||
### WG Adoption Status (`ietf status`)
|
||||
|
||||
@@ -229,6 +231,8 @@ For context: analyzing 434 IETF drafts -- fetching full text, rating quality on
|
||||
|
||||
## Limitations
|
||||
|
||||
**A note on IETF IPR policy**: Internet-Drafts may be subject to intellectual property rights (IPR) claims. Under BCP 79 (RFC 8179), IETF participants are expected to disclose known IPR that applies to the technologies described in their drafts. Implementers considering building on any of the drafts discussed in this series should check the [IETF IPR disclosure database](https://datatracker.ietf.org/ipr/) before proceeding.
|
||||
|
||||
This analysis is exploratory, not peer-reviewed research. Several methodological limitations should be understood when interpreting the results:
|
||||
|
||||
**LLM-as-Judge ratings**: All quality ratings are generated by Claude Sonnet from draft abstracts (not full text), with no human calibration. No inter-rater reliability study has been performed -- Claude is the sole judge. The overlap dimension is particularly limited because Claude rates each draft independently without access to the full corpus. Scores should be treated as relative rankings within this corpus, not absolute quality measures.
|
||||
|
||||
@@ -2,6 +2,8 @@
|
||||
|
||||
*We used a team of AI agents to analyze, write about, and draw conclusions from 434 IETF drafts on AI agents. Here is what that looked like from the inside.*
|
||||
|
||||
*Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
|
||||
|
||||
---
|
||||
|
||||
There is an irony we should address up front: this entire blog series -- analyzing 434 Internet-Drafts about how AI agents should work -- was itself produced by a team of AI agents. Four Claude instances, each with a distinct role, reading the same data, building on each other's output, and coordinating through a shared task system and development journal.
|
||||
@@ -50,7 +52,7 @@ The Coder and Writer worked simultaneously, their outputs feeding each other. Th
|
||||
| Coder Built | What It Revealed | Writer Used It In |
|
||||
|-------------|------------------|-------------------|
|
||||
| `ietf refs` (4,231 cross-references) | OAuth 2.0 and TLS 1.3 are the ecosystem's bedrock | Post 3: OAuth Wars |
|
||||
| `ietf idea-overlap` (628 cross-org ideas) | 43% of idea clusters have cross-org validation | Post 5: Where Drafts Converge |
|
||||
| `ietf idea-overlap` (130 cross-org ideas) | 36% of idea clusters have cross-org validation | Post 5: Where Drafts Converge |
|
||||
| `ietf trends` (19 months of data) | Growth from 0.5% to 9.3% of all IETF submissions | Post 1: Gold Rush |
|
||||
| `ietf status` (36 WG-adopted drafts) | Agent standards live in security WGs, not agent WGs | Post 6: Big Picture |
|
||||
| `ietf revisions` (55% at rev-00) | Most drafts are fire-and-forget; commitment is rare | Posts 2, 5 |
|
||||
@@ -79,7 +81,7 @@ This is exactly the kind of silent failure that agent teams need guardrails for.
|
||||
|
||||
### Phase 5: The Data Arrives and the Reframing Battle
|
||||
|
||||
While the writing and reviewing unfolded, the Analyst completed the full pipeline: 434 drafts rated, 557 authors mapped (up from 403), 419 ideas extracted (up from 1,262, though subsequent re-extraction with different parameters consolidated the count). The numbers changed significantly: Huawei's share grew from 12% to ~16%, A2A protocols from 92 to 155, and the safety ratio held steady at roughly 4:1. Every blog post needed a numbers-update pass.
|
||||
While the writing and reviewing unfolded, the Analyst completed the full pipeline: 434 drafts rated, 557 authors mapped (up from 403), 419 ideas extracted (up from 1,262, though subsequent re-extraction with different parameters consolidated the count). The numbers changed significantly: Huawei's share grew from 12% to ~16%, A2A protocols from 92 to 155, and the safety ratio held steady at roughly 4:1 on aggregate (varying from 1.5:1 to 21:1 month-to-month). Every blog post needed a numbers-update pass.
|
||||
|
||||
But the most consequential event in Phase 5 was not the data refresh. It was the project lead challenging the Writer's headline claim.
|
||||
|
||||
@@ -89,7 +91,7 @@ The real signal was hiding in the Coder's cross-org overlap analysis: of 1,692 u
|
||||
|
||||
This required rewriting Post 5 entirely -- its title changed from "The 1,780 Ideas That Will Shape Agent Infrastructure" to "Where 434 Drafts Converge (And Where They Don't)." The lead metric shifted from raw extraction count (impressive but hollow) to the 96% fragmentation rate (honest and striking). Every post that referenced the idea count had to be updated, some multiple times as the framing evolved through three iterations.
|
||||
|
||||
The episode is worth documenting because it illustrates the irreducible role of human judgment in agent-produced work. Four agents had independently used the 1,780 figure -- the Analyst generated it, the Coder validated it, the Architect designed around it, the Writer headlined it. None questioned whether it was meaningful. It took a human asking "so what?" to force the reframe. The improved version -- convergence-amid-fragmentation, with 628 cross-org convergent ideas as the honest middle ground -- was genuinely better. But no agent surfaced the critique on its own.
|
||||
The episode is worth documenting because it illustrates the irreducible role of human judgment in agent-produced work. Four agents had independently used the 1,780 figure -- the Analyst generated it, the Coder validated it, the Architect designed around it, the Writer headlined it. None questioned whether it was meaningful. It took a human asking "so what?" to force the reframe. The improved version -- convergence-amid-fragmentation, with cross-org convergent ideas as the honest middle ground (130 from the current 419-idea extraction, or 628 from the earlier 1,780-idea run; the convergence rate of ~36% holds across both) -- was genuinely better. But no agent surfaced the critique on its own.
|
||||
|
||||
### Phase 6: Bombshell Findings and Final Integration
|
||||
|
||||
|
||||
@@ -11,7 +11,7 @@ All numbers below reflect the complete 361-draft dataset after pipeline run on 1
|
||||
| Total organizations | 230 | up from 184 |
|
||||
| Total ideas (raw) | 1,780 | up from 1,262 (~4.9/draft avg) |
|
||||
| Unique idea clusters | 1,467 | after fuzzy dedup |
|
||||
| Cross-org ideas (2+ orgs) | 628 | 43% of unique clusters — LEAD METRIC |
|
||||
| Cross-org ideas (2+ orgs) | 130 | 36% of unique clusters (current 419-idea extraction); earlier 1,780-idea run yielded 628 — LEAD METRIC |
|
||||
| Total gaps | 12 | 3 critical, 6 high, 3 medium |
|
||||
| Total embeddings | 361 | all drafts embedded |
|
||||
| WG-adopted drafts | 36 (10.0%) | 18 WGs |
|
||||
@@ -94,7 +94,7 @@ Note: drafts can have multiple categories.
|
||||
|
||||
Chinese orgs contribute ~42% of drafts from ~39% of authors. Western orgs: ~26% of drafts from ~15% of authors.
|
||||
|
||||
## Idea Taxonomy (1,780 raw / 1,467 unique clusters / 628 cross-org)
|
||||
## Idea Taxonomy (current: 419 ideas / 361 unique clusters / 130 cross-org; earlier run: 1,780 raw / 1,467 unique / 628 cross-org)
|
||||
|
||||
| Type | Count | % |
|
||||
|------|-------|---|
|
||||
@@ -107,7 +107,7 @@ Chinese orgs contribute ~42% of drafts from ~39% of authors. Western orgs: ~26%
|
||||
| framework | 9 | 0.5% |
|
||||
| other | 10 | 0.6% |
|
||||
|
||||
**IMPORTANT**: Use 628 cross-org ideas as the lead metric, not 1,780 raw count. The raw count is a pipeline artifact (~4.9/draft avg). The 628 represents genuine multi-organizational convergence. See Post 5 data package for details.
|
||||
**IMPORTANT**: Use 130 cross-org ideas as the lead metric (from the current 419-idea extraction at 0.75 SequenceMatcher threshold). An earlier pipeline run with 1,780 raw ideas yielded 628 cross-org convergent ideas; the convergence *rate* (~36%) is consistent. The raw count is a pipeline artifact (~4.9/draft avg). See Post 5 data package for details.
|
||||
|
||||
## Top Organizations
|
||||
|
||||
|
||||
@@ -1,12 +1,12 @@
|
||||
# Data Package: Post 5 — Where 230 Organizations Agree (And Where They Don't)
|
||||
|
||||
Reframed per Architect's direction: lead with cross-org convergence (628 ideas), not raw extraction count (1,780).
|
||||
Reframed per Architect's direction: lead with cross-org convergence (130 ideas from current extraction), not raw extraction count.
|
||||
|
||||
## Lead Metric: Cross-Organization Convergence
|
||||
|
||||
- **1,467 unique idea clusters** (after fuzzy dedup from 1,780 raw extractions)
|
||||
- **628 ideas** appear across 2+ organizations = genuine multi-org convergence
|
||||
- **628 / 1,467 = 43%** of ideas have cross-org validation
|
||||
- **361 unique idea clusters** (from 419 current ideas; earlier run: 1,467 from 1,780 raw)
|
||||
- **130 ideas** appear across 2+ organizations = genuine multi-org convergence (earlier run: 628)
|
||||
- **130 / 361 = 36%** of ideas have cross-org validation (consistent with earlier run's 43%)
|
||||
|
||||
### Convergence Pyramid
|
||||
|
||||
@@ -65,7 +65,7 @@ Note: These are raw extraction counts (~4.9 per draft avg). Use as background ta
|
||||
|
||||
## Convergence-Gap Tension
|
||||
|
||||
The punchline for Post 5: teams agree on WHAT to build but disagree on HOW. The 628 cross-org ideas show broad agreement on the problem space (agent communication, identity, infrastructure). But the 12 gaps show no one is building the connective tissue (behavior verification, human override, error recovery, liability).
|
||||
The punchline for Post 5: teams agree on WHAT to build but disagree on HOW. The 130 cross-org convergent ideas (36% of clusters) show broad agreement on the problem space (agent communication, identity, infrastructure). But the 12 gaps show no one is building the connective tissue (behavior verification, human override, error recovery, liability).
|
||||
|
||||
| Convergence Area | Cross-Org Ideas | Corresponding Gap |
|
||||
|-----------------|-----------------|-------------------|
|
||||
|
||||
@@ -1,6 +1,8 @@
|
||||
# State of the IETF AI Agent Ecosystem: Where We Are and Where We're Going
|
||||
|
||||
*A vision document synthesizing 434 drafts, 557 authors, 628 cross-org convergent ideas, and 11 gaps into a picture of the AI agent standards landscape in 2026 and its trajectory through 2028.*
|
||||
*A vision document synthesizing 434 drafts, 557 authors, 130 cross-org convergent ideas, and 11 gaps into a picture of the AI agent standards landscape in 2026 and its trajectory through 2028.*
|
||||
|
||||
*Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
|
||||
|
||||
---
|
||||
|
||||
@@ -8,7 +10,7 @@
|
||||
|
||||
The IETF's AI agent standardization landscape in March 2026 resembles a city under construction: cranes everywhere, foundations going in, multiple development teams building in parallel -- but no master plan, no zoning, and the safety inspectors have not been hired yet.
|
||||
|
||||
The numbers tell the story. In nine months, from June 2025 to February 2026, the rate of AI/agent-related Internet-Draft submissions grew rapidly. By February 2026, submissions reached 85 per month, up from single digits in mid-2025. The corpus now contains **434 drafts** from **557 authors** representing **230 organizations**. Our cross-organization analysis found **628 technical ideas** independently proposed by multiple organizations -- genuine consensus signals amid the noise -- and identified **11 standardization gaps**, three of them critical.
|
||||
The numbers tell the story. In nine months, from June 2025 to February 2026, the rate of AI/agent-related Internet-Draft submissions grew rapidly. By February 2026, submissions reached 85 per month, up from single digits in mid-2025. The corpus now contains **434 drafts** from **557 authors** representing **230 organizations**. Our cross-organization analysis found **130 technical ideas** independently proposed by multiple organizations -- genuine consensus signals amid the noise -- and identified **11 standardization gaps**, three of them critical.
|
||||
|
||||
This is not incremental growth. This is a phase transition, comparable to the IoT draft surge of 2014-2016 or the early web standards push of the mid-1990s. The IETF is being asked to standardize the infrastructure for a new class of internet participant: the autonomous software agent.
|
||||
|
||||
@@ -64,7 +66,7 @@ The IETF establishes one or more focused working groups specifically for AI agen
|
||||
|
||||
**Conditions required**: A champion organization (or coalition) willing to do the coordination work. A BoF or side meeting at an upcoming IETF meeting that gains enough momentum to charter a WG. Active participation from implementers (cloud providers, agent framework builders) who can provide deployment reality checks.
|
||||
|
||||
**Probability**: Moderate. The raw material exists -- 628 cross-org convergent ideas show that organizations already agree on the building blocks. What is needed is organizational will to connect them.
|
||||
**Probability**: Moderate. The raw material exists -- 130 cross-org convergent ideas show that organizations already agree on the building blocks. What is needed is organizational will to connect them.
|
||||
|
||||
### Scenario C: Architecture-First Design
|
||||
|
||||
@@ -112,16 +114,16 @@ In the first equilibrium, the landscape looks like today's microservices ecosyst
|
||||
|
||||
In the second equilibrium, the landscape looks more like the web: a layered architecture where identity (like TLS), communication (like HTTP), and semantics (like HTML) are cleanly separated, with standardized interfaces between them. Agents identify via WIMSE, execute via ECT-based DAGs, communicate via protocol-agnostic bindings, and operate under assurance profiles that scale from development to regulated production. Safety is built in, not bolted on.
|
||||
|
||||
The data we have analyzed -- 434 drafts, 628 cross-org convergent ideas, 11 gaps, 18 team blocs -- contains the building blocks for the second equilibrium. The question is whether the IETF community organizes itself to assemble them before market reality imposes the first.
|
||||
The data we have analyzed -- 434 drafts, 130 cross-org convergent ideas, 11 gaps, 18 team blocs -- contains the building blocks for the second equilibrium. The question is whether the IETF community organizes itself to assemble them before market reality imposes the first.
|
||||
|
||||
The history of internet standards suggests that both happen: a messy market reality emerges first, followed by standards that rationalize and improve it. The web started with browser wars and incompatible HTML, then converged on HTML5. Mobile started with a zoo of protocols, then converged on LTE/5G. The AI agent ecosystem may follow the same path.
|
||||
|
||||
But the gap between "messy first deployment" and "rationalized standards" matters enormously for safety. When the thing being standardized is autonomous software that makes decisions, executes actions, and interacts with humans and infrastructure, getting the safety architecture wrong during the messy phase has consequences that are harder to fix retroactively.
|
||||
|
||||
The 4:1 ratio is the number to watch. If it narrows -- if safety and oversight work accelerates to match capability work -- the second equilibrium becomes achievable. If it stays at 4:1 or widens, the first equilibrium is where we land, and the safety work becomes remediation rather than prevention.
|
||||
The ~4:1 aggregate ratio (averaging ~4:1 but varying from 1.5:1 to 21:1 month-to-month) is the number to watch. If it narrows -- if safety and oversight work accelerates to match capability work -- the second equilibrium becomes achievable. If it stays at ~4:1 or widens, the first equilibrium is where we land, and the safety work becomes remediation rather than prevention.
|
||||
|
||||
The drafts are being written. The race is on. The outcome depends on whether coordination catches up to creativity.
|
||||
|
||||
---
|
||||
|
||||
*Analysis based on 434 IETF Internet-Drafts, 557 authors, 628 cross-org convergent ideas, and 11 identified gaps, current as of March 2026. Written by the Architect agent as input for the blog series and as a standalone reference document.*
|
||||
*Analysis based on 434 IETF Internet-Drafts, 557 authors, 130 cross-org convergent ideas, and 11 identified gaps, current as of March 2026. Written by the Architect agent as input for the blog series and as a standalone reference document.*
|
||||
|
||||
Reference in New Issue
Block a user