Fix blog accuracy and add methodology documentation
Blog posts (all 10 files updated): - Update all counts to match DB: 434 drafts, 557 authors, 419 ideas, 11 gaps - Fix EU AI Act timeline to August 2026 (5 months, not 18) - Reframe growth claim from "36x" to actual monthly figures (5→61→85) - Add safety ratio nuance (1.5:1 to 21:1 monthly variation) - Fix composite scores (4.8→4.75, 4.6→4.5) - Add OAuth/GDPR consent distinction (Art. 6(1)(a), Art. 28) - Add EU AI Act Annex III + MDR context to hospital scenario - Add FIPA, IEEE P3394, eIDAS 2.0 references - Add GDPR gap paragraph (DPIA, erasure, portability, purpose limitation) - Rewrite Post 04 gap table to match actual DB gap names Methodology: - Expand methodology.md: pipeline docs, limitations, related work - Add LLM-as-judge caveats and explicit rating rubric to analyzer.py - Add clustering threshold rationale to embeddings.py - Add gap analysis grounding notes to analyzer.py - Add Limitations section to Post 07 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
# State of the IETF AI Agent Ecosystem: Where We Are and Where We're Going
|
||||
|
||||
*A vision document synthesizing 361 drafts, 557 authors, 628 cross-org convergent ideas, and 12 gaps into a picture of the AI agent standards landscape in 2026 and its trajectory through 2028.*
|
||||
*A vision document synthesizing 434 drafts, 557 authors, 628 cross-org convergent ideas, and 11 gaps into a picture of the AI agent standards landscape in 2026 and its trajectory through 2028.*
|
||||
|
||||
---
|
||||
|
||||
@@ -8,7 +8,7 @@
|
||||
|
||||
The IETF's AI agent standardization landscape in March 2026 resembles a city under construction: cranes everywhere, foundations going in, multiple development teams building in parallel -- but no master plan, no zoning, and the safety inspectors have not been hired yet.
|
||||
|
||||
The numbers tell the story. In nine months, from June 2025 to February 2026, the rate of AI/agent-related Internet-Draft submissions grew from 2 per month to 72 -- a 36x increase. The corpus now contains **361 drafts** from **557 authors** representing **230 organizations**. Our cross-organization analysis found **628 technical ideas** independently proposed by multiple organizations -- genuine consensus signals amid the noise -- and identified **12 standardization gaps**, three of them critical.
|
||||
The numbers tell the story. In nine months, from June 2025 to February 2026, the rate of AI/agent-related Internet-Draft submissions grew rapidly. By February 2026, submissions reached 85 per month, up from single digits in mid-2025. The corpus now contains **434 drafts** from **557 authors** representing **230 organizations**. Our cross-organization analysis found **628 technical ideas** independently proposed by multiple organizations -- genuine consensus signals amid the noise -- and identified **11 standardization gaps**, three of them critical.
|
||||
|
||||
This is not incremental growth. This is a phase transition, comparable to the IoT draft surge of 2014-2016 or the early web standards push of the mid-1990s. The IETF is being asked to standardize the infrastructure for a new class of internet participant: the autonomous software agent.
|
||||
|
||||
@@ -16,11 +16,11 @@ But the landscape that has emerged is not converging. It is fragmenting.
|
||||
|
||||
### The Structural Problems
|
||||
|
||||
**Fragmentation without coordination.** The 361 drafts cluster into at least 42 topically overlapping groups. The most crowded area -- OAuth extensions for AI agents -- has 14 competing drafts, each proposing a different approach to the same problem: how does an autonomous agent authenticate and obtain authorization? In the agent-to-agent communication space, 120 drafts propose protocols with no interoperability layer between them. We found 25 near-duplicate pairs where teams independently wrote essentially the same specification.
|
||||
**Fragmentation without coordination.** The 434 drafts cluster into at least 42 topically overlapping groups. The most crowded area -- OAuth extensions for AI agents -- has 14 competing drafts, each proposing a different approach to the same problem: how does an autonomous agent authenticate and obtain authorization? In the agent-to-agent communication space, 155 drafts propose protocols with no interoperability layer between them. We found 25 near-duplicate pairs where teams independently wrote essentially the same specification.
|
||||
|
||||
**Concentration without diversity.** One organization -- Huawei -- accounts for 53 authors and 66 drafts, 18% of the entire corpus. A single 13-person team within Huawei co-authors 22 drafts at 94% internal cohesion. The broader Chinese institutional ecosystem (Huawei, China Mobile, China Telecom, China Unicom, Tsinghua University, ZTE, BUPT, CAICT, Zhongguancun Lab) collectively fields over 160 authors. Meanwhile, Google, Microsoft, and Apple are largely absent from AI agent protocol work. The standards that will govern how AI agents identify, authenticate, and communicate on the internet are being written by a remarkably narrow group.
|
||||
**Concentration without diversity.** One organization -- Huawei -- accounts for 53 authors and 69 drafts, ~16% of the entire corpus. A single 13-person team within Huawei co-authors 22 drafts at 94% internal cohesion. The broader Chinese institutional ecosystem (Huawei, China Mobile, China Telecom, China Unicom, Tsinghua University, ZTE, BUPT, CAICT, Zhongguancun Lab) collectively fields over 160 authors. Meanwhile, Google, Microsoft, and Apple are largely absent from AI agent protocol work. The standards that will govern how AI agents identify, authenticate, and communicate on the internet are being written by a remarkably narrow group.
|
||||
|
||||
**Capability without safety.** For every draft addressing AI safety, alignment, or human oversight, approximately four drafts build new agent capabilities. Only 44 of 361 drafts touch safety. Only 30 address human-agent interaction, compared to 120 A2A protocols and 93 autonomous network operations drafts. The three critical gaps we identified -- behavior verification, resource management, and error recovery -- all concern what happens when agents fail or misbehave. These gaps have received minimal attention.
|
||||
**Capability without safety.** For every draft addressing AI safety, alignment, or human oversight, approximately four drafts build new agent capabilities. Only 47 of 434 drafts touch safety. Only 34 address human-agent interaction, compared to 155 A2A protocols and 114 autonomous network operations drafts. The two critical gaps we identified -- behavioral verification and failure cascade prevention -- concern what happens when agents fail or misbehave. These gaps have received minimal attention.
|
||||
|
||||
---
|
||||
|
||||
@@ -28,19 +28,19 @@ But the landscape that has emerged is not converging. It is fragmenting.
|
||||
|
||||
The deepest problem is not fragmentation or concentration. It is the absence of connective tissue.
|
||||
|
||||
The 361 drafts contain the pieces of an agent ecosystem. What they lack is a shared model of how those pieces fit together. Consider what a deployed multi-agent system actually needs:
|
||||
The 434 drafts contain the pieces of an agent ecosystem. What they lack is a shared model of how those pieces fit together. Consider what a deployed multi-agent system actually needs:
|
||||
|
||||
1. **An execution model**: How are agent tasks organized, sequenced, and tracked? What is the unit of work? How do dependencies between tasks get expressed? Today: no standard. Every draft assumes its own task model.
|
||||
|
||||
2. **Human oversight primitives**: When does a human need to approve, intervene, or override an agent's decision? How does the override propagate? How is the decision recorded for audit? Today: 30 drafts touch this, none define standard primitives.
|
||||
2. **Human oversight primitives**: When does a human need to approve, intervene, or override an agent's decision? How does the override propagate? How is the decision recorded for audit? Today: 34 drafts touch this, none define standard primitives.
|
||||
|
||||
3. **Error recovery and rollback**: When an autonomous agent makes a bad decision, how do you undo it? When a cascade of failures ripples through an agent network, how do you contain the blast radius? Today: one draft (draft-yue-anima-agent-recovery-networks) partially addresses this. The rest of the 360 ignore it.
|
||||
3. **Error recovery and rollback**: When an autonomous agent makes a bad decision, how do you undo it? When a cascade of failures ripples through an agent network, how do you contain the blast radius? Today: one draft (draft-yue-anima-agent-recovery-networks) partially addresses this. The rest of the 433 ignore it.
|
||||
|
||||
4. **Protocol interoperability**: With 120 competing A2A protocols, how does an agent speaking Protocol A communicate with an agent speaking Protocol B? Today: zero ideas in the entire corpus for cross-protocol translation. This gap is entirely unaddressed.
|
||||
4. **Protocol interoperability**: With 155 competing A2A protocols, how does an agent speaking Protocol A communicate with an agent speaking Protocol B? Today: zero ideas in the entire corpus for cross-protocol translation. This gap is entirely unaddressed.
|
||||
|
||||
5. **Assurance profiles**: How does the same agent ecosystem work in a fast development environment (acceptable risk, minimal overhead) AND a regulated production environment (proofs, attestations, compliance)? Today: the discussion is split between safety-oriented drafts and capability-oriented drafts with no bridge between them.
|
||||
|
||||
These five needs map precisely to the five most critical and high-severity gaps in our analysis. They are not exotic requirements; they are the basic infrastructure that any production agent deployment will need. The fact that 361 drafts have been written without addressing them is the landscape's defining weakness.
|
||||
These five needs map precisely to the five most critical and high-severity gaps in our analysis. They are not exotic requirements; they are the basic infrastructure that any production agent deployment will need. The fact that 434 drafts have been written without addressing them is the landscape's defining weakness.
|
||||
|
||||
---
|
||||
|
||||
@@ -58,7 +58,7 @@ The current trajectory continues. Draft volume doubles again. The OAuth-for-agen
|
||||
|
||||
### Scenario B: Consolidation Through Working Groups
|
||||
|
||||
The IETF establishes one or more focused working groups specifically for AI agent architecture (not just individual protocols). These WGs force consolidation: the 14 OAuth proposals get down to 2-3. The 120 A2A protocols get mapped against a common requirements document. Gap-filling work gets explicitly chartered.
|
||||
The IETF establishes one or more focused working groups specifically for AI agent architecture (not just individual protocols). These WGs force consolidation: the 14 OAuth proposals get down to 2-3. The 155 A2A protocols get mapped against a common requirements document. Gap-filling work gets explicitly chartered.
|
||||
|
||||
**Result**: A more coherent landscape emerges by mid-2027. Not a single standard, but a small number of complementary standards with defined interfaces between them. Safety work gets a mandate.
|
||||
|
||||
@@ -88,11 +88,11 @@ The most critical missing piece is a shared execution model for agent tasks. Exe
|
||||
|
||||
### 2. Build human oversight in now, not later
|
||||
|
||||
The 30-vs-120 human-agent-to-A2A ratio is not just a standards problem; it is an engineering problem. Systems being designed today without human override primitives will need to be retrofitted. The CHEQ protocol (draft-rosenberg-aiproto-cheq) and the LLM-assisted network management framework (draft-cui-nmrg-llm-nm) both propose HITL models. Pick one and build to it, or design your own -- but do not ship agent systems without override capability.
|
||||
The 34-vs-155 human-agent-to-A2A ratio is not just a standards problem; it is an engineering problem. Systems being designed today without human override primitives will need to be retrofitted. The CHEQ protocol (draft-rosenberg-aiproto-cheq) and the LLM-assisted network management framework (draft-cui-nmrg-llm-nm) both propose HITL models. Pick one and build to it, or design your own -- but do not ship agent systems without override capability.
|
||||
|
||||
### 3. Assume protocol diversity, design for translation
|
||||
|
||||
The 120-protocol landscape is not going to consolidate to one protocol. Design agent systems with protocol abstraction layers. Assume that agents in your ecosystem will eventually need to communicate with agents speaking different protocols. The gateway pattern (draft-agent-gw, draft-li-dmsc-macp) is emerging as the pragmatic solution.
|
||||
The 155-protocol landscape is not going to consolidate to one protocol. Design agent systems with protocol abstraction layers. Assume that agents in your ecosystem will eventually need to communicate with agents speaking different protocols. The gateway pattern (draft-agent-gw, draft-li-dmsc-macp) is emerging as the pragmatic solution.
|
||||
|
||||
### 4. Invest in error recovery
|
||||
|
||||
@@ -112,7 +112,7 @@ In the first equilibrium, the landscape looks like today's microservices ecosyst
|
||||
|
||||
In the second equilibrium, the landscape looks more like the web: a layered architecture where identity (like TLS), communication (like HTTP), and semantics (like HTML) are cleanly separated, with standardized interfaces between them. Agents identify via WIMSE, execute via ECT-based DAGs, communicate via protocol-agnostic bindings, and operate under assurance profiles that scale from development to regulated production. Safety is built in, not bolted on.
|
||||
|
||||
The data we have analyzed -- 361 drafts, 628 cross-org convergent ideas, 12 gaps, 18 team blocs -- contains the building blocks for the second equilibrium. The question is whether the IETF community organizes itself to assemble them before market reality imposes the first.
|
||||
The data we have analyzed -- 434 drafts, 628 cross-org convergent ideas, 11 gaps, 18 team blocs -- contains the building blocks for the second equilibrium. The question is whether the IETF community organizes itself to assemble them before market reality imposes the first.
|
||||
|
||||
The history of internet standards suggests that both happen: a messy market reality emerges first, followed by standards that rationalize and improve it. The web started with browser wars and incompatible HTML, then converged on HTML5. Mobile started with a zoo of protocols, then converged on LTE/5G. The AI agent ecosystem may follow the same path.
|
||||
|
||||
@@ -124,4 +124,4 @@ The drafts are being written. The race is on. The outcome depends on whether coo
|
||||
|
||||
---
|
||||
|
||||
*Analysis based on 361 IETF Internet-Drafts, 557 authors, 628 cross-org convergent ideas, and 12 identified gaps, current as of March 2026. Written by the Architect agent as input for the blog series and as a standalone reference document.*
|
||||
*Analysis based on 434 IETF Internet-Drafts, 557 authors, 628 cross-org convergent ideas, and 11 identified gaps, current as of March 2026. Written by the Architect agent as input for the blog series and as a standalone reference document.*
|
||||
|
||||
Reference in New Issue
Block a user