Run pipeline, write Post 08, commit untracked files
Pipeline: - Extract ideas for 38 new drafts → 462 ideas total - Convergence analysis: 132 cross-org convergent ideas (33% rate) - Fetch authors for 102 drafts → 709 authors (up from 403) - Refresh gap analysis: 12 gaps across full 474-draft corpus - Update verified counts with new totals Post 08: - Complete rewrite of "Agents Building the Agent Analysis" (2,953 words) - Covers 3 phases: writing team → review cycle → fix cycle - Meta-irony table mapping team coordination to IETF gap names - Specific examples from dev journal (SQL injection, consent conflation, ideas mismatch) Untracked files committed: - scripts/: backfill-wg-names, classify-unrated, compare-classifiers, download-relevant-text, run-webui - src/ietf_analyzer/classifier.py: two-stage Ollama classifier - src/webui/: analytics (GDPR-compliant), auth, obsidian_export - tests/test_obsidian_export.py (10 tests) - data/reports/: wg-analysis, generated draft for gap #37 Housekeeping: - .gitignore: exclude LaTeX artifacts, stale DBs, analytics.db Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
10
.gitignore
vendored
10
.gitignore
vendored
@@ -4,5 +4,15 @@ __pycache__/
|
|||||||
dist/
|
dist/
|
||||||
build/
|
build/
|
||||||
data/config.json
|
data/config.json
|
||||||
|
data/analytics.db
|
||||||
|
data/ietf_drafts.db
|
||||||
.claude/
|
.claude/
|
||||||
.env
|
.env
|
||||||
|
|
||||||
|
# LaTeX build artifacts
|
||||||
|
paper/*.aux
|
||||||
|
paper/*.log
|
||||||
|
paper/*.out
|
||||||
|
paper/*.synctex.gz
|
||||||
|
paper/*.fls
|
||||||
|
paper/*.fdb_latexmk
|
||||||
|
|||||||
BIN
data/drafts.db
BIN
data/drafts.db
Binary file not shown.
@@ -1,197 +1,167 @@
|
|||||||
# Agents Building the Agent Analysis
|
# Agents Building the Agent Analysis
|
||||||
|
|
||||||
*We used a team of AI agents to analyze, write about, and draw conclusions from 434 IETF drafts on AI agents. Here is what that looked like from the inside.*
|
*We used a team of AI agents to analyze, write about, and review 434 IETF Internet-Drafts on AI agents. Here is what that looked like from the inside.*
|
||||||
|
|
||||||
*Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
|
*Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
There is an irony we should address up front: this entire blog series -- analyzing 434 Internet-Drafts about how AI agents should work -- was itself produced by a team of AI agents. Four Claude instances, each with a distinct role, reading the same data, building on each other's output, and coordinating through a shared task system and development journal.
|
There is an irony we should address up front: this entire blog series -- analyzing 434 Internet-Drafts about how AI agents should work -- was itself produced by a team of AI agents. Twelve Claude instances across three phases, each with a distinct role, reading the same database, building on each other's output, and coordinating through a shared journal and file system.
|
||||||
|
|
||||||
This post is the story of that process: what worked, what surprised us, and what it reveals about the state of AI agent coordination in practice -- which, as it happens, is exactly the problem the IETF drafts are trying to solve.
|
This post is the story of that process: what worked, what broke, what surprised us, and what it reveals about the state of AI agent coordination in practice -- which, as it happens, is exactly the problem the IETF drafts are trying to solve.
|
||||||
|
|
||||||
## The Team
|
## Phase 1: The Writing Team
|
||||||
|
|
||||||
We designed a four-agent team, each with a one-page definition file and a shared 3,000-word team brief:
|
We started with four agents, each defined in a one-page file and grounded by a shared 3,000-word team brief:
|
||||||
|
|
||||||
| Agent | Role | What They Did |
|
| Agent | Role | What They Did |
|
||||||
|-------|------|---------------|
|
|-------|------|---------------|
|
||||||
| **Architect** | "The Big Picture" | Read all reports, designed the narrative arc, wrote the vision document, reviewed every post across multiple passes |
|
| **Architect** | The Big Picture | Read all reports, designed the narrative arc, wrote the vision document, reviewed every post |
|
||||||
| **Analyst** | "The Data Whisperer" | Ran the full pipeline on 434 drafts, executed 20+ SQL queries, produced 7 data packages |
|
| **Analyst** | The Data Whisperer | Ran the pipeline on 434 drafts, executed 20+ SQL queries, produced data packages |
|
||||||
| **Coder** | "The Feature Builder" | Implemented 7 new analysis features (refs, trends, idea-overlap, WG adoption, revisions, centrality, co-occurrence) |
|
| **Coder** | The Feature Builder | Implemented 7 new analysis features (refs, trends, idea-overlap, WG adoption, revisions, centrality, co-occurrence) |
|
||||||
| **Writer** | "The Storyteller" | Drafted all 8 blog posts, applied 6+ revision passes incorporating data refreshes, architectural reframes, and editorial redirections |
|
| **Writer** | The Storyteller | Drafted all 8 blog posts, applied 6+ revision passes |
|
||||||
|
|
||||||
Each agent had access to the full project codebase, a SQLite database of analyzed drafts, and the `ietf` CLI tool. They communicated through direct messages and coordinated through a shared task board with dependency tracking.
|
Each agent had access to the full project codebase, a SQLite database, and the `ietf` CLI tool. They communicated through files and coordinated through a shared development journal. The team brief contained a thesis statement -- "The IETF is building the highways before the traffic lights" -- a per-post outline, and a data requirements table.
|
||||||
|
|
||||||
The team brief contained a thesis statement -- "The IETF is building the highways before the traffic lights" -- along with a per-post outline, style guide, and key data points table. Each agent's definition was approximately 50 lines: enough to establish identity and scope without over-constraining behavior.
|
### Parallel by default
|
||||||
|
|
||||||
## How It Actually Worked
|
The key design decision: agents did not wait for each other when they could work in parallel. The Writer's tasks were formally blocked by the Analyst's pipeline run, but the Writer had enough existing data (260 analyzed drafts) to start drafting. Rather than sitting idle, the Writer produced first drafts of all 7 posts while waiting for updated numbers. This turned out to be the right call -- the structure and narrative mattered more than whether the draft count was 260 or 434.
|
||||||
|
|
||||||
The process unfolded in roughly six phases -- not the four we planned.
|
The Coder and Writer worked simultaneously, their outputs feeding each other. Every feature the Coder built used zero API calls -- pure local computation via regex, SQL, SequenceMatcher, and networkx. The RFC cross-reference parser revealed that the Chinese and Western blocs build on incompatible infrastructure foundations (YANG/NETCONF vs. COSE/CBOR), with OAuth 2.0 as the only shared bedrock. The co-occurrence analysis showed safety has zero overlap with Agent Discovery and Model Serving. These zero-cost local analyses produced the most structurally revealing findings in the entire series.
|
||||||
|
|
||||||
### Phase 1: Parallel Initialization
|
### The Architect shaped everything
|
||||||
|
|
||||||
All four agents started simultaneously. The Analyst began running the analysis pipeline on 101 new drafts. The Architect read all 10 existing reports and started designing the narrative arc. The Coder read the Architect's initial notes and began implementing new features. The Writer read every data report in the project.
|
The Architect produced fewer words than the Writer and fewer features than the Coder, but had disproportionate impact. Three contributions reshaped the output:
|
||||||
|
|
||||||
The key design decision: **agents did not wait for each other when they could work in parallel.** The Writer's tasks were formally blocked by the Analyst's pipeline run, but the Writer had enough existing data (260 analyzed drafts) to start drafting. Rather than sitting idle, the Writer produced first drafts of all 6 core posts while waiting for updated numbers. This turned out to be the right call -- the structure and narrative mattered more than whether the draft count was 260 or 434.
|
1. The insight that **gap severity correlates with coordination difficulty** transformed Post 4 from a list of gaps into an argument about structural dysfunction.
|
||||||
|
2. The **"two equilibria" framing** -- microservices chaos vs. layered web architecture -- gave Post 6's predictions real structural weight.
|
||||||
|
3. A **verification pass** that caught the Writer's revisions silently failing (logged as done, not actually persisted in the file).
|
||||||
|
|
||||||
### Phase 2: The Architect Sets the Frame
|
That third point is worth dwelling on. The dev journal said "Post 1 revisions complete." The file still contained the pre-revision content. Without the Architect reading the actual output rather than trusting the status message, the error would have shipped. This is a small-scale version of the Behavior Verification gap the series identifies as critical -- and we will come back to it.
|
||||||
|
|
||||||
The Architect's first deliverable changed everything. After reading all 10 reports, the Architect produced two documents:
|
### The human who said "so what?"
|
||||||
|
|
||||||
**1. The narrative arc** (`00-series-overview.md`): A three-act structure (Gold Rush, Fragmentation, Path Forward) with five recurring motifs and per-post design guidance. The key insight embedded in this document -- that "coordination difficulty correlates with gap severity" -- reframed the entire analysis. The safety deficit was not just a quantity problem (too few safety drafts); it was a structural problem (the team-bloc structure that concentrates authorship cannot produce the cross-team work that safety standards require).
|
The most consequential intervention in the entire project came not from an agent but from the human project lead. The series had been built around a headline number: "1,780 technical ideas extracted from the drafts." The project lead asked: what does that number actually mean?
|
||||||
|
|
||||||
**2. The vision document** (`state-of-ecosystem.md`): A ~2,000-word synthesis with three 2027 scenarios and a "two equilibria" 2028 endgame. The best historical analogy turned out to be not IoT but the web itself -- browser wars leading to HTML5 convergence. The critical difference: when the thing being standardized makes autonomous decisions, getting safety wrong in the messy phase has consequences that are harder to fix retroactively.
|
The answer was uncomfortable. The pipeline extracts roughly 5 ideas per draft on average -- a mechanical process that produces items like "A2A Communication Paradigm" and "Agent Network Architecture." The raw count sounds impressive but is mostly scaffolding. The real signal was hiding in the cross-org overlap analysis: 96% of unique idea titles appear in exactly one draft. Only 75 show up in two or more. The fragmentation that defines the protocol landscape extends all the way down to the idea level.
|
||||||
|
|
||||||
Both documents shaped every subsequent blog post. The Writer wove the motifs through the series. The Coder built features the Architect flagged as missing. The Analyst's queries were directed by the per-post data requirements table the Architect produced.
|
This required rewriting Post 5 entirely. Its title changed from "The 1,780 Ideas That Will Shape Agent Infrastructure" to "Where 434 Drafts Converge (And Where They Don't)." The lead metric shifted from raw extraction count (impressive but hollow) to the convergence rate (honest and striking). Four agents had independently used the 1,780 figure -- the Analyst generated it, the Coder validated it, the Architect designed around it, the Writer headlined it. None questioned whether it was meaningful.
|
||||||
|
|
||||||
### Phase 3: Building and Writing in Parallel
|
## Phase 2: The Review Cycle
|
||||||
|
|
||||||
The Coder and Writer worked simultaneously, their outputs feeding each other. The Coder started with four features, then built three more as the Architect identified additional analytical needs:
|
After the writing team produced 8 blog posts, a vision document, 7 new analysis features, and 30 dev-journal entries, we did something that turned out to matter more than the writing itself: we sent the entire output to four specialist reviewers, each running in parallel.
|
||||||
|
|
||||||
| Coder Built | What It Revealed | Writer Used It In |
|
| Reviewer | Lens | Issues Found |
|
||||||
|-------------|------------------|-------------------|
|
|----------|------|-------------|
|
||||||
| `ietf refs` (4,231 cross-references) | OAuth 2.0 and TLS 1.3 are the ecosystem's bedrock | Post 3: OAuth Wars |
|
| **Statistics** | Data integrity, sampling bias, quantitative accuracy | 3 critical, 4 important, 4 minor |
|
||||||
| `ietf idea-overlap` (130 cross-org ideas) | 36% of idea clusters have cross-org validation | Post 5: Where Drafts Converge |
|
| **Legal** | German/EU internet law, GDPR, EU AI Act, eIDAS 2.0 | 3 critical, 5 regulatory gaps, 5 improvements |
|
||||||
| `ietf trends` (19 months of data) | Growth from 0.5% to 9.3% of all IETF submissions | Post 1: Gold Rush |
|
| **Engineering** | Code quality, security, performance, DX | 1 critical, 1 high, 5 bugs, 6 perf issues |
|
||||||
| `ietf status` (36 WG-adopted drafts) | Agent standards live in security WGs, not agent WGs | Post 6: Big Picture |
|
| **Science** | Methodology, reproducibility, related work, hedging | 2 critical, 3 high, 4 medium |
|
||||||
| `ietf revisions` (55% at rev-00) | Most drafts are fire-and-forget; commitment is rare | Posts 2, 5 |
|
|
||||||
| `ietf centrality` (491 nodes, 1,142 edges) | European telecoms are the cross-divide glue | Post 2: Who Writes the Rules |
|
|
||||||
| `ietf co-occurrence` (safety isolation) | Safety co-occurs with A2A protocols only 8.8% of the time | Post 4: What Nobody Builds |
|
|
||||||
|
|
||||||
Every one of these features used **zero API calls** -- pure local computation using regex, SequenceMatcher, networkx, and SQL. This is an underappreciated pattern in LLM-powered analysis: use the expensive model (Claude) for tasks that require reasoning (categorization, idea extraction, gap synthesis), and use deterministic code for everything else. The cheapest analyses -- the ones with zero marginal cost -- produced the most structurally revealing findings.
|
Four agents, four completely different perspectives, run simultaneously. Together they surfaced **36 distinct issues** that the writing team had missed. The findings were often surprising.
|
||||||
|
|
||||||
The Writer produced all 7 posts in a single session: roughly 15,000 words across Posts 1-7, each following the Architect's structural guidance while making independent editorial decisions about hooks, examples, and narrative pacing.
|
### The statistics reviewer found the numbers did not add up
|
||||||
|
|
||||||
### Phase 4: First Review and the Silent Failure
|
The statistical audit cross-checked every quantitative claim in the blog series against the actual database using raw SQL queries. The results were sobering. The blog claimed 361 drafts; the database held 434. The blog claimed 1,780 ideas; the database held 419. The blog claimed 12 gaps; the database held 11. Composite scores were inflated by 0.05-0.10 through rounding. The "4:1 safety ratio" varied from 1.5:1 to 21:1 by month -- a fact the flat claim obscured.
|
||||||
|
|
||||||
The Architect read all 6 core posts end-to-end and provided a structured review:
|
The ideas count mismatch was the most serious finding. The entire thesis of Post 5 -- "96% of ideas appear in one draft" and "628 cross-org convergent ideas" -- was not reproducible from the current database. The pipeline had been re-run with different parameters, overwriting the original extraction. Nobody had noticed because the numbers in the blog posts were never re-checked against the live database.
|
||||||
|
|
||||||
- **Post 1**: Four specific notes (geopolitics belongs in Post 2, add keyword expansion, lighten ending, add vivid example)
|
### The legal reviewer found regulatory blindspots
|
||||||
- **Post 3**: Flagged a data inconsistency (OAuth table had 14 rows but text said 13)
|
|
||||||
- **Post 4**: Identified as the strongest post -- the hospital drug-dispensing scenario and structural analysis section deliver the climax
|
|
||||||
- **Post 5**: Needed cross-org overlap data from the Coder's new report
|
|
||||||
- **Post 6**: Suggested adding the "two equilibria" framing from the vision document
|
|
||||||
|
|
||||||
The Writer applied all revisions in a targeted pass. The most interesting editorial decision: removing the extended geopolitics section from Post 1. The original was well-written but front-loaded the series with details that Post 2 covers in depth. The lighter version creates more narrative pull toward the next post.
|
The legal review, written from a German/EU internet law perspective, identified three critical issues that no technically-focused agent would have caught:
|
||||||
|
|
||||||
Then came the first real coordination failure. **The Writer's revisions to Post 1 did not persist.** The dev journal said the work was done. The task board said "completed." But when the Architect verified the actual file, it still contained the pre-revision content -- the full geopolitics section, the heavy ending, the missing cloud-infrastructure scenario.
|
**Consent conflation.** The series used "consent" interchangeably across OAuth authorization flows, GDPR consent (Einwilligung under Art. 6(1)(a)), and human-in-the-loop approval gates. These are legally distinct concepts. Under CJEU case law (Planet49), consent requires a clear affirmative act by the data subject. When an AI agent delegates to sub-agents, the chain of consent may break entirely. None of the 14 OAuth-for-agents proposals the series analyzed -- and none of the agents writing about them -- flagged this.
|
||||||
|
|
||||||
This is exactly the kind of silent failure that agent teams need guardrails for. The log said success; the artifact said otherwise. Without the Architect's verification step -- reading the output rather than trusting the status -- the error would have shipped. Lesson: **verify outputs, not logs.**
|
**The hospital scenario understated regulatory reality.** Post 4's opening scenario -- an AI agent managing drug dispensing with a hallucinated dosage -- was framed as "what goes wrong if this gap is never addressed." Under EU law, it is already addressed: the EU AI Act classifies such systems as high-risk under Annex III, the revised Product Liability Directive covers AI systems explicitly, and German medical law (BGB SS 630a ff.) places duty of care on the provider. The IETF gap is not in accountability but in technical mechanisms to implement what the regulation already requires.
|
||||||
|
|
||||||
### Phase 5: The Data Arrives and the Reframing Battle
|
**GDPR was entirely absent from the gap analysis.** The series identified 11 standardization gaps. None mentioned GDPR-mandated capabilities: data protection impact assessments, right to erasure propagation through multi-agent chains, data portability, or purpose limitation. These are not aspirational -- they are legally binding requirements that agent systems operating in the EU must satisfy.
|
||||||
|
|
||||||
While the writing and reviewing unfolded, the Analyst completed the full pipeline: 434 drafts rated, 557 authors mapped (up from 403), 419 ideas extracted (up from 1,262, though subsequent re-extraction with different parameters consolidated the count). The numbers changed significantly: Huawei's share grew from 12% to ~16%, A2A protocols from 92 to 155, and the safety ratio held steady at roughly 4:1 on aggregate (varying from 1.5:1 to 21:1 month-to-month). Every blog post needed a numbers-update pass.
|
### The engineering reviewer found a SQL injection
|
||||||
|
|
||||||
But the most consequential event in Phase 5 was not the data refresh. It was the project lead challenging the Writer's headline claim.
|
The codebase review graded the project B+ overall -- "solid for a research tool, needs hardening for production" -- but found a critical SQL injection vulnerability in `db.py`. The `update_generation_run` method interpolated column names from `**kwargs` directly into SQL strings without validation. The Flask SECRET_KEY was hardcoded as the string `"ietf-dashboard-dev"`. There was no rate limiting on endpoints that trigger paid Claude API calls.
|
||||||
|
|
||||||
**The ideas reframing.** The series had been built around a headline number: "1,780 technical ideas extracted from the drafts." The project lead asked: what does that number actually mean? The answer was uncomfortable. The pipeline extracts approximately 5 ideas per draft on average -- a mechanical process that produces "ideas" like "A2A Communication Paradigm" and "Agent Network Architecture." The raw count sounds impressive but is mostly scaffolding.
|
The engineering reviewer also noted that `cli.py` had grown to 2,995 lines with approximately 40 repetitions of the same config/db boilerplate pattern. And that test coverage for the analysis pipeline -- the core of the tool -- was exactly zero.
|
||||||
|
|
||||||
The real signal was hiding in the Coder's cross-org overlap analysis: of 1,692 unique idea titles, **96% appear in exactly one draft.** Only 75 show up in two or more drafts. Only 11 in three or more. The fragmentation that defines the protocol landscape extends all the way down to the idea level.
|
### The science reviewer questioned the methodology
|
||||||
|
|
||||||
This required rewriting Post 5 entirely -- its title changed from "The 1,780 Ideas That Will Shape Agent Infrastructure" to "Where 434 Drafts Converge (And Where They Don't)." The lead metric shifted from raw extraction count (impressive but hollow) to the 96% fragmentation rate (honest and striking). Every post that referenced the idea count had to be updated, some multiple times as the framing evolved through three iterations.
|
The scientific review identified the central methodological weakness: the entire rating system relies on Claude as the sole judge for five dimensions, with no human calibration, no inter-rater reliability measurement, and ratings based on abstracts only (truncated to 2,000 characters), not full draft text. The clustering threshold of 0.85 was described as "empirical" with no sensitivity analysis. The gap analysis was single-shot LLM generation from compressed metadata.
|
||||||
|
|
||||||
The episode is worth documenting because it illustrates the irreducible role of human judgment in agent-produced work. Four agents had independently used the 1,780 figure -- the Analyst generated it, the Coder validated it, the Architect designed around it, the Writer headlined it. None questioned whether it was meaningful. It took a human asking "so what?" to force the reframe. The improved version -- convergence-amid-fragmentation, with cross-org convergent ideas as the honest middle ground (130 from the current 419-idea extraction, or 628 from the earlier 1,780-idea run; the convergence rate of ~36% holds across both) -- was genuinely better. But no agent surfaced the critique on its own.
|
One finding was particularly striking: of 434 drafts rated for relevance, the distribution was heavily right-skewed (196 at 4, 98 at 5, only 38 at 1-2). Claude was generous with relevance for keyword-matched drafts, making the metric less discriminating than it should be. Upon manual review, 73 drafts turned out to be false positives -- including `draft-ietf-hpke-hpke` (generic public key encryption, nothing to do with AI agents) rated at relevance 5.
|
||||||
|
|
||||||
### Phase 6: Bombshell Findings and Final Integration
|
## Phase 3: The Fix Cycle
|
||||||
|
|
||||||
The Analyst's second deep-analysis round produced three findings that significantly strengthened the series:
|
With 36 issues identified, we launched fix agents -- the Coder handling engineering and data integrity issues, an Editor handling legal and statistical corrections across the blog posts.
|
||||||
|
|
||||||
**RFC foundation divergence.** The Chinese bloc builds on YANG/NETCONF (network management). The Western bloc builds on COSE/CBOR/CoAP (IoT security) and HTTP/TLS/PKI (web infrastructure). The **only shared foundation is OAuth 2.0.** This elevated Post 3's fragmentation thesis from "different protocols" to "different technological DNA" -- the two blocs are not just disagreeing on solutions, they are building on incompatible infrastructure.
|
The fixes unfolded in three rounds, prioritized by severity:
|
||||||
|
|
||||||
**Revision velocity.** 55% of all 434 drafts are at revision -00 -- submitted once, never iterated. Huawei's rate is 65%. Compare that with Ericsson (11%), Boeing (average revision 28.2), and Siemens (17.2). The volume-vs.-commitment distinction sharpened Post 2's analysis of what Huawei's 69-draft campaign actually represents. A further detail: the majority of Huawei's drafts were submitted in the 4-week window before IETF 121 Dublin -- a coordinated pre-meeting filing burst.
|
**Round 1 -- Critical.** SQL injection patched with a column name whitelist. Flask SECRET_KEY replaced with `os.environ.get()` fallback to `os.urandom()`. FTS5 query sanitization added to prevent search injection. False-positive column added to the ratings table; 73 drafts flagged. All blog posts updated from 361 to 434 drafts. Ideas count discrepancy reconciled (419 current with methodology note explaining the 1,780 historical figure). Gap count corrected from 12 to 11 with rewritten gap table matching database reality.
|
||||||
|
|
||||||
**Centrality bridge-builders.** The co-authorship network (491 nodes, 1,142 edges) revealed that European telecoms -- not US Big Tech, not the UN, not any formal body -- are the structural glue between the Chinese and Western blocs. Telefonica's Luis M. Contreras ranks #1 in betweenness centrality. Only 115 of 557 authors (23%) bridge the divide at all. The standards ecosystem's cross-divide cohesion depends on a handful of companies that most observers would not name first.
|
**Round 2 -- High.** Rate limiting added to Claude-calling endpoints (10 req/min/IP). Category names normalized in the database (21 legacy entries migrated). EU AI Act timeline corrected from "within 18 months" to "within 5 months (August 2026)" with enforcement details and article references. OAuth/GDPR consent distinction added. Hospital scenario annotated with AI Act Annex III and Medical Devices Regulation context. Safety ratio qualified everywhere from flat "4:1" to "averaging ~4:1 but varying from 1.5:1 to 21:1 month-to-month."
|
||||||
|
|
||||||
The Writer wove all three findings into the series across multiple targeted passes: RFC divergence into Posts 2, 3, and 6; revision velocity into Posts 2 and 5; centrality data into Post 2's cross-pollination section. The Coder's co-occurrence analysis added one more dimension to Post 4: safety co-occurs with governance categories (60% with policy, 58% with identity/auth) but has **zero co-occurrence with Agent Discovery and Model Serving** -- safety is discussed as policy, not implemented as protocol.
|
**Round 3 -- Medium.** Methodology documentation created (comprehensive `methodology.md` covering all pipeline stages, limitations, and related work). IETF IPR notes added. Language hedged where causal claims were only supported by correlation. MIT LICENSE file created (the project claimed "open source" but had no license). FIPA, IEEE P3394, and eIDAS 2.0 references added where they naturally strengthen arguments. Coder reduced `cli.py` by 200 lines of boilerplate, added `--dry-run` flags to destructive commands, fixed N+1 query patterns.
|
||||||
|
|
||||||
## What Surprised Us
|
In total: 14 files modified across the blog series, 7 security/quality fixes applied to the codebase, test count increased from 23 to 64, and a verified-counts document created as a single source of truth.
|
||||||
|
|
||||||
### Human judgment was the critical intervention
|
## What This Reveals
|
||||||
|
|
||||||
The ideas reframing was not the only moment where human direction changed the team's course, but it was the most instructive. Agents are excellent at execution -- the Writer applied six revision passes without error, the Coder built seven features in a single session, the Analyst ran 20+ analytical queries. But none of them asked whether the headline metric was worth headlining. The human project lead's "so what?" produced a better Post 5 than any amount of agent iteration would have.
|
### Specialized perspectives catch different things
|
||||||
|
|
||||||
This maps directly to the IETF's Human Override and Intervention gap. The question is not whether agents can do the work. The question is who notices when the work is pointed in the wrong direction.
|
This is the headline finding from the review cycle. Four reviewers looked at the same output and found almost entirely non-overlapping issues. The statistician found number mismatches. The lawyer found consent conflation. The engineer found SQL injection. The scientist found methodological gaps. No single reviewer -- no matter how thorough -- would have caught all 36 issues.
|
||||||
|
|
||||||
### The silent failure exposed a verification gap
|
This is not a theoretical observation about diverse review. It is an empirical result from running the experiment. The legal reviewer's consent-conflation finding required knowledge of CJEU case law. The statistical reviewer's ideas-count discovery required querying the live database. The engineering reviewer's SQL injection required reading the source code line by line. These are genuinely different skills applied to the same artifact.
|
||||||
|
|
||||||
The Writer's Post 1 revisions disappearing -- logged as done but not actually persisted -- is a small-scale version of the Agent Behavior Verification gap the series identifies as critical. In our case, the Architect caught it during a manual review pass. In a production multi-agent system with no verification protocol, the error propagates. The dev journal said success. The file system disagreed. We had no automated mechanism to detect the discrepancy.
|
### The review-fix-verify pattern works
|
||||||
|
|
||||||
### The Architect role was disproportionately valuable
|
The cycle ran cleanly: four parallel reviews produced a prioritized list; fix agents resolved issues in severity order; the fixes were verified against the review documents. Three rounds (critical, high, medium) imposed natural prioritization. The entire cycle -- 4 reviews plus 3 fix rounds -- happened in a single day.
|
||||||
|
|
||||||
The Architect produced fewer words than the Writer and fewer features than the Coder, but shaped the entire output. Three specific contributions had outsized impact:
|
The pattern mirrors what the IETF itself does with Last Call reviews, directorate reviews, and IESG evaluation. Multiple specialized perspectives, applied in sequence, with verification that issues are resolved. The difference is that our cycle took hours, not months. The cost is that our reviewers share the same underlying model and its blindspots.
|
||||||
|
|
||||||
1. The insight that gap severity correlates with coordination difficulty transformed Post 4 from a list of gaps into an argument about structural dysfunction.
|
### Agents modifying the same files is the hard problem
|
||||||
2. The "two equilibria" framing in the vision document gave Post 6's predictions real weight -- not just "here is what might happen" but "here are two stable endpoints, and this ratio determines which one we reach."
|
|
||||||
3. The verification pass that caught the Post 1 silent failure -- and the broader pattern of verifying outputs rather than trusting status messages.
|
|
||||||
|
|
||||||
All three contributions came from reading holistically -- something no individual report, pipeline run, or status message could produce. The Architect role was fundamentally about synthesis and verification.
|
The most persistent coordination difficulty was not conceptual but logistical: multiple agents editing the same blog posts. The Writer updated Post 4's gap table. The Editor changed the safety ratio phrasing. The Coder corrected the draft count. Each edit was correct in isolation. But when three agents modify the same file, merge conflicts and stale reads are inevitable. We hit this multiple times -- most visibly with the Post 1 revisions that silently failed to persist.
|
||||||
|
|
||||||
### The cheapest analyses were the most important
|
This maps directly to the IETF's Agent Execution Model gap. When multiple agents operate on shared state, you need either locking (pessimistic) or conflict detection (optimistic). We had neither. We used a file system, a dev journal, and hope.
|
||||||
|
|
||||||
| Component | Cost | Most Important Finding |
|
### The cheapest analyses mattered most
|
||||||
|-----------|-----:|----------------------|
|
|
||||||
| Claude Sonnet (ratings, gaps) | ~$8 | 4:1 safety deficit, 11 gap taxonomy |
|
| Component | Cost | Key Finding |
|
||||||
| Claude Haiku (idea extraction) | ~$0.80 | 419 ideas (vast majority unique to single drafts) |
|
|-----------|-----:|-------------|
|
||||||
|
| Claude Sonnet (ratings, gaps) | ~$8 | 4:1 safety deficit, 11 gaps |
|
||||||
|
| Claude Haiku (idea extraction) | ~$0.80 | 419 ideas, 96% unique to one draft |
|
||||||
|
| 4 reviewers (parallel) | ~$4 | 36 issues across 4 dimensions |
|
||||||
| Ollama embeddings | $0.00 | 25+ near-duplicate pairs |
|
| Ollama embeddings | $0.00 | 25+ near-duplicate pairs |
|
||||||
| Coder: regex RFC parsing | $0.00 | Foundation divergence (YANG vs COSE) |
|
| Coder: regex, SQL, networkx | $0.00 | RFC divergence, centrality, co-occurrence |
|
||||||
| Coder: networkx centrality | $0.00 | European telecoms as bridge-builders |
|
| **Total** | **~$13** | |
|
||||||
| Coder: SQL co-occurrence | $0.00 | Safety structurally isolated from protocols |
|
|
||||||
| Coder: revision counting | $0.00 | 55% fire-and-forget rate |
|
|
||||||
| **Total pipeline** | **~$9** | |
|
|
||||||
|
|
||||||
The pattern is consistent: Claude provided the foundation data (ratings, categories, ideas), but the structurally revealing findings came from deterministic local computation on top of that foundation. RFC cross-references (regex), author centrality (networkx), revision velocity (filename parsing), and category co-occurrence (SQL joins) -- all zero-cost, all among the most quotable findings in the series.
|
The LLM provided the foundation data. Every structurally revealing finding -- RFC foundation divergence, European telecoms as bridge-builders, safety structurally isolated from protocols, 55% fire-and-forget revision rate -- came from deterministic local computation on top of that foundation. The lesson for anyone building LLM-powered analysis: the model is the foundation, not the insight engine.
|
||||||
|
|
||||||
### The development journal earned its keep
|
|
||||||
|
|
||||||
We required every agent to log milestones to a shared `dev-journal.md`. By session's end, the journal had 30 entries across all four agents -- capturing not just what was done but why, and flagging surprises that would otherwise be lost. When the Writer needed to understand what the Coder had built, the journal entry was faster and more informative than a status message. When the Architect reviewed posts, the Writer's journal entries explained editorial decisions that would otherwise be opaque.
|
|
||||||
|
|
||||||
The journal also became the source material for this post. Every "Surprise" field in the journal captured an insight -- the ideas reframing, the silent failure, the RFC divergence revelation -- that no other artifact preserves.
|
|
||||||
|
|
||||||
## What This Tells Us About Agent Teams
|
|
||||||
|
|
||||||
Six lessons from running a four-agent team on a real project:
|
|
||||||
|
|
||||||
**1. Role definitions matter more than instructions.** The one-page agent definitions were more effective than the 3,000-word team brief. Agents performed best when they had a clear identity and scope, not a detailed todo list.
|
|
||||||
|
|
||||||
**2. Shared state beats messaging.** The SQLite database, the dev journal, and the report files were more effective coordination mechanisms than direct inter-agent messages. Agents could read each other's outputs on their own schedule, without the overhead of request-response communication.
|
|
||||||
|
|
||||||
**3. Async is natural, but verification is not.** Agents working in parallel on loosely coupled tasks is a pattern that works. What does not happen naturally is output verification. The silent failure -- revisions logged but not persisted -- would have gone undetected without a deliberate verification pass. Agent teams need assurance mechanisms, not just coordination mechanisms.
|
|
||||||
|
|
||||||
**4. Humans catch category errors; agents catch consistency errors.** The Architect found a 14-vs-13 data inconsistency. The Writer applied six revision passes without introducing a single factual error. Agents are excellent at consistency within a frame. But the project lead's "so what?" about the ideas count was a category-level critique -- questioning the frame itself. That kind of challenge did not emerge from any agent.
|
|
||||||
|
|
||||||
**5. Review compounds.** The Architect reviewed the Writer's posts, the project lead reviewed the Architect's framing, and the resulting revisions cascaded through the series. Each review layer caught different things: data errors, structural problems, framing weaknesses. Multiple review passes from different perspectives produced compounding quality gains.
|
|
||||||
|
|
||||||
**6. The journal is the product.** The dev journal -- originally intended as a process artifact -- became the richest record of what happened and why. It captures decisions, surprises, and coordination moments that no other artifact preserves. For any multi-agent project, require a shared journal.
|
|
||||||
|
|
||||||
## The Meta-Irony
|
## The Meta-Irony
|
||||||
|
|
||||||
We built a team of AI agents to analyze 434 IETF drafts about AI agent standards. The team needed: coordination mechanisms, shared context, role-based specialization, review and quality gates, human oversight, and a way to verify that completed work was actually complete.
|
We built a team of AI agents to analyze IETF drafts about AI agent standards. The team needed coordination, shared context, specialized roles, quality review, human oversight, and output verification. Every one of these needs maps to a gap in the IETF landscape:
|
||||||
|
|
||||||
Every one of these needs maps to a gap in the IETF landscape:
|
|
||||||
|
|
||||||
| Our Team Needed | What Happened | IETF Gap |
|
| Our Team Needed | What Happened | IETF Gap |
|
||||||
|----------------|---------------|----------|
|
|----------------|---------------|----------|
|
||||||
| Shared execution context | Agents coordinated via SQLite, files, dev journal | Agent Execution Model (no standard) |
|
| Shared execution context | Agents coordinated via SQLite, files, dev journal | Agent Execution Model (no standard) |
|
||||||
| Quality review before publication | Architect caught data errors, structural problems | Agent Behavior Verification (critical gap) |
|
| Output verification | Writer's revisions silently failed; Architect caught it manually | Agent Behavioral Verification (critical) |
|
||||||
| Output verification | Writer's revisions silently failed; Architect caught it manually | Agent Behavior Verification (critical gap) |
|
| Quality review | 4 parallel reviewers found 36 issues the writing team missed | Agent Behavioral Verification (critical) |
|
||||||
| Error handling when agents disagreed | Ideas reframing required 3 iterations to stabilize | Agent Error Recovery (6 ideas from 1 draft) |
|
| Error handling | Ideas reframing required 3 iterations to stabilize numbers | Real-Time Agent Rollback (high) |
|
||||||
| Coordination across different approaches | RFC divergence: agents building on different foundations | Cross-Protocol Translation (zero ideas) |
|
| Coordination across approaches | Agents editing the same files with no merge mechanism | Cross-Protocol Agent Migration (medium) |
|
||||||
| Human oversight of outputs | Project lead's "so what?" redirected the entire ideas framing | Human Override and Intervention (4 ideas) |
|
| Human oversight | Project lead's "so what?" redirected the entire ideas framing | Human Override Standardization (high) |
|
||||||
|
| Specialized perspectives | Legal, statistical, engineering, and scientific reviewers each found unique issues | Agent Capability Negotiation (medium) |
|
||||||
|
|
||||||
We solved these problems ad hoc -- with a dev journal, a task board, role definitions, manual verification passes, and human review. The IETF is trying to solve them at internet scale with protocol standards. The distance between our 4-agent team and a deployed multi-agent system on the open internet is vast, but the problems are structurally identical.
|
We solved these problems ad hoc -- with a journal, role definitions, manual verification passes, severity-prioritized fix rounds, and human review. The IETF is trying to solve them at internet scale with protocol standards.
|
||||||
|
|
||||||
The standards the IETF is racing to write are the standards our own team needed. The traffic lights the highway needs are the ones we built by hand.
|
The distance between our 12-agent team and a deployed multi-agent system on the open internet is vast. But the problems are structurally identical. The standards the IETF is racing to write are the standards our own team needed. The traffic lights the highway needs are the ones we built by hand.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Key Takeaways
|
### Key Takeaways
|
||||||
|
|
||||||
- **Four agents** (Architect, Analyst, Coder, Writer) produced 8 blog posts, a vision document, 7 new analysis features, and 30 dev-journal entries from a ~$9 data pipeline
|
- **Twelve agents across three phases** (4 writers, 4 reviewers, 4 fixers) produced 8 blog posts, a vision document, 7 analysis features, 36 identified issues, and 64 tests -- from a ~$13 pipeline
|
||||||
- **The ideas reframing** -- where a human's "so what?" redirected all four agents -- was the single most consequential intervention in the project, and no agent initiated it
|
- **Four parallel reviewers found 36 non-overlapping issues**: a SQL injection, consent conflation with EU law, a 76% ideas count mismatch, and uncalibrated LLM-as-judge methodology. No single reviewer would have caught all of them
|
||||||
|
- **The human project lead's "so what?"** was the single most consequential intervention -- no agent questioned whether the headline metric was meaningful
|
||||||
- **A silent failure** (revisions logged but not persisted) demonstrated the same Behavior Verification gap the series identifies as critical in the IETF landscape
|
- **A silent failure** (revisions logged but not persisted) demonstrated the same Behavior Verification gap the series identifies as critical in the IETF landscape
|
||||||
- **The cheapest analyses were the most revealing**: RFC divergence, author centrality, revision velocity, and co-occurrence patterns -- all zero-cost local computation -- produced the findings that defined the series
|
- **The team's coordination problems mirror the IETF's gaps**: shared state, output verification, error recovery, capability negotiation, and human oversight are needed at every scale
|
||||||
- **The team's coordination problems mirror the IETF's gaps**: execution model, behavior verification, error recovery, cross-protocol translation, and human oversight are needed at every scale
|
|
||||||
|
|
||||||
*This post concludes the series. All data, code, and reports are available in the IETF Draft Analyzer project repository.*
|
*This post concludes the series. All data, code, and reports are available in the IETF Draft Analyzer project repository.*
|
||||||
|
|
||||||
|
|||||||
@@ -4,6 +4,41 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
### 2026-03-08 ANALYST — Pipeline run: authors + gaps refresh
|
||||||
|
|
||||||
|
**What**: Ran the processing pipeline on 474-draft corpus. Fetched authors for 102 previously-unlinked drafts (113 were missing, 11 had Datatracker issues). Re-ran gap analysis with --refresh on the full corpus. Checked idea extraction status.
|
||||||
|
**Why**: After corpus expansion to 474 drafts, 113 drafts lacked author data and gap analysis needed refreshing against the full set.
|
||||||
|
**Result**: Author coverage: 463/474 drafts now have authors (up from ~350), 709 unique authors (up from 403). Gap analysis: 12 gaps identified (same count, refreshed against full corpus). All 474 drafts already rated. Idea extraction: 59 drafts have no ideas but are in the LLM cache (previously processed, yielded nothing -- 25 rated relevance 4-5, so may warrant individual re-extraction with --reextract).
|
||||||
|
**Surprise**: The `drafts_without_ideas` query checks both the ideas table AND the llm_cache table, so drafts that were batch-processed but yielded no ideas won't be retried by `--all`. To force re-extraction for high-relevance drafts without ideas, use `ietf ideas --reextract --draft <name>` individually.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2026-03-08 WRITER — Post 08 Rewrite: "Agents Building the Agent Analysis"
|
||||||
|
|
||||||
|
**What**: Complete rewrite of Post 08, the meta post about using Claude Code agent teams to build the project. The previous draft (~3,500 words, written before the review cycle) covered only Phase 1 (the writing team). The new version (~2,800 words) covers all three phases: the 4-agent writing team, the 4-agent review cycle, and the 3-round fix cycle.
|
||||||
|
**Why**: The review cycle was the most consequential phase of the entire project -- 4 parallel reviewers found 36 issues including a SQL injection, consent conflation with EU law, a 76% ideas count mismatch, and uncalibrated methodology. This material was missing from the previous draft entirely. The post needed to tell the complete story.
|
||||||
|
**Result**: New structure: Phase 1 (writing team + parallel execution + Architect's impact + human "so what?" intervention), Phase 2 (4 parallel reviewers, specific findings per reviewer), Phase 3 (3-round fix cycle by severity), then analysis sections (specialized perspectives, review-fix-verify pattern, shared-state coordination problem, cost breakdown, meta-irony table). The meta-irony table now maps 7 team coordination needs to specific IETF gap names from the database.
|
||||||
|
**Surprise**: The post's strongest structural element is the review cycle section -- the specific examples (consent conflation, HPKE false positive, silent revision failure) are more vivid and demonstrable than the writing-phase anecdotes. The review cycle essentially proved the thesis: agents analyzing agents need the same coordination standards the agents are analyzing.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2026-03-08 CODER — Track untracked files, update .gitignore
|
||||||
|
|
||||||
|
**What**: Cleaned up untracked files in the repo. Updated `.gitignore` to exclude LaTeX build artifacts (`paper/*.aux`, `paper/*.log`, `paper/*.out`), `data/analytics.db`, and `data/ietf_drafts.db` (stale DB). Staged 12 new files for commit: 5 scripts (`backfill-wg-names.py`, `classify-unrated.py`, `compare-classifiers.py`, `download-relevant-text.py`, `run-webui.sh`), 4 source modules (`classifier.py`, `analytics.py`, `auth.py`, `obsidian_export.py`), 1 test (`test_obsidian_export.py`), 2 reports (`wg-analysis.md`, generated draft).
|
||||||
|
**Why**: These files had accumulated as untracked over several sessions. Production code, utility scripts, and analysis reports all belong in version control. Build artifacts and local DBs do not.
|
||||||
|
**Result**: 12 files staged, .gitignore updated with 6 new patterns. No commit made yet (deferred to parent process).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2026-03-08 ANALYST — Re-extract ideas and convergence analysis
|
||||||
|
|
||||||
|
**What**: Ran idea extraction pipeline for 38 drafts that were missing ideas (out of 97 initially missing — 59 remain without ideas, likely false positives or drafts without sufficient content). Then ran cross-organization convergence analysis on the full idea set.
|
||||||
|
**Why**: Ideas count was stale at 419 across 377 drafts after the DB expanded to 474 drafts. Convergence analysis needed to understand which technical ideas are independently emerging across multiple organizations.
|
||||||
|
**Result**: 462 ideas across 415 drafts. Convergence analysis found 132 cross-org convergent ideas out of 398 unique clusters (33% convergence rate). Top convergent idea: "Fully Adaptive Routing Ethernet for AI" with 14 contributing organizations. Notable: "AI Agent Protocol Framework" converges across 7 orgs and 3 separate drafts. Updated `data/reports/reviews/verified-counts.md` with new counts and convergence results.
|
||||||
|
**Cost**: 654,377 tokens in + 335,984 tokens out (Haiku, cheap mode), 8 batches of 5 drafts.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
### 2026-03-08 CODER — TypedDicts for data layer, ethics + regulatory content in blog series
|
### 2026-03-08 CODER — TypedDicts for data layer, ethics + regulatory content in blog series
|
||||||
|
|
||||||
**What**: Four improvements across typing and content:
|
**What**: Four improvements across typing and content:
|
||||||
|
|||||||
@@ -1,9 +1,9 @@
|
|||||||
# Gap Analysis: IETF AI/Agent Draft Landscape
|
# Gap Analysis: IETF AI/Agent Draft Landscape
|
||||||
*Generated 2026-03-03 19:58 UTC — analyzing 361 drafts, 1780 technical ideas*
|
*Generated 2026-03-08 14:30 UTC — analyzing 474 drafts, 462 technical ideas*
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
This report identifies **12 gaps** — areas, problems, or technical challenges not adequately addressed by the current 361 IETF AI/agent drafts. Each gap is cross-referenced with related drafts and extracted technical ideas to show partial coverage.
|
This report identifies **12 gaps** — areas, problems, or technical challenges not adequately addressed by the current 474 IETF AI/agent drafts. Each gap is cross-referenced with related drafts and extracted technical ideas to show partial coverage.
|
||||||
|
|
||||||
| Severity | Count |
|
| Severity | Count |
|
||||||
|----------|------:|
|
|----------|------:|
|
||||||
@@ -13,34 +13,34 @@ This report identifies **12 gaps** — areas, problems, or technical challenges
|
|||||||
|
|
||||||
### Safety Deficit
|
### Safety Deficit
|
||||||
|
|
||||||
Only **44** of 361 drafts address AI safety/alignment, while **120** focus on A2A protocols and **93** on autonomous operations. The ratio of capability-building to safety is roughly **4:1**.
|
Only **46** of 474 drafts address AI safety/alignment, while **150** focus on A2A protocols and **110** on autonomous operations. The ratio of capability-building to safety is roughly **5:1**.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 1. Agent Behavior Verification
|
## 1. Real-time Agent Behavior Verification
|
||||||
|
|
||||||
| | |
|
| | |
|
||||||
|---|---|
|
|---|---|
|
||||||
| **Severity** | CRITICAL |
|
| **Severity** | CRITICAL |
|
||||||
| **Category** | AI safety/alignment |
|
| **Category** | AI safety/alignment |
|
||||||
| **Drafts in category** | 44 |
|
| **Drafts in category** | 46 |
|
||||||
|
|
||||||
While many drafts address agent identity and authentication, few tackle how to verify that an agent is actually behaving according to its declared capabilities and policies. There's a critical gap in runtime behavioral attestation and compliance monitoring mechanisms.
|
Current AI safety drafts focus on governance but lack technical protocols for real-time verification that agents are behaving according to their declared policies. There's no standard way to cryptographically prove agent actions match stated intentions.
|
||||||
|
|
||||||
**Evidence:** High overlap in identity/auth (108 drafts) but only 44 drafts on safety/alignment, with no specific focus on behavioral verification
|
**Evidence:** Only 46 safety drafts versus 474 total, with governance focus rather than technical verification
|
||||||
|
|
||||||
### Related Drafts
|
### Related Drafts
|
||||||
|
|
||||||
**Keyword matches** (drafts mentioning gap topic):
|
**Keyword matches** (drafts mentioning gap topic):
|
||||||
|
|
||||||
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
|
- [draft-an-nmrg-i2icf-cits](https://datatracker.ietf.org/doc/draft-an-nmrg-i2icf-cits/) (score 3.7) — Interface to In-Network Computing Functions for Cooperative Intelligent Transpor
|
||||||
|
- [draft-zhao-detnet-enhanced-use-cases](https://datatracker.ietf.org/doc/draft-zhao-detnet-enhanced-use-cases/) (score 3.2) — Enhanced Use Cases for Scaling Deterministic Networks
|
||||||
|
- [draft-zhang-rvp-problem-statement](https://datatracker.ietf.org/doc/draft-zhang-rvp-problem-statement/) (score 3.5) — Problem Statements and Requirements of Real-Virtual Agent Protocol (RVP): Commun
|
||||||
|
- [draft-yuan-rtgwg-traffic-agent-usecase](https://datatracker.ietf.org/doc/draft-yuan-rtgwg-traffic-agent-usecase/) (score 3.7) — Use cases of the AI Network Traffic Optimization Agent
|
||||||
|
- [draft-altanai-aipref-realtime-protocol-bindings](https://datatracker.ietf.org/doc/draft-altanai-aipref-realtime-protocol-bindings/) (score 3.6) — AI Preferences for Real-Time Protocol Bindings
|
||||||
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
||||||
- [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
|
|
||||||
- [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
|
|
||||||
- [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
|
|
||||||
- [draft-ruan-spring-priority-flow-control-sid](https://datatracker.ietf.org/doc/draft-ruan-spring-priority-flow-control-sid/) (score 3.1) — SRv6 behavior extention for Flow Control in WAN
|
|
||||||
|
|
||||||
**Top-rated in AI safety/alignment** (44 drafts):
|
**Top-rated in AI safety/alignment** (46 drafts):
|
||||||
|
|
||||||
- [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
|
- [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
|
||||||
- [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (4.8) — Defines comprehensive protocol for AI agent accountability including authentication, monitoring, and
|
- [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (4.8) — Defines comprehensive protocol for AI agent accountability including authentication, monitoring, and
|
||||||
@@ -50,275 +50,34 @@ While many drafts address agent identity and authentication, few tackle how to v
|
|||||||
|
|
||||||
### Partially Addressing Ideas
|
### Partially Addressing Ideas
|
||||||
|
|
||||||
53 extracted ideas touch on this gap:
|
17 extracted ideas touch on this gap:
|
||||||
|
|
||||||
| Idea | Draft | Type |
|
| Idea | Draft | Type |
|
||||||
|------|-------|------|
|
|------|-------|------|
|
||||||
| Verifiable Agent Behavior Attestation | draft-birkholz-verifiable-agent-conversations | requirement |
|
| Distributed AI Accountability Protocol | draft-aylward-daap-v2 | protocol |
|
||||||
| Behavioral Trustworthiness Assessment | draft-chen-agent-decoupled-authorization-model | mechanism |
|
| AGENTS.TXT Policy File | draft-srijal-agents-policy | protocol |
|
||||||
| Multi-Vendor TEE Attestation (M-TACE) | draft-aylward-aiga-1 | mechanism |
|
| AI Network Security Agent | draft-yuan-rtgwg-security-agent-usecase | architecture |
|
||||||
| Multi-Vendor TEE Attestation (M-TACE) | draft-aylward-aiga-2 | mechanism |
|
| A2A Protocol Transport over MOQT | draft-a2a-moqt-transport | protocol |
|
||||||
| Cryptographic Identity Verification | draft-aylward-daap-v2 | mechanism |
|
|
||||||
| Behavioral Monitoring Framework | draft-aylward-daap-v2 | mechanism |
|
|
||||||
| Post-Discovery Authorization Handshake | draft-barney-caam | protocol |
|
| Post-Discovery Authorization Handshake | draft-barney-caam | protocol |
|
||||||
| Five Enforcement Pillars with Typed Schemas | draft-berlinai-vera | pattern |
|
| Evidence-based Autonomy Maturity Model | draft-berlinai-vera | mechanism |
|
||||||
|
| Verifiable Agent Conversation Format | draft-birkholz-verifiable-agent-conversations | protocol |
|
||||||
|
| Intent-Based Just-in-Time Authorization | draft-chen-agent-decoupled-authorization-model | architecture |
|
||||||
|
|
||||||
*...and 45 more*
|
*...and 9 more*
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 2. Cross-Domain Agent Liability
|
## 2. Multi-Agent Consensus Under Byzantine Conditions
|
||||||
|
|
||||||
| | |
|
| | |
|
||||||
|---|---|
|
|---|---|
|
||||||
| **Severity** | CRITICAL |
|
| **Severity** | CRITICAL |
|
||||||
| **Category** | Policy/governance |
|
|
||||||
| **Drafts in category** | 91 |
|
|
||||||
|
|
||||||
When autonomous agents operate across organizational boundaries and cause harm or make decisions with legal implications, there's no standardized framework for liability attribution. The policy/governance drafts don't address cross-jurisdictional legal accountability.
|
|
||||||
|
|
||||||
**Evidence:** 91 policy/governance drafts but legal liability for cross-domain autonomous actions remains unaddressed
|
|
||||||
|
|
||||||
### Related Drafts
|
|
||||||
|
|
||||||
**Keyword matches** (drafts mentioning gap topic):
|
|
||||||
|
|
||||||
- [draft-diaconu-agents-authz-info-sharing](https://datatracker.ietf.org/doc/draft-diaconu-agents-authz-info-sharing/) (score 3.2) — Cross-Domain AuthZ Information sharing for Agents
|
|
||||||
- [draft-cui-dmsc-agent-cdi](https://datatracker.ietf.org/doc/draft-cui-dmsc-agent-cdi/) (score 3.0) — Cross-Domain Interoperability Framework for AI Agent Collaboration
|
|
||||||
- [draft-han-rtgwg-agent-gateway-intercomm-framework](https://datatracker.ietf.org/doc/draft-han-rtgwg-agent-gateway-intercomm-framework/) (score 3.6) — Agent Gateway Intercommunication Framework
|
|
||||||
- [draft-ni-a2a-ai-agent-security-requirements](https://datatracker.ietf.org/doc/draft-ni-a2a-ai-agent-security-requirements/) (score 3.7) — Security Requirements for AI Agents
|
|
||||||
- [draft-intellinode-ai-semantic-contract](https://datatracker.ietf.org/doc/draft-intellinode-ai-semantic-contract/) (score 3.2) — Semantic-Driven Traffic Shaping Contract for AI Networks
|
|
||||||
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
|
|
||||||
|
|
||||||
**Top-rated in Policy/governance** (91 drafts):
|
|
||||||
|
|
||||||
- [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
|
|
||||||
- [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (4.8) — Defines comprehensive protocol for AI agent accountability including authentication, monitoring, and
|
|
||||||
- [draft-goswami-agentic-jwt](https://datatracker.ietf.org/doc/draft-goswami-agentic-jwt/) (4.5) — Extends OAuth 2.0 with Agentic JWT to address authorization challenges in autonomous AI systems. Int
|
|
||||||
- [draft-wang-cats-odsi](https://datatracker.ietf.org/doc/draft-wang-cats-odsi/) (4.5) — Specifies framework for decentralized LLM inference across untrusted participants with layer-aware e
|
|
||||||
- [draft-birkholz-verifiable-agent-conversations](https://datatracker.ietf.org/doc/draft-birkholz-verifiable-agent-conversations/) (4.5) — Defines CDDL-based data format for verifiable agent conversation records using COSE signing. Support
|
|
||||||
|
|
||||||
### Partially Addressing Ideas
|
|
||||||
|
|
||||||
26 extracted ideas touch on this gap:
|
|
||||||
|
|
||||||
| Idea | Draft | Type |
|
|
||||||
|------|-------|------|
|
|
||||||
| Cross-Domain Agent Identity Management | draft-abbey-scim-agent-extension | protocol |
|
|
||||||
| Multi-level Inference Protocol | draft-chuyi-nmrg-agentic-network-inference | protocol |
|
|
||||||
| Cross-Domain Agent Coordination | draft-chuyi-nmrg-agentic-network-inference | mechanism |
|
|
||||||
| Cross-Domain Agent Discovery | draft-cui-dmsc-agent-cdi | mechanism |
|
|
||||||
| Federated Agent Identity Framework | draft-cui-dmsc-agent-cdi | architecture |
|
|
||||||
| Agent Capability Negotiation Protocol | draft-cui-dmsc-agent-cdi | protocol |
|
|
||||||
| Federated Policy Enforcement | draft-cui-dmsc-agent-cdi | architecture |
|
|
||||||
| Cross-Domain Authorization Information Sharing | draft-diaconu-agents-authz-info-sharing | mechanism |
|
|
||||||
|
|
||||||
*...and 18 more*
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 3. Human Override Protocols
|
|
||||||
|
|
||||||
| | |
|
|
||||||
|---|---|
|
|
||||||
| **Severity** | CRITICAL |
|
|
||||||
| **Category** | Human-agent interaction |
|
|
||||||
| **Drafts in category** | 30 |
|
|
||||||
|
|
||||||
Critical gap in standardized protocols for humans to safely interrupt, override, or take control of autonomous agents in emergency situations. Only 30 drafts address human-agent interaction, with no focus on emergency takeover procedures.
|
|
||||||
|
|
||||||
**Evidence:** Only 30 human-agent interaction drafts compared to 213+ autonomous operation drafts, with no emergency override standards
|
|
||||||
|
|
||||||
### Related Drafts
|
|
||||||
|
|
||||||
**Keyword matches** (drafts mentioning gap topic):
|
|
||||||
|
|
||||||
- [draft-dhir-http-agent-profile](https://datatracker.ietf.org/doc/draft-dhir-http-agent-profile/) (score 4.2) — HTTP Agent Profile (HAP): Authenticated and Monetized Agent Traffic on the Web
|
|
||||||
- [draft-irtf-nmrg-llm-nm](https://datatracker.ietf.org/doc/draft-irtf-nmrg-llm-nm/) (score 3.5) — A Framework for LLM-Assisted Network Management with Human-in-the-Loop
|
|
||||||
- [draft-cui-nmrg-llm-nm](https://datatracker.ietf.org/doc/draft-cui-nmrg-llm-nm/) (score 4.1) — A Framework for LLM Agent-Assisted Network Management with Human-in-the-Loop
|
|
||||||
- [draft-zeng-opsawg-applicability-mcp-a2a](https://datatracker.ietf.org/doc/draft-zeng-opsawg-applicability-mcp-a2a/) (score 3.5) — When NETCONF Is Not Enough: Applicability of MCP and A2A for Advanced Network Ma
|
|
||||||
- [draft-wmz-nmrg-agent-ndt-arch](https://datatracker.ietf.org/doc/draft-wmz-nmrg-agent-ndt-arch/) (score 4.2) — Network Digital Twin and Agentic AI based Architecture for AI driven Network Ope
|
|
||||||
- [draft-ietf-suit-firmware-encryption](https://datatracker.ietf.org/doc/draft-ietf-suit-firmware-encryption/) (score 3.7) — Encrypted Payloads in SUIT Manifests
|
|
||||||
|
|
||||||
**Top-rated in Human-agent interaction** (30 drafts):
|
|
||||||
|
|
||||||
- [draft-drake-email-tpm-attestation](https://datatracker.ietf.org/doc/draft-drake-email-tpm-attestation/) (4.6) — Defines hardware attestation for email using TPM verification chains to prevent spam and provide Syb
|
|
||||||
- [draft-ietf-aipref-vocab](https://datatracker.ietf.org/doc/draft-ietf-aipref-vocab/) (4.4) — Defines a standardized vocabulary for expressing preferences about how digital assets should be used
|
|
||||||
- [draft-dhir-http-agent-profile](https://datatracker.ietf.org/doc/draft-dhir-http-agent-profile/) (4.2) — Defines HTTP Agent Profile for authenticating agent traffic, separating human from agent traffic, an
|
|
||||||
- [draft-song-tsvwg-camp](https://datatracker.ietf.org/doc/draft-song-tsvwg-camp/) (4.2) — Proposes CAMP, a multipath transport protocol for interactive multimodal LLM systems that maintains
|
|
||||||
- [draft-liu-agent-operation-authorization](https://datatracker.ietf.org/doc/draft-liu-agent-operation-authorization/) (4.1) — Specifies framework for verifiable delegation of actions from humans to AI agents using JWT tokens.
|
|
||||||
|
|
||||||
### Partially Addressing Ideas
|
|
||||||
|
|
||||||
7 extracted ideas touch on this gap:
|
|
||||||
|
|
||||||
| Idea | Draft | Type |
|
|
||||||
|------|-------|------|
|
|
||||||
| LLM-Human Collaborative Framework | draft-irtf-nmrg-llm-nm | architecture |
|
|
||||||
| CHEQ Protocol | draft-rosenberg-aiproto-cheq | protocol |
|
|
||||||
| Signed Confirmation Objects | draft-rosenberg-aiproto-cheq | mechanism |
|
|
||||||
| Cross-Protocol Integration Pattern | draft-rosenberg-aiproto-cheq | pattern |
|
|
||||||
| CHEQ Protocol | draft-rosenberg-cheq | protocol |
|
|
||||||
| Signed Decision Objects | draft-rosenberg-cheq | mechanism |
|
|
||||||
| Protocol Integration Pattern | draft-rosenberg-cheq | pattern |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 4. Agent Resource Exhaustion Protection
|
|
||||||
|
|
||||||
| | |
|
|
||||||
|---|---|
|
|
||||||
| **Severity** | HIGH |
|
|
||||||
| **Category** | Autonomous netops |
|
|
||||||
| **Drafts in category** | 93 |
|
|
||||||
|
|
||||||
Missing standardized mechanisms to prevent malicious or poorly designed agents from consuming excessive network, compute, or storage resources. Current drafts focus on traffic management but not on agent-specific resource quotas and enforcement.
|
|
||||||
|
|
||||||
**Evidence:** 93 autonomous netops drafts and 73 ML traffic management drafts lack agent-specific resource protection mechanisms
|
|
||||||
|
|
||||||
### Related Drafts
|
|
||||||
|
|
||||||
**Keyword matches** (drafts mentioning gap topic):
|
|
||||||
|
|
||||||
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
|
|
||||||
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
|
||||||
- [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
|
|
||||||
- [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
|
|
||||||
- [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
|
|
||||||
- [draft-jia-oauth-scope-aggregation](https://datatracker.ietf.org/doc/draft-jia-oauth-scope-aggregation/) (score 3.5) — OAuth 2.0 Scope Aggregation for Multi-Step AI Agent Workflows
|
|
||||||
|
|
||||||
**Top-rated in Autonomous netops** (93 drafts):
|
|
||||||
|
|
||||||
- [draft-cui-nmrg-llm-benchmark](https://datatracker.ietf.org/doc/draft-cui-nmrg-llm-benchmark/) (4.3) — Provides comprehensive evaluation framework for LLM-based network configuration agents. Includes emu
|
|
||||||
- [draft-wmz-nmrg-agent-ndt-arch](https://datatracker.ietf.org/doc/draft-wmz-nmrg-agent-ndt-arch/) (4.2) — Comprehensive architecture combining Network Digital Twin with Agentic AI for intent-based network o
|
|
||||||
- [draft-yue-anima-agent-recovery-networks](https://datatracker.ietf.org/doc/draft-yue-anima-agent-recovery-networks/) (4.1) — Defines task-oriented multi-agent framework for fault recovery in converged mobile networks. Targets
|
|
||||||
- [draft-cui-nmrg-llm-nm](https://datatracker.ietf.org/doc/draft-cui-nmrg-llm-nm/) (4.1) — Defines framework for collaborative network management between LLM agents and human operators. Intro
|
|
||||||
- [draft-jadoon-nmrg-agentic-ai-autonomous-networks](https://datatracker.ietf.org/doc/draft-jadoon-nmrg-agentic-ai-autonomous-networks/) (4.1) — Introduces architectural principles for integrating AI agents into IP protocol stack layers while pr
|
|
||||||
|
|
||||||
### Partially Addressing Ideas
|
|
||||||
|
|
||||||
40 extracted ideas touch on this gap:
|
|
||||||
|
|
||||||
| Idea | Draft | Type |
|
|
||||||
|------|-------|------|
|
|
||||||
| Agent Resource Type | draft-abbey-scim-agent-extension | extension |
|
|
||||||
| Agentic Application Resource Type | draft-abbey-scim-agent-extension | extension |
|
|
||||||
| Collaborative Inference Acceleration (KDN) | draft-agent-gw | mechanism |
|
|
||||||
| Data and Agent Aware-Inference and Training Network (DA-ITN) | draft-akhavain-moussa-ai-network | architecture |
|
|
||||||
| Agent-to-Agent (A2A) Communication Paradigm | draft-an-nmrg-i2icf-cits | protocol |
|
|
||||||
| Network-Level Quarantine Protocol | draft-aylward-aiga-1 | protocol |
|
|
||||||
| Agent Task Negotiation | draft-cui-ai-agent-task | protocol |
|
|
||||||
| Multi-Agent Security Protection | draft-fu-nmop-agent-communication-framework | mechanism |
|
|
||||||
|
|
||||||
*...and 32 more*
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 5. Agent-Generated Data Provenance
|
|
||||||
|
|
||||||
| | |
|
|
||||||
|---|---|
|
|
||||||
| **Severity** | HIGH |
|
|
||||||
| **Category** | Data formats/interop |
|
|
||||||
| **Drafts in category** | 145 |
|
|
||||||
|
|
||||||
While 145 drafts address data formats for AI interop, there's insufficient attention to tracking the provenance and lineage of data generated by agents. This creates trust and auditability issues in agent-to-agent data exchanges.
|
|
||||||
|
|
||||||
**Evidence:** 145 data format drafts with high overlap but no clear standards for agent-generated data provenance tracking
|
|
||||||
|
|
||||||
### Related Drafts
|
|
||||||
|
|
||||||
**Keyword matches** (drafts mentioning gap topic):
|
|
||||||
|
|
||||||
- [draft-romanchuk-normative-admissibility](https://datatracker.ietf.org/doc/draft-romanchuk-normative-admissibility/) (score 3.4) — Normative Admissibility Framework for Agent Speech Acts
|
|
||||||
- [draft-li-semantic-routing-architecture](https://datatracker.ietf.org/doc/draft-li-semantic-routing-architecture/) (score 3.6) — Semantic Routing Architecture for AI Agents Communication
|
|
||||||
- [draft-cui-nmrg-llm-nm](https://datatracker.ietf.org/doc/draft-cui-nmrg-llm-nm/) (score 4.1) — A Framework for LLM Agent-Assisted Network Management with Human-in-the-Loop
|
|
||||||
- [draft-mpsb-agntcy-messaging](https://datatracker.ietf.org/doc/draft-mpsb-agntcy-messaging/) (score 2.6) — An Overview of Messaging Systems and Their Applicability to Agentic AI
|
|
||||||
- [draft-gaikwad-south-authorization](https://datatracker.ietf.org/doc/draft-gaikwad-south-authorization/) (score 3.7) — SOUTH: Stochastic Authorization for Agent and Service Requests
|
|
||||||
- [draft-abaris-aicdh](https://datatracker.ietf.org/doc/draft-abaris-aicdh/) (score 2.8) — AI Content Disclosure Header
|
|
||||||
|
|
||||||
**Top-rated in Data formats/interop** (145 drafts):
|
|
||||||
|
|
||||||
- [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
|
|
||||||
- [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
|
|
||||||
- [draft-ietf-lake-app-profiles](https://datatracker.ietf.org/doc/draft-ietf-lake-app-profiles/) (4.6) — Defines canonical CBOR representation for EDHOC application profiles and coordination mechanisms for
|
|
||||||
- [draft-chang-agent-token-efficient](https://datatracker.ietf.org/doc/draft-chang-agent-token-efficient/) (4.5) — Defines ADOL (Agentic Data Optimization Layer) to address token bloat in agent communication protoco
|
|
||||||
- [draft-birkholz-verifiable-agent-conversations](https://datatracker.ietf.org/doc/draft-birkholz-verifiable-agent-conversations/) (4.5) — Defines CDDL-based data format for verifiable agent conversation records using COSE signing. Support
|
|
||||||
|
|
||||||
### Partially Addressing Ideas
|
|
||||||
|
|
||||||
4 extracted ideas touch on this gap:
|
|
||||||
|
|
||||||
| Idea | Draft | Type |
|
|
||||||
|------|-------|------|
|
|
||||||
| Context-Enhanced Training Data | draft-improving-data-quality-tags | extension |
|
|
||||||
| Training Data Provenance Claims | draft-messous-eat-ai | mechanism |
|
|
||||||
| Sentinel Evidence Package | draft-reilly-sentinel-protocol | architecture |
|
|
||||||
| AI Lifecycle Provenance Tracking | draft-reilly-sentinel-protocol | architecture |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 6. Agent Capability Degradation Handling
|
|
||||||
|
|
||||||
| | |
|
|
||||||
|---|---|
|
|
||||||
| **Severity** | HIGH |
|
|
||||||
| **Category** | AI safety/alignment |
|
|
||||||
| **Drafts in category** | 44 |
|
|
||||||
|
|
||||||
No standardized approaches for detecting and handling when an agent's capabilities degrade due to model drift, data corruption, or hardware issues. Systems need graceful degradation protocols rather than silent failures.
|
|
||||||
|
|
||||||
**Evidence:** Only 44 safety/alignment drafts don't address capability degradation, while 213+ drafts assume stable agent performance
|
|
||||||
|
|
||||||
### Related Drafts
|
|
||||||
|
|
||||||
**Keyword matches** (drafts mentioning gap topic):
|
|
||||||
|
|
||||||
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
|
|
||||||
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
|
||||||
- [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
|
|
||||||
- [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
|
|
||||||
- [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
|
|
||||||
- [draft-li-dmsc-inf-architecture](https://datatracker.ietf.org/doc/draft-li-dmsc-inf-architecture/) (score 3.1) — Dynamic Multi-agent Secured Collaboration Infrastructure Architecture
|
|
||||||
|
|
||||||
**Top-rated in AI safety/alignment** (44 drafts):
|
|
||||||
|
|
||||||
- [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
|
|
||||||
- [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (4.8) — Defines comprehensive protocol for AI agent accountability including authentication, monitoring, and
|
|
||||||
- [draft-guy-bary-stamp-protocol](https://datatracker.ietf.org/doc/draft-guy-bary-stamp-protocol/) (4.6) — Defines STAMP protocol for cryptographic delegation and proof in AI agent systems. Provides task-bou
|
|
||||||
- [draft-drake-email-tpm-attestation](https://datatracker.ietf.org/doc/draft-drake-email-tpm-attestation/) (4.6) — Defines hardware attestation for email using TPM verification chains to prevent spam and provide Syb
|
|
||||||
- [draft-goswami-agentic-jwt](https://datatracker.ietf.org/doc/draft-goswami-agentic-jwt/) (4.5) — Extends OAuth 2.0 with Agentic JWT to address authorization challenges in autonomous AI systems. Int
|
|
||||||
|
|
||||||
### Partially Addressing Ideas
|
|
||||||
|
|
||||||
45 extracted ideas touch on this gap:
|
|
||||||
|
|
||||||
| Idea | Draft | Type |
|
|
||||||
|------|-------|------|
|
|
||||||
| Semantic Routing | draft-agent-gw | mechanism |
|
|
||||||
| Semantic Routing | draft-ainp-protocol | mechanism |
|
|
||||||
| Capability-based Discovery | draft-ainp-protocol | pattern |
|
|
||||||
| Complex Delegation Relationship Management | draft-chen-ai-agent-auth-new-requirements | architecture |
|
|
||||||
| Capability-Based Discovery Mechanism | draft-cui-ai-agent-discovery-invocation | mechanism |
|
|
||||||
| Agent Capability Negotiation Protocol | draft-cui-dmsc-agent-cdi | protocol |
|
|
||||||
| Agent Capability-Based Routing | draft-du-catalist-routing-considerations | mechanism |
|
|
||||||
| Agent Monitoring and Tracking | draft-fu-nmop-agent-communication-framework | mechanism |
|
|
||||||
|
|
||||||
*...and 37 more*
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 7. Multi-Agent Coordination Deadlocks
|
|
||||||
|
|
||||||
| | |
|
|
||||||
|---|---|
|
|
||||||
| **Severity** | HIGH |
|
|
||||||
| **Category** | A2A protocols |
|
| **Category** | A2A protocols |
|
||||||
| **Drafts in category** | 120 |
|
| **Drafts in category** | 150 |
|
||||||
|
|
||||||
With 120+ A2A protocol drafts, there's insufficient attention to preventing deadlock situations where multiple agents create circular dependencies or resource conflicts. Missing are standardized deadlock detection and resolution mechanisms.
|
While agent discovery and A2A protocols exist, there's no framework for handling consensus when some agents may be compromised or malicious. Critical for autonomous systems making collective decisions.
|
||||||
|
|
||||||
**Evidence:** 120 A2A protocol drafts with high internal overlap but no systematic deadlock prevention frameworks
|
**Evidence:** Complex autonomous systems require Byzantine fault tolerance but it's absent from protocol designs
|
||||||
|
|
||||||
### Related Drafts
|
### Related Drafts
|
||||||
|
|
||||||
@@ -329,196 +88,88 @@ With 120+ A2A protocol drafts, there's insufficient attention to preventing dead
|
|||||||
- [draft-yue-anima-agent-recovery-networks](https://datatracker.ietf.org/doc/draft-yue-anima-agent-recovery-networks/) (score 4.1) — Task-Oriented Multi-Agent Recovery Framework for High-Reliability in Converged M
|
- [draft-yue-anima-agent-recovery-networks](https://datatracker.ietf.org/doc/draft-yue-anima-agent-recovery-networks/) (score 4.1) — Task-Oriented Multi-Agent Recovery Framework for High-Reliability in Converged M
|
||||||
- [draft-chang-agent-context-interaction](https://datatracker.ietf.org/doc/draft-chang-agent-context-interaction/) (score 2.9) — Agent Context Interaction Optimizations
|
- [draft-chang-agent-context-interaction](https://datatracker.ietf.org/doc/draft-chang-agent-context-interaction/) (score 2.9) — Agent Context Interaction Optimizations
|
||||||
- [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
|
- [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
|
||||||
- [draft-cui-ai-agent-task](https://datatracker.ietf.org/doc/draft-cui-ai-agent-task/) (score 3.0) — Task-oriented Coordination Requirements for AI Agent Protocols
|
- [draft-ramakrishna-satp-views-addresses](https://datatracker.ietf.org/doc/draft-ramakrishna-satp-views-addresses/) (score 3.4) — Views and View Addresses for Secure Asset Transfer
|
||||||
|
|
||||||
**Top-rated in A2A protocols** (120 drafts):
|
**Top-rated in A2A protocols** (150 drafts):
|
||||||
|
|
||||||
- [draft-guy-bary-stamp-protocol](https://datatracker.ietf.org/doc/draft-guy-bary-stamp-protocol/) (4.6) — Defines STAMP protocol for cryptographic delegation and proof in AI agent systems. Provides task-bou
|
- [draft-guy-bary-stamp-protocol](https://datatracker.ietf.org/doc/draft-guy-bary-stamp-protocol/) (4.6) — Defines STAMP protocol for cryptographic delegation and proof in AI agent systems. Provides task-bou
|
||||||
- [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
|
- [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
|
||||||
|
- [draft-ietf-lake-edhoc](https://datatracker.ietf.org/doc/draft-ietf-lake-edhoc/) (4.6) — Specifies EDHOC, a compact authenticated Diffie-Hellman key exchange protocol for constrained enviro
|
||||||
- [draft-chang-agent-token-efficient](https://datatracker.ietf.org/doc/draft-chang-agent-token-efficient/) (4.5) — Defines ADOL (Agentic Data Optimization Layer) to address token bloat in agent communication protoco
|
- [draft-chang-agent-token-efficient](https://datatracker.ietf.org/doc/draft-chang-agent-token-efficient/) (4.5) — Defines ADOL (Agentic Data Optimization Layer) to address token bloat in agent communication protoco
|
||||||
- [draft-chen-oauth-rar-agent-extensions](https://datatracker.ietf.org/doc/draft-chen-oauth-rar-agent-extensions/) (4.2) — Extends OAuth RAR with policy_context and lifecycle_binding members for AI agent environments. Enabl
|
- [draft-chen-oauth-rar-agent-extensions](https://datatracker.ietf.org/doc/draft-chen-oauth-rar-agent-extensions/) (4.2) — Extends OAuth RAR with policy_context and lifecycle_binding members for AI agent environments. Enabl
|
||||||
- [draft-mallick-muacp](https://datatracker.ietf.org/doc/draft-mallick-muacp/) (4.2) — Resource-efficient messaging protocol specifically designed for constrained IoT/Edge devices with de
|
|
||||||
|
|
||||||
### Partially Addressing Ideas
|
### Partially Addressing Ideas
|
||||||
|
|
||||||
11 extracted ideas touch on this gap:
|
2 extracted ideas touch on this gap:
|
||||||
|
|
||||||
| Idea | Draft | Type |
|
| Idea | Draft | Type |
|
||||||
|------|-------|------|
|
|------|-------|------|
|
||||||
| Multi-Agent Task Coordination | draft-du-ai-agent-communication-6g-aspect | mechanism |
|
| ASRank Structural Vulnerability Analysis | draft-xu-sidrops-asrank-vulnerabilities | requirement |
|
||||||
| AI Gateway | draft-fu-nmop-agent-communication-framework | architecture |
|
| MCP and A2A Complementary Solutions for Network Management | draft-zeng-opsawg-applicability-mcp-a2a | architecture |
|
||||||
| DMSC Infrastructure Architecture | draft-li-dmsc-inf-architecture | architecture |
|
|
||||||
| Multi-agent Collaboration Protocol Suite | draft-li-dmsc-macp | protocol |
|
|
||||||
| Task-based Multi-Agent Coordination | draft-li-dmsc-mcps-agw | pattern |
|
|
||||||
| Cognitive Networking Substrate | draft-li-semantic-routing-architecture | architecture |
|
|
||||||
| Agent Communication Use Cases | draft-stephan-ai-agent-6g | pattern |
|
|
||||||
| Structured Responsibility and Traceability Architecture (SRTA) | draft-takagi-srta-trinity | architecture |
|
|
||||||
|
|
||||||
*...and 3 more*
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 8. Agent Privacy Preservation
|
## 3. Emergency Agent Shutdown Coordination
|
||||||
|
|
||||||
| | |
|
| | |
|
||||||
|---|---|
|
|---|---|
|
||||||
| **Severity** | HIGH |
|
| **Severity** | CRITICAL |
|
||||||
| **Category** | Agent identity/auth |
|
| **Category** | AI safety/alignment |
|
||||||
| **Drafts in category** | 108 |
|
| **Drafts in category** | 46 |
|
||||||
|
|
||||||
Agents often process sensitive data but current drafts don't adequately address privacy-preserving computation, differential privacy, or secure multi-party computation for agent interactions. This is critical for deployment in regulated industries.
|
Missing protocols for coordinated emergency shutdown of autonomous agent networks when safety issues are detected. Individual agent controls exist but not network-wide coordination mechanisms.
|
||||||
|
|
||||||
**Evidence:** 108 identity/auth drafts focus on authentication but lack privacy preservation mechanisms for agent data processing
|
**Evidence:** Human-in-the-loop drafts exist but no emergency coordination protocols for autonomous systems
|
||||||
|
|
||||||
### Related Drafts
|
### Related Drafts
|
||||||
|
|
||||||
**Keyword matches** (drafts mentioning gap topic):
|
**Keyword matches** (drafts mentioning gap topic):
|
||||||
|
|
||||||
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
|
- [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (score 4.8) — Distributed AI Accountability Protocol (DAAP) Version 2.0
|
||||||
|
- [draft-khatri-sipcore-call-transfer-fail-response](https://datatracker.ietf.org/doc/draft-khatri-sipcore-call-transfer-fail-response/) (score 3.3) — A SIP Response Code (497) for Call Transfer Failure
|
||||||
|
- [draft-cui-dmsc-agent-cdi](https://datatracker.ietf.org/doc/draft-cui-dmsc-agent-cdi/) (score 3.0) — Cross-Domain Interoperability Framework for AI Agent Collaboration
|
||||||
|
- [draft-yu-ai-agent-use-cases-in-6g](https://datatracker.ietf.org/doc/draft-yu-ai-agent-use-cases-in-6g/) (score 2.5) — AI Agent Use Cases and Requirements in 6G Network
|
||||||
|
- [draft-zhang-rvp-problem-statement](https://datatracker.ietf.org/doc/draft-zhang-rvp-problem-statement/) (score 3.5) — Problem Statements and Requirements of Real-Virtual Agent Protocol (RVP): Commun
|
||||||
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
||||||
- [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
|
|
||||||
- [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
|
|
||||||
- [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
|
|
||||||
- [draft-kale-agntcy-federated-privacy](https://datatracker.ietf.org/doc/draft-kale-agntcy-federated-privacy/) (score 3.2) — Privacy-Preserving Federated Learning Architecture for Multi-Tenant AI Agent Sys
|
|
||||||
|
|
||||||
**Top-rated in Agent identity/auth** (108 drafts):
|
**Top-rated in AI safety/alignment** (46 drafts):
|
||||||
|
|
||||||
- [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
|
- [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
|
||||||
- [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (4.8) — Defines comprehensive protocol for AI agent accountability including authentication, monitoring, and
|
- [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (4.8) — Defines comprehensive protocol for AI agent accountability including authentication, monitoring, and
|
||||||
- [draft-guy-bary-stamp-protocol](https://datatracker.ietf.org/doc/draft-guy-bary-stamp-protocol/) (4.6) — Defines STAMP protocol for cryptographic delegation and proof in AI agent systems. Provides task-bou
|
- [draft-guy-bary-stamp-protocol](https://datatracker.ietf.org/doc/draft-guy-bary-stamp-protocol/) (4.6) — Defines STAMP protocol for cryptographic delegation and proof in AI agent systems. Provides task-bou
|
||||||
- [draft-drake-email-tpm-attestation](https://datatracker.ietf.org/doc/draft-drake-email-tpm-attestation/) (4.6) — Defines hardware attestation for email using TPM verification chains to prevent spam and provide Syb
|
- [draft-drake-email-tpm-attestation](https://datatracker.ietf.org/doc/draft-drake-email-tpm-attestation/) (4.6) — Defines hardware attestation for email using TPM verification chains to prevent spam and provide Syb
|
||||||
- [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
|
- [draft-goswami-agentic-jwt](https://datatracker.ietf.org/doc/draft-goswami-agentic-jwt/) (4.5) — Extends OAuth 2.0 with Agentic JWT to address authorization challenges in autonomous AI systems. Int
|
||||||
|
|
||||||
### Partially Addressing Ideas
|
### Partially Addressing Ideas
|
||||||
|
|
||||||
11 extracted ideas touch on this gap:
|
9 extracted ideas touch on this gap:
|
||||||
|
|
||||||
| Idea | Draft | Type |
|
| Idea | Draft | Type |
|
||||||
|------|-------|------|
|
|------|-------|------|
|
||||||
| Agent Card Structure | draft-nandakumar-agent-sd-jwt | protocol |
|
| Distributed AI Accountability Protocol | draft-aylward-daap-v2 | protocol |
|
||||||
| Pseudonymous Key Generation | draft-bradleylundberg-cfrg-arkg | mechanism |
|
| Agentic network architecture for multi-agent coordination | draft-chuyi-nmrg-agentic-network-inference | architecture |
|
||||||
| Privacy-Preserving Human Tokens | draft-dhir-http-agent-profile | mechanism |
|
| Dynamic Task Coordination Requirements for AI Agents | draft-cui-ai-agent-task | requirement |
|
||||||
| Cryptographic Erasure Compliance | draft-gaikwad-aps-profile | mechanism |
|
| Multi-Agent Communication Framework for AIOps | draft-fu-nmop-agent-communication-framework | architecture |
|
||||||
| Privacy-Respecting Capability Attestation | draft-huang-rats-agentic-eat-cap-attest | pattern |
|
| Meta-Layer Coordination Substrate | draft-meta-layer-overview | architecture |
|
||||||
| Differential Privacy for Agent Models | draft-kale-agntcy-federated-privacy | mechanism |
|
| Trinity Configuration for Agent Coordination | draft-takagi-srta-trinity | pattern |
|
||||||
| Agent Identity Preservation | draft-liu-oauth-a2a-profile | pattern |
|
| Internet of Agents Task Protocol for heterogeneous collaboration | draft-yang-dmsc-ioa-task-protocol | protocol |
|
||||||
| Inference-Time Data Access Policy Claims | draft-messous-eat-ai | mechanism |
|
| Task-Oriented Multi-Agent Recovery Framework | draft-yue-anima-agent-recovery-networks | architecture |
|
||||||
|
|
||||||
*...and 3 more*
|
*...and 1 more*
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 9. Agent Firmware/Model Update Security
|
## 4. Cross-Protocol Agent Migration
|
||||||
|
|
||||||
| | |
|
| | |
|
||||||
|---|---|
|
|---|---|
|
||||||
| **Severity** | HIGH |
|
| **Severity** | HIGH |
|
||||||
| **Category** | Model serving/inference |
|
|
||||||
| **Drafts in category** | 42 |
|
|
||||||
|
|
||||||
While model serving is addressed in 42 drafts, there's insufficient focus on secure update mechanisms for agent models and firmware. Missing are standards for cryptographically verified, rollback-capable agent updates.
|
|
||||||
|
|
||||||
**Evidence:** 42 model serving drafts but no comprehensive security standards for agent software/model updates
|
|
||||||
|
|
||||||
### Related Drafts
|
|
||||||
|
|
||||||
**Keyword matches** (drafts mentioning gap topic):
|
|
||||||
|
|
||||||
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
|
|
||||||
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
|
||||||
- [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
|
|
||||||
- [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
|
|
||||||
- [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
|
|
||||||
- [draft-ietf-tls-extended-key-update](https://datatracker.ietf.org/doc/draft-ietf-tls-extended-key-update/) (score 4.2) — Extended Key Update for Transport Layer Security (TLS) 1.3
|
|
||||||
|
|
||||||
**Top-rated in Model serving/inference** (42 drafts):
|
|
||||||
|
|
||||||
- [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
|
|
||||||
- [draft-chang-agent-token-efficient](https://datatracker.ietf.org/doc/draft-chang-agent-token-efficient/) (4.5) — Defines ADOL (Agentic Data Optimization Layer) to address token bloat in agent communication protoco
|
|
||||||
- [draft-calabria-bmwg-ai-fabric-inference-bench](https://datatracker.ietf.org/doc/draft-calabria-bmwg-ai-fabric-inference-bench/) (4.5) — Defines benchmarking methodology for AI inference network fabrics. Establishes KPIs and test procedu
|
|
||||||
- [draft-wang-cats-odsi](https://datatracker.ietf.org/doc/draft-wang-cats-odsi/) (4.5) — Specifies framework for decentralized LLM inference across untrusted participants with layer-aware e
|
|
||||||
- [draft-wmz-nmrg-agent-ndt-arch](https://datatracker.ietf.org/doc/draft-wmz-nmrg-agent-ndt-arch/) (4.2) — Comprehensive architecture combining Network Digital Twin with Agentic AI for intent-based network o
|
|
||||||
|
|
||||||
### Partially Addressing Ideas
|
|
||||||
|
|
||||||
79 extracted ideas touch on this gap:
|
|
||||||
|
|
||||||
| Idea | Draft | Type |
|
|
||||||
|------|-------|------|
|
|
||||||
| Multi-layered Security Architecture | draft-aylward-daap-v2 | architecture |
|
|
||||||
| VERA Zero Trust Reference Architecture | draft-berlinai-vera | architecture |
|
|
||||||
| Evidence-Based Maturity Runtime | draft-berlinai-vera | mechanism |
|
|
||||||
| Five Enforcement Pillars with Typed Schemas | draft-berlinai-vera | pattern |
|
|
||||||
| AI Agent Structured Threat Model | draft-berlinai-vera | requirement |
|
|
||||||
| Cryptographic Proof-Based Autonomy | draft-berlinai-vera | mechanism |
|
|
||||||
| Pseudonymous Key Generation | draft-bradleylundberg-cfrg-arkg | mechanism |
|
|
||||||
| Multi-Agent Security Protection | draft-fu-nmop-agent-communication-framework | mechanism |
|
|
||||||
|
|
||||||
*...and 71 more*
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 10. Real-time Agent Debugging
|
|
||||||
|
|
||||||
| | |
|
|
||||||
|---|---|
|
|
||||||
| **Severity** | MEDIUM |
|
|
||||||
| **Category** | Other AI/agent |
|
|
||||||
| **Drafts in category** | 26 |
|
|
||||||
|
|
||||||
Missing standardized protocols for debugging autonomous agents in production environments. When agents make unexpected decisions, there are no standard interfaces for real-time introspection without disrupting operations.
|
|
||||||
|
|
||||||
**Evidence:** 26 other AI/agent drafts suggest various approaches but no standardized debugging protocols for production agents
|
|
||||||
|
|
||||||
### Related Drafts
|
|
||||||
|
|
||||||
**Keyword matches** (drafts mentioning gap topic):
|
|
||||||
|
|
||||||
- [draft-an-nmrg-i2icf-cits](https://datatracker.ietf.org/doc/draft-an-nmrg-i2icf-cits/) (score 3.7) — Interface to In-Network Computing Functions for Cooperative Intelligent Transpor
|
|
||||||
- [draft-zhao-detnet-enhanced-use-cases](https://datatracker.ietf.org/doc/draft-zhao-detnet-enhanced-use-cases/) (score 3.2) — Enhanced Use Cases for Scaling Deterministic Networks
|
|
||||||
- [draft-zhang-rvp-problem-statement](https://datatracker.ietf.org/doc/draft-zhang-rvp-problem-statement/) (score 3.5) — Problem Statements and Requirements of Real-Virtual Agent Protocol (RVP): Commun
|
|
||||||
- [draft-yuan-rtgwg-traffic-agent-usecase](https://datatracker.ietf.org/doc/draft-yuan-rtgwg-traffic-agent-usecase/) (score 3.7) — Use cases of the AI Network Traffic Optimization Agent
|
|
||||||
- [draft-hong-nmrg-agenticai-ps](https://datatracker.ietf.org/doc/draft-hong-nmrg-agenticai-ps/) (score 3.0) — Motivations and Problem Statement of Agentic AI for network management
|
|
||||||
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
|
|
||||||
|
|
||||||
**Top-rated in Other AI/agent** (26 drafts):
|
|
||||||
|
|
||||||
- [draft-calabria-bmwg-ai-fabric-inference-bench](https://datatracker.ietf.org/doc/draft-calabria-bmwg-ai-fabric-inference-bench/) (4.5) — Defines benchmarking methodology for AI inference network fabrics. Establishes KPIs and test procedu
|
|
||||||
- [draft-ietf-tls-ecdhe-mlkem](https://datatracker.ietf.org/doc/draft-ietf-tls-ecdhe-mlkem/) (4.4) — Defines hybrid post-quantum key agreement mechanisms for TLS 1.3 that combine ML-KEM with traditiona
|
|
||||||
- [draft-wmz-nmrg-agent-ndt-arch](https://datatracker.ietf.org/doc/draft-wmz-nmrg-agent-ndt-arch/) (4.2) — Comprehensive architecture combining Network Digital Twin with Agentic AI for intent-based network o
|
|
||||||
- [draft-an-nmrg-i2icf-cits](https://datatracker.ietf.org/doc/draft-an-nmrg-i2icf-cits/) (3.7) — Defines framework for orchestrating In-Network Computing Functions in Cooperative Intelligent Transp
|
|
||||||
- [draft-cui-nmrg-auto-test](https://datatracker.ietf.org/doc/draft-cui-nmrg-auto-test/) (3.6) — Framework for AI-assisted network protocol testing using LLMs and automated test generation. Defines
|
|
||||||
|
|
||||||
### Partially Addressing Ideas
|
|
||||||
|
|
||||||
23 extracted ideas touch on this gap:
|
|
||||||
|
|
||||||
| Idea | Draft | Type |
|
|
||||||
|------|-------|------|
|
|
||||||
| A2A Protocol Transport over MOQT | draft-a2a-moqt-transport | protocol |
|
|
||||||
| QUIC-based Publish/Subscribe for AI Agents | draft-a2a-moqt-transport | mechanism |
|
|
||||||
| Streaming Capabilities Integration | draft-a2a-moqt-transport | mechanism |
|
|
||||||
| Action-Based Authorization | draft-aylward-aiga-2 | mechanism |
|
|
||||||
| Multi-layered Security Architecture | draft-aylward-daap-v2 | architecture |
|
|
||||||
| Behavioral Monitoring Framework | draft-aylward-daap-v2 | mechanism |
|
|
||||||
| Context-Aware Task Scheduling | draft-cui-ai-agent-task | mechanism |
|
|
||||||
| Real-Time Task Adaptability | draft-cui-ai-agent-task | requirement |
|
|
||||||
|
|
||||||
*...and 15 more*
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 11. Cross-Protocol Agent Migration
|
|
||||||
|
|
||||||
| | |
|
|
||||||
|---|---|
|
|
||||||
| **Severity** | MEDIUM |
|
|
||||||
| **Category** | A2A protocols |
|
| **Category** | A2A protocols |
|
||||||
| **Drafts in category** | 120 |
|
| **Drafts in category** | 150 |
|
||||||
|
|
||||||
No standardized mechanisms for migrating agent state and context when moving between different A2A protocols or infrastructure providers. This creates vendor lock-in and limits agent mobility.
|
While A2A protocols exist, there's no standardized mechanism for agents to migrate between different protocol frameworks or service providers while maintaining state and identity. This creates vendor lock-in and limits agent portability across heterogeneous systems.
|
||||||
|
|
||||||
**Evidence:** 120 A2A protocol drafts with high overlap suggest competing approaches but no migration standards between them
|
**Evidence:** 150 A2A protocol drafts with high overlap suggest fragmentation without migration solutions
|
||||||
|
|
||||||
### Related Drafts
|
### Related Drafts
|
||||||
|
|
||||||
@@ -529,75 +180,356 @@ No standardized mechanisms for migrating agent state and context when moving bet
|
|||||||
- [draft-narajala-ans](https://datatracker.ietf.org/doc/draft-narajala-ans/) (score 4.2) — Agent Name Service (ANS): A Universal Directory for Secure AI Agent Discovery an
|
- [draft-narajala-ans](https://datatracker.ietf.org/doc/draft-narajala-ans/) (score 4.2) — Agent Name Service (ANS): A Universal Directory for Secure AI Agent Discovery an
|
||||||
- [draft-ietf-emu-eap-edhoc](https://datatracker.ietf.org/doc/draft-ietf-emu-eap-edhoc/) (score 3.2) — Using the Extensible Authentication Protocol (EAP) with Ephemeral Diffie-Hellman
|
- [draft-ietf-emu-eap-edhoc](https://datatracker.ietf.org/doc/draft-ietf-emu-eap-edhoc/) (score 3.2) — Using the Extensible Authentication Protocol (EAP) with Ephemeral Diffie-Hellman
|
||||||
- [draft-howe-sipcore-mcp-extension](https://datatracker.ietf.org/doc/draft-howe-sipcore-mcp-extension/) (score 3.7) — SIP Extension for Model Context Protocol (MCP)
|
- [draft-howe-sipcore-mcp-extension](https://datatracker.ietf.org/doc/draft-howe-sipcore-mcp-extension/) (score 3.7) — SIP Extension for Model Context Protocol (MCP)
|
||||||
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
|
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
||||||
|
|
||||||
**Top-rated in A2A protocols** (120 drafts):
|
**Top-rated in A2A protocols** (150 drafts):
|
||||||
|
|
||||||
- [draft-guy-bary-stamp-protocol](https://datatracker.ietf.org/doc/draft-guy-bary-stamp-protocol/) (4.6) — Defines STAMP protocol for cryptographic delegation and proof in AI agent systems. Provides task-bou
|
- [draft-guy-bary-stamp-protocol](https://datatracker.ietf.org/doc/draft-guy-bary-stamp-protocol/) (4.6) — Defines STAMP protocol for cryptographic delegation and proof in AI agent systems. Provides task-bou
|
||||||
- [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
|
- [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
|
||||||
|
- [draft-ietf-lake-edhoc](https://datatracker.ietf.org/doc/draft-ietf-lake-edhoc/) (4.6) — Specifies EDHOC, a compact authenticated Diffie-Hellman key exchange protocol for constrained enviro
|
||||||
- [draft-chang-agent-token-efficient](https://datatracker.ietf.org/doc/draft-chang-agent-token-efficient/) (4.5) — Defines ADOL (Agentic Data Optimization Layer) to address token bloat in agent communication protoco
|
- [draft-chang-agent-token-efficient](https://datatracker.ietf.org/doc/draft-chang-agent-token-efficient/) (4.5) — Defines ADOL (Agentic Data Optimization Layer) to address token bloat in agent communication protoco
|
||||||
- [draft-chen-oauth-rar-agent-extensions](https://datatracker.ietf.org/doc/draft-chen-oauth-rar-agent-extensions/) (4.2) — Extends OAuth RAR with policy_context and lifecycle_binding members for AI agent environments. Enabl
|
- [draft-chen-oauth-rar-agent-extensions](https://datatracker.ietf.org/doc/draft-chen-oauth-rar-agent-extensions/) (4.2) — Extends OAuth RAR with policy_context and lifecycle_binding members for AI agent environments. Enabl
|
||||||
- [draft-mallick-muacp](https://datatracker.ietf.org/doc/draft-mallick-muacp/) (4.2) — Resource-efficient messaging protocol specifically designed for constrained IoT/Edge devices with de
|
|
||||||
|
|
||||||
### Partially Addressing Ideas
|
### Partially Addressing Ideas
|
||||||
|
|
||||||
3 extracted ideas touch on this gap:
|
No directly related technical ideas found in current drafts — this gap is entirely unaddressed.
|
||||||
|
|
||||||
| Idea | Draft | Type |
|
|
||||||
|------|-------|------|
|
|
||||||
| Transport-Independent Attestation Format | draft-drake-email-tpm-attestation | extension |
|
|
||||||
| Cross-Protocol Integration Pattern | draft-rosenberg-aiproto-cheq | pattern |
|
|
||||||
| Agent Mobility with IPv6 MIPv6 | draft-yc-ipv6-for-ioa | mechanism |
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 12. Agent Energy Consumption Optimization
|
## 5. Agent Resource Accounting and Billing
|
||||||
|
|
||||||
| | |
|
| | |
|
||||||
|---|---|
|
|---|---|
|
||||||
| **Severity** | MEDIUM |
|
| **Severity** | HIGH |
|
||||||
| **Category** | ML traffic mgmt |
|
| **Category** | new |
|
||||||
| **Drafts in category** | 73 |
|
| **Drafts in category** | 0 |
|
||||||
|
|
||||||
Missing standards for energy-aware agent deployment and operation. As AI workloads are energy-intensive, there's no framework for agents to optimize their energy consumption or for infrastructure to enforce energy budgets.
|
No standardized protocols exist for tracking and billing computational resources consumed by autonomous agents across distributed systems. This is essential for commercial deployment but completely unaddressed.
|
||||||
|
|
||||||
**Evidence:** 73 ML traffic management drafts focus on performance but lack energy consumption considerations for sustainable AI deployment
|
**Evidence:** High focus on protocols and deployment but zero drafts addressing economic models
|
||||||
|
|
||||||
### Related Drafts
|
### Related Drafts
|
||||||
|
|
||||||
**Keyword matches** (drafts mentioning gap topic):
|
**Keyword matches** (drafts mentioning gap topic):
|
||||||
|
|
||||||
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
|
|
||||||
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
||||||
|
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
|
||||||
- [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
|
- [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
|
||||||
- [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
|
- [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
|
||||||
- [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
|
- [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
|
||||||
- [draft-ahc-green-smartpdu-yang](https://datatracker.ietf.org/doc/draft-ahc-green-smartpdu-yang/) (score 2.9) — A YANG Model for SmartPDU Monitoring and Control
|
- [draft-jia-oauth-scope-aggregation](https://datatracker.ietf.org/doc/draft-jia-oauth-scope-aggregation/) (score 3.5) — OAuth 2.0 Scope Aggregation for Multi-Step AI Agent Workflows
|
||||||
|
|
||||||
**Top-rated in ML traffic mgmt** (73 drafts):
|
|
||||||
|
|
||||||
- [draft-calabria-bmwg-ai-fabric-inference-bench](https://datatracker.ietf.org/doc/draft-calabria-bmwg-ai-fabric-inference-bench/) (4.5) — Defines benchmarking methodology for AI inference network fabrics. Establishes KPIs and test procedu
|
|
||||||
- [draft-dhir-http-agent-profile](https://datatracker.ietf.org/doc/draft-dhir-http-agent-profile/) (4.2) — Defines HTTP Agent Profile for authenticating agent traffic, separating human from agent traffic, an
|
|
||||||
- [draft-calabria-bmwg-ai-fabric-terminology](https://datatracker.ietf.org/doc/draft-calabria-bmwg-ai-fabric-terminology/) (4.2) — Defines comprehensive benchmarking terminology for AI network fabrics including collective communica
|
|
||||||
- [draft-li-spring-rdma-multicast-over-srv6](https://datatracker.ietf.org/doc/draft-li-spring-rdma-multicast-over-srv6/) (4.2) — Specifies SRv6 extensions for RDMA multicast delivery with new End.MT behavior and ACK/NACK aggregat
|
|
||||||
- [draft-song-tsvwg-camp](https://datatracker.ietf.org/doc/draft-song-tsvwg-camp/) (4.2) — Proposes CAMP, a multipath transport protocol for interactive multimodal LLM systems that maintains
|
|
||||||
|
|
||||||
### Partially Addressing Ideas
|
### Partially Addressing Ideas
|
||||||
|
|
||||||
17 extracted ideas touch on this gap:
|
8 extracted ideas touch on this gap:
|
||||||
|
|
||||||
| Idea | Draft | Type |
|
| Idea | Draft | Type |
|
||||||
|------|-------|------|
|
|------|-------|------|
|
||||||
| SmartPDU Telemetry Framework | draft-ahc-green-smartpdu-yang | mechanism |
|
| SCIM 2.0 Extension for Agents and Agentic Applications | draft-abbey-scim-agent-extension | extension |
|
||||||
| Agent Context Distribution | draft-chang-agent-context-interaction | mechanism |
|
| Events Query Protocol | draft-gupta-httpapi-events-query | protocol |
|
||||||
| Context Distribution Optimization Procedures | draft-chang-agent-context-interaction | protocol |
|
| Micro Agent Communication Protocol (µACP) | draft-mallick-muacp | protocol |
|
||||||
| Schema Deduplication via JSON References | draft-chang-agent-token-efficient | mechanism |
|
| MOQT Binding for A2A and MCP Protocols | draft-nandakumar-ai-agent-moq-transport | extension |
|
||||||
| Agentic Data Optimization Layer (ADOL) | draft-chang-agent-token-efficient | architecture |
|
| SCIM 2.0 Agent Extension | draft-scim-agent-extension | extension |
|
||||||
| Information Exchange Efficiency | draft-chuyi-nmrg-agentic-network-inference | mechanism |
|
| Authorized Connection Policy Framework | draft-steckbeck-ua-conn-sec | mechanism |
|
||||||
| Vector Index Workload Optimization | draft-gaikwad-aps-profile | pattern |
|
| Agent Workflow Protocol Well-Known Resource | draft-vinaysingh-awp-wellknown | extension |
|
||||||
| Collaboration Tunnel Protocol (TCT) | draft-jurkovikj-collab-tunnel | protocol |
|
| AI Network Traffic Optimization Agent | draft-yuan-rtgwg-traffic-agent-usecase | architecture |
|
||||||
|
|
||||||
*...and 9 more*
|
---
|
||||||
|
|
||||||
|
## 6. Agent Capability Advertisement Verification
|
||||||
|
|
||||||
|
| | |
|
||||||
|
|---|---|
|
||||||
|
| **Severity** | HIGH |
|
||||||
|
| **Category** | Agent discovery/reg |
|
||||||
|
| **Drafts in category** | 87 |
|
||||||
|
|
||||||
|
While agent discovery protocols exist, there's no way to cryptographically verify that advertised agent capabilities are accurate. Agents could falsely claim capabilities leading to system failures.
|
||||||
|
|
||||||
|
**Evidence:** 87 discovery drafts but no mention of capability verification mechanisms
|
||||||
|
|
||||||
|
### Related Drafts
|
||||||
|
|
||||||
|
**Keyword matches** (drafts mentioning gap topic):
|
||||||
|
|
||||||
|
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
||||||
|
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
|
||||||
|
- [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
|
||||||
|
- [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
|
||||||
|
- [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
|
||||||
|
- [draft-li-dmsc-inf-architecture](https://datatracker.ietf.org/doc/draft-li-dmsc-inf-architecture/) (score 3.1) — Dynamic Multi-agent Secured Collaboration Infrastructure Architecture
|
||||||
|
|
||||||
|
**Top-rated in Agent discovery/reg** (87 drafts):
|
||||||
|
|
||||||
|
- [draft-narajala-ans](https://datatracker.ietf.org/doc/draft-narajala-ans/) (4.2) — Introduces Agent Name Service (ANS) as a DNS-based universal directory for AI agent discovery and ve
|
||||||
|
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (4.2) — Specifies a comprehensive multi-agent collaboration protocol suite using Agent Gateways for registra
|
||||||
|
- [draft-cui-dns-native-agent-naming-resolution](https://datatracker.ietf.org/doc/draft-cui-dns-native-agent-naming-resolution/) (4.1) — Specifies DNS-native naming and resolution for AI agents using FQDNs and SVCB records. Emphasizes DN
|
||||||
|
- [draft-nederveld-adl](https://datatracker.ietf.org/doc/draft-nederveld-adl/) (4.1) — Defines ADL, a JSON-based standard for describing AI agents including their capabilities, tools, per
|
||||||
|
- [draft-rosenberg-ai-protocols](https://datatracker.ietf.org/doc/draft-rosenberg-ai-protocols/) (4.1) — Establishes framework for AI agent communications on the Internet, surveying existing protocols like
|
||||||
|
|
||||||
|
### Partially Addressing Ideas
|
||||||
|
|
||||||
|
25 extracted ideas touch on this gap:
|
||||||
|
|
||||||
|
| Idea | Draft | Type |
|
||||||
|
|------|-------|------|
|
||||||
|
| DNS-based AI Agent Discovery | draft-mozleywilliams-dnsop-bandaid | mechanism |
|
||||||
|
| DNS namespace for AI agent discovery | draft-mozleywilliams-dnsop-dnsaid | mechanism |
|
||||||
|
| Agent Registration and Discovery Protocol | draft-pioli-agent-discovery | protocol |
|
||||||
|
| Intent-based Agent Interconnection Protocol | draft-sun-zhang-iaip | protocol |
|
||||||
|
| Capability Advertisement and Intent Resolution | draft-sz-dmsc-iaip | mechanism |
|
||||||
|
| Intelligent Agent Communication Gateway Architecture | draft-agent-gw | architecture |
|
||||||
|
| AI-Native Network Protocol (AINP) | draft-ainp-protocol | protocol |
|
||||||
|
| Distributed AI Accountability Protocol | draft-aylward-daap-v2 | protocol |
|
||||||
|
|
||||||
|
*...and 17 more*
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Cross-Domain Agent Communication Security
|
||||||
|
|
||||||
|
| | |
|
||||||
|
|---|---|
|
||||||
|
| **Severity** | HIGH |
|
||||||
|
| **Category** | Agent identity/auth |
|
||||||
|
| **Drafts in category** | 145 |
|
||||||
|
|
||||||
|
Current identity/auth solutions don't address secure communication between agents operating in different security domains or trust boundaries. Critical for enterprise and government deployments.
|
||||||
|
|
||||||
|
**Evidence:** 145 identity drafts show awareness but cross-domain scenarios appear unaddressed
|
||||||
|
|
||||||
|
### Related Drafts
|
||||||
|
|
||||||
|
**Keyword matches** (drafts mentioning gap topic):
|
||||||
|
|
||||||
|
- [draft-diaconu-agents-authz-info-sharing](https://datatracker.ietf.org/doc/draft-diaconu-agents-authz-info-sharing/) (score 3.2) — Cross-Domain AuthZ Information sharing for Agents
|
||||||
|
- [draft-cui-dmsc-agent-cdi](https://datatracker.ietf.org/doc/draft-cui-dmsc-agent-cdi/) (score 3.0) — Cross-Domain Interoperability Framework for AI Agent Collaboration
|
||||||
|
- [draft-han-rtgwg-agent-gateway-intercomm-framework](https://datatracker.ietf.org/doc/draft-han-rtgwg-agent-gateway-intercomm-framework/) (score 3.6) — Agent Gateway Intercommunication Framework
|
||||||
|
- [draft-ni-a2a-ai-agent-security-requirements](https://datatracker.ietf.org/doc/draft-ni-a2a-ai-agent-security-requirements/) (score 3.7) — Security Requirements for AI Agents
|
||||||
|
- [draft-intellinode-ai-semantic-contract](https://datatracker.ietf.org/doc/draft-intellinode-ai-semantic-contract/) (score 3.2) — Semantic-Driven Traffic Shaping Contract for AI Networks
|
||||||
|
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
||||||
|
|
||||||
|
**Top-rated in Agent identity/auth** (145 drafts):
|
||||||
|
|
||||||
|
- [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
|
||||||
|
- [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (4.8) — Defines comprehensive protocol for AI agent accountability including authentication, monitoring, and
|
||||||
|
- [draft-guy-bary-stamp-protocol](https://datatracker.ietf.org/doc/draft-guy-bary-stamp-protocol/) (4.6) — Defines STAMP protocol for cryptographic delegation and proof in AI agent systems. Provides task-bou
|
||||||
|
- [draft-drake-email-tpm-attestation](https://datatracker.ietf.org/doc/draft-drake-email-tpm-attestation/) (4.6) — Defines hardware attestation for email using TPM verification chains to prevent spam and provide Syb
|
||||||
|
- [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
|
||||||
|
|
||||||
|
### Partially Addressing Ideas
|
||||||
|
|
||||||
|
46 extracted ideas touch on this gap:
|
||||||
|
|
||||||
|
| Idea | Draft | Type |
|
||||||
|
|------|-------|------|
|
||||||
|
| Centralized Gateway for Multi-Agent Communication | draft-song-dmsc-problem-statement | architecture |
|
||||||
|
| Multi-Tenant Policy Enforcement Infrastructure | draft-song-dmsc-problem-statement | architecture |
|
||||||
|
| Intelligent Agent Communication Gateway Architecture | draft-agent-gw | architecture |
|
||||||
|
| AI-Native Network Protocol (AINP) | draft-ainp-protocol | protocol |
|
||||||
|
| Agent-to-Agent Communication in Transportation Networks | draft-an-nmrg-i2icf-cits | pattern |
|
||||||
|
| Zero Trust Runtime Agent Architecture | draft-berlinai-vera | architecture |
|
||||||
|
| Agentic Data Optimization Layer (ADOL) | draft-chang-agent-token-efficient | protocol |
|
||||||
|
| Agentic network architecture for multi-agent coordination | draft-chuyi-nmrg-agentic-network-inference | architecture |
|
||||||
|
|
||||||
|
*...and 38 more*
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. Agent Performance Degradation Detection
|
||||||
|
|
||||||
|
| | |
|
||||||
|
|---|---|
|
||||||
|
| **Severity** | HIGH |
|
||||||
|
| **Category** | new |
|
||||||
|
| **Drafts in category** | 0 |
|
||||||
|
|
||||||
|
No standardized protocols exist for detecting when AI agents are experiencing model drift, adversarial attacks, or performance degradation. Essential for maintaining autonomous system reliability.
|
||||||
|
|
||||||
|
**Evidence:** ML traffic management exists but not agent health monitoring standards
|
||||||
|
|
||||||
|
### Related Drafts
|
||||||
|
|
||||||
|
**Keyword matches** (drafts mentioning gap topic):
|
||||||
|
|
||||||
|
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
||||||
|
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
|
||||||
|
- [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
|
||||||
|
- [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
|
||||||
|
- [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
|
||||||
|
- [draft-xiong-rtgwg-use-cases-hp-wan](https://datatracker.ietf.org/doc/draft-xiong-rtgwg-use-cases-hp-wan/) (score 2.6) — Use Cases for High-performance Wide Area Network
|
||||||
|
|
||||||
|
### Partially Addressing Ideas
|
||||||
|
|
||||||
|
5 extracted ideas touch on this gap:
|
||||||
|
|
||||||
|
| Idea | Draft | Type |
|
||||||
|
|------|-------|------|
|
||||||
|
| Virtual In-Cloud Router as IPv6 Enhancement Agent | draft-he-yi-srv6ops-ipv6-enhancemnet-in-cloud-uc | architecture |
|
||||||
|
| 6G Agent Protocol Requirements and Enabling Technologies | draft-hw-ai-agent-6g | requirement |
|
||||||
|
| Comparative analysis of messaging protocols for agentic AI | draft-mpsb-agntcy-messaging | pattern |
|
||||||
|
| AI Network Security Agent | draft-yuan-rtgwg-security-agent-usecase | architecture |
|
||||||
|
| Task-Oriented Multi-Agent Recovery Framework | draft-yue-anima-agent-recovery-networks | architecture |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. Legal Liability Attribution Protocols
|
||||||
|
|
||||||
|
| | |
|
||||||
|
|---|---|
|
||||||
|
| **Severity** | HIGH |
|
||||||
|
| **Category** | Policy/governance |
|
||||||
|
| **Drafts in category** | 115 |
|
||||||
|
|
||||||
|
Missing technical protocols for creating audit trails that can determine legal liability when autonomous agents cause harm. Governance drafts exist but not technical accountability mechanisms.
|
||||||
|
|
||||||
|
**Evidence:** 115 governance drafts but legal technology gap for liability attribution
|
||||||
|
|
||||||
|
### Related Drafts
|
||||||
|
|
||||||
|
**Keyword matches** (drafts mentioning gap topic):
|
||||||
|
|
||||||
|
- [draft-madhavan-aipref-displaybasedpref](https://datatracker.ietf.org/doc/draft-madhavan-aipref-displaybasedpref/) (score 2.5) — A Vocabulary for Controlling Usage of Content Collected by Search and AI Crawler
|
||||||
|
- [draft-farzdusa-aipref-enduser](https://datatracker.ietf.org/doc/draft-farzdusa-aipref-enduser/) (score 3.8) — AI Preferences Signaling: End User Impact
|
||||||
|
- [draft-kotecha-agentic-dispute-protocol](https://datatracker.ietf.org/doc/draft-kotecha-agentic-dispute-protocol/) (score 3.6) — Agentic Dispute Protocol
|
||||||
|
- [draft-cui-dmsc-agent-cdi](https://datatracker.ietf.org/doc/draft-cui-dmsc-agent-cdi/) (score 3.0) — Cross-Domain Interoperability Framework for AI Agent Collaboration
|
||||||
|
- [draft-ietf-aipref-vocab](https://datatracker.ietf.org/doc/draft-ietf-aipref-vocab/) (score 4.4) — A Vocabulary For Expressing AI Usage Preferences
|
||||||
|
- [draft-aylward-aiga-1](https://datatracker.ietf.org/doc/draft-aylward-aiga-1/) (score 4.2) — AI Governance and Accountability Protocol (AIGA)
|
||||||
|
|
||||||
|
**Top-rated in Policy/governance** (115 drafts):
|
||||||
|
|
||||||
|
- [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
|
||||||
|
- [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (4.8) — Defines comprehensive protocol for AI agent accountability including authentication, monitoring, and
|
||||||
|
- [draft-goswami-agentic-jwt](https://datatracker.ietf.org/doc/draft-goswami-agentic-jwt/) (4.5) — Extends OAuth 2.0 with Agentic JWT to address authorization challenges in autonomous AI systems. Int
|
||||||
|
- [draft-wang-cats-odsi](https://datatracker.ietf.org/doc/draft-wang-cats-odsi/) (4.5) — Specifies framework for decentralized LLM inference across untrusted participants with layer-aware e
|
||||||
|
- [draft-birkholz-verifiable-agent-conversations](https://datatracker.ietf.org/doc/draft-birkholz-verifiable-agent-conversations/) (4.5) — Defines CDDL-based data format for verifiable agent conversation records using COSE signing. Support
|
||||||
|
|
||||||
|
### Partially Addressing Ideas
|
||||||
|
|
||||||
|
No directly related technical ideas found in current drafts — this gap is entirely unaddressed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10. Agent Memory and State Persistence Standards
|
||||||
|
|
||||||
|
| | |
|
||||||
|
|---|---|
|
||||||
|
| **Severity** | MEDIUM |
|
||||||
|
| **Category** | Data formats/interop |
|
||||||
|
| **Drafts in category** | 165 |
|
||||||
|
|
||||||
|
No standardized formats or protocols exist for how agents should persist long-term memory, experience, and learned behaviors across system restarts or migrations. Each implementation creates proprietary solutions.
|
||||||
|
|
||||||
|
**Evidence:** 165 data format drafts focus on communication but not persistent state management
|
||||||
|
|
||||||
|
### Related Drafts
|
||||||
|
|
||||||
|
**Keyword matches** (drafts mentioning gap topic):
|
||||||
|
|
||||||
|
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
||||||
|
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
|
||||||
|
- [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
|
||||||
|
- [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
|
||||||
|
- [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
|
||||||
|
- [draft-gaikwad-llm-benchmarking-terminology](https://datatracker.ietf.org/doc/draft-gaikwad-llm-benchmarking-terminology/) (score 2.7) — Benchmarking Terminology for Large Language Model Serving
|
||||||
|
|
||||||
|
**Top-rated in Data formats/interop** (165 drafts):
|
||||||
|
|
||||||
|
- [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
|
||||||
|
- [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
|
||||||
|
- [draft-ietf-lake-app-profiles](https://datatracker.ietf.org/doc/draft-ietf-lake-app-profiles/) (4.6) — Defines canonical CBOR representation for EDHOC application profiles and coordination mechanisms for
|
||||||
|
- [draft-chang-agent-token-efficient](https://datatracker.ietf.org/doc/draft-chang-agent-token-efficient/) (4.5) — Defines ADOL (Agentic Data Optimization Layer) to address token bloat in agent communication protoco
|
||||||
|
- [draft-birkholz-verifiable-agent-conversations](https://datatracker.ietf.org/doc/draft-birkholz-verifiable-agent-conversations/) (4.5) — Defines CDDL-based data format for verifiable agent conversation records using COSE signing. Support
|
||||||
|
|
||||||
|
### Partially Addressing Ideas
|
||||||
|
|
||||||
|
16 extracted ideas touch on this gap:
|
||||||
|
|
||||||
|
| Idea | Draft | Type |
|
||||||
|
|------|-------|------|
|
||||||
|
| Compliance-oriented agent memory model | draft-gaikwad-aps-profile | pattern |
|
||||||
|
| Zero Trust Interoperability Framework | draft-liu-saag-zt-problem-statement | requirement |
|
||||||
|
| Intelligent Agent Communication Gateway Architecture | draft-agent-gw | architecture |
|
||||||
|
| Zero Trust Runtime Agent Architecture | draft-berlinai-vera | architecture |
|
||||||
|
| Agentic Hypercall Protocol | draft-campbell-agentic-http | pattern |
|
||||||
|
| Agent Persistent State Profile | draft-gaikwad-aps-profile | architecture |
|
||||||
|
| Agentic AI for Autonomous Network Management | draft-hong-nmrg-agenticai-ps | requirement |
|
||||||
|
| LISP-based geospatial intelligence network | draft-ietf-lisp-nexagon | protocol |
|
||||||
|
|
||||||
|
*...and 8 more*
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 11. Agent-to-Human Escalation Standards
|
||||||
|
|
||||||
|
| | |
|
||||||
|
|---|---|
|
||||||
|
| **Severity** | MEDIUM |
|
||||||
|
| **Category** | Human-agent interaction |
|
||||||
|
| **Drafts in category** | 41 |
|
||||||
|
|
||||||
|
While human-in-the-loop protocols exist, there's no standardized framework for when and how agents should escalate decisions to humans based on uncertainty, risk, or ethical considerations.
|
||||||
|
|
||||||
|
**Evidence:** Only 41 human-agent interaction drafts versus complex autonomous systems requiring escalation
|
||||||
|
|
||||||
|
### Related Drafts
|
||||||
|
|
||||||
|
**Keyword matches** (drafts mentioning gap topic):
|
||||||
|
|
||||||
|
- [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (score 4.6) — Hierarchical Topology for Language Model Coordination
|
||||||
|
- [draft-ietf-websec-mime-sniff](https://datatracker.ietf.org/doc/draft-ietf-websec-mime-sniff/) (score 3.7) — Media Type Sniffing
|
||||||
|
- [draft-scrm-aiproto-usecases](https://datatracker.ietf.org/doc/draft-scrm-aiproto-usecases/) (score 4.1) — Agentic AI Use Cases
|
||||||
|
- [draft-zeng-opsawg-llm-netconf-gap](https://datatracker.ietf.org/doc/draft-zeng-opsawg-llm-netconf-gap/) (score 3.9) — Gap Analysis of Network Configuration Protocols in LLM-Driven Intent-Based Netwo
|
||||||
|
- [draft-jadoon-nmrg-agentic-ai-autonomous-networks](https://datatracker.ietf.org/doc/draft-jadoon-nmrg-agentic-ai-autonomous-networks/) (score 4.1) — Agentic AI Architectural Principles for Autonomous Computer Networks
|
||||||
|
|
||||||
|
**Top-rated in Human-agent interaction** (41 drafts):
|
||||||
|
|
||||||
|
- [draft-drake-email-tpm-attestation](https://datatracker.ietf.org/doc/draft-drake-email-tpm-attestation/) (4.6) — Defines hardware attestation for email using TPM verification chains to prevent spam and provide Syb
|
||||||
|
- [draft-ietf-aipref-vocab](https://datatracker.ietf.org/doc/draft-ietf-aipref-vocab/) (4.4) — Defines a standardized vocabulary for expressing preferences about how digital assets should be used
|
||||||
|
- [draft-dhir-http-agent-profile](https://datatracker.ietf.org/doc/draft-dhir-http-agent-profile/) (4.2) — Defines HTTP Agent Profile for authenticating agent traffic, separating human from agent traffic, an
|
||||||
|
- [draft-song-tsvwg-camp](https://datatracker.ietf.org/doc/draft-song-tsvwg-camp/) (4.2) — Proposes CAMP, a multipath transport protocol for interactive multimodal LLM systems that maintains
|
||||||
|
- [draft-liu-agent-operation-authorization](https://datatracker.ietf.org/doc/draft-liu-agent-operation-authorization/) (4.1) — Specifies framework for verifiable delegation of actions from humans to AI agents using JWT tokens.
|
||||||
|
|
||||||
|
### Partially Addressing Ideas
|
||||||
|
|
||||||
|
No directly related technical ideas found in current drafts — this gap is entirely unaddressed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 12. Federated Agent Learning Privacy
|
||||||
|
|
||||||
|
| | |
|
||||||
|
|---|---|
|
||||||
|
| **Severity** | MEDIUM |
|
||||||
|
| **Category** | new |
|
||||||
|
| **Drafts in category** | 0 |
|
||||||
|
|
||||||
|
Federated AI operations models exist but lack privacy-preserving protocols for agents learning from shared experiences without exposing sensitive data from individual deployments.
|
||||||
|
|
||||||
|
**Evidence:** Federated models mentioned but privacy-preserving learning protocols absent
|
||||||
|
|
||||||
|
### Related Drafts
|
||||||
|
|
||||||
|
**Keyword matches** (drafts mentioning gap topic):
|
||||||
|
|
||||||
|
- [draft-kale-agntcy-federated-privacy](https://datatracker.ietf.org/doc/draft-kale-agntcy-federated-privacy/) (score 3.2) — Privacy-Preserving Federated Learning Architecture for Multi-Tenant AI Agent Sys
|
||||||
|
- [draft-cui-dmsc-agent-cdi](https://datatracker.ietf.org/doc/draft-cui-dmsc-agent-cdi/) (score 3.0) — Cross-Domain Interoperability Framework for AI Agent Collaboration
|
||||||
|
- [draft-ai-traffic](https://datatracker.ietf.org/doc/draft-ai-traffic/) (score 2.5) — Handling inter-DC/Edge AI-related network traffic: Problem statement
|
||||||
|
- [draft-aft-ai-traffic](https://datatracker.ietf.org/doc/draft-aft-ai-traffic/) (score 3.1) — Handling inter-DC/Edge AI-related network traffic: Problem statement
|
||||||
|
- [draft-aylward-aiga-1](https://datatracker.ietf.org/doc/draft-aylward-aiga-1/) (score 4.2) — AI Governance and Accountability Protocol (AIGA)
|
||||||
|
- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
|
||||||
|
|
||||||
|
### Partially Addressing Ideas
|
||||||
|
|
||||||
|
5 extracted ideas touch on this gap:
|
||||||
|
|
||||||
|
| Idea | Draft | Type |
|
||||||
|
|------|-------|------|
|
||||||
|
| Privacy-Preserving Federated Learning for Multi-Tenant AI Agents | draft-kale-agntcy-federated-privacy | architecture |
|
||||||
|
| Cross-Domain Agent Interoperability Framework | draft-cui-dmsc-agent-cdi | architecture |
|
||||||
|
| HTTP Agent Profile (HAP) | draft-dhir-http-agent-profile | protocol |
|
||||||
|
| AI Network Security Agent | draft-yuan-rtgwg-security-agent-usecase | architecture |
|
||||||
|
| AI Network Traffic Optimization Agent | draft-yuan-rtgwg-traffic-agent-usecase | architecture |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -607,16 +539,14 @@ Missing standards for energy-aware agent deployment and operation. As AI workloa
|
|||||||
|
|
||||||
| Category | Drafts | Gaps | Gap Topics |
|
| Category | Drafts | Gaps | Gap Topics |
|
||||||
|----------|-------:|-----:|------------|
|
|----------|-------:|-----:|------------|
|
||||||
| a2a protocols | 120 | 2 | Multi-Agent Coordination Deadlocks; Cross-Protocol Agent Migration |
|
| a2a protocols | 150 | 2 | Multi-Agent Consensus Under Byzantine Conditions; Cross-Protocol Agent Migration |
|
||||||
| agent identity/auth | 108 | 1 | Agent Privacy Preservation |
|
| agent discovery/reg | 87 | 1 | Agent Capability Advertisement Verification |
|
||||||
| ai safety/alignment | 44 | 2 | Agent Behavior Verification; Agent Capability Degradation Handling |
|
| agent identity/auth | 145 | 1 | Cross-Domain Agent Communication Security |
|
||||||
| autonomous netops | 93 | 1 | Agent Resource Exhaustion Protection |
|
| ai safety/alignment | 46 | 2 | Real-time Agent Behavior Verification; Emergency Agent Shutdown Coordination |
|
||||||
| data formats/interop | 145 | 1 | Agent-Generated Data Provenance |
|
| data formats/interop | 165 | 1 | Agent Memory and State Persistence Standards |
|
||||||
| human-agent interaction | 30 | 1 | Human Override Protocols |
|
| human-agent interaction | 41 | 1 | Agent-to-Human Escalation Standards |
|
||||||
| ml traffic mgmt | 73 | 1 | Agent Energy Consumption Optimization |
|
| new | 0 | 3 | Agent Resource Accounting and Billing; Agent Performance Degradation Detection; Federated Agent Learning Privacy |
|
||||||
| model serving/inference | 42 | 1 | Agent Firmware/Model Update Security |
|
| policy/governance | 115 | 1 | Legal Liability Attribution Protocols |
|
||||||
| other ai/agent | 26 | 1 | Real-time Agent Debugging |
|
|
||||||
| policy/governance | 91 | 1 | Cross-Domain Agent Liability |
|
|
||||||
|
|
||||||
## Recommendations
|
## Recommendations
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,656 @@
|
|||||||
|
Internet-Draft AI/Agent WG
|
||||||
|
Intended status: standards-track March 2026
|
||||||
|
Expires: September 08, 2026
|
||||||
|
|
||||||
|
|
||||||
|
Multi-Agent Consensus Protocol (MACP) for Distributed AI Agent Coordination
|
||||||
|
draft-ai-consensus-protocol-00
|
||||||
|
|
||||||
|
Abstract
|
||||||
|
|
||||||
|
This document defines the Multi-Agent Consensus Protocol (MACP), a
|
||||||
|
standardized framework for enabling multiple AI agents to reach
|
||||||
|
consensus on shared decisions and resolve conflicting objectives
|
||||||
|
in distributed environments. MACP addresses critical coordination
|
||||||
|
challenges in autonomous systems where agents must collaborate on
|
||||||
|
resource allocation, policy enforcement, and decision-making
|
||||||
|
across organizational and domain boundaries. The protocol
|
||||||
|
incorporates Byzantine fault tolerance mechanisms, cryptographic
|
||||||
|
verification, and hierarchical consensus structures to ensure
|
||||||
|
reliable agreement even in the presence of malicious or
|
||||||
|
malfunctioning agents. MACP defines message formats, consensus
|
||||||
|
algorithms, conflict resolution procedures, and integration
|
||||||
|
patterns with existing agent-to-agent communication protocols. The
|
||||||
|
protocol supports various consensus models including proof-of-
|
||||||
|
authority, weighted voting, and reputation-based systems, enabling
|
||||||
|
deployment across diverse use cases from IoT device coordination
|
||||||
|
to enterprise AI system orchestration. This specification aims to
|
||||||
|
reduce fragmentation in multi-agent systems and provide a
|
||||||
|
foundation for interoperable autonomous agent coordination at
|
||||||
|
scale.
|
||||||
|
|
||||||
|
Status of This Memo
|
||||||
|
|
||||||
|
This Internet-Draft is submitted in full conformance with the
|
||||||
|
provisions of BCP 78 and BCP 79.
|
||||||
|
|
||||||
|
This document is intended to have standards-track status.
|
||||||
|
Distribution of this memo is unlimited.
|
||||||
|
|
||||||
|
Table of Contents
|
||||||
|
|
||||||
|
1. Introduction ................................................ 3
|
||||||
|
2. Terminology ................................................. 4
|
||||||
|
3. Problem Statement ........................................... 5
|
||||||
|
4. MACP Architecture and Core Components ....................... 6
|
||||||
|
5. Consensus Algorithms and Message Formats .................... 7
|
||||||
|
6. Conflict Resolution and Decision Binding .................... 8
|
||||||
|
7. Integration with Existing Agent Protocols ................... 9
|
||||||
|
8. Security Considerations ..................................... 10
|
||||||
|
9. IANA Considerations ......................................... 11
|
||||||
|
|
||||||
|
1. Introduction
|
||||||
|
|
||||||
|
The proliferation of autonomous AI agents across distributed
|
||||||
|
computing environments has created an urgent need for standardized
|
||||||
|
consensus mechanisms that enable coordinated decision-making
|
||||||
|
without centralized control. As organizations deploy increasing
|
||||||
|
numbers of intelligent agents for tasks ranging from resource
|
||||||
|
allocation and policy enforcement to complex multi-party
|
||||||
|
negotiations, the lack of interoperable consensus protocols has
|
||||||
|
resulted in fragmented implementations that cannot effectively
|
||||||
|
coordinate across organizational and domain boundaries. Current
|
||||||
|
agent-to-agent communication protocols, while addressing basic
|
||||||
|
message exchange and authentication, provide insufficient
|
||||||
|
mechanisms for achieving reliable agreement among multiple agents
|
||||||
|
with potentially conflicting objectives or incomplete information.
|
||||||
|
|
||||||
|
Existing consensus approaches in multi-agent systems typically
|
||||||
|
rely on proprietary coordination mechanisms or adapt consensus
|
||||||
|
algorithms designed for blockchain and distributed database
|
||||||
|
applications without addressing the unique requirements of AI
|
||||||
|
agent coordination. These limitations become particularly acute in
|
||||||
|
scenarios involving Byzantine fault tolerance, where agents may
|
||||||
|
exhibit malicious behavior, experience partial failures, or
|
||||||
|
operate under adversarial conditions. The heterogeneous nature of
|
||||||
|
AI agent implementations, combined with varying trust
|
||||||
|
relationships and organizational policies, further complicates the
|
||||||
|
development of effective consensus mechanisms that can operate
|
||||||
|
reliably at scale.
|
||||||
|
|
||||||
|
The Multi-Agent Consensus Protocol (MACP) addresses these
|
||||||
|
challenges by providing a standardized framework specifically
|
||||||
|
designed for AI agent coordination that incorporates proven
|
||||||
|
consensus algorithms while addressing the unique requirements of
|
||||||
|
autonomous agent systems. MACP supports multiple consensus models
|
||||||
|
including proof-of-authority, weighted voting based on agent
|
||||||
|
reputation or capabilities, and hierarchical consensus structures
|
||||||
|
that reflect organizational boundaries and trust relationships.
|
||||||
|
The protocol integrates cryptographic verification mechanisms and
|
||||||
|
Byzantine fault tolerance to ensure reliable consensus achievement
|
||||||
|
even in the presence of malicious or malfunctioning agents, while
|
||||||
|
maintaining compatibility with existing agent communication and
|
||||||
|
attestation frameworks.
|
||||||
|
|
||||||
|
The scope of MACP encompasses the definition of consensus
|
||||||
|
algorithms optimized for AI agent coordination, standardized
|
||||||
|
message formats for proposal submission and voting processes,
|
||||||
|
conflict resolution mechanisms for handling competing objectives,
|
||||||
|
and integration patterns with existing agent-to-agent protocols.
|
||||||
|
This specification aims to reduce the current fragmentation in
|
||||||
|
multi-agent coordination approaches by providing a foundation for
|
||||||
|
interoperable consensus mechanisms that can scale from small IoT
|
||||||
|
device clusters to enterprise-wide AI system orchestration. By
|
||||||
|
establishing common protocols for multi-agent consensus, MACP
|
||||||
|
enables the development of more robust and coordinated autonomous
|
||||||
|
systems while maintaining the flexibility required for diverse
|
||||||
|
deployment scenarios and organizational requirements.
|
||||||
|
|
||||||
|
2. Terminology
|
||||||
|
|
||||||
|
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
|
||||||
|
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
|
||||||
|
"MAY", and "OPTIONAL" in this document are to be interpreted as
|
||||||
|
described in BCP 14 [RFC2119] [RFC8174] when, and only when, they
|
||||||
|
appear in all capitals, as shown here.
|
||||||
|
|
||||||
|
This section establishes terminology specific to multi-agent
|
||||||
|
consensus operations and distributed AI agent coordination. These
|
||||||
|
definitions build upon established concepts from distributed
|
||||||
|
systems literature while introducing terminology specific to
|
||||||
|
autonomous agent environments. Where applicable, references are
|
||||||
|
made to related terminology from existing agent communication
|
||||||
|
protocols such as those defined in [RFC8600] and distributed
|
||||||
|
consensus literature.
|
||||||
|
|
||||||
|
A "consensus agent" is an autonomous AI entity capable of
|
||||||
|
participating in distributed decision-making processes by
|
||||||
|
submitting proposals, evaluating options, and committing to
|
||||||
|
agreed-upon outcomes. Consensus agents MUST maintain state
|
||||||
|
consistency with other participants and possess cryptographic
|
||||||
|
capabilities for message authentication and verification. An
|
||||||
|
"observer agent" is a non-participating entity that monitors
|
||||||
|
consensus processes without voting rights or decision authority.
|
||||||
|
|
||||||
|
A "coordination domain" defines the scope and boundaries within
|
||||||
|
which a group of agents operate under shared governance rules and
|
||||||
|
consensus mechanisms. Each coordination domain establishes its own
|
||||||
|
policies for membership, voting weights, and decision authority. A
|
||||||
|
"decision quorum" represents the minimum number or weight
|
||||||
|
threshold of participating agents required to reach a valid
|
||||||
|
consensus decision within a coordination domain, expressed either
|
||||||
|
as an absolute count or percentage of eligible participants.
|
||||||
|
|
||||||
|
"Byzantine Fault Tolerance" (BFT) refers to the system's ability
|
||||||
|
to achieve consensus despite the presence of agents that may
|
||||||
|
exhibit arbitrary failures, including malicious behavior, message
|
||||||
|
corruption, or incorrect state reporting. MACP implementations
|
||||||
|
SHOULD support Byzantine fault tolerance for up to one-third of
|
||||||
|
participating agents within any coordination domain. "Practical
|
||||||
|
Byzantine Fault Tolerance" (pBFT) describes optimized algorithms
|
||||||
|
that achieve Byzantine fault tolerance with reduced message
|
||||||
|
complexity and improved performance characteristics.
|
||||||
|
|
||||||
|
"Conflict resolution" encompasses the mechanisms and procedures
|
||||||
|
used to address competing proposals, resolve deadlocks, and handle
|
||||||
|
situations where multiple valid decisions could be reached. This
|
||||||
|
includes tie-breaking algorithms, priority-based selection, and
|
||||||
|
escalation procedures to higher-level coordination domains.
|
||||||
|
"Decision binding" refers to the enforcement mechanisms that
|
||||||
|
ensure participating agents comply with consensus outcomes and
|
||||||
|
maintain consistency with agreed-upon decisions across the
|
||||||
|
coordination domain.
|
||||||
|
|
||||||
|
3. Problem Statement
|
||||||
|
|
||||||
|
The proliferation of autonomous AI agents across distributed
|
||||||
|
systems has created an urgent need for standardized consensus
|
||||||
|
mechanisms that can coordinate decision-making without centralized
|
||||||
|
control. Current multi-agent deployments frequently encounter
|
||||||
|
scenarios where agents must collectively agree on resource
|
||||||
|
allocation, policy enforcement, task scheduling, or strategic
|
||||||
|
decisions that affect multiple stakeholders. However, existing
|
||||||
|
agent-to-agent communication protocols such as FIPA-ACL and
|
||||||
|
emerging frameworks focus primarily on message exchange and basic
|
||||||
|
coordination primitives, lacking robust consensus mechanisms
|
||||||
|
necessary for reliable distributed decision-making. This gap
|
||||||
|
becomes particularly problematic in cross-organizational
|
||||||
|
deployments where agents operate under different governance
|
||||||
|
models, trust assumptions, and operational constraints.
|
||||||
|
|
||||||
|
Network partitions and communication failures present fundamental
|
||||||
|
challenges to consensus achievement in distributed agent
|
||||||
|
environments. Unlike traditional distributed systems where nodes
|
||||||
|
typically operate within controlled network environments, AI
|
||||||
|
agents often function across heterogeneous networks with varying
|
||||||
|
reliability characteristics, from edge computing environments to
|
||||||
|
cloud infrastructure spanning multiple providers. Agents may
|
||||||
|
become temporarily or permanently unreachable, creating scenarios
|
||||||
|
where consensus decisions must proceed with incomplete information
|
||||||
|
or be safely aborted to prevent system-wide inconsistencies.
|
||||||
|
Current agent protocols provide insufficient guidance for handling
|
||||||
|
these partition scenarios, often resulting in ad-hoc
|
||||||
|
implementations that cannot guarantee safety properties or
|
||||||
|
liveness guarantees across different deployment contexts.
|
||||||
|
|
||||||
|
Conflicting objectives among participating agents introduce
|
||||||
|
additional complexity beyond traditional distributed consensus
|
||||||
|
problems. AI agents frequently operate with competing utility
|
||||||
|
functions, resource constraints, and optimization targets that may
|
||||||
|
not align with collective decision requirements. For example, in a
|
||||||
|
multi-cloud resource allocation scenario, individual agents may
|
||||||
|
prioritize cost minimization for their respective organizations
|
||||||
|
while the collective decision requires balancing performance,
|
||||||
|
security, and availability across all participants. Existing
|
||||||
|
consensus algorithms assume participants share common objectives
|
||||||
|
or can reduce decisions to simple binary choices, failing to
|
||||||
|
address the multi-dimensional optimization problems inherent in AI
|
||||||
|
agent coordination.
|
||||||
|
|
||||||
|
The presence of malicious or Byzantine actors poses significant
|
||||||
|
threats to consensus integrity in open multi-agent environments.
|
||||||
|
Unlike closed distributed systems where all participants operate
|
||||||
|
under unified security models, AI agents may need to establish
|
||||||
|
consensus across organizational boundaries where some participants
|
||||||
|
cannot be fully trusted. Malicious agents may attempt to
|
||||||
|
manipulate consensus outcomes through strategic voting, false
|
||||||
|
information injection, or coordination attacks designed to prevent
|
||||||
|
legitimate consensus achievement. Furthermore, compromised or
|
||||||
|
malfunctioning agents may exhibit Byzantine behavior without
|
||||||
|
malicious intent, requiring consensus mechanisms that can tolerate
|
||||||
|
arbitrary failures while maintaining decision quality and system
|
||||||
|
progress.
|
||||||
|
|
||||||
|
Current limitations in existing agent-to-agent protocols create
|
||||||
|
additional barriers to reliable consensus implementation. Most
|
||||||
|
protocols focus on peer-to-peer communication abstractions without
|
||||||
|
addressing the coordination complexity required for multi-party
|
||||||
|
decision-making. Authentication and authorization mechanisms are
|
||||||
|
typically designed for bilateral interactions rather than group
|
||||||
|
consensus scenarios, lacking support for quorum-based verification
|
||||||
|
or reputation systems that could improve consensus security.
|
||||||
|
Additionally, existing protocols provide limited support for
|
||||||
|
hierarchical consensus structures or delegation mechanisms that
|
||||||
|
could enable scalable decision-making in large agent populations,
|
||||||
|
forcing implementations toward flat consensus models that may not
|
||||||
|
perform well beyond small agent groups.
|
||||||
|
|
||||||
|
4. MACP Architecture and Core Components
|
||||||
|
|
||||||
|
The MACP architecture employs a distributed coordination model
|
||||||
|
consisting of three primary components: Consensus Coordinators,
|
||||||
|
Participating Agents, and Decision Domains. This architecture is
|
||||||
|
designed to scale horizontally while maintaining fault tolerance
|
||||||
|
and ensuring efficient consensus across diverse agent populations.
|
||||||
|
The system operates on the principle of domain-partitioned
|
||||||
|
consensus, where agents are organized into logical groupings based
|
||||||
|
on their functional roles, trust relationships, or organizational
|
||||||
|
boundaries. Each Decision Domain maintains its own consensus state
|
||||||
|
and can interact with other domains through well-defined inter-
|
||||||
|
domain protocols.
|
||||||
|
|
||||||
|
Consensus Coordinators serve as the orchestration layer for MACP
|
||||||
|
operations within each Decision Domain. A Consensus Coordinator
|
||||||
|
MUST be capable of maintaining the current consensus state,
|
||||||
|
managing proposal queues, and coordinating message distribution
|
||||||
|
among Participating Agents. Coordinators are responsible for
|
||||||
|
implementing the selected consensus algorithm, enforcing quorum
|
||||||
|
requirements, and ensuring message ordering consistency. In
|
||||||
|
deployments requiring high availability, multiple Consensus
|
||||||
|
Coordinators MAY operate in a redundant configuration using leader
|
||||||
|
election mechanisms similar to those defined in Raft consensus
|
||||||
|
algorithms. Coordinators MUST implement Byzantine fault tolerance
|
||||||
|
measures when operating in environments where malicious behavior
|
||||||
|
is possible, maintaining consensus integrity even when up to one-
|
||||||
|
third of coordinators exhibit arbitrary failures.
|
||||||
|
|
||||||
|
Participating Agents represent the autonomous entities that
|
||||||
|
contribute to consensus decisions within the MACP framework. Each
|
||||||
|
Participating Agent MUST maintain a unique identity within its
|
||||||
|
Decision Domain and possess the capability to generate, evaluate,
|
||||||
|
and vote on consensus proposals. Agents are classified into one of
|
||||||
|
three participation modes: Active Participants that can both
|
||||||
|
propose and vote on decisions, Voting Participants that can vote
|
||||||
|
but not propose, and Observer Participants that receive consensus
|
||||||
|
results but do not participate in the decision process.
|
||||||
|
Participating Agents MUST implement proposal validation logic
|
||||||
|
appropriate to their domain context and SHOULD incorporate
|
||||||
|
reputation tracking mechanisms to assess the trustworthiness of
|
||||||
|
proposals from other agents.
|
||||||
|
|
||||||
|
Decision Domains establish the scope and context for consensus
|
||||||
|
operations, defining the set of agents authorized to participate
|
||||||
|
in specific types of decisions. A Decision Domain MUST specify its
|
||||||
|
consensus model (proof-of-authority, weighted voting, or
|
||||||
|
reputation-based), quorum requirements, and decision binding
|
||||||
|
policies. Domains operate independently but MAY establish inter-
|
||||||
|
domain communication channels for coordinating decisions that span
|
||||||
|
multiple domains. The domain configuration MUST include conflict
|
||||||
|
resolution parameters, timeout specifications, and rollback
|
||||||
|
procedures to handle consensus failures gracefully.
|
||||||
|
|
||||||
|
The interaction patterns between these components follow a
|
||||||
|
structured request-response model augmented with publish-subscribe
|
||||||
|
mechanisms for state synchronization. When a Participating Agent
|
||||||
|
initiates a consensus proposal, it MUST first submit the proposal
|
||||||
|
to its designated Consensus Coordinator, which validates the
|
||||||
|
proposal format and participant authorization. The Coordinator
|
||||||
|
then distributes the proposal to all eligible Participating Agents
|
||||||
|
within the Decision Domain, collecting votes according to the
|
||||||
|
configured consensus algorithm. Upon reaching quorum and achieving
|
||||||
|
consensus, the Coordinator publishes the binding decision to all
|
||||||
|
participants and updates the domain's consensus state. Failed
|
||||||
|
consensus attempts trigger the domain's configured rollback
|
||||||
|
procedures, allowing the system to maintain consistency despite
|
||||||
|
partial failures or network partitions.
|
||||||
|
|
||||||
|
5. Consensus Algorithms and Message Formats
|
||||||
|
|
||||||
|
MACP implements multiple consensus algorithms to accommodate
|
||||||
|
different operational requirements and network conditions. The
|
||||||
|
base specification MUST support the Practical Byzantine Fault
|
||||||
|
Tolerance (pBFT) algorithm adapted for multi-agent environments,
|
||||||
|
while implementations MAY support additional algorithms including
|
||||||
|
RAFT consensus for non-Byzantine scenarios and novel reputation-
|
||||||
|
weighted consensus for environments with established agent trust
|
||||||
|
relationships. The pBFT implementation assumes a maximum of f
|
||||||
|
Byzantine agents out of 3f+1 total participating agents, providing
|
||||||
|
safety and liveness guarantees under standard network assumptions.
|
||||||
|
Each consensus algorithm is identified by a unique Algorithm
|
||||||
|
Identifier (AID) registered with IANA as specified in Section 9.
|
||||||
|
|
||||||
|
The MACP consensus process follows a four-phase protocol:
|
||||||
|
Proposal, Pre-voting, Voting, and Commitment. During the Proposal
|
||||||
|
phase, any authorized agent MAY submit decision proposals to the
|
||||||
|
designated consensus coordinator for the relevant coordination
|
||||||
|
domain. The Pre-voting phase allows agents to signal their
|
||||||
|
preliminary position and identify potential conflicts or
|
||||||
|
dependencies with other pending proposals. The Voting phase
|
||||||
|
requires participating agents to submit cryptographically signed
|
||||||
|
votes within a specified timeout window, with vote validity
|
||||||
|
determined by the agent's authorization level and current
|
||||||
|
reputation score. The Commitment phase broadcasts the final
|
||||||
|
decision and requires acknowledgment from a quorum of agents
|
||||||
|
before considering the consensus binding.
|
||||||
|
|
||||||
|
MACP defines standardized message formats using JSON serialization
|
||||||
|
with mandatory digital signatures following RFC 7515 (JSON Web
|
||||||
|
Signature). All consensus messages MUST include a common header
|
||||||
|
containing the message type, consensus session identifier,
|
||||||
|
timestamp, originating agent identifier, and cryptographic
|
||||||
|
signature. Proposal messages additionally contain the proposed
|
||||||
|
decision payload, expected quorum size, voting timeout duration,
|
||||||
|
and conflict resolution parameters. Vote messages include the
|
||||||
|
proposal hash, agent's decision (ACCEPT, REJECT, ABSTAIN), voting
|
||||||
|
weight, and optional reasoning metadata. Commitment messages
|
||||||
|
broadcast the final consensus result, participating agent list,
|
||||||
|
vote tally, and binding duration for the agreed decision.
|
||||||
|
|
||||||
|
Message validation requires verification of agent authorization,
|
||||||
|
signature authenticity, and temporal constraints before
|
||||||
|
processing. Agents MUST reject messages with invalid signatures,
|
||||||
|
expired timestamps beyond the configured clock skew tolerance, or
|
||||||
|
originating from agents not authorized for the specific
|
||||||
|
coordination domain. Vote aggregation follows the specified
|
||||||
|
consensus algorithm with additional validation for vote weight
|
||||||
|
consistency and duplicate vote detection. The consensus
|
||||||
|
coordinator MUST broadcast commitment messages to all
|
||||||
|
participating agents and maintain an audit log of the complete
|
||||||
|
consensus session for accountability purposes as defined in
|
||||||
|
existing agent attestation frameworks.
|
||||||
|
|
||||||
|
Timeout handling and failure recovery mechanisms ensure liveness
|
||||||
|
properties under adverse network conditions. If insufficient votes
|
||||||
|
are received within the voting timeout window, the consensus
|
||||||
|
coordinator MUST initiate a new consensus round with an
|
||||||
|
exponentially increasing timeout duration, up to a maximum
|
||||||
|
threshold defined in the coordination domain configuration.
|
||||||
|
Network partition scenarios are addressed through coordinator
|
||||||
|
election protocols that prevent split-brain consensus decisions.
|
||||||
|
Failed consensus attempts trigger rollback procedures that notify
|
||||||
|
all participating agents of the unsuccessful coordination attempt
|
||||||
|
and release any provisionally allocated resources pending the
|
||||||
|
consensus outcome.
|
||||||
|
|
||||||
|
6. Conflict Resolution and Decision Binding
|
||||||
|
|
||||||
|
Conflict resolution in MACP occurs when multiple competing
|
||||||
|
proposals are submitted simultaneously or when agents disagree on
|
||||||
|
the validity or priority of proposed decisions. When conflicts are
|
||||||
|
detected during the proposal phase, participating agents MUST
|
||||||
|
invoke the conflict resolution mechanism before proceeding with
|
||||||
|
the voting phase. The protocol defines three primary conflict
|
||||||
|
types: proposal conflicts (multiple proposals for the same
|
||||||
|
decision domain), timing conflicts (simultaneous submissions
|
||||||
|
within the conflict detection window), and validity conflicts
|
||||||
|
(disagreement on proposal prerequisites or constraints). Agents
|
||||||
|
MUST maintain a conflict detection buffer with a configurable
|
||||||
|
timeout period (default 30 seconds) to identify competing
|
||||||
|
proposals before initiating consensus procedures.
|
||||||
|
|
||||||
|
The tie-breaking procedure activates when voting results in
|
||||||
|
equivalent support for multiple proposals or when no proposal
|
||||||
|
achieves the required quorum threshold. MACP employs a
|
||||||
|
hierarchical tie-breaking mechanism starting with proposal
|
||||||
|
priority levels, followed by submitter reputation scores, and
|
||||||
|
finally deterministic hash-based selection using the combined hash
|
||||||
|
of conflicting proposal identifiers. Participating agents MUST
|
||||||
|
apply tie-breaking rules in the specified order and MUST reach
|
||||||
|
agreement on tie-breaking criteria during the initial coordination
|
||||||
|
domain establishment. When tie-breaking fails to resolve
|
||||||
|
conflicts, the consensus coordinator MUST initiate a cooling-off
|
||||||
|
period of at least 60 seconds before allowing resubmission of
|
||||||
|
conflicting proposals.
|
||||||
|
|
||||||
|
Decision binding ensures that consensus outcomes are enforced
|
||||||
|
across all participating agents through cryptographic commitment
|
||||||
|
and distributed verification mechanisms. Once consensus is
|
||||||
|
achieved, all participating agents MUST generate binding
|
||||||
|
commitment messages containing the decision hash, their digital
|
||||||
|
signature, and a commitment timestamp. The binding phase requires
|
||||||
|
acknowledgment from at least the same quorum that approved the
|
||||||
|
original proposal within a configurable binding timeout (default
|
||||||
|
120 seconds). Agents MUST store binding commitments in their local
|
||||||
|
decision ledger and MUST reject future proposals that violate
|
||||||
|
committed decisions unless explicitly superseded through the
|
||||||
|
decision override mechanism.
|
||||||
|
|
||||||
|
Timeout handling addresses scenarios where consensus cannot be
|
||||||
|
achieved within specified time bounds or when participating agents
|
||||||
|
become unresponsive during critical phases. MACP defines distinct
|
||||||
|
timeout periods for each consensus phase: proposal timeout
|
||||||
|
(default 60 seconds), voting timeout (default 180 seconds), and
|
||||||
|
binding timeout (default 120 seconds). When timeouts occur, the
|
||||||
|
consensus coordinator MUST broadcast a timeout notification and
|
||||||
|
initiate graceful degradation procedures, which may include
|
||||||
|
reducing quorum requirements, extending timeout periods, or
|
||||||
|
aborting the consensus attempt. Agents that fail to respond within
|
||||||
|
timeout periods MUST be temporarily excluded from the current
|
||||||
|
consensus round but MAY rejoin subsequent rounds.
|
||||||
|
|
||||||
|
Rollback mechanisms provide recovery capabilities when consensus
|
||||||
|
failures occur after the voting phase or when binding commitments
|
||||||
|
cannot be properly established. Rollback procedures MUST be
|
||||||
|
initiated when binding acknowledgments fall below the required
|
||||||
|
threshold, when Byzantine fault detection identifies compromised
|
||||||
|
consensus results, or when critical participating agents report
|
||||||
|
implementation failures. The rollback process requires the
|
||||||
|
consensus coordinator to broadcast rollback notifications
|
||||||
|
containing the failed consensus identifier, rollback reason code,
|
||||||
|
and reversion instructions. All participating agents MUST
|
||||||
|
acknowledge rollback notifications, remove associated decision
|
||||||
|
commitments from their local ledgers, and reset their consensus
|
||||||
|
state to allow for subsequent retry attempts with modified
|
||||||
|
parameters or participant sets.
|
||||||
|
|
||||||
|
7. Integration with Existing Agent Protocols
|
||||||
|
|
||||||
|
MACP is designed to operate as an overlay protocol that integrates
|
||||||
|
seamlessly with existing agent-to-agent communication frameworks
|
||||||
|
and infrastructure. Implementations MUST support integration with
|
||||||
|
standard authentication protocols including OAuth 2.0 [RFC6749],
|
||||||
|
OpenID Connect, and X.509 certificate-based authentication systems
|
||||||
|
commonly deployed in enterprise environments. MACP consensus
|
||||||
|
messages SHOULD leverage existing secure transport mechanisms such
|
||||||
|
as TLS 1.3 [RFC8446] or DTLS for UDP-based communications,
|
||||||
|
ensuring that consensus operations benefit from established
|
||||||
|
security practices without requiring separate cryptographic
|
||||||
|
implementations.
|
||||||
|
|
||||||
|
Integration with agent accountability frameworks requires MACP
|
||||||
|
implementations to maintain comprehensive audit trails of
|
||||||
|
consensus participation and decision outcomes. Consensus
|
||||||
|
coordinators MUST log all proposal submissions, voting records,
|
||||||
|
and final decisions in formats compatible with existing audit and
|
||||||
|
compliance systems. When operating alongside agent attestation
|
||||||
|
protocols, MACP SHOULD verify agent identity and authorization
|
||||||
|
status before allowing participation in consensus processes,
|
||||||
|
utilizing existing identity providers and policy enforcement
|
||||||
|
points where available. The protocol defines standard interfaces
|
||||||
|
for querying agent reputation scores and authorization levels from
|
||||||
|
external accountability systems.
|
||||||
|
|
||||||
|
MACP consensus operations MUST be designed to coexist with
|
||||||
|
workflow management and orchestration platforms commonly used in
|
||||||
|
distributed AI deployments. Implementations SHOULD provide APIs
|
||||||
|
and event notifications that allow workflow systems to trigger
|
||||||
|
consensus processes when collective decisions are required, and to
|
||||||
|
receive binding consensus outcomes for subsequent workflow
|
||||||
|
execution. The protocol supports asynchronous integration patterns
|
||||||
|
where consensus results can be delivered to workflow systems
|
||||||
|
through message queues, webhooks, or polling interfaces, ensuring
|
||||||
|
compatibility with diverse orchestration architectures.
|
||||||
|
|
||||||
|
For environments utilizing existing agent-to-agent communication
|
||||||
|
protocols such as FIPA-ACL or custom REST-based agent APIs, MACP
|
||||||
|
provides adapter interfaces that translate consensus-specific
|
||||||
|
messages into native communication formats. Protocol
|
||||||
|
implementations MAY offer plugin architectures that allow custom
|
||||||
|
integration modules for proprietary agent communication systems,
|
||||||
|
while maintaining core consensus algorithm integrity. Standard
|
||||||
|
message mapping templates are provided for common integration
|
||||||
|
scenarios, reducing implementation complexity for organizations
|
||||||
|
with established agent communication infrastructure.
|
||||||
|
|
||||||
|
The protocol includes provisions for gradual deployment in mixed
|
||||||
|
environments where only a subset of agents support MACP consensus
|
||||||
|
mechanisms. Non-MACP agents MAY participate in consensus processes
|
||||||
|
through proxy agents that translate between native agent protocols
|
||||||
|
and MACP message formats, though such deployments SHOULD implement
|
||||||
|
additional verification mechanisms to ensure proxy agent fidelity.
|
||||||
|
Integration guidelines specify fallback procedures for scenarios
|
||||||
|
where consensus mechanisms are unavailable, allowing graceful
|
||||||
|
degradation to existing coordination approaches while maintaining
|
||||||
|
system stability.
|
||||||
|
|
||||||
|
8. Security Considerations
|
||||||
|
|
||||||
|
Security considerations for MACP deployment are paramount given
|
||||||
|
the distributed nature of multi-agent systems and the potential
|
||||||
|
for malicious actors to compromise consensus integrity. The
|
||||||
|
protocol MUST implement comprehensive threat mitigation strategies
|
||||||
|
to address attacks specific to distributed consensus mechanisms.
|
||||||
|
Primary security concerns include Sybil attacks where malicious
|
||||||
|
actors create multiple false agent identities to gain
|
||||||
|
disproportionate voting power, coordination attacks where
|
||||||
|
compromised agents collude to manipulate consensus outcomes, and
|
||||||
|
consensus manipulation through message tampering or replay
|
||||||
|
attacks. Additionally, MACP implementations MUST consider denial-
|
||||||
|
of-service attacks targeting consensus coordinators, eclipse
|
||||||
|
attacks isolating honest agents from the consensus network, and
|
||||||
|
long-range attacks where compromised agents attempt to rewrite
|
||||||
|
historical consensus decisions.
|
||||||
|
|
||||||
|
Cryptographic requirements for MACP implementations MUST include
|
||||||
|
strong identity verification mechanisms to prevent unauthorized
|
||||||
|
participation in consensus processes. Each participating agent
|
||||||
|
MUST possess a cryptographically verifiable identity backed by
|
||||||
|
public-key infrastructure or distributed identity systems such as
|
||||||
|
those defined in [RFC6960] and emerging decentralized identity
|
||||||
|
standards. Digital signatures MUST be used for all consensus
|
||||||
|
messages including proposals, votes, and commitments, with
|
||||||
|
signature schemes providing at least 128-bit security strength as
|
||||||
|
specified in [RFC3766]. Message integrity MUST be protected
|
||||||
|
through cryptographic hash functions resistant to collision
|
||||||
|
attacks, and implementations SHOULD employ hash-based message
|
||||||
|
authentication codes (HMAC) for additional verification. Time-
|
||||||
|
based replay attack prevention MUST be implemented through message
|
||||||
|
timestamps and nonce mechanisms, with strict validation of message
|
||||||
|
freshness windows.
|
||||||
|
|
||||||
|
Identity verification mechanisms MUST prevent Sybil attacks
|
||||||
|
through robust agent authentication and reputation tracking
|
||||||
|
systems. MACP implementations SHOULD integrate with existing
|
||||||
|
Public Key Infrastructure (PKI) systems or emerging decentralized
|
||||||
|
identity frameworks to establish verifiable agent identities.
|
||||||
|
Consensus coordinators MUST maintain authoritative lists of
|
||||||
|
eligible participating agents and regularly validate agent
|
||||||
|
credentials against trusted identity providers. Multi-factor
|
||||||
|
authentication SHOULD be employed for high-stakes consensus
|
||||||
|
decisions, potentially including hardware security module (HSM)
|
||||||
|
attestation or trusted execution environment verification. Agent
|
||||||
|
reputation systems MAY be implemented to track historical behavior
|
||||||
|
and adjust voting weights based on demonstrated trustworthiness,
|
||||||
|
though such systems MUST include mechanisms to prevent reputation
|
||||||
|
manipulation attacks.
|
||||||
|
|
||||||
|
Protection mechanisms for consensus integrity MUST address both
|
||||||
|
technical and game-theoretic attack vectors inherent in
|
||||||
|
distributed decision-making systems. Byzantine Fault Tolerant
|
||||||
|
consensus algorithms MUST be configured to handle the maximum
|
||||||
|
expected number of malicious agents according to theoretical
|
||||||
|
bounds, typically supporting up to f faulty agents in a network of
|
||||||
|
3f+1 total agents. Network-level protections SHOULD include
|
||||||
|
encrypted communication channels using protocols such as TLS 1.3
|
||||||
|
[RFC8446] and distributed denial-of-service (DDoS) mitigation
|
||||||
|
strategies to ensure consensus availability. Implementations MUST
|
||||||
|
implement consensus finality mechanisms that prevent retroactive
|
||||||
|
modification of agreed-upon decisions and provide cryptographic
|
||||||
|
proofs of consensus achievement. Regular security audits and
|
||||||
|
penetration testing SHOULD be conducted on MACP implementations,
|
||||||
|
with particular attention to consensus algorithm correctness and
|
||||||
|
cryptographic implementation vulnerabilities.
|
||||||
|
|
||||||
|
Economic and incentive-based security measures SHOULD be
|
||||||
|
considered to discourage malicious behavior and ensure honest
|
||||||
|
participation in consensus processes. Stake-based consensus
|
||||||
|
mechanisms MAY be implemented where agents must commit resources
|
||||||
|
or reputation to participate in decision-making, creating economic
|
||||||
|
disincentives for malicious behavior. Slashing mechanisms SHOULD
|
||||||
|
be employed to penalize agents that violate consensus rules or
|
||||||
|
demonstrate Byzantine behavior. However, such economic measures
|
||||||
|
MUST be carefully designed to prevent wealth concentration attacks
|
||||||
|
and ensure broad participation accessibility. Monitoring and
|
||||||
|
anomaly detection systems SHOULD continuously analyze consensus
|
||||||
|
patterns to identify potential coordinated attacks or unusual
|
||||||
|
voting behaviors that may indicate compromise. Emergency response
|
||||||
|
procedures MUST be established to handle detected security
|
||||||
|
incidents, including mechanisms for temporarily suspending
|
||||||
|
consensus participation of suspected malicious agents and
|
||||||
|
initiating incident response protocols.
|
||||||
|
|
||||||
|
9. IANA Considerations
|
||||||
|
|
||||||
|
This document requests IANA to create and maintain several new
|
||||||
|
registries to support the Multi-Agent Consensus Protocol (MACP)
|
||||||
|
and ensure protocol extensibility and interoperability. The
|
||||||
|
registries defined in this section will enable standardized
|
||||||
|
identification of protocol elements while allowing for future
|
||||||
|
enhancements and vendor-specific extensions without creating
|
||||||
|
conflicts or ambiguity in multi-agent consensus implementations.
|
||||||
|
|
||||||
|
IANA is requested to establish a "Multi-Agent Consensus Protocol
|
||||||
|
(MACP) Parameters" registry group containing three distinct
|
||||||
|
registries. The first registry, "MACP Message Types", SHALL
|
||||||
|
contain identifiers for all MACP message types including consensus
|
||||||
|
proposals, votes, commitments, and administrative messages.
|
||||||
|
Message type identifiers MUST be allocated as 16-bit unsigned
|
||||||
|
integers in the range 0x0000-0xFFFF, with values 0x0000-0x7FFF
|
||||||
|
reserved for IETF-defined message types and values 0x8000-0xFFFF
|
||||||
|
available for private use and experimental implementations.
|
||||||
|
Registration of new message types in the IETF range requires
|
||||||
|
Standards Action as defined in RFC 8126, and MUST include a
|
||||||
|
complete message format specification and security considerations.
|
||||||
|
|
||||||
|
The second registry, "MACP Consensus Algorithm Identifiers", SHALL
|
||||||
|
contain unique identifiers for consensus algorithms supported by
|
||||||
|
MACP implementations. Algorithm identifiers MUST be allocated as
|
||||||
|
UTF-8 strings following the pattern "algorithm-name.version" with
|
||||||
|
a maximum length of 64 characters. The registry MUST include
|
||||||
|
algorithm names, version numbers, reference specifications,
|
||||||
|
security properties, and applicable use case constraints for each
|
||||||
|
entry. Initial registry entries SHALL include "pbft.1.0" for
|
||||||
|
Practical Byzantine Fault Tolerance, "poa.1.0" for Proof of
|
||||||
|
Authority, and "weighted-vote.1.0" for weighted voting consensus.
|
||||||
|
New algorithm registrations require Expert Review with designated
|
||||||
|
experts having demonstrated expertise in distributed consensus
|
||||||
|
mechanisms and multi-agent systems.
|
||||||
|
|
||||||
|
The third registry, "MACP Conflict Resolution Methods", SHALL
|
||||||
|
contain identifiers for standardized conflict resolution
|
||||||
|
procedures used when multiple competing proposals achieve similar
|
||||||
|
consensus scores. Resolution method identifiers MUST follow the
|
||||||
|
same UTF-8 string format as consensus algorithms and include
|
||||||
|
detailed descriptions of resolution logic, fairness guarantees,
|
||||||
|
and termination conditions. Registration requires Expert Review
|
||||||
|
and MUST demonstrate deterministic behavior across all
|
||||||
|
participating agents. Initial entries SHALL include "timestamp-
|
||||||
|
priority.1.0", "hash-ordering.1.0", and "weighted-random.1.0" with
|
||||||
|
complete algorithmic specifications.
|
||||||
|
|
||||||
|
IANA is further requested to establish a "MACP Extension
|
||||||
|
Parameters" registry for protocol extension identifiers used in
|
||||||
|
MACP header fields and capability negotiation. Extension
|
||||||
|
identifiers MUST be allocated as reverse DNS notation strings to
|
||||||
|
prevent conflicts and enable vendor-specific extensions while
|
||||||
|
maintaining global uniqueness. The registry SHALL operate under
|
||||||
|
First Come First Served allocation policy as defined in RFC 8126,
|
||||||
|
requiring only basic documentation of the extension purpose and
|
||||||
|
format. All registry entries MUST include contact information for
|
||||||
|
the registering organization and SHOULD reference publicly
|
||||||
|
accessible specification documents for interoperability purposes.
|
||||||
|
|
||||||
|
Author's Address
|
||||||
|
|
||||||
|
Generated by IETF Draft Analyzer
|
||||||
|
2026-03-07
|
||||||
@@ -12,7 +12,7 @@
|
|||||||
| drafts | 434 | Up from 361 after 2026-03-07 fetch |
|
| drafts | 434 | Up from 361 after 2026-03-07 fetch |
|
||||||
| ratings | 434 | 1:1 with drafts |
|
| ratings | 434 | 1:1 with drafts |
|
||||||
| authors | 557 | Unique persons from Datatracker |
|
| authors | 557 | Unique persons from Datatracker |
|
||||||
| ideas | 419 | See "Ideas Count History" below |
|
| ideas | 462 | Re-extracted 2026-03-08, see "Ideas Count History" below |
|
||||||
| gaps | 11 | Not 12 -- see gap list below |
|
| gaps | 11 | Not 12 -- see gap list below |
|
||||||
| embeddings | 434 | 1:1 with drafts |
|
| embeddings | 434 | 1:1 with drafts |
|
||||||
| draft_authors | 1,057 | Draft-author links |
|
| draft_authors | 1,057 | Draft-author links |
|
||||||
@@ -79,24 +79,25 @@ Blog posts reference 12 gaps with different names (e.g., "Agent Resource Exhaust
|
|||||||
|
|
||||||
## Ideas Count History
|
## Ideas Count History
|
||||||
|
|
||||||
The database currently contains **419 ideas** across **377 drafts**. This is the third different count encountered:
|
The database currently contains **462 ideas** across **415 drafts**. This is the fourth count encountered:
|
||||||
|
|
||||||
| Source | Count | Date | Likely Explanation |
|
| Source | Count | Date | Likely Explanation |
|
||||||
|--------|-------|------|-------------------|
|
|--------|-------|------|-------------------|
|
||||||
| Blog post 5 filename | 1,262 | ~2026-03-03 | Pre-expansion dataset (260 drafts), before dedup |
|
| Blog post 5 filename | 1,262 | ~2026-03-03 | Pre-expansion dataset (260 drafts), before dedup |
|
||||||
| Blog post 5 text / master stats | 1,780 | ~2026-03-05 | Post-expansion (361 drafts), before dedup |
|
| Blog post 5 text / master stats | 1,780 | ~2026-03-05 | Post-expansion (361 drafts), before dedup |
|
||||||
| Current database | 419 | 2026-03-08 | After `dedup_ideas` run (0.85 threshold) or re-extraction with different params |
|
| Previous database | 419 | 2026-03-08 | After `dedup_ideas` run (0.85 threshold) or re-extraction with different params |
|
||||||
|
| Current database | 462 | 2026-03-08 | After re-extraction for 38 drafts missing ideas (474 total drafts, 59 still without ideas) |
|
||||||
|
|
||||||
### Ideas by Type (current DB)
|
### Ideas by Type (current DB)
|
||||||
|
|
||||||
| Type | Count |
|
| Type | Count |
|
||||||
|------|-------|
|
|------|-------|
|
||||||
| protocol | 96 |
|
| architecture | 107 |
|
||||||
| architecture | 95 |
|
| protocol | 106 |
|
||||||
| extension | 79 |
|
| extension | 84 |
|
||||||
| mechanism | 68 |
|
| mechanism | 74 |
|
||||||
| requirement | 42 |
|
| requirement | 47 |
|
||||||
| pattern | 35 |
|
| pattern | 40 |
|
||||||
| framework | 3 |
|
| framework | 3 |
|
||||||
| format | 1 |
|
| format | 1 |
|
||||||
|
|
||||||
@@ -104,14 +105,30 @@ The database currently contains **419 ideas** across **377 drafts**. This is the
|
|||||||
|
|
||||||
| Ideas/Draft | Drafts |
|
| Ideas/Draft | Drafts |
|
||||||
|-------------|--------|
|
|-------------|--------|
|
||||||
| 1 | 337 |
|
| 1 | 370 |
|
||||||
| 2 | 38 |
|
| 2 | 43 |
|
||||||
| 3 | 2 |
|
| 3 | 2 |
|
||||||
| 0 (no ideas) | 57 |
|
| 0 (no ideas) | 59 |
|
||||||
|
|
||||||
The near-uniform 1-idea-per-draft (89% of drafts with ideas) suggests either aggressive dedup or a re-extraction with constrained output. The original pipeline extracted 1-4 ideas per draft, so the 1,780 figure likely reflects pre-dedup counts.
|
The near-uniform 1-idea-per-draft (89% of drafts with ideas) suggests either aggressive dedup or a re-extraction with constrained output. The original pipeline extracted 1-4 ideas per draft, so the 1,780 figure likely reflects pre-dedup counts.
|
||||||
|
|
||||||
Excluding false positives: 365 ideas across 326 drafts.
|
### Convergence Analysis (2026-03-08)
|
||||||
|
|
||||||
|
Cross-organization idea convergence analysis (threshold: 0.75 SequenceMatcher similarity):
|
||||||
|
|
||||||
|
| Metric | Value |
|
||||||
|
|--------|-------|
|
||||||
|
| Total ideas | 462 |
|
||||||
|
| Unique clusters | 398 |
|
||||||
|
| Cross-org convergent ideas | 132 |
|
||||||
|
| Convergence rate | 33% |
|
||||||
|
|
||||||
|
Top convergent ideas by organization count:
|
||||||
|
- **Fully Adaptive Routing Ethernet for AI** — 14 orgs (Baidu, Broadcom, China Mobile, etc.)
|
||||||
|
- **AI Agent Protocol Framework** — 7 orgs, 3 drafts
|
||||||
|
- **Natural Language Protocol for Agent Comm** — 7 orgs
|
||||||
|
- **LISP-based geospatial intelligence network** — 6 orgs
|
||||||
|
- **MCP-Based Network Management Plane** — 4 orgs (Deutsche Telekom, Huawei, Orange, Telefonica)
|
||||||
|
|
||||||
## Actions Taken (2026-03-08)
|
## Actions Taken (2026-03-08)
|
||||||
|
|
||||||
|
|||||||
97
data/reports/wg-analysis.md
Normal file
97
data/reports/wg-analysis.md
Normal file
@@ -0,0 +1,97 @@
|
|||||||
|
# Working Group Analysis
|
||||||
|
*Generated 2026-03-06 21:16 UTC — 434 drafts (85 WG-adopted, 349 individual)*
|
||||||
|
|
||||||
|
## Working Group Overview
|
||||||
|
|
||||||
|
| WG | Drafts | Ideas | Novelty | Maturity | Overlap | Momentum | Relevance |
|
||||||
|
|:---|-------:|------:|--------:|---------:|--------:|---------:|----------:|
|
||||||
|
| **lake** | 11 | 10 | 3.1 | 3.8 | 2.3 | 3.6 | 3.9 |
|
||||||
|
| **lamps** | 9 | 9 | 2.7 | 3.9 | 1.7 | 3.4 | 3.6 |
|
||||||
|
| **aipref** | 9 | 10 | 3.0 | 3.2 | 3.2 | 3.3 | 4.1 |
|
||||||
|
| **emu** | 6 | 6 | 3.3 | 3.2 | 2.8 | 3.3 | 3.7 |
|
||||||
|
| **httpbis** | 5 | 5 | 2.0 | 4.8 | 3.2 | 4.2 | 3.0 |
|
||||||
|
| **tsv** | 4 | 4 | 2.8 | 3.8 | 2.2 | 3.0 | 3.0 |
|
||||||
|
| **tls** | 4 | 4 | 3.2 | 4.0 | 2.0 | 4.5 | 5.0 |
|
||||||
|
| **sshm** | 3 | 3 | 2.0 | 4.3 | 2.0 | 3.7 | 3.7 |
|
||||||
|
| **idr** | 3 | 3 | 2.7 | 3.0 | 2.7 | 3.7 | 3.0 |
|
||||||
|
| **dnsop** | 3 | 3 | 3.0 | 3.7 | 1.7 | 3.7 | 3.0 |
|
||||||
|
| **app** | 3 | 3 | 2.0 | 3.7 | 2.0 | 2.0 | 2.0 |
|
||||||
|
| **anima** | 3 | 4 | 3.0 | 4.3 | 2.3 | 3.7 | 3.7 |
|
||||||
|
| **sml** | 2 | 2 | 3.0 | 3.5 | 2.0 | 3.0 | 3.0 |
|
||||||
|
| **nmrg** | 2 | 2 | 3.0 | 3.0 | 3.5 | 3.0 | 3.5 |
|
||||||
|
| **hpke** | 2 | 2 | 3.5 | 4.5 | 2.0 | 4.5 | 5.0 |
|
||||||
|
| **dtn** | 2 | 2 | 3.0 | 4.0 | 1.0 | 3.5 | 2.5 |
|
||||||
|
| **ace** | 2 | 2 | 3.5 | 4.0 | 3.0 | 4.0 | 4.0 |
|
||||||
|
| **websec** | 1 | 1 | 3.0 | 4.0 | 2.0 | 4.0 | 4.0 |
|
||||||
|
| **vwrap** | 1 | 1 | 4.0 | 3.0 | 2.0 | 3.0 | 4.0 |
|
||||||
|
| **suit** | 1 | 1 | 3.0 | 4.0 | 2.0 | 4.0 | 4.0 |
|
||||||
|
| **sip** | 1 | 1 | 3.0 | 4.0 | 2.0 | 4.0 | 3.0 |
|
||||||
|
| **sec** | 1 | 1 | 2.0 | 5.0 | 4.0 | 3.0 | 4.0 |
|
||||||
|
| **roll** | 1 | 1 | 2.0 | 4.0 | 3.0 | 3.0 | 3.0 |
|
||||||
|
| **pim** | 1 | 2 | 2.0 | 3.0 | 3.0 | 3.0 | 2.0 |
|
||||||
|
| **netconf** | 1 | 1 | 3.0 | 4.0 | 2.0 | 4.0 | 4.0 |
|
||||||
|
| **mailmaint** | 1 | 1 | 2.0 | 4.0 | 4.0 | 2.0 | 2.0 |
|
||||||
|
| **lisp** | 1 | 1 | 4.0 | 4.0 | 2.0 | 4.0 | 4.0 |
|
||||||
|
| **grow** | 1 | 1 | 4.0 | 4.0 | 2.0 | 4.0 | 5.0 |
|
||||||
|
| **core** | 1 | 1 | 3.0 | 4.0 | 1.0 | 4.0 | 4.0 |
|
||||||
|
|
||||||
|
## Cross-WG Category Spread
|
||||||
|
|
||||||
|
Categories appearing in multiple WGs — potential coordination or alignment needed.
|
||||||
|
|
||||||
|
| Category | WG Count | Total Drafts | WGs |
|
||||||
|
|:---------|:--------:|-------------:|:----|
|
||||||
|
| Data formats/interop | 23 | 174 | aipref(8), lamps(6), lake(4), httpbis(3), sml(2), sshm(2), hpke(2), lisp(1), mailmaint(1), nmrg(1), ace(1), suit(1), tls(1), anima(1), netconf(1), pim(1), dtn(1), websec(1), app(1), emu(1), core(1), sec(1) |
|
||||||
|
| Agent identity/auth | 13 | 152 | lake(8), emu(6), anima(3), lamps(3), sshm(2), ace(2), hpke(2), sml(1), vwrap(1), aipref(1), core(1), sec(1) |
|
||||||
|
| A2A protocols | 9 | 155 | idr(3), lake(2), lisp(1), ace(1), aipref(1), sip(1), vwrap(1), dtn(1) |
|
||||||
|
| Autonomous netops | 9 | 114 | anima(2), dnsop(2), lisp(1), roll(1), nmrg(1), netconf(1), dtn(1), grow(1) |
|
||||||
|
| Policy/governance | 9 | 108 | aipref(9), lamps(2), dnsop(2), lake(1), tls(1), websec(1), httpbis(1), idr(1) |
|
||||||
|
| Agent discovery/reg | 8 | 89 | lake(2), roll(1), pim(1), sip(1), aipref(1), app(1), anima(1) |
|
||||||
|
| Other AI/agent | 6 | 34 | tsv(3), httpbis(2), tls(2), app(2), dnsop(1) |
|
||||||
|
| ML traffic mgmt | 5 | 79 | nmrg(1), tsv(1), aipref(1), grow(1) |
|
||||||
|
| Human-agent interaction | 4 | 33 | aipref(3), nmrg(1), vwrap(1) |
|
||||||
|
| AI safety/alignment | 3 | 47 | aipref(2), sml(1) |
|
||||||
|
| Model serving/inference | 2 | 42 | nmrg(1) |
|
||||||
|
|
||||||
|
## Cross-WG Idea Overlap
|
||||||
|
|
||||||
|
Same technical ideas appearing in different WGs — strongest signals for alignment.
|
||||||
|
|
||||||
|
### Hybrid Post-Quantum Cryptography for EAP-AKA' (1 WGs: emu)
|
||||||
|
|
||||||
|
- **[emu]** [draft-ar-emu-hybrid-pqc-eapaka](https://datatracker.ietf.org/doc/draft-ar-emu-hybrid-pqc-eapaka/) — Enhancing Security in EAP-AKA' with Hybrid Post-Quantum Cryptography
|
||||||
|
|
||||||
|
|
||||||
|
## Individual vs WG-Adopted Distribution
|
||||||
|
|
||||||
|
| Category | Individual | WG-Adopted | Assessment |
|
||||||
|
|:---------|----------:|-----------:|:-----------|
|
||||||
|
| A2A protocols | 144 | 11 | WG exists — individual drafts could target it |
|
||||||
|
| AI safety/alignment | 44 | 3 | WG exists — individual drafts could target it |
|
||||||
|
| Agent discovery/reg | 81 | 8 | WG exists — individual drafts could target it |
|
||||||
|
| Agent identity/auth | 121 | 31 | WG exists — individual drafts could target it |
|
||||||
|
| Autonomous netops | 104 | 10 | WG exists — individual drafts could target it |
|
||||||
|
| Data formats/interop | 132 | 42 | WG exists — individual drafts could target it |
|
||||||
|
| Human-agent interaction | 29 | 5 | WG exists — individual drafts could target it |
|
||||||
|
| ML traffic mgmt | 75 | 4 | WG exists — individual drafts could target it |
|
||||||
|
| Model serving/inference | 41 | 1 | WG exists — individual drafts could target it |
|
||||||
|
| Other AI/agent | 24 | 10 | WG exists — individual drafts could target it |
|
||||||
|
| Policy/governance | 91 | 18 | WG exists — individual drafts could target it |
|
||||||
|
|
||||||
|
## Recommended Submission Targets
|
||||||
|
|
||||||
|
For each category, the best WG to submit new work to.
|
||||||
|
|
||||||
|
| Category | Best WG | Alternatives |
|
||||||
|
|:---------|:--------|:-------------|
|
||||||
|
| Data formats/interop | **aipref** | lamps(6), lake(4) |
|
||||||
|
| Agent identity/auth | **lake** | emu(6), anima(3) |
|
||||||
|
| A2A protocols | **idr** | lake(2), lisp(1) |
|
||||||
|
| Autonomous netops | **anima** | dnsop(2), lisp(1) |
|
||||||
|
| Policy/governance | **aipref** | lamps(2), dnsop(2) |
|
||||||
|
| Agent discovery/reg | **lake** | roll(1), pim(1) |
|
||||||
|
| Other AI/agent | **tsv** | httpbis(2), tls(2) |
|
||||||
|
| ML traffic mgmt | **nmrg** | tsv(1), aipref(1) |
|
||||||
|
| Human-agent interaction | **aipref** | nmrg(1), vwrap(1) |
|
||||||
|
| AI safety/alignment | **aipref** | sml(1) |
|
||||||
|
| Model serving/inference | **nmrg** | - |
|
||||||
66
scripts/backfill-wg-names.py
Normal file
66
scripts/backfill-wg-names.py
Normal file
@@ -0,0 +1,66 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Backfill working group names by resolving group_uri from Datatracker API."""
|
||||||
|
|
||||||
|
import sqlite3
|
||||||
|
import time
|
||||||
|
import httpx
|
||||||
|
|
||||||
|
DB_PATH = "data/drafts.db"
|
||||||
|
|
||||||
|
conn = sqlite3.connect(DB_PATH)
|
||||||
|
conn.row_factory = sqlite3.Row
|
||||||
|
|
||||||
|
# Get distinct group_uris that don't have a group name yet
|
||||||
|
rows = conn.execute("""
|
||||||
|
SELECT DISTINCT group_uri FROM drafts
|
||||||
|
WHERE group_uri IS NOT NULL AND group_uri != ''
|
||||||
|
AND ("group" IS NULL OR "group" = '')
|
||||||
|
""").fetchall()
|
||||||
|
|
||||||
|
uris = [r["group_uri"] for r in rows]
|
||||||
|
print(f"Resolving {len(uris)} unique group URIs...")
|
||||||
|
|
||||||
|
client = httpx.Client(timeout=30, follow_redirects=True)
|
||||||
|
resolved = {}
|
||||||
|
|
||||||
|
for uri in uris:
|
||||||
|
try:
|
||||||
|
resp = client.get(f"https://datatracker.ietf.org{uri}", params={"format": "json"})
|
||||||
|
resp.raise_for_status()
|
||||||
|
data = resp.json()
|
||||||
|
acronym = data.get("acronym", "")
|
||||||
|
name = data.get("name", "")
|
||||||
|
resolved[uri] = acronym or name or ""
|
||||||
|
print(f" {uri} -> {resolved[uri]} ({name})")
|
||||||
|
time.sleep(0.3)
|
||||||
|
except Exception as e:
|
||||||
|
print(f" {uri} -> ERROR: {e}")
|
||||||
|
resolved[uri] = ""
|
||||||
|
|
||||||
|
client.close()
|
||||||
|
|
||||||
|
# Update the database
|
||||||
|
for uri, group_name in resolved.items():
|
||||||
|
if group_name:
|
||||||
|
conn.execute(
|
||||||
|
'UPDATE drafts SET "group" = ? WHERE group_uri = ?',
|
||||||
|
(group_name, uri),
|
||||||
|
)
|
||||||
|
|
||||||
|
conn.commit()
|
||||||
|
|
||||||
|
# Show summary
|
||||||
|
rows = conn.execute("""
|
||||||
|
SELECT "group", COUNT(*) as cnt FROM drafts
|
||||||
|
WHERE "group" IS NOT NULL AND "group" != ''
|
||||||
|
GROUP BY "group" ORDER BY cnt DESC
|
||||||
|
""").fetchall()
|
||||||
|
|
||||||
|
print(f"\nWorking groups resolved ({len(rows)} groups):")
|
||||||
|
for r in rows:
|
||||||
|
print(f" {r[0]:30s} {r[1]} drafts")
|
||||||
|
|
||||||
|
total = conn.execute('SELECT COUNT(*) FROM drafts WHERE "group" IS NOT NULL AND "group" != ""').fetchone()[0]
|
||||||
|
none_count = conn.execute('SELECT COUNT(*) FROM drafts WHERE "group" IS NULL OR "group" = ""').fetchone()[0]
|
||||||
|
print(f"\nTotal with WG: {total}, individual/unresolved: {none_count}")
|
||||||
|
conn.close()
|
||||||
39
scripts/classify-unrated.py
Normal file
39
scripts/classify-unrated.py
Normal file
@@ -0,0 +1,39 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Classify unrated drafts using Ollama two-stage filter."""
|
||||||
|
|
||||||
|
import sqlite3
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, "src")
|
||||||
|
|
||||||
|
from ietf_analyzer.classifier import Classifier
|
||||||
|
from ietf_analyzer.config import Config
|
||||||
|
|
||||||
|
cfg = Config.load()
|
||||||
|
conn = sqlite3.connect(cfg.db_path)
|
||||||
|
conn.row_factory = sqlite3.Row
|
||||||
|
|
||||||
|
# Get unrated drafts
|
||||||
|
rows = conn.execute("""
|
||||||
|
SELECT name, title, abstract, source FROM drafts
|
||||||
|
WHERE name NOT IN (SELECT draft_name FROM ratings)
|
||||||
|
ORDER BY source, name
|
||||||
|
""").fetchall()
|
||||||
|
|
||||||
|
drafts = [dict(r) for r in rows]
|
||||||
|
print(f"Classifying {len(drafts)} unrated drafts...\n")
|
||||||
|
|
||||||
|
with Classifier(cfg) as clf:
|
||||||
|
relevant, irrelevant = clf.classify_batch(drafts, verbose=True)
|
||||||
|
|
||||||
|
print(f"\n--- RELEVANT ({len(relevant)}) ---")
|
||||||
|
for d in relevant:
|
||||||
|
print(f" [{d['source']}] {d['name']}")
|
||||||
|
print(f" {d['title'][:100]}")
|
||||||
|
|
||||||
|
print(f"\n--- IRRELEVANT ({len(irrelevant)}) ---")
|
||||||
|
for d in irrelevant:
|
||||||
|
print(f" [{d['source']}] {d['name']}")
|
||||||
|
print(f" {d['title'][:100]}")
|
||||||
|
|
||||||
|
print(f"\nSummary: {len(relevant)} relevant, {len(irrelevant)} irrelevant out of {len(drafts)}")
|
||||||
|
conn.close()
|
||||||
86
scripts/compare-classifiers.py
Normal file
86
scripts/compare-classifiers.py
Normal file
@@ -0,0 +1,86 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Compare Ollama classifier vs Claude ratings to find disagreements."""
|
||||||
|
|
||||||
|
import sqlite3
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, "src")
|
||||||
|
|
||||||
|
from ietf_analyzer.classifier import Classifier
|
||||||
|
from ietf_analyzer.config import Config
|
||||||
|
|
||||||
|
cfg = Config.load()
|
||||||
|
conn = sqlite3.connect(cfg.db_path)
|
||||||
|
conn.row_factory = sqlite3.Row
|
||||||
|
|
||||||
|
# Get all rated drafts with their Claude ratings
|
||||||
|
rows = conn.execute("""
|
||||||
|
SELECT d.name, d.title, d.abstract, r.relevance, r.false_positive,
|
||||||
|
r.novelty, r.maturity, r.overlap, r.momentum,
|
||||||
|
(r.novelty + r.maturity + (5 - r.overlap) + r.momentum + r.relevance) / 5.0 as composite
|
||||||
|
FROM drafts d JOIN ratings r ON d.name = r.draft_name
|
||||||
|
WHERE d.abstract IS NOT NULL AND d.abstract != ''
|
||||||
|
ORDER BY d.name
|
||||||
|
""").fetchall()
|
||||||
|
|
||||||
|
print(f"Comparing Ollama classifier vs Claude ratings on {len(rows)} drafts...\n")
|
||||||
|
|
||||||
|
with Classifier(cfg) as clf:
|
||||||
|
agree = 0
|
||||||
|
disagree_ollama_yes_claude_no = [] # Ollama says relevant, Claude says FP
|
||||||
|
disagree_ollama_no_claude_yes = [] # Ollama says irrelevant, Claude says relevant
|
||||||
|
|
||||||
|
for i, r in enumerate(rows):
|
||||||
|
is_rel, sim, method = clf.classify(r["title"], r["abstract"])
|
||||||
|
|
||||||
|
# Claude's view: false_positive=1 OR relevance<=2 means "not really relevant"
|
||||||
|
claude_relevant = not r["false_positive"] and r["relevance"] >= 3
|
||||||
|
|
||||||
|
if is_rel == claude_relevant:
|
||||||
|
agree += 1
|
||||||
|
elif is_rel and not claude_relevant:
|
||||||
|
disagree_ollama_yes_claude_no.append({
|
||||||
|
"name": r["name"], "title": r["title"][:60],
|
||||||
|
"sim": sim, "method": method,
|
||||||
|
"relevance": r["relevance"], "fp": r["false_positive"],
|
||||||
|
"composite": r["composite"],
|
||||||
|
})
|
||||||
|
else:
|
||||||
|
disagree_ollama_no_claude_yes.append({
|
||||||
|
"name": r["name"], "title": r["title"][:60],
|
||||||
|
"sim": sim, "method": method,
|
||||||
|
"relevance": r["relevance"], "fp": r["false_positive"],
|
||||||
|
"composite": r["composite"],
|
||||||
|
})
|
||||||
|
|
||||||
|
if (i + 1) % 50 == 0:
|
||||||
|
print(f" Processed {i+1}/{len(rows)}...")
|
||||||
|
|
||||||
|
print(f"\n{'='*70}")
|
||||||
|
print(f"AGREEMENT: {agree}/{len(rows)} ({100*agree/len(rows):.1f}%)")
|
||||||
|
print(f"{'='*70}")
|
||||||
|
|
||||||
|
print(f"\nOllama=RELEVANT but Claude=NOT relevant ({len(disagree_ollama_yes_claude_no)}):")
|
||||||
|
print(f" (These are cases where Ollama wastes Claude tokens on irrelevant drafts)")
|
||||||
|
for d in sorted(disagree_ollama_yes_claude_no, key=lambda x: x["sim"], reverse=True)[:15]:
|
||||||
|
fp_label = " [FP]" if d["fp"] else ""
|
||||||
|
print(f" sim={d['sim']:.3f} ({d['method']:18s}) rel={d['relevance']}{fp_label} | {d['name']}")
|
||||||
|
print(f" {d['title']}")
|
||||||
|
|
||||||
|
print(f"\nOllama=IRRELEVANT but Claude=RELEVANT ({len(disagree_ollama_no_claude_yes)}):")
|
||||||
|
print(f" (These are cases where Ollama would have incorrectly filtered out good drafts)")
|
||||||
|
for d in sorted(disagree_ollama_no_claude_yes, key=lambda x: x["relevance"], reverse=True)[:15]:
|
||||||
|
print(f" sim={d['sim']:.3f} ({d['method']:18s}) rel={d['relevance']} comp={d['composite']:.1f} | {d['name']}")
|
||||||
|
print(f" {d['title']}")
|
||||||
|
|
||||||
|
# Summary stats
|
||||||
|
total_fp_by_claude = sum(1 for r in rows if r["false_positive"] or r["relevance"] <= 2)
|
||||||
|
total_relevant_by_claude = len(rows) - total_fp_by_claude
|
||||||
|
print(f"\n{'='*70}")
|
||||||
|
print(f"Claude thinks: {total_relevant_by_claude} relevant, {total_fp_by_claude} not relevant")
|
||||||
|
print(f"Ollama would let through: {agree + len(disagree_ollama_yes_claude_no) - len(disagree_ollama_no_claude_yes)} (saves {len(disagree_ollama_no_claude_yes) - len(disagree_ollama_yes_claude_no)} Claude calls)")
|
||||||
|
print(f"\nToken savings if Ollama pre-filters:")
|
||||||
|
print(f" Correctly rejected: {agree - total_relevant_by_claude + len(rows) - agree - len(disagree_ollama_yes_claude_no)} drafts")
|
||||||
|
print(f" Incorrectly rejected (missed): {len(disagree_ollama_no_claude_yes)} drafts")
|
||||||
|
print(f" Incorrectly passed (wasted): {len(disagree_ollama_yes_claude_no)} drafts")
|
||||||
|
|
||||||
|
conn.close()
|
||||||
65
scripts/download-relevant-text.py
Normal file
65
scripts/download-relevant-text.py
Normal file
@@ -0,0 +1,65 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Download full text for the 9 classifier-relevant unrated drafts."""
|
||||||
|
|
||||||
|
import sqlite3
|
||||||
|
import time
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, "src")
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
from ietf_analyzer.config import Config
|
||||||
|
|
||||||
|
cfg = Config.load()
|
||||||
|
conn = sqlite3.connect(cfg.db_path)
|
||||||
|
conn.row_factory = sqlite3.Row
|
||||||
|
|
||||||
|
# The 9 relevant drafts from classifier
|
||||||
|
relevant_names = [
|
||||||
|
"draft-bondar-wca",
|
||||||
|
"draft-latour-pre-registration",
|
||||||
|
"draft-li-trustworthy-routing-discovery",
|
||||||
|
"draft-scrm-aiproto-usecases",
|
||||||
|
"draft-song-dmsc-problem-statement",
|
||||||
|
"draft-wiethuechter-drip-det-moc",
|
||||||
|
"draft-wiethuechter-drip-det-tada",
|
||||||
|
"draft-zzn-dvs",
|
||||||
|
"w3c-cuap",
|
||||||
|
]
|
||||||
|
|
||||||
|
client = httpx.Client(timeout=30, follow_redirects=True)
|
||||||
|
|
||||||
|
for name in relevant_names:
|
||||||
|
row = conn.execute("SELECT name, rev, source, source_url, full_text FROM drafts WHERE name=?", (name,)).fetchone()
|
||||||
|
if not row:
|
||||||
|
print(f" SKIP {name}: not in DB")
|
||||||
|
continue
|
||||||
|
if row["full_text"]:
|
||||||
|
print(f" SKIP {name}: already has text")
|
||||||
|
continue
|
||||||
|
|
||||||
|
if row["source"] == "w3c":
|
||||||
|
url = row["source_url"] or ""
|
||||||
|
if not url:
|
||||||
|
print(f" SKIP {name}: no source_url for W3C doc")
|
||||||
|
continue
|
||||||
|
else:
|
||||||
|
rev = row["rev"] or "00"
|
||||||
|
url = f"https://www.ietf.org/archive/id/{name}-{rev}.txt"
|
||||||
|
|
||||||
|
print(f" Fetching {name} from {url}...")
|
||||||
|
try:
|
||||||
|
resp = client.get(url)
|
||||||
|
if resp.status_code == 200:
|
||||||
|
text = resp.text[:500000] # cap at 500K
|
||||||
|
conn.execute("UPDATE drafts SET full_text=? WHERE name=?", (text, name))
|
||||||
|
conn.commit()
|
||||||
|
print(f" OK ({len(text)} chars)")
|
||||||
|
else:
|
||||||
|
print(f" FAIL: HTTP {resp.status_code}")
|
||||||
|
except Exception as e:
|
||||||
|
print(f" ERROR: {e}")
|
||||||
|
time.sleep(0.5)
|
||||||
|
|
||||||
|
client.close()
|
||||||
|
conn.close()
|
||||||
|
print("\nDone.")
|
||||||
8
scripts/run-webui.sh
Executable file
8
scripts/run-webui.sh
Executable file
@@ -0,0 +1,8 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
# Start the IETF Draft Analyzer Web Dashboard
|
||||||
|
#
|
||||||
|
# Usage:
|
||||||
|
# ./scripts/run-webui.sh # Production (admin disabled)
|
||||||
|
# ./scripts/run-webui.sh --dev # Development (admin enabled)
|
||||||
|
cd "$(dirname "$0")/.."
|
||||||
|
python src/webui/app.py "$@"
|
||||||
182
src/ietf_analyzer/classifier.py
Normal file
182
src/ietf_analyzer/classifier.py
Normal file
@@ -0,0 +1,182 @@
|
|||||||
|
"""Local AI-relevance classifier using Ollama.
|
||||||
|
|
||||||
|
Two-stage filter to avoid spending Claude tokens on irrelevant drafts:
|
||||||
|
1. Embedding similarity — fast cosine check against a reference description
|
||||||
|
2. Chat classification — small local model for borderline cases
|
||||||
|
|
||||||
|
Both stages run locally via Ollama (zero cost).
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
import ollama as ollama_lib
|
||||||
|
from rich.console import Console
|
||||||
|
|
||||||
|
from .config import Config
|
||||||
|
|
||||||
|
console = Console()
|
||||||
|
|
||||||
|
# Reference description of what we're looking for.
|
||||||
|
# Embedding of this text is compared against each draft's abstract.
|
||||||
|
REFERENCE_DESCRIPTION = """
|
||||||
|
AI agent protocols, autonomous agent communication, agent-to-agent interaction,
|
||||||
|
agent identity and authentication, agent authorization, agent discovery,
|
||||||
|
large language model integration with network protocols, agentic systems,
|
||||||
|
machine learning for network operations, AI safety in networked systems,
|
||||||
|
model context protocol, multi-agent coordination, agent task delegation,
|
||||||
|
generative AI infrastructure, intelligent network automation,
|
||||||
|
trustworthy AI systems, AI governance in standards.
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Thresholds for the two-stage filter (calibrated against 434 drafts + 73 FPs)
|
||||||
|
# TP avg similarity: 0.685, FP avg: 0.598
|
||||||
|
SIMILARITY_ACCEPT = 0.72 # Above this: definitely relevant, skip chat
|
||||||
|
SIMILARITY_REJECT = 0.50 # Below this: definitely irrelevant, skip chat
|
||||||
|
# Between REJECT and ACCEPT: borderline, use chat model to decide
|
||||||
|
|
||||||
|
CLASSIFY_PROMPT = """\
|
||||||
|
You are classifying IETF Internet-Drafts for an AI/agent standards tracker.
|
||||||
|
|
||||||
|
A draft is RELEVANT if it relates to ANY of these topics:
|
||||||
|
- AI agents, autonomous agents, multi-agent systems
|
||||||
|
- Agent identity, authentication, authorization, discovery
|
||||||
|
- Agent-to-agent (A2A) communication protocols
|
||||||
|
- Large language models (LLMs), generative AI
|
||||||
|
- Machine learning in network operations
|
||||||
|
- AI safety, alignment, trustworthiness
|
||||||
|
- Model Context Protocol (MCP), agentic workflows
|
||||||
|
- OAuth/JWT/credentials for agents or AI systems
|
||||||
|
- Autonomous network operations using AI
|
||||||
|
- Intelligent network management or traffic handling
|
||||||
|
|
||||||
|
A draft is NOT relevant if it only covers:
|
||||||
|
- Pure cryptography without AI/agent context
|
||||||
|
- General networking protocols (BGP, DNS, TLS) without AI
|
||||||
|
- Email, HTTP, or web standards without AI/agent features
|
||||||
|
|
||||||
|
Title: {title}
|
||||||
|
|
||||||
|
Abstract: {abstract}
|
||||||
|
|
||||||
|
Is this draft relevant to AI agents or related topics? Answer ONLY "yes" or "no"."""
|
||||||
|
|
||||||
|
|
||||||
|
class Classifier:
|
||||||
|
"""Classify drafts as AI-relevant using local Ollama models."""
|
||||||
|
|
||||||
|
def __init__(self, config: Config | None = None):
|
||||||
|
self.config = config or Config.load()
|
||||||
|
self.client = ollama_lib.Client(host=self.config.ollama_url)
|
||||||
|
self._ref_embedding: np.ndarray | None = None
|
||||||
|
|
||||||
|
def close(self) -> None:
|
||||||
|
if hasattr(self.client, '_client'):
|
||||||
|
self.client._client.close()
|
||||||
|
|
||||||
|
def __enter__(self):
|
||||||
|
return self
|
||||||
|
|
||||||
|
def __exit__(self, *exc):
|
||||||
|
self.close()
|
||||||
|
|
||||||
|
def _get_reference_embedding(self) -> np.ndarray:
|
||||||
|
"""Get (cached) embedding of the reference AI description."""
|
||||||
|
if self._ref_embedding is None:
|
||||||
|
resp = self.client.embed(
|
||||||
|
model=self.config.ollama_embed_model,
|
||||||
|
input=REFERENCE_DESCRIPTION.strip(),
|
||||||
|
)
|
||||||
|
self._ref_embedding = np.array(resp["embeddings"][0], dtype=np.float32)
|
||||||
|
return self._ref_embedding
|
||||||
|
|
||||||
|
def _embed(self, text: str) -> np.ndarray:
|
||||||
|
"""Embed a text string."""
|
||||||
|
resp = self.client.embed(
|
||||||
|
model=self.config.ollama_embed_model,
|
||||||
|
input=text[:8000],
|
||||||
|
)
|
||||||
|
return np.array(resp["embeddings"][0], dtype=np.float32)
|
||||||
|
|
||||||
|
def _cosine_similarity(self, a: np.ndarray, b: np.ndarray) -> float:
|
||||||
|
dot = np.dot(a, b)
|
||||||
|
norm = np.linalg.norm(a) * np.linalg.norm(b)
|
||||||
|
return float(dot / norm) if norm > 0 else 0.0
|
||||||
|
|
||||||
|
def _chat_classify(self, title: str, abstract: str) -> bool:
|
||||||
|
"""Ask local chat model whether a draft is AI-related."""
|
||||||
|
prompt = CLASSIFY_PROMPT.format(title=title, abstract=abstract[:2000])
|
||||||
|
try:
|
||||||
|
resp = self.client.chat(
|
||||||
|
model=self.config.ollama_classify_model,
|
||||||
|
messages=[{"role": "user", "content": prompt}],
|
||||||
|
options={"temperature": 0.0, "num_predict": 10},
|
||||||
|
)
|
||||||
|
answer = resp["message"]["content"].strip().lower()
|
||||||
|
return answer.startswith("yes")
|
||||||
|
except Exception as e:
|
||||||
|
console.print(f"[yellow]Chat classify failed: {e}, defaulting to relevant[/yellow]")
|
||||||
|
return True # err on the side of inclusion
|
||||||
|
|
||||||
|
def classify(self, title: str, abstract: str) -> tuple[bool, float, str]:
|
||||||
|
"""Classify a draft as AI-relevant.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
(is_relevant, similarity_score, method)
|
||||||
|
method is one of: "embedding_accept", "embedding_reject", "chat_yes", "chat_no"
|
||||||
|
"""
|
||||||
|
text = f"{title}\n{abstract}"
|
||||||
|
ref = self._get_reference_embedding()
|
||||||
|
emb = self._embed(text)
|
||||||
|
sim = self._cosine_similarity(emb, ref)
|
||||||
|
|
||||||
|
if sim >= SIMILARITY_ACCEPT:
|
||||||
|
return True, sim, "embedding_accept"
|
||||||
|
if sim <= SIMILARITY_REJECT:
|
||||||
|
return False, sim, "embedding_reject"
|
||||||
|
|
||||||
|
# Borderline — ask chat model
|
||||||
|
is_relevant = self._chat_classify(title, abstract)
|
||||||
|
method = "chat_yes" if is_relevant else "chat_no"
|
||||||
|
return is_relevant, sim, method
|
||||||
|
|
||||||
|
def classify_batch(
|
||||||
|
self, drafts: list[dict], verbose: bool = True
|
||||||
|
) -> tuple[list[dict], list[dict]]:
|
||||||
|
"""Classify a batch of drafts.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
drafts: list of dicts with at least 'name', 'title', 'abstract' keys
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
(relevant, irrelevant) — two lists of draft dicts
|
||||||
|
"""
|
||||||
|
relevant = []
|
||||||
|
irrelevant = []
|
||||||
|
stats = {"embedding_accept": 0, "embedding_reject": 0, "chat_yes": 0, "chat_no": 0}
|
||||||
|
|
||||||
|
for i, d in enumerate(drafts):
|
||||||
|
is_rel, sim, method = self.classify(
|
||||||
|
d.get("title", ""), d.get("abstract", "")
|
||||||
|
)
|
||||||
|
stats[method] += 1
|
||||||
|
|
||||||
|
if verbose and (i + 1) % 10 == 0:
|
||||||
|
console.print(f" Classified {i + 1}/{len(drafts)}...")
|
||||||
|
|
||||||
|
if is_rel:
|
||||||
|
relevant.append(d)
|
||||||
|
else:
|
||||||
|
irrelevant.append(d)
|
||||||
|
|
||||||
|
if verbose:
|
||||||
|
console.print(
|
||||||
|
f"\n [green]Relevant: {len(relevant)}[/green] "
|
||||||
|
f"[red]Irrelevant: {len(irrelevant)}[/red]\n"
|
||||||
|
f" Embedding accept: {stats['embedding_accept']} "
|
||||||
|
f" Embedding reject: {stats['embedding_reject']}\n"
|
||||||
|
f" Chat yes: {stats['chat_yes']} "
|
||||||
|
f" Chat no: {stats['chat_no']}"
|
||||||
|
)
|
||||||
|
|
||||||
|
return relevant, irrelevant
|
||||||
@@ -297,8 +297,9 @@ class Database:
|
|||||||
def upsert_draft(self, draft: Draft) -> None:
|
def upsert_draft(self, draft: Draft) -> None:
|
||||||
self.conn.execute(
|
self.conn.execute(
|
||||||
"""INSERT INTO drafts (name, rev, title, abstract, time, dt_id, pages, words,
|
"""INSERT INTO drafts (name, rev, title, abstract, time, dt_id, pages, words,
|
||||||
"group", group_uri, expires, ad, shepherd, states, full_text, categories, tags, fetched_at)
|
"group", group_uri, expires, ad, shepherd, states, full_text, categories, tags, fetched_at,
|
||||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
source, source_id, source_url, doc_status)
|
||||||
|
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||||
ON CONFLICT(name) DO UPDATE SET
|
ON CONFLICT(name) DO UPDATE SET
|
||||||
rev=excluded.rev, title=excluded.title, abstract=excluded.abstract,
|
rev=excluded.rev, title=excluded.title, abstract=excluded.abstract,
|
||||||
time=excluded.time, dt_id=excluded.dt_id, pages=excluded.pages,
|
time=excluded.time, dt_id=excluded.dt_id, pages=excluded.pages,
|
||||||
@@ -307,7 +308,9 @@ class Database:
|
|||||||
states=excluded.states,
|
states=excluded.states,
|
||||||
full_text=COALESCE(excluded.full_text, full_text),
|
full_text=COALESCE(excluded.full_text, full_text),
|
||||||
categories=excluded.categories, tags=excluded.tags,
|
categories=excluded.categories, tags=excluded.tags,
|
||||||
fetched_at=excluded.fetched_at
|
fetched_at=excluded.fetched_at,
|
||||||
|
source=excluded.source, source_id=excluded.source_id,
|
||||||
|
source_url=excluded.source_url, doc_status=excluded.doc_status
|
||||||
""",
|
""",
|
||||||
(
|
(
|
||||||
draft.name, draft.rev, draft.title, draft.abstract, draft.time,
|
draft.name, draft.rev, draft.title, draft.abstract, draft.time,
|
||||||
@@ -316,6 +319,7 @@ class Database:
|
|||||||
json.dumps(draft.states), draft.full_text,
|
json.dumps(draft.states), draft.full_text,
|
||||||
json.dumps(draft.categories), json.dumps(draft.tags),
|
json.dumps(draft.categories), json.dumps(draft.tags),
|
||||||
draft.fetched_at or datetime.now(timezone.utc).isoformat(),
|
draft.fetched_at or datetime.now(timezone.utc).isoformat(),
|
||||||
|
draft.source, draft.source_id, draft.source_url, draft.doc_status,
|
||||||
),
|
),
|
||||||
)
|
)
|
||||||
self.conn.commit()
|
self.conn.commit()
|
||||||
|
|||||||
244
src/webui/analytics.py
Normal file
244
src/webui/analytics.py
Normal file
@@ -0,0 +1,244 @@
|
|||||||
|
"""Lightweight, GDPR-compliant analytics using SQLite.
|
||||||
|
|
||||||
|
No cookies, no personal data, no consent needed.
|
||||||
|
Visitor uniqueness is estimated via daily-salted IP hash (not stored raw).
|
||||||
|
Data lives in a separate analytics.db to keep the main DB clean.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import hashlib
|
||||||
|
import sqlite3
|
||||||
|
import time
|
||||||
|
from collections import Counter, defaultdict
|
||||||
|
from datetime import date, datetime, timedelta
|
||||||
|
from pathlib import Path
|
||||||
|
from urllib.parse import urlparse
|
||||||
|
|
||||||
|
from flask import Flask, request, g
|
||||||
|
|
||||||
|
_SCHEMA = """
|
||||||
|
CREATE TABLE IF NOT EXISTS page_views (
|
||||||
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||||
|
ts TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%S', 'now')),
|
||||||
|
date TEXT NOT NULL DEFAULT (strftime('%Y-%m-%d', 'now')),
|
||||||
|
path TEXT NOT NULL,
|
||||||
|
referrer TEXT,
|
||||||
|
visitor TEXT,
|
||||||
|
ua_type TEXT
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_pv_date ON page_views(date);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_pv_path ON page_views(path);
|
||||||
|
|
||||||
|
CREATE TABLE IF NOT EXISTS downloads (
|
||||||
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||||
|
ts TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%S', 'now')),
|
||||||
|
date TEXT NOT NULL DEFAULT (strftime('%Y-%m-%d', 'now')),
|
||||||
|
file_type TEXT NOT NULL,
|
||||||
|
visitor TEXT
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_dl_date ON downloads(date);
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Daily salt rotates so yesterday's hashes can't be correlated with today's
|
||||||
|
_daily_salt: tuple[str, str] = ("", "")
|
||||||
|
|
||||||
|
|
||||||
|
def _get_salt() -> str:
|
||||||
|
global _daily_salt
|
||||||
|
today = date.today().isoformat()
|
||||||
|
if _daily_salt[0] != today:
|
||||||
|
_daily_salt = (today, hashlib.sha256(f"ietf-analytics-{today}".encode()).hexdigest()[:16])
|
||||||
|
return _daily_salt[1]
|
||||||
|
|
||||||
|
|
||||||
|
def _hash_visitor(ip: str) -> str:
|
||||||
|
"""Create a daily-rotating hash from IP. Cannot be reversed or correlated across days."""
|
||||||
|
salt = _get_salt()
|
||||||
|
return hashlib.sha256(f"{salt}:{ip}".encode()).hexdigest()[:12]
|
||||||
|
|
||||||
|
|
||||||
|
def _classify_ua(ua: str) -> str:
|
||||||
|
"""Rough bot/browser classification."""
|
||||||
|
ua_lower = ua.lower()
|
||||||
|
if any(b in ua_lower for b in ("bot", "spider", "crawl", "slurp", "wget", "curl", "python-requests")):
|
||||||
|
return "bot"
|
||||||
|
if "mobile" in ua_lower:
|
||||||
|
return "mobile"
|
||||||
|
return "browser"
|
||||||
|
|
||||||
|
|
||||||
|
def _get_analytics_db() -> sqlite3.Connection:
|
||||||
|
"""Get or create the analytics DB connection for this request."""
|
||||||
|
if "analytics_db" not in g:
|
||||||
|
db_path = Path(request.app_root or ".").parent.parent.parent / "data" / "analytics.db"
|
||||||
|
# Fall back to app config if available
|
||||||
|
if hasattr(g, "_analytics_db_path"):
|
||||||
|
db_path = g._analytics_db_path
|
||||||
|
db_path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
conn = sqlite3.connect(str(db_path))
|
||||||
|
conn.row_factory = sqlite3.Row
|
||||||
|
conn.executescript(_SCHEMA)
|
||||||
|
g.analytics_db = conn
|
||||||
|
return g.analytics_db
|
||||||
|
|
||||||
|
|
||||||
|
# Paths to skip (static assets, API calls, etc.)
|
||||||
|
_SKIP_PREFIXES = ("/static/", "/api/", "/favicon", "/robots.txt", "/admin/")
|
||||||
|
|
||||||
|
|
||||||
|
def init_analytics(app: Flask, db_path: str | None = None):
|
||||||
|
"""Register analytics hooks on the Flask app."""
|
||||||
|
|
||||||
|
_resolved_db_path = Path(db_path) if db_path else (
|
||||||
|
Path(app.root_path).parent.parent / "data" / "analytics.db"
|
||||||
|
)
|
||||||
|
|
||||||
|
@app.before_request
|
||||||
|
def track_pageview():
|
||||||
|
path = request.path
|
||||||
|
|
||||||
|
# Skip static/API/admin routes
|
||||||
|
if any(path.startswith(p) for p in _SKIP_PREFIXES):
|
||||||
|
return
|
||||||
|
|
||||||
|
g._analytics_db_path = _resolved_db_path
|
||||||
|
|
||||||
|
try:
|
||||||
|
conn = _get_analytics_db()
|
||||||
|
ip = request.remote_addr or "unknown"
|
||||||
|
visitor = _hash_visitor(ip)
|
||||||
|
ua = request.headers.get("User-Agent", "")
|
||||||
|
ua_type = _classify_ua(ua)
|
||||||
|
|
||||||
|
# Skip bots from page view counts (still track downloads)
|
||||||
|
if ua_type == "bot" and path != "/export/obsidian":
|
||||||
|
return
|
||||||
|
|
||||||
|
referrer = request.headers.get("Referer", "")
|
||||||
|
# Only keep the domain of referrer
|
||||||
|
if referrer:
|
||||||
|
try:
|
||||||
|
parsed = urlparse(referrer)
|
||||||
|
referrer = parsed.netloc or ""
|
||||||
|
except Exception:
|
||||||
|
referrer = ""
|
||||||
|
|
||||||
|
# Track downloads separately
|
||||||
|
if path == "/export/obsidian":
|
||||||
|
conn.execute(
|
||||||
|
"INSERT INTO downloads (file_type, visitor) VALUES (?, ?)",
|
||||||
|
("obsidian", visitor),
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
|
||||||
|
conn.execute(
|
||||||
|
"INSERT INTO page_views (path, referrer, visitor, ua_type) VALUES (?, ?, ?, ?)",
|
||||||
|
(path, referrer, visitor, ua_type),
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
except Exception:
|
||||||
|
pass # Analytics should never break the app
|
||||||
|
|
||||||
|
@app.teardown_appcontext
|
||||||
|
def close_analytics_db(exception=None):
|
||||||
|
conn = g.pop("analytics_db", None)
|
||||||
|
if conn is not None:
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
|
||||||
|
def get_analytics_data(db_path: str | Path) -> dict:
|
||||||
|
"""Query analytics data for the dashboard. Returns dicts ready for rendering."""
|
||||||
|
conn = sqlite3.connect(str(db_path))
|
||||||
|
conn.row_factory = sqlite3.Row
|
||||||
|
conn.executescript(_SCHEMA)
|
||||||
|
|
||||||
|
today = date.today()
|
||||||
|
week_ago = (today - timedelta(days=7)).isoformat()
|
||||||
|
month_ago = (today - timedelta(days=30)).isoformat()
|
||||||
|
|
||||||
|
# --- Overall stats ---
|
||||||
|
total_views = conn.execute("SELECT COUNT(*) FROM page_views").fetchone()[0]
|
||||||
|
total_visitors = conn.execute("SELECT COUNT(DISTINCT visitor || date) FROM page_views").fetchone()[0]
|
||||||
|
total_downloads = conn.execute("SELECT COUNT(*) FROM downloads").fetchone()[0]
|
||||||
|
|
||||||
|
today_views = conn.execute(
|
||||||
|
"SELECT COUNT(*) FROM page_views WHERE date = ?", (today.isoformat(),)
|
||||||
|
).fetchone()[0]
|
||||||
|
today_visitors = conn.execute(
|
||||||
|
"SELECT COUNT(DISTINCT visitor) FROM page_views WHERE date = ?", (today.isoformat(),)
|
||||||
|
).fetchone()[0]
|
||||||
|
|
||||||
|
week_views = conn.execute(
|
||||||
|
"SELECT COUNT(*) FROM page_views WHERE date >= ?", (week_ago,)
|
||||||
|
).fetchone()[0]
|
||||||
|
month_views = conn.execute(
|
||||||
|
"SELECT COUNT(*) FROM page_views WHERE date >= ?", (month_ago,)
|
||||||
|
).fetchone()[0]
|
||||||
|
|
||||||
|
# --- Daily views (last 30 days) ---
|
||||||
|
daily_rows = conn.execute(
|
||||||
|
"SELECT date, COUNT(*) as views, COUNT(DISTINCT visitor) as visitors "
|
||||||
|
"FROM page_views WHERE date >= ? GROUP BY date ORDER BY date",
|
||||||
|
(month_ago,),
|
||||||
|
).fetchall()
|
||||||
|
daily = {
|
||||||
|
"dates": [r["date"] for r in daily_rows],
|
||||||
|
"views": [r["views"] for r in daily_rows],
|
||||||
|
"visitors": [r["visitors"] for r in daily_rows],
|
||||||
|
}
|
||||||
|
|
||||||
|
# --- Top pages (last 30 days) ---
|
||||||
|
page_rows = conn.execute(
|
||||||
|
"SELECT path, COUNT(*) as views, COUNT(DISTINCT visitor) as visitors "
|
||||||
|
"FROM page_views WHERE date >= ? GROUP BY path ORDER BY views DESC LIMIT 20",
|
||||||
|
(month_ago,),
|
||||||
|
).fetchall()
|
||||||
|
top_pages = [{"path": r["path"], "views": r["views"], "visitors": r["visitors"]} for r in page_rows]
|
||||||
|
|
||||||
|
# --- Top referrers (last 30 days) ---
|
||||||
|
ref_rows = conn.execute(
|
||||||
|
"SELECT referrer, COUNT(*) as count FROM page_views "
|
||||||
|
"WHERE date >= ? AND referrer != '' GROUP BY referrer ORDER BY count DESC LIMIT 15",
|
||||||
|
(month_ago,),
|
||||||
|
).fetchall()
|
||||||
|
top_referrers = [{"referrer": r["referrer"], "count": r["count"]} for r in ref_rows]
|
||||||
|
|
||||||
|
# --- Downloads over time ---
|
||||||
|
dl_rows = conn.execute(
|
||||||
|
"SELECT date, COUNT(*) as count FROM downloads GROUP BY date ORDER BY date"
|
||||||
|
).fetchall()
|
||||||
|
downloads_daily = {
|
||||||
|
"dates": [r["date"] for r in dl_rows],
|
||||||
|
"counts": [r["count"] for r in dl_rows],
|
||||||
|
}
|
||||||
|
|
||||||
|
# --- Hourly pattern (last 7 days) ---
|
||||||
|
hourly_rows = conn.execute(
|
||||||
|
"SELECT CAST(strftime('%H', ts) AS INTEGER) as hour, COUNT(*) as views "
|
||||||
|
"FROM page_views WHERE date >= ? GROUP BY hour ORDER BY hour",
|
||||||
|
(week_ago,),
|
||||||
|
).fetchall()
|
||||||
|
hourly = {r["hour"]: r["views"] for r in hourly_rows}
|
||||||
|
hourly_full = {"hours": list(range(24)), "views": [hourly.get(h, 0) for h in range(24)]}
|
||||||
|
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
return {
|
||||||
|
"stats": {
|
||||||
|
"total_views": total_views,
|
||||||
|
"total_visitors": total_visitors,
|
||||||
|
"total_downloads": total_downloads,
|
||||||
|
"today_views": today_views,
|
||||||
|
"today_visitors": today_visitors,
|
||||||
|
"week_views": week_views,
|
||||||
|
"month_views": month_views,
|
||||||
|
},
|
||||||
|
"daily": daily,
|
||||||
|
"top_pages": top_pages,
|
||||||
|
"top_referrers": top_referrers,
|
||||||
|
"downloads_daily": downloads_daily,
|
||||||
|
"hourly": hourly_full,
|
||||||
|
}
|
||||||
55
src/webui/auth.py
Normal file
55
src/webui/auth.py
Normal file
@@ -0,0 +1,55 @@
|
|||||||
|
"""Admin authentication with two run modes.
|
||||||
|
|
||||||
|
Production (default):
|
||||||
|
python src/webui/app.py
|
||||||
|
All admin routes return 404. No way to access private features.
|
||||||
|
|
||||||
|
Development:
|
||||||
|
python src/webui/app.py --dev
|
||||||
|
Every request is auto-authenticated as admin. No login needed.
|
||||||
|
|
||||||
|
The mode is set once at startup and cannot be changed at runtime.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from functools import wraps
|
||||||
|
|
||||||
|
from flask import abort, g
|
||||||
|
|
||||||
|
# Module-level flag set by init_auth()
|
||||||
|
_dev_mode: bool = False
|
||||||
|
_initialized: bool = False
|
||||||
|
|
||||||
|
|
||||||
|
def is_admin() -> bool:
|
||||||
|
"""Check if the current request has admin access."""
|
||||||
|
return _dev_mode
|
||||||
|
|
||||||
|
|
||||||
|
def admin_required(f):
|
||||||
|
"""Decorator: returns 404 for non-admin users so routes stay hidden."""
|
||||||
|
@wraps(f)
|
||||||
|
def decorated(*args, **kwargs):
|
||||||
|
if not is_admin():
|
||||||
|
abort(404)
|
||||||
|
return f(*args, **kwargs)
|
||||||
|
return decorated
|
||||||
|
|
||||||
|
|
||||||
|
def init_auth(app, dev: bool = False):
|
||||||
|
"""Set the auth mode and register Flask hooks (once only)."""
|
||||||
|
global _dev_mode, _initialized
|
||||||
|
_dev_mode = dev
|
||||||
|
|
||||||
|
if _initialized:
|
||||||
|
return
|
||||||
|
_initialized = True
|
||||||
|
|
||||||
|
@app.before_request
|
||||||
|
def set_admin_flag():
|
||||||
|
g.is_admin = is_admin()
|
||||||
|
|
||||||
|
@app.context_processor
|
||||||
|
def inject_admin():
|
||||||
|
return {"is_admin": g.get("is_admin", False)}
|
||||||
508
src/webui/obsidian_export.py
Normal file
508
src/webui/obsidian_export.py
Normal file
@@ -0,0 +1,508 @@
|
|||||||
|
"""Export research data as an Obsidian-compatible vault (ZIP).
|
||||||
|
|
||||||
|
Generates interlinked markdown files with YAML frontmatter,
|
||||||
|
[[wikilinks]], #tags, and Mermaid diagrams that Obsidian renders natively.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import io
|
||||||
|
import zipfile
|
||||||
|
from collections import Counter, defaultdict
|
||||||
|
from datetime import date
|
||||||
|
|
||||||
|
from ietf_analyzer.db import Database
|
||||||
|
|
||||||
|
from webui.data import _extract_month
|
||||||
|
|
||||||
|
|
||||||
|
def _safe_filename(name: str) -> str:
|
||||||
|
"""Sanitize a string for use as a filename."""
|
||||||
|
return name.replace("/", "-").replace("\\", "-").replace(":", "-").replace('"', "")
|
||||||
|
|
||||||
|
|
||||||
|
def _score_bar(val: float, max_val: float = 5.0) -> str:
|
||||||
|
"""Render a simple text progress bar."""
|
||||||
|
filled = round(val / max_val * 10)
|
||||||
|
return "`" + "\u2588" * filled + "\u2591" * (10 - filled) + f"` {val}/{max_val}"
|
||||||
|
|
||||||
|
|
||||||
|
def _mermaid_pie(title: str, data: dict[str, int], limit: int = 12) -> str:
|
||||||
|
"""Generate a Mermaid pie chart."""
|
||||||
|
items = list(data.items())[:limit]
|
||||||
|
if not items:
|
||||||
|
return ""
|
||||||
|
lines = [f'```mermaid\npie title {title}']
|
||||||
|
for label, count in items:
|
||||||
|
safe_label = label.replace('"', "'")
|
||||||
|
lines.append(f' "{safe_label}" : {count}')
|
||||||
|
lines.append("```")
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def _mermaid_bar(title: str, data: dict[str, float], limit: int = 15) -> str:
|
||||||
|
"""Generate a Mermaid xychart bar chart."""
|
||||||
|
items = list(data.items())[:limit]
|
||||||
|
if not items:
|
||||||
|
return ""
|
||||||
|
labels = [f'"{k[:20]}"' for k, _ in items]
|
||||||
|
values = [str(round(v, 1)) for _, v in items]
|
||||||
|
return f"""```mermaid
|
||||||
|
xychart-beta
|
||||||
|
title "{title}"
|
||||||
|
x-axis [{", ".join(labels)}]
|
||||||
|
y-axis "Score"
|
||||||
|
bar [{", ".join(values)}]
|
||||||
|
```"""
|
||||||
|
|
||||||
|
|
||||||
|
def _mermaid_timeline_chart(monthly: dict[str, int]) -> str:
|
||||||
|
"""Generate a Mermaid xychart for submissions over time."""
|
||||||
|
if len(monthly) < 2:
|
||||||
|
return ""
|
||||||
|
months = sorted(monthly.keys())
|
||||||
|
# Show every 3rd label to avoid clutter
|
||||||
|
labels = []
|
||||||
|
for i, m in enumerate(months):
|
||||||
|
if i % 3 == 0:
|
||||||
|
labels.append(f'"{m}"')
|
||||||
|
else:
|
||||||
|
labels.append('" "')
|
||||||
|
values = [str(monthly[m]) for m in months]
|
||||||
|
return f"""```mermaid
|
||||||
|
xychart-beta
|
||||||
|
title "Draft Submissions Over Time"
|
||||||
|
x-axis [{", ".join(labels)}]
|
||||||
|
y-axis "Drafts"
|
||||||
|
bar [{", ".join(values)}]
|
||||||
|
```"""
|
||||||
|
|
||||||
|
|
||||||
|
def build_obsidian_vault(db: Database) -> bytes:
|
||||||
|
"""Build a ZIP file containing an Obsidian vault with all research data."""
|
||||||
|
buf = io.BytesIO()
|
||||||
|
prefix = "IETF-AI-Agent-Drafts"
|
||||||
|
|
||||||
|
pairs = db.drafts_with_ratings(limit=2000)
|
||||||
|
all_drafts_list = db.list_drafts(limit=2000, order_by="time DESC")
|
||||||
|
draft_map = {d.name: d for d in all_drafts_list}
|
||||||
|
all_ideas = db.all_ideas()
|
||||||
|
all_authors = db.top_authors(limit=500)
|
||||||
|
|
||||||
|
# Build lookup maps
|
||||||
|
cat_counts: Counter = Counter()
|
||||||
|
cat_drafts: dict[str, list[str]] = defaultdict(list)
|
||||||
|
score_map: dict[str, float] = {}
|
||||||
|
rating_map: dict[str, object] = {}
|
||||||
|
|
||||||
|
for d, r in pairs:
|
||||||
|
score_map[d.name] = r.composite_score
|
||||||
|
rating_map[d.name] = r
|
||||||
|
for cat in r.categories:
|
||||||
|
cat_counts[cat] += 1
|
||||||
|
cat_drafts[cat].append(d.name)
|
||||||
|
|
||||||
|
# Monthly submission counts
|
||||||
|
monthly: Counter = Counter()
|
||||||
|
for d in all_drafts_list:
|
||||||
|
monthly[_extract_month(d.time)] += 1
|
||||||
|
|
||||||
|
# Ideas by draft
|
||||||
|
ideas_by_draft: dict[str, list[dict]] = defaultdict(list)
|
||||||
|
for idea in all_ideas:
|
||||||
|
ideas_by_draft[idea.get("draft_name", "")].append(idea)
|
||||||
|
|
||||||
|
# Author info by draft
|
||||||
|
author_drafts: dict[str, list[str]] = defaultdict(list)
|
||||||
|
author_info: dict[str, dict] = {}
|
||||||
|
for name, aff, cnt, drafts in all_authors:
|
||||||
|
author_info[name] = {"affiliation": aff or "", "draft_count": cnt, "drafts": drafts}
|
||||||
|
for dn in drafts:
|
||||||
|
author_drafts[dn].append(name)
|
||||||
|
|
||||||
|
with zipfile.ZipFile(buf, "w", zipfile.ZIP_DEFLATED) as zf:
|
||||||
|
|
||||||
|
# --- Dashboard.md ---
|
||||||
|
top_rated = sorted(pairs, key=lambda p: p[1].composite_score, reverse=True)[:15]
|
||||||
|
top_table = "| Draft | Score | Category |\n|---|---|---|\n"
|
||||||
|
for d, r in top_rated:
|
||||||
|
score = r.composite_score
|
||||||
|
cat = r.categories[0] if r.categories else ""
|
||||||
|
top_table += f"| [[{d.name}]] | **{score:.2f}** | {cat} |\n"
|
||||||
|
|
||||||
|
cat_pie = _mermaid_pie("Drafts by Category", dict(cat_counts.most_common(12)))
|
||||||
|
timeline_chart = _mermaid_timeline_chart(dict(sorted(monthly.items())))
|
||||||
|
|
||||||
|
# Score distribution as mermaid
|
||||||
|
score_buckets: Counter = Counter()
|
||||||
|
for _, r in pairs:
|
||||||
|
bucket = f"{r.composite_score:.0f}"
|
||||||
|
score_buckets[bucket] += 1
|
||||||
|
score_dist = dict(sorted(score_buckets.items()))
|
||||||
|
|
||||||
|
dashboard = f"""---
|
||||||
|
tags: [dashboard, ietf, ai-agents]
|
||||||
|
generated: {date.today().isoformat()}
|
||||||
|
---
|
||||||
|
|
||||||
|
# IETF AI/Agent Draft Analysis
|
||||||
|
|
||||||
|
> Automated analysis of {len(all_drafts_list)} Internet-Drafts on AI and agent topics.
|
||||||
|
> Generated by [IETF Draft Analyzer](https://github.com) on {date.today().isoformat()}.
|
||||||
|
|
||||||
|
## Key Stats
|
||||||
|
|
||||||
|
| Metric | Value |
|
||||||
|
|---|---|
|
||||||
|
| Total Drafts | **{len(all_drafts_list)}** |
|
||||||
|
| Rated Drafts | **{len(pairs)}** |
|
||||||
|
| Authors | **{len(all_authors)}** |
|
||||||
|
| Ideas Extracted | **{len(all_ideas)}** |
|
||||||
|
| Categories | **{len(cat_counts)}** |
|
||||||
|
|
||||||
|
## Categories
|
||||||
|
|
||||||
|
{cat_pie}
|
||||||
|
|
||||||
|
### Category Index
|
||||||
|
|
||||||
|
{chr(10).join(f"- [[{cat}]] ({count} drafts)" for cat, count in cat_counts.most_common())}
|
||||||
|
|
||||||
|
## Submissions Over Time
|
||||||
|
|
||||||
|
{timeline_chart}
|
||||||
|
|
||||||
|
## Top Rated Drafts
|
||||||
|
|
||||||
|
{top_table}
|
||||||
|
|
||||||
|
## Navigation
|
||||||
|
|
||||||
|
- **[[Categories/index|Categories]]** — Browse by topic
|
||||||
|
- **[[Authors/index|Authors]]** — Browse by author
|
||||||
|
- **[[Analysis/Score Distribution|Score Distribution]]** — Rating analytics
|
||||||
|
- **[[Analysis/Top Rated|Top Rated]]** — Highest-scored drafts
|
||||||
|
- **[[Analysis/Ideas Overview|Ideas]]** — Extracted technical ideas
|
||||||
|
- **[[Analysis/Glossary|Glossary]]** — Terms, abbreviations, and scoring methodology
|
||||||
|
"""
|
||||||
|
zf.writestr(f"{prefix}/Dashboard.md", dashboard)
|
||||||
|
|
||||||
|
# --- Individual Draft Notes ---
|
||||||
|
for d_obj in all_drafts_list:
|
||||||
|
name = d_obj.name
|
||||||
|
draft = draft_map.get(name, d_obj)
|
||||||
|
r = rating_map.get(name)
|
||||||
|
ideas = ideas_by_draft.get(name, [])
|
||||||
|
authors = author_drafts.get(name, [])
|
||||||
|
month = _extract_month(draft.time)
|
||||||
|
|
||||||
|
# Frontmatter
|
||||||
|
fm_lines = [
|
||||||
|
"---",
|
||||||
|
f'title: "{(draft.title or name).replace(chr(34), chr(39))}"',
|
||||||
|
f"date: {draft.time or 'unknown'}",
|
||||||
|
f"rev: {draft.rev or '00'}",
|
||||||
|
]
|
||||||
|
if r:
|
||||||
|
fm_lines.append(f"score: {r.composite_score:.2f}")
|
||||||
|
fm_lines.append(f"novelty: {r.novelty}")
|
||||||
|
fm_lines.append(f"maturity: {r.maturity}")
|
||||||
|
fm_lines.append(f"overlap: {r.overlap}")
|
||||||
|
fm_lines.append(f"momentum: {r.momentum}")
|
||||||
|
fm_lines.append(f"relevance: {r.relevance}")
|
||||||
|
if r.categories:
|
||||||
|
fm_lines.append(f"categories: [{', '.join(r.categories)}]")
|
||||||
|
if authors:
|
||||||
|
fm_lines.append(f"authors: [{', '.join(a.replace(',', '') for a in authors)}]")
|
||||||
|
fm_lines.append(f"tags: [draft, ietf, {month}]")
|
||||||
|
fm_lines.append("---")
|
||||||
|
frontmatter = "\n".join(fm_lines)
|
||||||
|
|
||||||
|
# Body
|
||||||
|
body = f"\n# {draft.title or name}\n\n"
|
||||||
|
body += f"**{name}** | rev {draft.rev or '00'} | {draft.time or 'unknown'}\n\n"
|
||||||
|
|
||||||
|
if authors:
|
||||||
|
body += "## Authors\n\n"
|
||||||
|
body += "\n".join(f"- [[{a}]]" for a in authors) + "\n\n"
|
||||||
|
|
||||||
|
if r:
|
||||||
|
body += "## Rating\n\n"
|
||||||
|
body += f"**Composite Score: {r.composite_score:.2f}**\n\n"
|
||||||
|
body += f"| Dimension | Score |\n|---|---|\n"
|
||||||
|
body += f"| Novelty | {_score_bar(r.novelty)} |\n"
|
||||||
|
body += f"| Maturity | {_score_bar(r.maturity)} |\n"
|
||||||
|
body += f"| Overlap | {_score_bar(r.overlap)} |\n"
|
||||||
|
body += f"| Momentum | {_score_bar(r.momentum)} |\n"
|
||||||
|
body += f"| Relevance | {_score_bar(r.relevance)} |\n\n"
|
||||||
|
if r.summary:
|
||||||
|
body += f"> {r.summary}\n\n"
|
||||||
|
if r.categories:
|
||||||
|
body += "**Categories:** " + ", ".join(f"[[{c}]]" for c in r.categories) + "\n\n"
|
||||||
|
|
||||||
|
if draft.abstract:
|
||||||
|
body += "## Abstract\n\n"
|
||||||
|
body += draft.abstract + "\n\n"
|
||||||
|
|
||||||
|
if ideas:
|
||||||
|
body += f"## Extracted Ideas ({len(ideas)})\n\n"
|
||||||
|
for idea in ideas:
|
||||||
|
novelty = f" `N:{idea.get('novelty_score', '?')}`" if idea.get("novelty_score") else ""
|
||||||
|
itype = f" *{idea.get('type', '')}*" if idea.get("type") else ""
|
||||||
|
body += f"- **{idea.get('title', 'Untitled')}**{itype}{novelty}\n"
|
||||||
|
if idea.get("description"):
|
||||||
|
body += f" {idea['description']}\n"
|
||||||
|
body += "\n"
|
||||||
|
|
||||||
|
body += "## Links\n\n"
|
||||||
|
body += f"- [View on IETF Datatracker](https://datatracker.ietf.org/doc/{name}/)\n"
|
||||||
|
if draft.rev:
|
||||||
|
body += f"- [Read Full Text](https://www.ietf.org/archive/id/{name}-{draft.rev}.txt)\n"
|
||||||
|
|
||||||
|
content = frontmatter + body
|
||||||
|
zf.writestr(f"{prefix}/Drafts/{_safe_filename(name)}.md", content)
|
||||||
|
|
||||||
|
# --- Author Notes ---
|
||||||
|
author_index_lines = [
|
||||||
|
"---\ntags: [index, authors]\n---\n",
|
||||||
|
"# Authors\n\n",
|
||||||
|
f"**{len(all_authors)}** authors contributing to AI/agent Internet-Drafts.\n\n",
|
||||||
|
"| Author | Affiliation | Drafts |\n|---|---|---|\n",
|
||||||
|
]
|
||||||
|
for name, aff, cnt, drafts in sorted(all_authors, key=lambda x: x[2], reverse=True):
|
||||||
|
author_index_lines.append(f"| [[{name}]] | {aff or ''} | {cnt} |\n")
|
||||||
|
zf.writestr(f"{prefix}/Authors/index.md", "".join(author_index_lines))
|
||||||
|
|
||||||
|
for name, aff, cnt, drafts in all_authors:
|
||||||
|
fm = f"---\ntags: [author]\naffiliation: \"{aff or ''}\"\ndraft_count: {cnt}\n---\n"
|
||||||
|
body = f"\n# {name}\n\n"
|
||||||
|
if aff:
|
||||||
|
body += f"**Affiliation:** {aff}\n\n"
|
||||||
|
body += f"## Drafts ({cnt})\n\n"
|
||||||
|
for dn in drafts:
|
||||||
|
d = draft_map.get(dn)
|
||||||
|
title = d.title if d else dn
|
||||||
|
score = score_map.get(dn, "")
|
||||||
|
score_str = f" (score: {score:.2f})" if score else ""
|
||||||
|
body += f"- [[{dn}|{title}]]{score_str}\n"
|
||||||
|
|
||||||
|
# Co-authors
|
||||||
|
coauthors: Counter = Counter()
|
||||||
|
for dn in drafts:
|
||||||
|
for other in author_drafts.get(dn, []):
|
||||||
|
if other != name:
|
||||||
|
coauthors[other] += 1
|
||||||
|
if coauthors:
|
||||||
|
body += f"\n## Co-authors\n\n"
|
||||||
|
for co, shared in coauthors.most_common(20):
|
||||||
|
body += f"- [[{co}]] ({shared} shared)\n"
|
||||||
|
|
||||||
|
zf.writestr(f"{prefix}/Authors/{_safe_filename(name)}.md", fm + body)
|
||||||
|
|
||||||
|
# --- Category Notes ---
|
||||||
|
cat_index_lines = [
|
||||||
|
"---\ntags: [index, categories]\n---\n",
|
||||||
|
"# Categories\n\n",
|
||||||
|
_mermaid_pie("Draft Distribution", dict(cat_counts.most_common(12))),
|
||||||
|
"\n\n",
|
||||||
|
]
|
||||||
|
for cat, count in cat_counts.most_common():
|
||||||
|
cat_index_lines.append(f"- [[{cat}]] — {count} drafts\n")
|
||||||
|
zf.writestr(f"{prefix}/Categories/index.md", "".join(cat_index_lines))
|
||||||
|
|
||||||
|
for cat, count in cat_counts.most_common():
|
||||||
|
fm = f"---\ntags: [category]\ndraft_count: {count}\n---\n"
|
||||||
|
body = f"\n# {cat}\n\n"
|
||||||
|
body += f"**{count} drafts** in this category.\n\n"
|
||||||
|
|
||||||
|
# Table of drafts sorted by score
|
||||||
|
draft_names = cat_drafts[cat]
|
||||||
|
scored = [(dn, score_map.get(dn, 0)) for dn in draft_names]
|
||||||
|
scored.sort(key=lambda x: x[1], reverse=True)
|
||||||
|
|
||||||
|
body += "| Draft | Score |\n|---|---|\n"
|
||||||
|
for dn, score in scored:
|
||||||
|
d = draft_map.get(dn)
|
||||||
|
title = d.title[:60] if d else dn
|
||||||
|
body += f"| [[{dn}|{title}]] | {score:.2f} |\n"
|
||||||
|
|
||||||
|
zf.writestr(f"{prefix}/Categories/{_safe_filename(cat)}.md", fm + body)
|
||||||
|
|
||||||
|
# --- Analysis Notes ---
|
||||||
|
|
||||||
|
# Score Distribution
|
||||||
|
score_lines = [
|
||||||
|
"---\ntags: [analysis]\n---\n",
|
||||||
|
"\n# Score Distribution\n\n",
|
||||||
|
"Composite scores across all rated drafts (1.0–5.0 scale).\n\n",
|
||||||
|
]
|
||||||
|
# Mermaid bar chart of score buckets
|
||||||
|
buckets: dict[str, int] = defaultdict(int)
|
||||||
|
for _, r in pairs:
|
||||||
|
b = f"{r.composite_score:.1f}"
|
||||||
|
buckets[b] += 1
|
||||||
|
sorted_buckets = dict(sorted(buckets.items()))
|
||||||
|
if sorted_buckets:
|
||||||
|
labels = [f'"{k}"' for k in sorted_buckets.keys()]
|
||||||
|
values = [str(v) for v in sorted_buckets.values()]
|
||||||
|
score_lines.append(f"""```mermaid
|
||||||
|
xychart-beta
|
||||||
|
title "Score Distribution"
|
||||||
|
x-axis [{", ".join(labels)}]
|
||||||
|
y-axis "Count"
|
||||||
|
bar [{", ".join(values)}]
|
||||||
|
```\n\n""")
|
||||||
|
|
||||||
|
# Dimension averages
|
||||||
|
dims = {"Novelty": [], "Maturity": [], "Overlap": [], "Momentum": [], "Relevance": []}
|
||||||
|
for _, r in pairs:
|
||||||
|
dims["Novelty"].append(r.novelty)
|
||||||
|
dims["Maturity"].append(r.maturity)
|
||||||
|
dims["Overlap"].append(r.overlap)
|
||||||
|
dims["Momentum"].append(r.momentum)
|
||||||
|
dims["Relevance"].append(r.relevance)
|
||||||
|
score_lines.append("## Dimension Averages\n\n")
|
||||||
|
score_lines.append("| Dimension | Average | Min | Max |\n|---|---|---|---|\n")
|
||||||
|
for dim, vals in dims.items():
|
||||||
|
if vals:
|
||||||
|
avg = sum(vals) / len(vals)
|
||||||
|
score_lines.append(f"| {dim} | {avg:.2f} | {min(vals)} | {max(vals)} |\n")
|
||||||
|
|
||||||
|
zf.writestr(f"{prefix}/Analysis/Score Distribution.md", "".join(score_lines))
|
||||||
|
|
||||||
|
# Top Rated
|
||||||
|
top_lines = [
|
||||||
|
"---\ntags: [analysis]\n---\n",
|
||||||
|
"\n# Top Rated Drafts\n\n",
|
||||||
|
"Drafts ranked by composite score.\n\n",
|
||||||
|
"| # | Draft | Score | Novelty | Maturity | Overlap | Momentum | Relevance | Category |\n",
|
||||||
|
"|---|---|---|---|---|---|---|---|---|\n",
|
||||||
|
]
|
||||||
|
for i, (d, r) in enumerate(top_rated[:30], 1):
|
||||||
|
cat = r.categories[0] if r.categories else ""
|
||||||
|
top_lines.append(
|
||||||
|
f"| {i} | [[{d.name}|{(d.title or d.name)[:45]}]] | **{r.composite_score:.2f}** | "
|
||||||
|
f"{r.novelty} | {r.maturity} | {r.overlap} | {r.momentum} | {r.relevance} | {cat} |\n"
|
||||||
|
)
|
||||||
|
zf.writestr(f"{prefix}/Analysis/Top Rated.md", "".join(top_lines))
|
||||||
|
|
||||||
|
# Ideas Overview
|
||||||
|
type_counts = Counter(i.get("type", "other") or "other" for i in all_ideas)
|
||||||
|
ideas_lines = [
|
||||||
|
"---\ntags: [analysis, ideas]\n---\n",
|
||||||
|
f"\n# Extracted Ideas\n\n",
|
||||||
|
f"**{len(all_ideas)}** technical ideas extracted from rated drafts.\n\n",
|
||||||
|
_mermaid_pie("Ideas by Type", dict(type_counts.most_common(10))),
|
||||||
|
"\n\n## By Type\n\n",
|
||||||
|
]
|
||||||
|
for itype, count in type_counts.most_common():
|
||||||
|
ideas_lines.append(f"- **{itype}**: {count} ideas\n")
|
||||||
|
|
||||||
|
ideas_lines.append(f"\n## Recent Ideas\n\n")
|
||||||
|
for idea in all_ideas[:50]:
|
||||||
|
dn = idea.get("draft_name", "")
|
||||||
|
novelty = f" `N:{idea.get('novelty_score')}`" if idea.get("novelty_score") else ""
|
||||||
|
ideas_lines.append(f"- **{idea.get('title', 'Untitled')}**{novelty} — [[{dn}]]\n")
|
||||||
|
if len(all_ideas) > 50:
|
||||||
|
ideas_lines.append(f"\n*...and {len(all_ideas) - 50} more. See individual draft notes.*\n")
|
||||||
|
|
||||||
|
zf.writestr(f"{prefix}/Analysis/Ideas Overview.md", "".join(ideas_lines))
|
||||||
|
|
||||||
|
# Timeline
|
||||||
|
timeline_lines = [
|
||||||
|
"---\ntags: [analysis, timeline]\n---\n",
|
||||||
|
"\n# Timeline\n\n",
|
||||||
|
"Draft submission activity over time.\n\n",
|
||||||
|
_mermaid_timeline_chart(dict(sorted(monthly.items()))),
|
||||||
|
"\n\n## Monthly Counts\n\n",
|
||||||
|
"| Month | Drafts |\n|---|---|\n",
|
||||||
|
]
|
||||||
|
for m in sorted(monthly.keys()):
|
||||||
|
timeline_lines.append(f"| {m} | {monthly[m]} |\n")
|
||||||
|
zf.writestr(f"{prefix}/Analysis/Timeline.md", "".join(timeline_lines))
|
||||||
|
|
||||||
|
# --- Glossary ---
|
||||||
|
glossary = """---
|
||||||
|
tags: [reference, glossary]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Glossary
|
||||||
|
|
||||||
|
Reference for all terms, abbreviations, and scoring dimensions used in this vault.
|
||||||
|
|
||||||
|
## Scoring Dimensions
|
||||||
|
|
||||||
|
Each draft is rated by Claude AI on five dimensions, scored from 1 (lowest) to 5 (highest).
|
||||||
|
|
||||||
|
| Dimension | Description |
|
||||||
|
|---|---|
|
||||||
|
| **Novelty** | How original is this draft? Does it introduce new ideas, or rehash existing approaches? High = genuinely new contribution. |
|
||||||
|
| **Maturity** | How complete and well-developed is the specification? High = detailed protocol, clear data formats, ready for implementation. Low = early sketch or position paper. |
|
||||||
|
| **Overlap** | How much does this draft duplicate existing work? High overlap (5) = very similar to other drafts. Low overlap (1) = unique in the landscape. *Note: In composite score, this is inverted (5 - overlap) so lower overlap contributes positively.* |
|
||||||
|
| **Momentum** | Is this draft gaining traction? High = active revisions, working group adoption, multiple authors/organizations. Low = single submission, no updates. |
|
||||||
|
| **Relevance** | How relevant is this draft to AI agent infrastructure? High = directly addresses agent-to-agent communication, identity, authorization. Low = tangentially related. |
|
||||||
|
|
||||||
|
## Composite Score
|
||||||
|
|
||||||
|
The **composite score** (1.0–5.0) is calculated as:
|
||||||
|
|
||||||
|
```
|
||||||
|
score = (novelty + maturity + (5 - overlap) + momentum + relevance) / 5
|
||||||
|
```
|
||||||
|
|
||||||
|
Overlap is inverted because a *lower* overlap is better (more unique).
|
||||||
|
|
||||||
|
## Score Bars
|
||||||
|
|
||||||
|
Score bars visualize ratings: `\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2591\u2591\u2591` = 3.5/5.0
|
||||||
|
|
||||||
|
- `\u2588` (filled) = earned score
|
||||||
|
- `\u2591` (empty) = remaining
|
||||||
|
|
||||||
|
## Other Terms
|
||||||
|
|
||||||
|
| Term | Meaning |
|
||||||
|
|---|---|
|
||||||
|
| **Draft / I-D** | Internet-Draft — a working document submitted to the IETF. Not yet an RFC (standard). |
|
||||||
|
| **RFC** | Request for Comments — a published IETF standard or informational document. |
|
||||||
|
| **Working Group (WG)** | An IETF group chartered to work on a specific topic (e.g., WIMSE, OAuth). |
|
||||||
|
| **Category** | Topic classification assigned by Claude during analysis (e.g., "A2A protocols", "AI safety/alignment"). A draft can belong to multiple categories. |
|
||||||
|
| **Idea** | A distinct technical concept extracted from a draft by Claude. Each idea has a type (protocol, mechanism, framework, etc.) and a novelty score. |
|
||||||
|
| **Novelty Score (N:1–5)** | Per-idea originality rating. Shown as `N:4` next to ideas. 5 = completely new concept, 1 = well-known approach. |
|
||||||
|
| **Gap** | An area identified where no existing draft adequately addresses a need in the AI agent ecosystem. |
|
||||||
|
| **Affiliation** | The organization an author is associated with (from IETF Datatracker records). |
|
||||||
|
| **Co-authorship** | Two authors who appear together on at least one draft. |
|
||||||
|
| **Datatracker** | The IETF's official system for tracking Internet-Drafts, RFCs, and working groups (datatracker.ietf.org). |
|
||||||
|
"""
|
||||||
|
zf.writestr(f"{prefix}/Analysis/Glossary.md", glossary)
|
||||||
|
|
||||||
|
# --- .obsidian settings for graph colors ---
|
||||||
|
graph_json = """{
|
||||||
|
"collapse-filter": false,
|
||||||
|
"search": "",
|
||||||
|
"showTags": true,
|
||||||
|
"showAttachments": false,
|
||||||
|
"hideUnresolved": false,
|
||||||
|
"showOrphans": true,
|
||||||
|
"collapse-color-groups": false,
|
||||||
|
"colorGroups": [
|
||||||
|
{"query": "path:Drafts", "color": {"a": 1, "rgb": 3444735}},
|
||||||
|
{"query": "path:Authors", "color": {"a": 1, "rgb": 10092441}},
|
||||||
|
{"query": "path:Categories", "color": {"a": 1, "rgb": 16744448}},
|
||||||
|
{"query": "path:Analysis", "color": {"a": 1, "rgb": 2293541}}
|
||||||
|
],
|
||||||
|
"collapse-display": false,
|
||||||
|
"showArrow": true,
|
||||||
|
"textFadeMultiplier": 0,
|
||||||
|
"nodeSizeMultiplier": 1.2,
|
||||||
|
"lineSizeMultiplier": 1,
|
||||||
|
"collapse-forces": true,
|
||||||
|
"centerStrength": 0.5,
|
||||||
|
"repelStrength": 10,
|
||||||
|
"linkStrength": 1,
|
||||||
|
"linkDistance": 100
|
||||||
|
}"""
|
||||||
|
zf.writestr(f"{prefix}/.obsidian/graph.json", graph_json)
|
||||||
|
|
||||||
|
buf.seek(0)
|
||||||
|
return buf.getvalue()
|
||||||
200
tests/test_obsidian_export.py
Normal file
200
tests/test_obsidian_export.py
Normal file
@@ -0,0 +1,200 @@
|
|||||||
|
"""Tests for the Obsidian vault export.
|
||||||
|
|
||||||
|
If this test breaks, the export is out of sync with the data model.
|
||||||
|
Fix obsidian_export.py to match whatever changed.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import io
|
||||||
|
import sys
|
||||||
|
import zipfile
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
_project_root = Path(__file__).resolve().parent.parent
|
||||||
|
if str(_project_root / "src") not in sys.path:
|
||||||
|
sys.path.insert(0, str(_project_root / "src"))
|
||||||
|
|
||||||
|
from webui.obsidian_export import build_obsidian_vault
|
||||||
|
|
||||||
|
|
||||||
|
def test_vault_structure(seeded_db):
|
||||||
|
"""Vault ZIP should contain expected folders and key files."""
|
||||||
|
data = build_obsidian_vault(seeded_db)
|
||||||
|
assert len(data) > 0
|
||||||
|
|
||||||
|
z = zipfile.ZipFile(io.BytesIO(data))
|
||||||
|
names = z.namelist()
|
||||||
|
|
||||||
|
# Key structural files must exist
|
||||||
|
assert "IETF-AI-Agent-Drafts/Dashboard.md" in names
|
||||||
|
assert "IETF-AI-Agent-Drafts/Authors/index.md" in names
|
||||||
|
assert "IETF-AI-Agent-Drafts/Categories/index.md" in names
|
||||||
|
assert "IETF-AI-Agent-Drafts/.obsidian/graph.json" in names
|
||||||
|
|
||||||
|
# Should have analysis notes
|
||||||
|
analysis = [n for n in names if "/Analysis/" in n]
|
||||||
|
assert len(analysis) >= 3 # Score Distribution, Top Rated, Ideas Overview
|
||||||
|
|
||||||
|
|
||||||
|
def test_vault_has_all_drafts(seeded_db):
|
||||||
|
"""Every draft in the DB should have a corresponding note in the vault."""
|
||||||
|
data = build_obsidian_vault(seeded_db)
|
||||||
|
z = zipfile.ZipFile(io.BytesIO(data))
|
||||||
|
draft_files = [n for n in z.namelist() if "/Drafts/" in n]
|
||||||
|
|
||||||
|
# seeded_db has 5 drafts
|
||||||
|
assert len(draft_files) == 5
|
||||||
|
|
||||||
|
# Check each draft name appears
|
||||||
|
draft_names = {Path(f).stem for f in draft_files}
|
||||||
|
assert "draft-alpha-agent-comm" in draft_names
|
||||||
|
assert "draft-gamma-agent-id" in draft_names
|
||||||
|
|
||||||
|
|
||||||
|
def test_draft_note_has_frontmatter(seeded_db):
|
||||||
|
"""Draft notes must have YAML frontmatter with score and categories."""
|
||||||
|
data = build_obsidian_vault(seeded_db)
|
||||||
|
z = zipfile.ZipFile(io.BytesIO(data))
|
||||||
|
|
||||||
|
content = z.read("IETF-AI-Agent-Drafts/Drafts/draft-alpha-agent-comm.md").decode()
|
||||||
|
|
||||||
|
# YAML frontmatter
|
||||||
|
assert content.startswith("---")
|
||||||
|
assert "score:" in content
|
||||||
|
assert "novelty:" in content
|
||||||
|
assert "maturity:" in content
|
||||||
|
assert "categories:" in content
|
||||||
|
assert "tags:" in content
|
||||||
|
|
||||||
|
# No floating-point noise (e.g., 3.4000000000000004)
|
||||||
|
import re
|
||||||
|
long_floats = re.findall(r"\d+\.\d{4,}", content)
|
||||||
|
assert len(long_floats) == 0, f"Unformatted floats found: {long_floats}"
|
||||||
|
|
||||||
|
|
||||||
|
def test_draft_note_has_wikilinks(seeded_db):
|
||||||
|
"""Draft notes should link to authors and categories with [[wikilinks]]."""
|
||||||
|
data = build_obsidian_vault(seeded_db)
|
||||||
|
z = zipfile.ZipFile(io.BytesIO(data))
|
||||||
|
|
||||||
|
content = z.read("IETF-AI-Agent-Drafts/Drafts/draft-alpha-agent-comm.md").decode()
|
||||||
|
|
||||||
|
# Should link to authors
|
||||||
|
assert "[[Alice Researcher]]" in content
|
||||||
|
assert "[[Bob Engineer]]" in content
|
||||||
|
|
||||||
|
# Should link to categories
|
||||||
|
assert "[[A2A protocols]]" in content
|
||||||
|
|
||||||
|
|
||||||
|
def test_draft_note_has_ideas(seeded_db):
|
||||||
|
"""Draft notes should include extracted ideas."""
|
||||||
|
data = build_obsidian_vault(seeded_db)
|
||||||
|
z = zipfile.ZipFile(io.BytesIO(data))
|
||||||
|
|
||||||
|
content = z.read("IETF-AI-Agent-Drafts/Drafts/draft-alpha-agent-comm.md").decode()
|
||||||
|
|
||||||
|
assert "Extracted Ideas" in content
|
||||||
|
assert "Agent Handshake" in content
|
||||||
|
assert "Capability Negotiation" in content
|
||||||
|
|
||||||
|
|
||||||
|
def test_draft_note_has_rating_bars(seeded_db):
|
||||||
|
"""Draft notes should include visual score bars."""
|
||||||
|
data = build_obsidian_vault(seeded_db)
|
||||||
|
z = zipfile.ZipFile(io.BytesIO(data))
|
||||||
|
|
||||||
|
content = z.read("IETF-AI-Agent-Drafts/Drafts/draft-alpha-agent-comm.md").decode()
|
||||||
|
|
||||||
|
# Score bars use block chars
|
||||||
|
assert "\u2588" in content # filled block
|
||||||
|
assert "\u2591" in content # empty block
|
||||||
|
assert "/5.0" in content
|
||||||
|
|
||||||
|
|
||||||
|
def test_author_notes(seeded_db):
|
||||||
|
"""Author notes should list their drafts with wikilinks."""
|
||||||
|
data = build_obsidian_vault(seeded_db)
|
||||||
|
z = zipfile.ZipFile(io.BytesIO(data))
|
||||||
|
|
||||||
|
content = z.read("IETF-AI-Agent-Drafts/Authors/Alice Researcher.md").decode()
|
||||||
|
|
||||||
|
assert content.startswith("---")
|
||||||
|
assert "affiliation:" in content
|
||||||
|
assert "ExampleCorp" in content
|
||||||
|
assert "[[draft-alpha-agent-comm" in content
|
||||||
|
assert "[[draft-gamma-agent-id" in content
|
||||||
|
|
||||||
|
|
||||||
|
def test_category_notes(seeded_db):
|
||||||
|
"""Category notes should list drafts with scores."""
|
||||||
|
data = build_obsidian_vault(seeded_db)
|
||||||
|
z = zipfile.ZipFile(io.BytesIO(data))
|
||||||
|
cat_files = [n for n in z.namelist() if "/Categories/" in n and "index" not in n]
|
||||||
|
|
||||||
|
# seeded_db has 5 distinct categories
|
||||||
|
assert len(cat_files) >= 4
|
||||||
|
|
||||||
|
# Check one category note
|
||||||
|
content = z.read("IETF-AI-Agent-Drafts/Categories/A2A protocols.md").decode()
|
||||||
|
assert "[[draft-alpha-agent-comm" in content
|
||||||
|
assert "draft_count:" in content
|
||||||
|
|
||||||
|
|
||||||
|
def test_dashboard_has_mermaid(seeded_db):
|
||||||
|
"""Dashboard should contain Mermaid chart blocks."""
|
||||||
|
data = build_obsidian_vault(seeded_db)
|
||||||
|
z = zipfile.ZipFile(io.BytesIO(data))
|
||||||
|
|
||||||
|
content = z.read("IETF-AI-Agent-Drafts/Dashboard.md").decode()
|
||||||
|
|
||||||
|
assert "```mermaid" in content
|
||||||
|
assert "pie title" in content
|
||||||
|
assert "Key Stats" in content
|
||||||
|
assert "Total Drafts" in content
|
||||||
|
|
||||||
|
|
||||||
|
def test_vault_has_glossary(seeded_db):
|
||||||
|
"""Vault should contain a Glossary with scoring dimensions explained."""
|
||||||
|
data = build_obsidian_vault(seeded_db)
|
||||||
|
z = zipfile.ZipFile(io.BytesIO(data))
|
||||||
|
|
||||||
|
assert "IETF-AI-Agent-Drafts/Analysis/Glossary.md" in z.namelist()
|
||||||
|
content = z.read("IETF-AI-Agent-Drafts/Analysis/Glossary.md").decode()
|
||||||
|
|
||||||
|
# All five dimensions must be explained
|
||||||
|
for dim in ("Novelty", "Maturity", "Overlap", "Momentum", "Relevance"):
|
||||||
|
assert dim in content, f"Glossary missing dimension: {dim}"
|
||||||
|
|
||||||
|
assert "Composite Score" in content
|
||||||
|
assert "Internet-Draft" in content
|
||||||
|
|
||||||
|
|
||||||
|
def test_top_rated_uses_full_names(seeded_db):
|
||||||
|
"""Top Rated table should use full dimension names, not abbreviations."""
|
||||||
|
data = build_obsidian_vault(seeded_db)
|
||||||
|
z = zipfile.ZipFile(io.BytesIO(data))
|
||||||
|
|
||||||
|
content = z.read("IETF-AI-Agent-Drafts/Analysis/Top Rated.md").decode()
|
||||||
|
|
||||||
|
assert "Novelty" in content
|
||||||
|
assert "Maturity" in content
|
||||||
|
assert "| Nov |" not in content # no abbreviations
|
||||||
|
|
||||||
|
|
||||||
|
def test_vault_is_valid_zip(seeded_db):
|
||||||
|
"""The output should be a valid ZIP that can be extracted."""
|
||||||
|
data = build_obsidian_vault(seeded_db)
|
||||||
|
z = zipfile.ZipFile(io.BytesIO(data))
|
||||||
|
|
||||||
|
# Should not raise
|
||||||
|
bad = z.testzip()
|
||||||
|
assert bad is None, f"Corrupt file in ZIP: {bad}"
|
||||||
|
|
||||||
|
# All files should be decodable as UTF-8
|
||||||
|
for name in z.namelist():
|
||||||
|
if name.endswith(".md"):
|
||||||
|
z.read(name).decode("utf-8")
|
||||||
Reference in New Issue
Block a user