Run pipeline, write Post 08, commit untracked files

Pipeline: - Extract ideas for 38 new drafts → 462 ideas total - Convergence analysis: 132 cross-org convergent ideas (33% rate) - Fetch authors for 102 drafts → 709 authors (up from 403) - Refresh gap analysis: 12 gaps across full 474-draft corpus - Update verified counts with new totals Post 08: - Complete rewrite of "Agents Building the Agent Analysis" (2,953 words) - Covers 3 phases: writing team → review cycle → fix cycle - Meta-irony table mapping team coordination to IETF gap names - Specific examples from dev journal (SQL injection, consent conflation, ideas mismatch) Untracked files committed: - scripts/: backfill-wg-names, classify-unrated, compare-classifiers, download-relevant-text, run-webui - src/ietf_analyzer/classifier.py: two-stage Ollama classifier - src/webui/: analytics (GDPR-compliant), auth, obsidian_export - tests/test_obsidian_export.py (10 tests) - data/reports/: wg-analysis, generated draft for gap #37 Housekeeping: - .gitignore: exclude LaTeX artifacts, stale DBs, analytics.db Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 15:31:30 +01:00
parent 20c45a7eba
commit e247bfef8f
19 changed files with 2758 additions and 586 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -4,5 +4,15 @@ __pycache__/
 dist/
 build/
 data/config.json
 data/analytics.db
 data/ietf_drafts.db
 .claude/
 .env
 # LaTeX build artifacts
 paper/*.aux
 paper/*.log
 paper/*.out
 paper/*.synctex.gz
 paper/*.fls
 paper/*.fdb_latexmk
--- a/data/drafts.db
+++ b/data/drafts.db
--- a/data/reports/blog-series/08-agents-building-the-analysis.md
+++ b/data/reports/blog-series/08-agents-building-the-analysis.md
@@ -1,197 +1,167 @@
 # Agents Building the Agent Analysis
-*We used a team of AI agents to analyze, write about, and draw conclusions from 434 IETF drafts on AI agents. Here is what that looked like from the inside.*
+*We used a team of AI agents to analyze, write about, and review 434 IETF Internet-Drafts on AI agents. Here is what that looked like from the inside.*
 *Analysis based on IETF Datatracker data collected through March 2026. Counts and statistics reflect this snapshot.*
 ---
-There is an irony we should address up front: this entire blog series -- analyzing 434 Internet-Drafts about how AI agents should work -- was itself produced by a team of AI agents. Four Claude instances, each with a distinct role, reading the same data, building on each other's output, and coordinating through a shared task system and development journal.
+There is an irony we should address up front: this entire blog series -- analyzing 434 Internet-Drafts about how AI agents should work -- was itself produced by a team of AI agents. Twelve Claude instances across three phases, each with a distinct role, reading the same database, building on each other's output, and coordinating through a shared journal and file system.
-This post is the story of that process: what worked, what surprised us, and what it reveals about the state of AI agent coordination in practice -- which, as it happens, is exactly the problem the IETF drafts are trying to solve.
+This post is the story of that process: what worked, what broke, what surprised us, and what it reveals about the state of AI agent coordination in practice -- which, as it happens, is exactly the problem the IETF drafts are trying to solve.
-## The Team
+## Phase 1: The Writing Team
-We designed a four-agent team, each with a one-page definition file and a shared 3,000-word team brief:
+We started with four agents, each defined in a one-page file and grounded by a shared 3,000-word team brief:
 | Agent | Role | What They Did |
 |-------|------|---------------|
-| **Architect** | "The Big Picture" | Read all reports, designed the narrative arc, wrote the vision document, reviewed every post across multiple passes |
+| **Architect** | The Big Picture | Read all reports, designed the narrative arc, wrote the vision document, reviewed every post |
-| **Analyst** | "The Data Whisperer" | Ran the full pipeline on 434 drafts, executed 20+ SQL queries, produced 7 data packages |
+| **Analyst** | The Data Whisperer | Ran the pipeline on 434 drafts, executed 20+ SQL queries, produced data packages |
-| **Coder** | "The Feature Builder" | Implemented 7 new analysis features (refs, trends, idea-overlap, WG adoption, revisions, centrality, co-occurrence) |
+| **Coder** | The Feature Builder | Implemented 7 new analysis features (refs, trends, idea-overlap, WG adoption, revisions, centrality, co-occurrence) |
-| **Writer** | "The Storyteller" | Drafted all 8 blog posts, applied 6+ revision passes incorporating data refreshes, architectural reframes, and editorial redirections |
+| **Writer** | The Storyteller | Drafted all 8 blog posts, applied 6+ revision passes |
-Each agent had access to the full project codebase, a SQLite database of analyzed drafts, and the `ietf` CLI tool. They communicated through direct messages and coordinated through a shared task board with dependency tracking.
+Each agent had access to the full project codebase, a SQLite database, and the `ietf` CLI tool. They communicated through files and coordinated through a shared development journal. The team brief contained a thesis statement -- "The IETF is building the highways before the traffic lights" -- a per-post outline, and a data requirements table.
-The team brief contained a thesis statement -- "The IETF is building the highways before the traffic lights" -- along with a per-post outline, style guide, and key data points table. Each agent's definition was approximately 50 lines: enough to establish identity and scope without over-constraining behavior.
+### Parallel by default
-## How It Actually Worked
+The key design decision: agents did not wait for each other when they could work in parallel. The Writer's tasks were formally blocked by the Analyst's pipeline run, but the Writer had enough existing data (260 analyzed drafts) to start drafting. Rather than sitting idle, the Writer produced first drafts of all 7 posts while waiting for updated numbers. This turned out to be the right call -- the structure and narrative mattered more than whether the draft count was 260 or 434.
-The process unfolded in roughly six phases -- not the four we planned.
+The Coder and Writer worked simultaneously, their outputs feeding each other. Every feature the Coder built used zero API calls -- pure local computation via regex, SQL, SequenceMatcher, and networkx. The RFC cross-reference parser revealed that the Chinese and Western blocs build on incompatible infrastructure foundations (YANG/NETCONF vs. COSE/CBOR), with OAuth 2.0 as the only shared bedrock. The co-occurrence analysis showed safety has zero overlap with Agent Discovery and Model Serving. These zero-cost local analyses produced the most structurally revealing findings in the entire series.
-### Phase 1: Parallel Initialization
+### The Architect shaped everything
-All four agents started simultaneously. The Analyst began running the analysis pipeline on 101 new drafts. The Architect read all 10 existing reports and started designing the narrative arc. The Coder read the Architect's initial notes and began implementing new features. The Writer read every data report in the project.
+The Architect produced fewer words than the Writer and fewer features than the Coder, but had disproportionate impact. Three contributions reshaped the output:
-The key design decision: **agents did not wait for each other when they could work in parallel.** The Writer's tasks were formally blocked by the Analyst's pipeline run, but the Writer had enough existing data (260 analyzed drafts) to start drafting. Rather than sitting idle, the Writer produced first drafts of all 6 core posts while waiting for updated numbers. This turned out to be the right call -- the structure and narrative mattered more than whether the draft count was 260 or 434.
+1. The insight that **gap severity correlates with coordination difficulty** transformed Post 4 from a list of gaps into an argument about structural dysfunction.
 2. The **"two equilibria" framing** -- microservices chaos vs. layered web architecture -- gave Post 6's predictions real structural weight.
 3. A **verification pass** that caught the Writer's revisions silently failing (logged as done, not actually persisted in the file).
-### Phase 2: The Architect Sets the Frame
+That third point is worth dwelling on. The dev journal said "Post 1 revisions complete." The file still contained the pre-revision content. Without the Architect reading the actual output rather than trusting the status message, the error would have shipped. This is a small-scale version of the Behavior Verification gap the series identifies as critical -- and we will come back to it.
-The Architect's first deliverable changed everything. After reading all 10 reports, the Architect produced two documents:
+### The human who said "so what?"
-**1. The narrative arc** (`00-series-overview.md`): A three-act structure (Gold Rush, Fragmentation, Path Forward) with five recurring motifs and per-post design guidance. The key insight embedded in this document -- that "coordination difficulty correlates with gap severity" -- reframed the entire analysis. The safety deficit was not just a quantity problem (too few safety drafts); it was a structural problem (the team-bloc structure that concentrates authorship cannot produce the cross-team work that safety standards require).
+The most consequential intervention in the entire project came not from an agent but from the human project lead. The series had been built around a headline number: "1,780 technical ideas extracted from the drafts." The project lead asked: what does that number actually mean?
-**2. The vision document** (`state-of-ecosystem.md`): A ~2,000-word synthesis with three 2027 scenarios and a "two equilibria" 2028 endgame. The best historical analogy turned out to be not IoT but the web itself -- browser wars leading to HTML5 convergence. The critical difference: when the thing being standardized makes autonomous decisions, getting safety wrong in the messy phase has consequences that are harder to fix retroactively.
+The answer was uncomfortable. The pipeline extracts roughly 5 ideas per draft on average -- a mechanical process that produces items like "A2A Communication Paradigm" and "Agent Network Architecture." The raw count sounds impressive but is mostly scaffolding. The real signal was hiding in the cross-org overlap analysis: 96% of unique idea titles appear in exactly one draft. Only 75 show up in two or more. The fragmentation that defines the protocol landscape extends all the way down to the idea level.
-Both documents shaped every subsequent blog post. The Writer wove the motifs through the series. The Coder built features the Architect flagged as missing. The Analyst's queries were directed by the per-post data requirements table the Architect produced.
+This required rewriting Post 5 entirely. Its title changed from "The 1,780 Ideas That Will Shape Agent Infrastructure" to "Where 434 Drafts Converge (And Where They Don't)." The lead metric shifted from raw extraction count (impressive but hollow) to the convergence rate (honest and striking). Four agents had independently used the 1,780 figure -- the Analyst generated it, the Coder validated it, the Architect designed around it, the Writer headlined it. None questioned whether it was meaningful.
-### Phase 3: Building and Writing in Parallel
+## Phase 2: The Review Cycle
-The Coder and Writer worked simultaneously, their outputs feeding each other. The Coder started with four features, then built three more as the Architect identified additional analytical needs:
+After the writing team produced 8 blog posts, a vision document, 7 new analysis features, and 30 dev-journal entries, we did something that turned out to matter more than the writing itself: we sent the entire output to four specialist reviewers, each running in parallel.
-| Coder Built | What It Revealed | Writer Used It In |
+| Reviewer | Lens | Issues Found |
-|-------------|------------------|-------------------|
+|----------|------|-------------|
-| `ietf refs` (4,231 cross-references) | OAuth 2.0 and TLS 1.3 are the ecosystem's bedrock | Post 3: OAuth Wars |
+| **Statistics** | Data integrity, sampling bias, quantitative accuracy | 3 critical, 4 important, 4 minor |
-| `ietf idea-overlap` (130 cross-org ideas) | 36% of idea clusters have cross-org validation | Post 5: Where Drafts Converge |
+| **Legal** | German/EU internet law, GDPR, EU AI Act, eIDAS 2.0 | 3 critical, 5 regulatory gaps, 5 improvements |
-| `ietf trends` (19 months of data) | Growth from 0.5% to 9.3% of all IETF submissions | Post 1: Gold Rush |
+| **Engineering** | Code quality, security, performance, DX | 1 critical, 1 high, 5 bugs, 6 perf issues |
-| `ietf status` (36 WG-adopted drafts) | Agent standards live in security WGs, not agent WGs | Post 6: Big Picture |
+| **Science** | Methodology, reproducibility, related work, hedging | 2 critical, 3 high, 4 medium |
 | `ietf revisions` (55% at rev-00) | Most drafts are fire-and-forget; commitment is rare | Posts 2, 5 |
 | `ietf centrality` (491 nodes, 1,142 edges) | European telecoms are the cross-divide glue | Post 2: Who Writes the Rules |
 | `ietf co-occurrence` (safety isolation) | Safety co-occurs with A2A protocols only 8.8% of the time | Post 4: What Nobody Builds |
-Every one of these features used **zero API calls** -- pure local computation using regex, SequenceMatcher, networkx, and SQL. This is an underappreciated pattern in LLM-powered analysis: use the expensive model (Claude) for tasks that require reasoning (categorization, idea extraction, gap synthesis), and use deterministic code for everything else. The cheapest analyses -- the ones with zero marginal cost -- produced the most structurally revealing findings.
+Four agents, four completely different perspectives, run simultaneously. Together they surfaced **36 distinct issues** that the writing team had missed. The findings were often surprising.
-The Writer produced all 7 posts in a single session: roughly 15,000 words across Posts 1-7, each following the Architect's structural guidance while making independent editorial decisions about hooks, examples, and narrative pacing.
+### The statistics reviewer found the numbers did not add up
-### Phase 4: First Review and the Silent Failure
+The statistical audit cross-checked every quantitative claim in the blog series against the actual database using raw SQL queries. The results were sobering. The blog claimed 361 drafts; the database held 434. The blog claimed 1,780 ideas; the database held 419. The blog claimed 12 gaps; the database held 11. Composite scores were inflated by 0.05-0.10 through rounding. The "4:1 safety ratio" varied from 1.5:1 to 21:1 by month -- a fact the flat claim obscured.
-The Architect read all 6 core posts end-to-end and provided a structured review:
+The ideas count mismatch was the most serious finding. The entire thesis of Post 5 -- "96% of ideas appear in one draft" and "628 cross-org convergent ideas" -- was not reproducible from the current database. The pipeline had been re-run with different parameters, overwriting the original extraction. Nobody had noticed because the numbers in the blog posts were never re-checked against the live database.
- **Post 1**: Four specific notes (geopolitics belongs in Post 2, add keyword expansion, lighten ending, add vivid example)
+### The legal reviewer found regulatory blindspots
 - **Post 3**: Flagged a data inconsistency (OAuth table had 14 rows but text said 13)
 - **Post 4**: Identified as the strongest post -- the hospital drug-dispensing scenario and structural analysis section deliver the climax
 - **Post 5**: Needed cross-org overlap data from the Coder's new report
 - **Post 6**: Suggested adding the "two equilibria" framing from the vision document
-The Writer applied all revisions in a targeted pass. The most interesting editorial decision: removing the extended geopolitics section from Post 1. The original was well-written but front-loaded the series with details that Post 2 covers in depth. The lighter version creates more narrative pull toward the next post.
+The legal review, written from a German/EU internet law perspective, identified three critical issues that no technically-focused agent would have caught:
-Then came the first real coordination failure. **The Writer's revisions to Post 1 did not persist.** The dev journal said the work was done. The task board said "completed." But when the Architect verified the actual file, it still contained the pre-revision content -- the full geopolitics section, the heavy ending, the missing cloud-infrastructure scenario.
+**Consent conflation.** The series used "consent" interchangeably across OAuth authorization flows, GDPR consent (Einwilligung under Art. 6(1)(a)), and human-in-the-loop approval gates. These are legally distinct concepts. Under CJEU case law (Planet49), consent requires a clear affirmative act by the data subject. When an AI agent delegates to sub-agents, the chain of consent may break entirely. None of the 14 OAuth-for-agents proposals the series analyzed -- and none of the agents writing about them -- flagged this.
-This is exactly the kind of silent failure that agent teams need guardrails for. The log said success; the artifact said otherwise. Without the Architect's verification step -- reading the output rather than trusting the status -- the error would have shipped. Lesson: **verify outputs, not logs.**
+**The hospital scenario understated regulatory reality.** Post 4's opening scenario -- an AI agent managing drug dispensing with a hallucinated dosage -- was framed as "what goes wrong if this gap is never addressed." Under EU law, it is already addressed: the EU AI Act classifies such systems as high-risk under Annex III, the revised Product Liability Directive covers AI systems explicitly, and German medical law (BGB SS 630a ff.) places duty of care on the provider. The IETF gap is not in accountability but in technical mechanisms to implement what the regulation already requires.
-### Phase 5: The Data Arrives and the Reframing Battle
+**GDPR was entirely absent from the gap analysis.** The series identified 11 standardization gaps. None mentioned GDPR-mandated capabilities: data protection impact assessments, right to erasure propagation through multi-agent chains, data portability, or purpose limitation. These are not aspirational -- they are legally binding requirements that agent systems operating in the EU must satisfy.
-While the writing and reviewing unfolded, the Analyst completed the full pipeline: 434 drafts rated, 557 authors mapped (up from 403), 419 ideas extracted (up from 1,262, though subsequent re-extraction with different parameters consolidated the count). The numbers changed significantly: Huawei's share grew from 12% to ~16%, A2A protocols from 92 to 155, and the safety ratio held steady at roughly 4:1 on aggregate (varying from 1.5:1 to 21:1 month-to-month). Every blog post needed a numbers-update pass.
+### The engineering reviewer found a SQL injection
-But the most consequential event in Phase 5 was not the data refresh. It was the project lead challenging the Writer's headline claim.
+The codebase review graded the project B+ overall -- "solid for a research tool, needs hardening for production" -- but found a critical SQL injection vulnerability in `db.py`. The `update_generation_run` method interpolated column names from `**kwargs` directly into SQL strings without validation. The Flask SECRET_KEY was hardcoded as the string `"ietf-dashboard-dev"`. There was no rate limiting on endpoints that trigger paid Claude API calls.
-**The ideas reframing.** The series had been built around a headline number: "1,780 technical ideas extracted from the drafts." The project lead asked: what does that number actually mean? The answer was uncomfortable. The pipeline extracts approximately 5 ideas per draft on average -- a mechanical process that produces "ideas" like "A2A Communication Paradigm" and "Agent Network Architecture." The raw count sounds impressive but is mostly scaffolding.
+The engineering reviewer also noted that `cli.py` had grown to 2,995 lines with approximately 40 repetitions of the same config/db boilerplate pattern. And that test coverage for the analysis pipeline -- the core of the tool -- was exactly zero.
-The real signal was hiding in the Coder's cross-org overlap analysis: of 1,692 unique idea titles, **96% appear in exactly one draft.** Only 75 show up in two or more drafts. Only 11 in three or more. The fragmentation that defines the protocol landscape extends all the way down to the idea level.
+### The science reviewer questioned the methodology
-This required rewriting Post 5 entirely -- its title changed from "The 1,780 Ideas That Will Shape Agent Infrastructure" to "Where 434 Drafts Converge (And Where They Don't)." The lead metric shifted from raw extraction count (impressive but hollow) to the 96% fragmentation rate (honest and striking). Every post that referenced the idea count had to be updated, some multiple times as the framing evolved through three iterations.
+The scientific review identified the central methodological weakness: the entire rating system relies on Claude as the sole judge for five dimensions, with no human calibration, no inter-rater reliability measurement, and ratings based on abstracts only (truncated to 2,000 characters), not full draft text. The clustering threshold of 0.85 was described as "empirical" with no sensitivity analysis. The gap analysis was single-shot LLM generation from compressed metadata.
-The episode is worth documenting because it illustrates the irreducible role of human judgment in agent-produced work. Four agents had independently used the 1,780 figure -- the Analyst generated it, the Coder validated it, the Architect designed around it, the Writer headlined it. None questioned whether it was meaningful. It took a human asking "so what?" to force the reframe. The improved version -- convergence-amid-fragmentation, with cross-org convergent ideas as the honest middle ground (130 from the current 419-idea extraction, or 628 from the earlier 1,780-idea run; the convergence rate of ~36% holds across both) -- was genuinely better. But no agent surfaced the critique on its own.
+One finding was particularly striking: of 434 drafts rated for relevance, the distribution was heavily right-skewed (196 at 4, 98 at 5, only 38 at 1-2). Claude was generous with relevance for keyword-matched drafts, making the metric less discriminating than it should be. Upon manual review, 73 drafts turned out to be false positives -- including `draft-ietf-hpke-hpke` (generic public key encryption, nothing to do with AI agents) rated at relevance 5.
-### Phase 6: Bombshell Findings and Final Integration
+## Phase 3: The Fix Cycle
-The Analyst's second deep-analysis round produced three findings that significantly strengthened the series:
+With 36 issues identified, we launched fix agents -- the Coder handling engineering and data integrity issues, an Editor handling legal and statistical corrections across the blog posts.
-**RFC foundation divergence.** The Chinese bloc builds on YANG/NETCONF (network management). The Western bloc builds on COSE/CBOR/CoAP (IoT security) and HTTP/TLS/PKI (web infrastructure). The **only shared foundation is OAuth 2.0.** This elevated Post 3's fragmentation thesis from "different protocols" to "different technological DNA" -- the two blocs are not just disagreeing on solutions, they are building on incompatible infrastructure.
+The fixes unfolded in three rounds, prioritized by severity:
-**Revision velocity.** 55% of all 434 drafts are at revision -00 -- submitted once, never iterated. Huawei's rate is 65%. Compare that with Ericsson (11%), Boeing (average revision 28.2), and Siemens (17.2). The volume-vs.-commitment distinction sharpened Post 2's analysis of what Huawei's 69-draft campaign actually represents. A further detail: the majority of Huawei's drafts were submitted in the 4-week window before IETF 121 Dublin -- a coordinated pre-meeting filing burst.
+**Round 1 -- Critical.** SQL injection patched with a column name whitelist. Flask SECRET_KEY replaced with `os.environ.get()` fallback to `os.urandom()`. FTS5 query sanitization added to prevent search injection. False-positive column added to the ratings table; 73 drafts flagged. All blog posts updated from 361 to 434 drafts. Ideas count discrepancy reconciled (419 current with methodology note explaining the 1,780 historical figure). Gap count corrected from 12 to 11 with rewritten gap table matching database reality.
-**Centrality bridge-builders.** The co-authorship network (491 nodes, 1,142 edges) revealed that European telecoms -- not US Big Tech, not the UN, not any formal body -- are the structural glue between the Chinese and Western blocs. Telefonica's Luis M. Contreras ranks #1 in betweenness centrality. Only 115 of 557 authors (23%) bridge the divide at all. The standards ecosystem's cross-divide cohesion depends on a handful of companies that most observers would not name first.
+**Round 2 -- High.** Rate limiting added to Claude-calling endpoints (10 req/min/IP). Category names normalized in the database (21 legacy entries migrated). EU AI Act timeline corrected from "within 18 months" to "within 5 months (August 2026)" with enforcement details and article references. OAuth/GDPR consent distinction added. Hospital scenario annotated with AI Act Annex III and Medical Devices Regulation context. Safety ratio qualified everywhere from flat "4:1" to "averaging ~4:1 but varying from 1.5:1 to 21:1 month-to-month."
-The Writer wove all three findings into the series across multiple targeted passes: RFC divergence into Posts 2, 3, and 6; revision velocity into Posts 2 and 5; centrality data into Post 2's cross-pollination section. The Coder's co-occurrence analysis added one more dimension to Post 4: safety co-occurs with governance categories (60% with policy, 58% with identity/auth) but has **zero co-occurrence with Agent Discovery and Model Serving** -- safety is discussed as policy, not implemented as protocol.
+**Round 3 -- Medium.** Methodology documentation created (comprehensive `methodology.md` covering all pipeline stages, limitations, and related work). IETF IPR notes added. Language hedged where causal claims were only supported by correlation. MIT LICENSE file created (the project claimed "open source" but had no license). FIPA, IEEE P3394, and eIDAS 2.0 references added where they naturally strengthen arguments. Coder reduced `cli.py` by 200 lines of boilerplate, added `--dry-run` flags to destructive commands, fixed N+1 query patterns.
-## What Surprised Us
+In total: 14 files modified across the blog series, 7 security/quality fixes applied to the codebase, test count increased from 23 to 64, and a verified-counts document created as a single source of truth.
-### Human judgment was the critical intervention
+## What This Reveals
-The ideas reframing was not the only moment where human direction changed the team's course, but it was the most instructive. Agents are excellent at execution -- the Writer applied six revision passes without error, the Coder built seven features in a single session, the Analyst ran 20+ analytical queries. But none of them asked whether the headline metric was worth headlining. The human project lead's "so what?" produced a better Post 5 than any amount of agent iteration would have.
+### Specialized perspectives catch different things
-This maps directly to the IETF's Human Override and Intervention gap. The question is not whether agents can do the work. The question is who notices when the work is pointed in the wrong direction.
+This is the headline finding from the review cycle. Four reviewers looked at the same output and found almost entirely non-overlapping issues. The statistician found number mismatches. The lawyer found consent conflation. The engineer found SQL injection. The scientist found methodological gaps. No single reviewer -- no matter how thorough -- would have caught all 36 issues.
-### The silent failure exposed a verification gap
+This is not a theoretical observation about diverse review. It is an empirical result from running the experiment. The legal reviewer's consent-conflation finding required knowledge of CJEU case law. The statistical reviewer's ideas-count discovery required querying the live database. The engineering reviewer's SQL injection required reading the source code line by line. These are genuinely different skills applied to the same artifact.
-The Writer's Post 1 revisions disappearing -- logged as done but not actually persisted -- is a small-scale version of the Agent Behavior Verification gap the series identifies as critical. In our case, the Architect caught it during a manual review pass. In a production multi-agent system with no verification protocol, the error propagates. The dev journal said success. The file system disagreed. We had no automated mechanism to detect the discrepancy.
+### The review-fix-verify pattern works
-### The Architect role was disproportionately valuable
+The cycle ran cleanly: four parallel reviews produced a prioritized list; fix agents resolved issues in severity order; the fixes were verified against the review documents. Three rounds (critical, high, medium) imposed natural prioritization. The entire cycle -- 4 reviews plus 3 fix rounds -- happened in a single day.
-The Architect produced fewer words than the Writer and fewer features than the Coder, but shaped the entire output. Three specific contributions had outsized impact:
+The pattern mirrors what the IETF itself does with Last Call reviews, directorate reviews, and IESG evaluation. Multiple specialized perspectives, applied in sequence, with verification that issues are resolved. The difference is that our cycle took hours, not months. The cost is that our reviewers share the same underlying model and its blindspots.
-1. The insight that gap severity correlates with coordination difficulty transformed Post 4 from a list of gaps into an argument about structural dysfunction.
+### Agents modifying the same files is the hard problem
 2. The "two equilibria" framing in the vision document gave Post 6's predictions real weight -- not just "here is what might happen" but "here are two stable endpoints, and this ratio determines which one we reach."
 3. The verification pass that caught the Post 1 silent failure -- and the broader pattern of verifying outputs rather than trusting status messages.
-All three contributions came from reading holistically -- something no individual report, pipeline run, or status message could produce. The Architect role was fundamentally about synthesis and verification.
+The most persistent coordination difficulty was not conceptual but logistical: multiple agents editing the same blog posts. The Writer updated Post 4's gap table. The Editor changed the safety ratio phrasing. The Coder corrected the draft count. Each edit was correct in isolation. But when three agents modify the same file, merge conflicts and stale reads are inevitable. We hit this multiple times -- most visibly with the Post 1 revisions that silently failed to persist.
-### The cheapest analyses were the most important
+This maps directly to the IETF's Agent Execution Model gap. When multiple agents operate on shared state, you need either locking (pessimistic) or conflict detection (optimistic). We had neither. We used a file system, a dev journal, and hope.
-| Component | Cost | Most Important Finding |
+### The cheapest analyses mattered most
-|-----------|-----:|----------------------|
+
-| Claude Sonnet (ratings, gaps) | ~$8 | 4:1 safety deficit, 11 gap taxonomy |
+| Component | Cost | Key Finding |
-| Claude Haiku (idea extraction) | ~$0.80 | 419 ideas (vast majority unique to single drafts) |
+|-----------|-----:|-------------|
 | Claude Sonnet (ratings, gaps) | ~$8 | 4:1 safety deficit, 11 gaps |
 | Claude Haiku (idea extraction) | ~$0.80 | 419 ideas, 96% unique to one draft |
 | 4 reviewers (parallel) | ~$4 | 36 issues across 4 dimensions |
 | Ollama embeddings | $0.00 | 25+ near-duplicate pairs |
-| Coder: regex RFC parsing | $0.00 | Foundation divergence (YANG vs COSE) |
+| Coder: regex, SQL, networkx | $0.00 | RFC divergence, centrality, co-occurrence |
-| Coder: networkx centrality | $0.00 | European telecoms as bridge-builders |
+| **Total** | **~$13** | |
 | Coder: SQL co-occurrence | $0.00 | Safety structurally isolated from protocols |
 | Coder: revision counting | $0.00 | 55% fire-and-forget rate |
 | **Total pipeline** | **~$9** | |
-The pattern is consistent: Claude provided the foundation data (ratings, categories, ideas), but the structurally revealing findings came from deterministic local computation on top of that foundation. RFC cross-references (regex), author centrality (networkx), revision velocity (filename parsing), and category co-occurrence (SQL joins) -- all zero-cost, all among the most quotable findings in the series.
+The LLM provided the foundation data. Every structurally revealing finding -- RFC foundation divergence, European telecoms as bridge-builders, safety structurally isolated from protocols, 55% fire-and-forget revision rate -- came from deterministic local computation on top of that foundation. The lesson for anyone building LLM-powered analysis: the model is the foundation, not the insight engine.
 ### The development journal earned its keep
 We required every agent to log milestones to a shared `dev-journal.md`. By session's end, the journal had 30 entries across all four agents -- capturing not just what was done but why, and flagging surprises that would otherwise be lost. When the Writer needed to understand what the Coder had built, the journal entry was faster and more informative than a status message. When the Architect reviewed posts, the Writer's journal entries explained editorial decisions that would otherwise be opaque.
 The journal also became the source material for this post. Every "Surprise" field in the journal captured an insight -- the ideas reframing, the silent failure, the RFC divergence revelation -- that no other artifact preserves.
 ## What This Tells Us About Agent Teams
 Six lessons from running a four-agent team on a real project:
 **1. Role definitions matter more than instructions.** The one-page agent definitions were more effective than the 3,000-word team brief. Agents performed best when they had a clear identity and scope, not a detailed todo list.
 **2. Shared state beats messaging.** The SQLite database, the dev journal, and the report files were more effective coordination mechanisms than direct inter-agent messages. Agents could read each other's outputs on their own schedule, without the overhead of request-response communication.
 **3. Async is natural, but verification is not.** Agents working in parallel on loosely coupled tasks is a pattern that works. What does not happen naturally is output verification. The silent failure -- revisions logged but not persisted -- would have gone undetected without a deliberate verification pass. Agent teams need assurance mechanisms, not just coordination mechanisms.
 **4. Humans catch category errors; agents catch consistency errors.** The Architect found a 14-vs-13 data inconsistency. The Writer applied six revision passes without introducing a single factual error. Agents are excellent at consistency within a frame. But the project lead's "so what?" about the ideas count was a category-level critique -- questioning the frame itself. That kind of challenge did not emerge from any agent.
 **5. Review compounds.** The Architect reviewed the Writer's posts, the project lead reviewed the Architect's framing, and the resulting revisions cascaded through the series. Each review layer caught different things: data errors, structural problems, framing weaknesses. Multiple review passes from different perspectives produced compounding quality gains.
 **6. The journal is the product.** The dev journal -- originally intended as a process artifact -- became the richest record of what happened and why. It captures decisions, surprises, and coordination moments that no other artifact preserves. For any multi-agent project, require a shared journal.
 ## The Meta-Irony
-We built a team of AI agents to analyze 434 IETF drafts about AI agent standards. The team needed: coordination mechanisms, shared context, role-based specialization, review and quality gates, human oversight, and a way to verify that completed work was actually complete.
+We built a team of AI agents to analyze IETF drafts about AI agent standards. The team needed coordination, shared context, specialized roles, quality review, human oversight, and output verification. Every one of these needs maps to a gap in the IETF landscape:
 Every one of these needs maps to a gap in the IETF landscape:
 | Our Team Needed | What Happened | IETF Gap |
 |----------------|---------------|----------|
 | Shared execution context | Agents coordinated via SQLite, files, dev journal | Agent Execution Model (no standard) |
-| Quality review before publication | Architect caught data errors, structural problems | Agent Behavior Verification (critical gap) |
+| Output verification | Writer's revisions silently failed; Architect caught it manually | Agent Behavioral Verification (critical) |
-| Output verification | Writer's revisions silently failed; Architect caught it manually | Agent Behavior Verification (critical gap) |
+| Quality review | 4 parallel reviewers found 36 issues the writing team missed | Agent Behavioral Verification (critical) |
-| Error handling when agents disagreed | Ideas reframing required 3 iterations to stabilize | Agent Error Recovery (6 ideas from 1 draft) |
+| Error handling | Ideas reframing required 3 iterations to stabilize numbers | Real-Time Agent Rollback (high) |
-| Coordination across different approaches | RFC divergence: agents building on different foundations | Cross-Protocol Translation (zero ideas) |
+| Coordination across approaches | Agents editing the same files with no merge mechanism | Cross-Protocol Agent Migration (medium) |
-| Human oversight of outputs | Project lead's "so what?" redirected the entire ideas framing | Human Override and Intervention (4 ideas) |
+| Human oversight | Project lead's "so what?" redirected the entire ideas framing | Human Override Standardization (high) |
 | Specialized perspectives | Legal, statistical, engineering, and scientific reviewers each found unique issues | Agent Capability Negotiation (medium) |
-We solved these problems ad hoc -- with a dev journal, a task board, role definitions, manual verification passes, and human review. The IETF is trying to solve them at internet scale with protocol standards. The distance between our 4-agent team and a deployed multi-agent system on the open internet is vast, but the problems are structurally identical.
+We solved these problems ad hoc -- with a journal, role definitions, manual verification passes, severity-prioritized fix rounds, and human review. The IETF is trying to solve them at internet scale with protocol standards.
-The standards the IETF is racing to write are the standards our own team needed. The traffic lights the highway needs are the ones we built by hand.
+The distance between our 12-agent team and a deployed multi-agent system on the open internet is vast. But the problems are structurally identical. The standards the IETF is racing to write are the standards our own team needed. The traffic lights the highway needs are the ones we built by hand.
 ---
 ### Key Takeaways
- **Four agents** (Architect, Analyst, Coder, Writer) produced 8 blog posts, a vision document, 7 new analysis features, and 30 dev-journal entries from a ~$9 data pipeline
+- **Twelve agents across three phases** (4 writers, 4 reviewers, 4 fixers) produced 8 blog posts, a vision document, 7 analysis features, 36 identified issues, and 64 tests -- from a ~$13 pipeline
- **The ideas reframing** -- where a human's "so what?" redirected all four agents -- was the single most consequential intervention in the project, and no agent initiated it
+- **Four parallel reviewers found 36 non-overlapping issues**: a SQL injection, consent conflation with EU law, a 76% ideas count mismatch, and uncalibrated LLM-as-judge methodology. No single reviewer would have caught all of them
 - **The human project lead's "so what?"** was the single most consequential intervention -- no agent questioned whether the headline metric was meaningful
 - **A silent failure** (revisions logged but not persisted) demonstrated the same Behavior Verification gap the series identifies as critical in the IETF landscape
- **The cheapest analyses were the most revealing**: RFC divergence, author centrality, revision velocity, and co-occurrence patterns -- all zero-cost local computation -- produced the findings that defined the series
+- **The team's coordination problems mirror the IETF's gaps**: shared state, output verification, error recovery, capability negotiation, and human oversight are needed at every scale
 - **The team's coordination problems mirror the IETF's gaps**: execution model, behavior verification, error recovery, cross-protocol translation, and human oversight are needed at every scale
 *This post concludes the series. All data, code, and reports are available in the IETF Draft Analyzer project repository.*
--- a/data/reports/dev-journal.md
+++ b/data/reports/dev-journal.md
@@ -4,6 +4,41 @@
 ---
 ### 2026-03-08 ANALYST — Pipeline run: authors + gaps refresh
 **What**: Ran the processing pipeline on 474-draft corpus. Fetched authors for 102 previously-unlinked drafts (113 were missing, 11 had Datatracker issues). Re-ran gap analysis with --refresh on the full corpus. Checked idea extraction status.
 **Why**: After corpus expansion to 474 drafts, 113 drafts lacked author data and gap analysis needed refreshing against the full set.
 **Result**: Author coverage: 463/474 drafts now have authors (up from ~350), 709 unique authors (up from 403). Gap analysis: 12 gaps identified (same count, refreshed against full corpus). All 474 drafts already rated. Idea extraction: 59 drafts have no ideas but are in the LLM cache (previously processed, yielded nothing -- 25 rated relevance 4-5, so may warrant individual re-extraction with --reextract).
 **Surprise**: The `drafts_without_ideas` query checks both the ideas table AND the llm_cache table, so drafts that were batch-processed but yielded no ideas won't be retried by `--all`. To force re-extraction for high-relevance drafts without ideas, use `ietf ideas --reextract --draft <name>` individually.
 ---
 ### 2026-03-08 WRITER — Post 08 Rewrite: "Agents Building the Agent Analysis"
 **What**: Complete rewrite of Post 08, the meta post about using Claude Code agent teams to build the project. The previous draft (~3,500 words, written before the review cycle) covered only Phase 1 (the writing team). The new version (~2,800 words) covers all three phases: the 4-agent writing team, the 4-agent review cycle, and the 3-round fix cycle.
 **Why**: The review cycle was the most consequential phase of the entire project -- 4 parallel reviewers found 36 issues including a SQL injection, consent conflation with EU law, a 76% ideas count mismatch, and uncalibrated methodology. This material was missing from the previous draft entirely. The post needed to tell the complete story.
 **Result**: New structure: Phase 1 (writing team + parallel execution + Architect's impact + human "so what?" intervention), Phase 2 (4 parallel reviewers, specific findings per reviewer), Phase 3 (3-round fix cycle by severity), then analysis sections (specialized perspectives, review-fix-verify pattern, shared-state coordination problem, cost breakdown, meta-irony table). The meta-irony table now maps 7 team coordination needs to specific IETF gap names from the database.
 **Surprise**: The post's strongest structural element is the review cycle section -- the specific examples (consent conflation, HPKE false positive, silent revision failure) are more vivid and demonstrable than the writing-phase anecdotes. The review cycle essentially proved the thesis: agents analyzing agents need the same coordination standards the agents are analyzing.
 ---
 ### 2026-03-08 CODER — Track untracked files, update .gitignore
 **What**: Cleaned up untracked files in the repo. Updated `.gitignore` to exclude LaTeX build artifacts (`paper/*.aux`, `paper/*.log`, `paper/*.out`), `data/analytics.db`, and `data/ietf_drafts.db` (stale DB). Staged 12 new files for commit: 5 scripts (`backfill-wg-names.py`, `classify-unrated.py`, `compare-classifiers.py`, `download-relevant-text.py`, `run-webui.sh`), 4 source modules (`classifier.py`, `analytics.py`, `auth.py`, `obsidian_export.py`), 1 test (`test_obsidian_export.py`), 2 reports (`wg-analysis.md`, generated draft).
 **Why**: These files had accumulated as untracked over several sessions. Production code, utility scripts, and analysis reports all belong in version control. Build artifacts and local DBs do not.
 **Result**: 12 files staged, .gitignore updated with 6 new patterns. No commit made yet (deferred to parent process).
 ---
 ### 2026-03-08 ANALYST — Re-extract ideas and convergence analysis
 **What**: Ran idea extraction pipeline for 38 drafts that were missing ideas (out of 97 initially missing — 59 remain without ideas, likely false positives or drafts without sufficient content). Then ran cross-organization convergence analysis on the full idea set.
 **Why**: Ideas count was stale at 419 across 377 drafts after the DB expanded to 474 drafts. Convergence analysis needed to understand which technical ideas are independently emerging across multiple organizations.
 **Result**: 462 ideas across 415 drafts. Convergence analysis found 132 cross-org convergent ideas out of 398 unique clusters (33% convergence rate). Top convergent idea: "Fully Adaptive Routing Ethernet for AI" with 14 contributing organizations. Notable: "AI Agent Protocol Framework" converges across 7 orgs and 3 separate drafts. Updated `data/reports/reviews/verified-counts.md` with new counts and convergence results.
 **Cost**: 654,377 tokens in + 335,984 tokens out (Haiku, cheap mode), 8 batches of 5 drafts.
 ---
 ### 2026-03-08 CODER — TypedDicts for data layer, ethics + regulatory content in blog series
 **What**: Four improvements across typing and content:
--- a/data/reports/gaps.md
+++ b/data/reports/gaps.md
@@ -1,9 +1,9 @@
 # Gap Analysis: IETF AI/Agent Draft Landscape
-*Generated 2026-03-03 19:58 UTC — analyzing 361 drafts, 1780 technical ideas*
+*Generated 2026-03-08 14:30 UTC — analyzing 474 drafts, 462 technical ideas*
 ## Overview
-This report identifies **12 gaps** — areas, problems, or technical challenges not adequately addressed by the current 361 IETF AI/agent drafts. Each gap is cross-referenced with related drafts and extracted technical ideas to show partial coverage.
+This report identifies **12 gaps** — areas, problems, or technical challenges not adequately addressed by the current 474 IETF AI/agent drafts. Each gap is cross-referenced with related drafts and extracted technical ideas to show partial coverage.
 | Severity | Count |
 |----------|------:|
@@ -13,34 +13,34 @@ This report identifies **12 gaps** — areas, problems, or technical challenges
 ### Safety Deficit
-Only **44** of 361 drafts address AI safety/alignment, while **120** focus on A2A protocols and **93** on autonomous operations. The ratio of capability-building to safety is roughly **4:1**.
+Only **46** of 474 drafts address AI safety/alignment, while **150** focus on A2A protocols and **110** on autonomous operations. The ratio of capability-building to safety is roughly **5:1**.
 ---
-## 1. Agent Behavior Verification
+## 1. Real-time Agent Behavior Verification
 | | |
 |---|---|
 | **Severity** | CRITICAL |
 | **Category** | AI safety/alignment |
-| **Drafts in category** | 44 |
+| **Drafts in category** | 46 |
-While many drafts address agent identity and authentication, few tackle how to verify that an agent is actually behaving according to its declared capabilities and policies. There's a critical gap in runtime behavioral attestation and compliance monitoring mechanisms.
+Current AI safety drafts focus on governance but lack technical protocols for real-time verification that agents are behaving according to their declared policies. There's no standard way to cryptographically prove agent actions match stated intentions.
-**Evidence:** High overlap in identity/auth (108 drafts) but only 44 drafts on safety/alignment, with no specific focus on behavioral verification
+**Evidence:** Only 46 safety drafts versus 474 total, with governance focus rather than technical verification
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
+- [draft-an-nmrg-i2icf-cits](https://datatracker.ietf.org/doc/draft-an-nmrg-i2icf-cits/) (score 3.7) — Interface to In-Network Computing Functions for Cooperative Intelligent Transpor
 - [draft-zhao-detnet-enhanced-use-cases](https://datatracker.ietf.org/doc/draft-zhao-detnet-enhanced-use-cases/) (score 3.2) — Enhanced Use Cases for Scaling Deterministic Networks
 - [draft-zhang-rvp-problem-statement](https://datatracker.ietf.org/doc/draft-zhang-rvp-problem-statement/) (score 3.5) — Problem Statements and Requirements of Real-Virtual Agent Protocol (RVP): Commun
 - [draft-yuan-rtgwg-traffic-agent-usecase](https://datatracker.ietf.org/doc/draft-yuan-rtgwg-traffic-agent-usecase/) (score 3.7) — Use cases of the AI Network Traffic Optimization Agent
 - [draft-altanai-aipref-realtime-protocol-bindings](https://datatracker.ietf.org/doc/draft-altanai-aipref-realtime-protocol-bindings/) (score 3.6) — AI Preferences for Real-Time Protocol Bindings
 - [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
 - [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
 - [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
 - [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
 - [draft-ruan-spring-priority-flow-control-sid](https://datatracker.ietf.org/doc/draft-ruan-spring-priority-flow-control-sid/) (score 3.1) — SRv6 behavior extention for Flow Control in WAN
-**Top-rated in AI safety/alignment** (44 drafts):
+**Top-rated in AI safety/alignment** (46 drafts):
 - [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
 - [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (4.8) — Defines comprehensive protocol for AI agent accountability including authentication, monitoring, and
@@ -50,275 +50,34 @@ While many drafts address agent identity and authentication, few tackle how to v
 ### Partially Addressing Ideas
-53 extracted ideas touch on this gap:
+17 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
-| Verifiable Agent Behavior Attestation | draft-birkholz-verifiable-agent-conversations | requirement |
+| Distributed AI Accountability Protocol | draft-aylward-daap-v2 | protocol |
-| Behavioral Trustworthiness Assessment | draft-chen-agent-decoupled-authorization-model | mechanism |
+| AGENTS.TXT Policy File | draft-srijal-agents-policy | protocol |
-| Multi-Vendor TEE Attestation (M-TACE) | draft-aylward-aiga-1 | mechanism |
+| AI Network Security Agent | draft-yuan-rtgwg-security-agent-usecase | architecture |
-| Multi-Vendor TEE Attestation (M-TACE) | draft-aylward-aiga-2 | mechanism |
+| A2A Protocol Transport over MOQT | draft-a2a-moqt-transport | protocol |
 | Cryptographic Identity Verification | draft-aylward-daap-v2 | mechanism |
 | Behavioral Monitoring Framework | draft-aylward-daap-v2 | mechanism |
 | Post-Discovery Authorization Handshake | draft-barney-caam | protocol |
-| Five Enforcement Pillars with Typed Schemas | draft-berlinai-vera | pattern |
+| Evidence-based Autonomy Maturity Model | draft-berlinai-vera | mechanism |
 | Verifiable Agent Conversation Format | draft-birkholz-verifiable-agent-conversations | protocol |
 | Intent-Based Just-in-Time Authorization | draft-chen-agent-decoupled-authorization-model | architecture |
-*...and 45 more*
+*...and 9 more*
 ---
-## 2. Cross-Domain Agent Liability
+## 2. Multi-Agent Consensus Under Byzantine Conditions
 | | |
 |---|---|
 | **Severity** | CRITICAL |
 | **Category** | Policy/governance |
 | **Drafts in category** | 91 |
 When autonomous agents operate across organizational boundaries and cause harm or make decisions with legal implications, there's no standardized framework for liability attribution. The policy/governance drafts don't address cross-jurisdictional legal accountability.
 **Evidence:** 91 policy/governance drafts but legal liability for cross-domain autonomous actions remains unaddressed
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-diaconu-agents-authz-info-sharing](https://datatracker.ietf.org/doc/draft-diaconu-agents-authz-info-sharing/) (score 3.2) — Cross-Domain AuthZ Information sharing for Agents
 - [draft-cui-dmsc-agent-cdi](https://datatracker.ietf.org/doc/draft-cui-dmsc-agent-cdi/) (score 3.0) — Cross-Domain Interoperability Framework for AI Agent Collaboration
 - [draft-han-rtgwg-agent-gateway-intercomm-framework](https://datatracker.ietf.org/doc/draft-han-rtgwg-agent-gateway-intercomm-framework/) (score 3.6) — Agent Gateway Intercommunication Framework
 - [draft-ni-a2a-ai-agent-security-requirements](https://datatracker.ietf.org/doc/draft-ni-a2a-ai-agent-security-requirements/) (score 3.7) — Security Requirements for AI Agents
 - [draft-intellinode-ai-semantic-contract](https://datatracker.ietf.org/doc/draft-intellinode-ai-semantic-contract/) (score 3.2) — Semantic-Driven Traffic Shaping Contract for AI Networks
 - [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
 **Top-rated in Policy/governance** (91 drafts):
 - [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
 - [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (4.8) — Defines comprehensive protocol for AI agent accountability including authentication, monitoring, and
 - [draft-goswami-agentic-jwt](https://datatracker.ietf.org/doc/draft-goswami-agentic-jwt/) (4.5) — Extends OAuth 2.0 with Agentic JWT to address authorization challenges in autonomous AI systems. Int
 - [draft-wang-cats-odsi](https://datatracker.ietf.org/doc/draft-wang-cats-odsi/) (4.5) — Specifies framework for decentralized LLM inference across untrusted participants with layer-aware e
 - [draft-birkholz-verifiable-agent-conversations](https://datatracker.ietf.org/doc/draft-birkholz-verifiable-agent-conversations/) (4.5) — Defines CDDL-based data format for verifiable agent conversation records using COSE signing. Support
 ### Partially Addressing Ideas
 26 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
 | Cross-Domain Agent Identity Management | draft-abbey-scim-agent-extension | protocol |
 | Multi-level Inference Protocol | draft-chuyi-nmrg-agentic-network-inference | protocol |
 | Cross-Domain Agent Coordination | draft-chuyi-nmrg-agentic-network-inference | mechanism |
 | Cross-Domain Agent Discovery | draft-cui-dmsc-agent-cdi | mechanism |
 | Federated Agent Identity Framework | draft-cui-dmsc-agent-cdi | architecture |
 | Agent Capability Negotiation Protocol | draft-cui-dmsc-agent-cdi | protocol |
 | Federated Policy Enforcement | draft-cui-dmsc-agent-cdi | architecture |
 | Cross-Domain Authorization Information Sharing | draft-diaconu-agents-authz-info-sharing | mechanism |
 *...and 18 more*
 ---
 ## 3. Human Override Protocols
 | | |
 |---|---|
 | **Severity** | CRITICAL |
 | **Category** | Human-agent interaction |
 | **Drafts in category** | 30 |
 Critical gap in standardized protocols for humans to safely interrupt, override, or take control of autonomous agents in emergency situations. Only 30 drafts address human-agent interaction, with no focus on emergency takeover procedures.
 **Evidence:** Only 30 human-agent interaction drafts compared to 213+ autonomous operation drafts, with no emergency override standards
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-dhir-http-agent-profile](https://datatracker.ietf.org/doc/draft-dhir-http-agent-profile/) (score 4.2) — HTTP Agent Profile (HAP): Authenticated and Monetized Agent Traffic on the Web
 - [draft-irtf-nmrg-llm-nm](https://datatracker.ietf.org/doc/draft-irtf-nmrg-llm-nm/) (score 3.5) — A Framework for LLM-Assisted Network Management with Human-in-the-Loop
 - [draft-cui-nmrg-llm-nm](https://datatracker.ietf.org/doc/draft-cui-nmrg-llm-nm/) (score 4.1) — A Framework for LLM Agent-Assisted Network Management with Human-in-the-Loop
 - [draft-zeng-opsawg-applicability-mcp-a2a](https://datatracker.ietf.org/doc/draft-zeng-opsawg-applicability-mcp-a2a/) (score 3.5) — When NETCONF Is Not Enough: Applicability of MCP and A2A for Advanced Network Ma
 - [draft-wmz-nmrg-agent-ndt-arch](https://datatracker.ietf.org/doc/draft-wmz-nmrg-agent-ndt-arch/) (score 4.2) — Network Digital Twin and Agentic AI based Architecture for AI driven Network Ope
 - [draft-ietf-suit-firmware-encryption](https://datatracker.ietf.org/doc/draft-ietf-suit-firmware-encryption/) (score 3.7) — Encrypted Payloads in SUIT Manifests
 **Top-rated in Human-agent interaction** (30 drafts):
 - [draft-drake-email-tpm-attestation](https://datatracker.ietf.org/doc/draft-drake-email-tpm-attestation/) (4.6) — Defines hardware attestation for email using TPM verification chains to prevent spam and provide Syb
 - [draft-ietf-aipref-vocab](https://datatracker.ietf.org/doc/draft-ietf-aipref-vocab/) (4.4) — Defines a standardized vocabulary for expressing preferences about how digital assets should be used
 - [draft-dhir-http-agent-profile](https://datatracker.ietf.org/doc/draft-dhir-http-agent-profile/) (4.2) — Defines HTTP Agent Profile for authenticating agent traffic, separating human from agent traffic, an
 - [draft-song-tsvwg-camp](https://datatracker.ietf.org/doc/draft-song-tsvwg-camp/) (4.2) — Proposes CAMP, a multipath transport protocol for interactive multimodal LLM systems that maintains 
 - [draft-liu-agent-operation-authorization](https://datatracker.ietf.org/doc/draft-liu-agent-operation-authorization/) (4.1) — Specifies framework for verifiable delegation of actions from humans to AI agents using JWT tokens. 
 ### Partially Addressing Ideas
 7 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
 | LLM-Human Collaborative Framework | draft-irtf-nmrg-llm-nm | architecture |
 | CHEQ Protocol | draft-rosenberg-aiproto-cheq | protocol |
 | Signed Confirmation Objects | draft-rosenberg-aiproto-cheq | mechanism |
 | Cross-Protocol Integration Pattern | draft-rosenberg-aiproto-cheq | pattern |
 | CHEQ Protocol | draft-rosenberg-cheq | protocol |
 | Signed Decision Objects | draft-rosenberg-cheq | mechanism |
 | Protocol Integration Pattern | draft-rosenberg-cheq | pattern |
 ---
 ## 4. Agent Resource Exhaustion Protection
 | | |
 |---|---|
 | **Severity** | HIGH |
 | **Category** | Autonomous netops |
 | **Drafts in category** | 93 |
 Missing standardized mechanisms to prevent malicious or poorly designed agents from consuming excessive network, compute, or storage resources. Current drafts focus on traffic management but not on agent-specific resource quotas and enforcement.
 **Evidence:** 93 autonomous netops drafts and 73 ML traffic management drafts lack agent-specific resource protection mechanisms
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
 - [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
 - [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
 - [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
 - [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
 - [draft-jia-oauth-scope-aggregation](https://datatracker.ietf.org/doc/draft-jia-oauth-scope-aggregation/) (score 3.5) — OAuth 2.0 Scope Aggregation for Multi-Step AI Agent Workflows
 **Top-rated in Autonomous netops** (93 drafts):
 - [draft-cui-nmrg-llm-benchmark](https://datatracker.ietf.org/doc/draft-cui-nmrg-llm-benchmark/) (4.3) — Provides comprehensive evaluation framework for LLM-based network configuration agents. Includes emu
 - [draft-wmz-nmrg-agent-ndt-arch](https://datatracker.ietf.org/doc/draft-wmz-nmrg-agent-ndt-arch/) (4.2) — Comprehensive architecture combining Network Digital Twin with Agentic AI for intent-based network o
 - [draft-yue-anima-agent-recovery-networks](https://datatracker.ietf.org/doc/draft-yue-anima-agent-recovery-networks/) (4.1) — Defines task-oriented multi-agent framework for fault recovery in converged mobile networks. Targets
 - [draft-cui-nmrg-llm-nm](https://datatracker.ietf.org/doc/draft-cui-nmrg-llm-nm/) (4.1) — Defines framework for collaborative network management between LLM agents and human operators. Intro
 - [draft-jadoon-nmrg-agentic-ai-autonomous-networks](https://datatracker.ietf.org/doc/draft-jadoon-nmrg-agentic-ai-autonomous-networks/) (4.1) — Introduces architectural principles for integrating AI agents into IP protocol stack layers while pr
 ### Partially Addressing Ideas
 40 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
 | Agent Resource Type | draft-abbey-scim-agent-extension | extension |
 | Agentic Application Resource Type | draft-abbey-scim-agent-extension | extension |
 | Collaborative Inference Acceleration (KDN) | draft-agent-gw | mechanism |
 | Data and Agent Aware-Inference and Training Network (DA-ITN) | draft-akhavain-moussa-ai-network | architecture |
 | Agent-to-Agent (A2A) Communication Paradigm | draft-an-nmrg-i2icf-cits | protocol |
 | Network-Level Quarantine Protocol | draft-aylward-aiga-1 | protocol |
 | Agent Task Negotiation | draft-cui-ai-agent-task | protocol |
 | Multi-Agent Security Protection | draft-fu-nmop-agent-communication-framework | mechanism |
 *...and 32 more*
 ---
 ## 5. Agent-Generated Data Provenance
 | | |
 |---|---|
 | **Severity** | HIGH |
 | **Category** | Data formats/interop |
 | **Drafts in category** | 145 |
 While 145 drafts address data formats for AI interop, there's insufficient attention to tracking the provenance and lineage of data generated by agents. This creates trust and auditability issues in agent-to-agent data exchanges.
 **Evidence:** 145 data format drafts with high overlap but no clear standards for agent-generated data provenance tracking
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-romanchuk-normative-admissibility](https://datatracker.ietf.org/doc/draft-romanchuk-normative-admissibility/) (score 3.4) — Normative Admissibility Framework for Agent Speech Acts
 - [draft-li-semantic-routing-architecture](https://datatracker.ietf.org/doc/draft-li-semantic-routing-architecture/) (score 3.6) — Semantic Routing Architecture for AI Agents Communication
 - [draft-cui-nmrg-llm-nm](https://datatracker.ietf.org/doc/draft-cui-nmrg-llm-nm/) (score 4.1) — A Framework for LLM Agent-Assisted Network Management with Human-in-the-Loop
 - [draft-mpsb-agntcy-messaging](https://datatracker.ietf.org/doc/draft-mpsb-agntcy-messaging/) (score 2.6) — An Overview of Messaging Systems and Their Applicability to Agentic AI
 - [draft-gaikwad-south-authorization](https://datatracker.ietf.org/doc/draft-gaikwad-south-authorization/) (score 3.7) — SOUTH: Stochastic Authorization for Agent and Service Requests
 - [draft-abaris-aicdh](https://datatracker.ietf.org/doc/draft-abaris-aicdh/) (score 2.8) — AI Content Disclosure Header
 **Top-rated in Data formats/interop** (145 drafts):
 - [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
 - [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
 - [draft-ietf-lake-app-profiles](https://datatracker.ietf.org/doc/draft-ietf-lake-app-profiles/) (4.6) — Defines canonical CBOR representation for EDHOC application profiles and coordination mechanisms for
 - [draft-chang-agent-token-efficient](https://datatracker.ietf.org/doc/draft-chang-agent-token-efficient/) (4.5) — Defines ADOL (Agentic Data Optimization Layer) to address token bloat in agent communication protoco
 - [draft-birkholz-verifiable-agent-conversations](https://datatracker.ietf.org/doc/draft-birkholz-verifiable-agent-conversations/) (4.5) — Defines CDDL-based data format for verifiable agent conversation records using COSE signing. Support
 ### Partially Addressing Ideas
 4 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
 | Context-Enhanced Training Data | draft-improving-data-quality-tags | extension |
 | Training Data Provenance Claims | draft-messous-eat-ai | mechanism |
 | Sentinel Evidence Package | draft-reilly-sentinel-protocol | architecture |
 | AI Lifecycle Provenance Tracking | draft-reilly-sentinel-protocol | architecture |
 ---
 ## 6. Agent Capability Degradation Handling
 | | |
 |---|---|
 | **Severity** | HIGH |
 | **Category** | AI safety/alignment |
 | **Drafts in category** | 44 |
 No standardized approaches for detecting and handling when an agent's capabilities degrade due to model drift, data corruption, or hardware issues. Systems need graceful degradation protocols rather than silent failures.
 **Evidence:** Only 44 safety/alignment drafts don't address capability degradation, while 213+ drafts assume stable agent performance
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
 - [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
 - [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
 - [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
 - [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
 - [draft-li-dmsc-inf-architecture](https://datatracker.ietf.org/doc/draft-li-dmsc-inf-architecture/) (score 3.1) — Dynamic Multi-agent Secured Collaboration Infrastructure Architecture
 **Top-rated in AI safety/alignment** (44 drafts):
 - [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
 - [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (4.8) — Defines comprehensive protocol for AI agent accountability including authentication, monitoring, and
 - [draft-guy-bary-stamp-protocol](https://datatracker.ietf.org/doc/draft-guy-bary-stamp-protocol/) (4.6) — Defines STAMP protocol for cryptographic delegation and proof in AI agent systems. Provides task-bou
 - [draft-drake-email-tpm-attestation](https://datatracker.ietf.org/doc/draft-drake-email-tpm-attestation/) (4.6) — Defines hardware attestation for email using TPM verification chains to prevent spam and provide Syb
 - [draft-goswami-agentic-jwt](https://datatracker.ietf.org/doc/draft-goswami-agentic-jwt/) (4.5) — Extends OAuth 2.0 with Agentic JWT to address authorization challenges in autonomous AI systems. Int
 ### Partially Addressing Ideas
 45 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
 | Semantic Routing | draft-agent-gw | mechanism |
 | Semantic Routing | draft-ainp-protocol | mechanism |
 | Capability-based Discovery | draft-ainp-protocol | pattern |
 | Complex Delegation Relationship Management | draft-chen-ai-agent-auth-new-requirements | architecture |
 | Capability-Based Discovery Mechanism | draft-cui-ai-agent-discovery-invocation | mechanism |
 | Agent Capability Negotiation Protocol | draft-cui-dmsc-agent-cdi | protocol |
 | Agent Capability-Based Routing | draft-du-catalist-routing-considerations | mechanism |
 | Agent Monitoring and Tracking | draft-fu-nmop-agent-communication-framework | mechanism |
 *...and 37 more*
 ---
 ## 7. Multi-Agent Coordination Deadlocks
 | | |
 |---|---|
 | **Severity** | HIGH |
 | **Category** | A2A protocols |
-| **Drafts in category** | 120 |
+| **Drafts in category** | 150 |
-With 120+ A2A protocol drafts, there's insufficient attention to preventing deadlock situations where multiple agents create circular dependencies or resource conflicts. Missing are standardized deadlock detection and resolution mechanisms.
+While agent discovery and A2A protocols exist, there's no framework for handling consensus when some agents may be compromised or malicious. Critical for autonomous systems making collective decisions.
-**Evidence:** 120 A2A protocol drafts with high internal overlap but no systematic deadlock prevention frameworks
+**Evidence:** Complex autonomous systems require Byzantine fault tolerance but it's absent from protocol designs
 ### Related Drafts
@@ -329,196 +88,88 @@ With 120+ A2A protocol drafts, there's insufficient attention to preventing dead
 - [draft-yue-anima-agent-recovery-networks](https://datatracker.ietf.org/doc/draft-yue-anima-agent-recovery-networks/) (score 4.1) — Task-Oriented Multi-Agent Recovery Framework for High-Reliability in Converged M
 - [draft-chang-agent-context-interaction](https://datatracker.ietf.org/doc/draft-chang-agent-context-interaction/) (score 2.9) — Agent Context Interaction Optimizations
 - [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
- [draft-cui-ai-agent-task](https://datatracker.ietf.org/doc/draft-cui-ai-agent-task/) (score 3.0) — Task-oriented Coordination Requirements for AI Agent Protocols
+- [draft-ramakrishna-satp-views-addresses](https://datatracker.ietf.org/doc/draft-ramakrishna-satp-views-addresses/) (score 3.4) — Views and View Addresses for Secure Asset Transfer
-**Top-rated in A2A protocols** (120 drafts):
+**Top-rated in A2A protocols** (150 drafts):
 - [draft-guy-bary-stamp-protocol](https://datatracker.ietf.org/doc/draft-guy-bary-stamp-protocol/) (4.6) — Defines STAMP protocol for cryptographic delegation and proof in AI agent systems. Provides task-bou
 - [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
 - [draft-ietf-lake-edhoc](https://datatracker.ietf.org/doc/draft-ietf-lake-edhoc/) (4.6) — Specifies EDHOC, a compact authenticated Diffie-Hellman key exchange protocol for constrained enviro
 - [draft-chang-agent-token-efficient](https://datatracker.ietf.org/doc/draft-chang-agent-token-efficient/) (4.5) — Defines ADOL (Agentic Data Optimization Layer) to address token bloat in agent communication protoco
 - [draft-chen-oauth-rar-agent-extensions](https://datatracker.ietf.org/doc/draft-chen-oauth-rar-agent-extensions/) (4.2) — Extends OAuth RAR with policy_context and lifecycle_binding members for AI agent environments. Enabl
 - [draft-mallick-muacp](https://datatracker.ietf.org/doc/draft-mallick-muacp/) (4.2) — Resource-efficient messaging protocol specifically designed for constrained IoT/Edge devices with de
 ### Partially Addressing Ideas
-11 extracted ideas touch on this gap:
+2 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
-| Multi-Agent Task Coordination | draft-du-ai-agent-communication-6g-aspect | mechanism |
+| ASRank Structural Vulnerability Analysis | draft-xu-sidrops-asrank-vulnerabilities | requirement |
-| AI Gateway | draft-fu-nmop-agent-communication-framework | architecture |
+| MCP and A2A Complementary Solutions for Network Management | draft-zeng-opsawg-applicability-mcp-a2a | architecture |
 | DMSC Infrastructure Architecture | draft-li-dmsc-inf-architecture | architecture |
 | Multi-agent Collaboration Protocol Suite | draft-li-dmsc-macp | protocol |
 | Task-based Multi-Agent Coordination | draft-li-dmsc-mcps-agw | pattern |
 | Cognitive Networking Substrate | draft-li-semantic-routing-architecture | architecture |
 | Agent Communication Use Cases | draft-stephan-ai-agent-6g | pattern |
 | Structured Responsibility and Traceability Architecture (SRTA) | draft-takagi-srta-trinity | architecture |
 *...and 3 more*
 ---
-## 8. Agent Privacy Preservation
+## 3. Emergency Agent Shutdown Coordination
 | | |
 |---|---|
-| **Severity** | HIGH |
+| **Severity** | CRITICAL |
-| **Category** | Agent identity/auth |
+| **Category** | AI safety/alignment |
-| **Drafts in category** | 108 |
+| **Drafts in category** | 46 |
-Agents often process sensitive data but current drafts don't adequately address privacy-preserving computation, differential privacy, or secure multi-party computation for agent interactions. This is critical for deployment in regulated industries.
+Missing protocols for coordinated emergency shutdown of autonomous agent networks when safety issues are detected. Individual agent controls exist but not network-wide coordination mechanisms.
-**Evidence:** 108 identity/auth drafts focus on authentication but lack privacy preservation mechanisms for agent data processing
+**Evidence:** Human-in-the-loop drafts exist but no emergency coordination protocols for autonomous systems
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
+- [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (score 4.8) — Distributed AI Accountability Protocol (DAAP) Version 2.0
 - [draft-khatri-sipcore-call-transfer-fail-response](https://datatracker.ietf.org/doc/draft-khatri-sipcore-call-transfer-fail-response/) (score 3.3) — A SIP Response Code (497) for Call Transfer Failure
 - [draft-cui-dmsc-agent-cdi](https://datatracker.ietf.org/doc/draft-cui-dmsc-agent-cdi/) (score 3.0) — Cross-Domain Interoperability Framework for AI Agent Collaboration
 - [draft-yu-ai-agent-use-cases-in-6g](https://datatracker.ietf.org/doc/draft-yu-ai-agent-use-cases-in-6g/) (score 2.5) — AI Agent Use Cases and Requirements in 6G Network
 - [draft-zhang-rvp-problem-statement](https://datatracker.ietf.org/doc/draft-zhang-rvp-problem-statement/) (score 3.5) — Problem Statements and Requirements of Real-Virtual Agent Protocol (RVP): Commun
 - [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
 - [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
 - [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
 - [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
 - [draft-kale-agntcy-federated-privacy](https://datatracker.ietf.org/doc/draft-kale-agntcy-federated-privacy/) (score 3.2) — Privacy-Preserving Federated Learning Architecture for Multi-Tenant AI Agent Sys
-**Top-rated in Agent identity/auth** (108 drafts):
+**Top-rated in AI safety/alignment** (46 drafts):
 - [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
 - [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (4.8) — Defines comprehensive protocol for AI agent accountability including authentication, monitoring, and
 - [draft-guy-bary-stamp-protocol](https://datatracker.ietf.org/doc/draft-guy-bary-stamp-protocol/) (4.6) — Defines STAMP protocol for cryptographic delegation and proof in AI agent systems. Provides task-bou
 - [draft-drake-email-tpm-attestation](https://datatracker.ietf.org/doc/draft-drake-email-tpm-attestation/) (4.6) — Defines hardware attestation for email using TPM verification chains to prevent spam and provide Syb
- [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
+- [draft-goswami-agentic-jwt](https://datatracker.ietf.org/doc/draft-goswami-agentic-jwt/) (4.5) — Extends OAuth 2.0 with Agentic JWT to address authorization challenges in autonomous AI systems. Int
 ### Partially Addressing Ideas
-11 extracted ideas touch on this gap:
+9 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
-| Agent Card Structure | draft-nandakumar-agent-sd-jwt | protocol |
+| Distributed AI Accountability Protocol | draft-aylward-daap-v2 | protocol |
-| Pseudonymous Key Generation | draft-bradleylundberg-cfrg-arkg | mechanism |
+| Agentic network architecture for multi-agent coordination | draft-chuyi-nmrg-agentic-network-inference | architecture |
-| Privacy-Preserving Human Tokens | draft-dhir-http-agent-profile | mechanism |
+| Dynamic Task Coordination Requirements for AI Agents | draft-cui-ai-agent-task | requirement |
-| Cryptographic Erasure Compliance | draft-gaikwad-aps-profile | mechanism |
+| Multi-Agent Communication Framework for AIOps | draft-fu-nmop-agent-communication-framework | architecture |
-| Privacy-Respecting Capability Attestation | draft-huang-rats-agentic-eat-cap-attest | pattern |
+| Meta-Layer Coordination Substrate | draft-meta-layer-overview | architecture |
-| Differential Privacy for Agent Models | draft-kale-agntcy-federated-privacy | mechanism |
+| Trinity Configuration for Agent Coordination | draft-takagi-srta-trinity | pattern |
-| Agent Identity Preservation | draft-liu-oauth-a2a-profile | pattern |
+| Internet of Agents Task Protocol for heterogeneous collaboration | draft-yang-dmsc-ioa-task-protocol | protocol |
-| Inference-Time Data Access Policy Claims | draft-messous-eat-ai | mechanism |
+| Task-Oriented Multi-Agent Recovery Framework | draft-yue-anima-agent-recovery-networks | architecture |
-*...and 3 more*
+*...and 1 more*
 ---
-## 9. Agent Firmware/Model Update Security
+## 4. Cross-Protocol Agent Migration
 | | |
 |---|---|
 | **Severity** | HIGH |
 | **Category** | Model serving/inference |
 | **Drafts in category** | 42 |
 While model serving is addressed in 42 drafts, there's insufficient focus on secure update mechanisms for agent models and firmware. Missing are standards for cryptographically verified, rollback-capable agent updates.
 **Evidence:** 42 model serving drafts but no comprehensive security standards for agent software/model updates
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
 - [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
 - [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
 - [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
 - [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
 - [draft-ietf-tls-extended-key-update](https://datatracker.ietf.org/doc/draft-ietf-tls-extended-key-update/) (score 4.2) — Extended Key Update for Transport Layer Security (TLS) 1.3
 **Top-rated in Model serving/inference** (42 drafts):
 - [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
 - [draft-chang-agent-token-efficient](https://datatracker.ietf.org/doc/draft-chang-agent-token-efficient/) (4.5) — Defines ADOL (Agentic Data Optimization Layer) to address token bloat in agent communication protoco
 - [draft-calabria-bmwg-ai-fabric-inference-bench](https://datatracker.ietf.org/doc/draft-calabria-bmwg-ai-fabric-inference-bench/) (4.5) — Defines benchmarking methodology for AI inference network fabrics. Establishes KPIs and test procedu
 - [draft-wang-cats-odsi](https://datatracker.ietf.org/doc/draft-wang-cats-odsi/) (4.5) — Specifies framework for decentralized LLM inference across untrusted participants with layer-aware e
 - [draft-wmz-nmrg-agent-ndt-arch](https://datatracker.ietf.org/doc/draft-wmz-nmrg-agent-ndt-arch/) (4.2) — Comprehensive architecture combining Network Digital Twin with Agentic AI for intent-based network o
 ### Partially Addressing Ideas
 79 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
 | Multi-layered Security Architecture | draft-aylward-daap-v2 | architecture |
 | VERA Zero Trust Reference Architecture | draft-berlinai-vera | architecture |
 | Evidence-Based Maturity Runtime | draft-berlinai-vera | mechanism |
 | Five Enforcement Pillars with Typed Schemas | draft-berlinai-vera | pattern |
 | AI Agent Structured Threat Model | draft-berlinai-vera | requirement |
 | Cryptographic Proof-Based Autonomy | draft-berlinai-vera | mechanism |
 | Pseudonymous Key Generation | draft-bradleylundberg-cfrg-arkg | mechanism |
 | Multi-Agent Security Protection | draft-fu-nmop-agent-communication-framework | mechanism |
 *...and 71 more*
 ---
 ## 10. Real-time Agent Debugging
 | | |
 |---|---|
 | **Severity** | MEDIUM |
 | **Category** | Other AI/agent |
 | **Drafts in category** | 26 |
 Missing standardized protocols for debugging autonomous agents in production environments. When agents make unexpected decisions, there are no standard interfaces for real-time introspection without disrupting operations.
 **Evidence:** 26 other AI/agent drafts suggest various approaches but no standardized debugging protocols for production agents
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-an-nmrg-i2icf-cits](https://datatracker.ietf.org/doc/draft-an-nmrg-i2icf-cits/) (score 3.7) — Interface to In-Network Computing Functions for Cooperative Intelligent Transpor
 - [draft-zhao-detnet-enhanced-use-cases](https://datatracker.ietf.org/doc/draft-zhao-detnet-enhanced-use-cases/) (score 3.2) — Enhanced Use Cases for Scaling Deterministic Networks
 - [draft-zhang-rvp-problem-statement](https://datatracker.ietf.org/doc/draft-zhang-rvp-problem-statement/) (score 3.5) — Problem Statements and Requirements of Real-Virtual Agent Protocol (RVP): Commun
 - [draft-yuan-rtgwg-traffic-agent-usecase](https://datatracker.ietf.org/doc/draft-yuan-rtgwg-traffic-agent-usecase/) (score 3.7) — Use cases of the AI Network Traffic Optimization Agent
 - [draft-hong-nmrg-agenticai-ps](https://datatracker.ietf.org/doc/draft-hong-nmrg-agenticai-ps/) (score 3.0) — Motivations and Problem Statement of Agentic AI for network management
 - [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
 **Top-rated in Other AI/agent** (26 drafts):
 - [draft-calabria-bmwg-ai-fabric-inference-bench](https://datatracker.ietf.org/doc/draft-calabria-bmwg-ai-fabric-inference-bench/) (4.5) — Defines benchmarking methodology for AI inference network fabrics. Establishes KPIs and test procedu
 - [draft-ietf-tls-ecdhe-mlkem](https://datatracker.ietf.org/doc/draft-ietf-tls-ecdhe-mlkem/) (4.4) — Defines hybrid post-quantum key agreement mechanisms for TLS 1.3 that combine ML-KEM with traditiona
 - [draft-wmz-nmrg-agent-ndt-arch](https://datatracker.ietf.org/doc/draft-wmz-nmrg-agent-ndt-arch/) (4.2) — Comprehensive architecture combining Network Digital Twin with Agentic AI for intent-based network o
 - [draft-an-nmrg-i2icf-cits](https://datatracker.ietf.org/doc/draft-an-nmrg-i2icf-cits/) (3.7) — Defines framework for orchestrating In-Network Computing Functions in Cooperative Intelligent Transp
 - [draft-cui-nmrg-auto-test](https://datatracker.ietf.org/doc/draft-cui-nmrg-auto-test/) (3.6) — Framework for AI-assisted network protocol testing using LLMs and automated test generation. Defines
 ### Partially Addressing Ideas
 23 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
 | A2A Protocol Transport over MOQT | draft-a2a-moqt-transport | protocol |
 | QUIC-based Publish/Subscribe for AI Agents | draft-a2a-moqt-transport | mechanism |
 | Streaming Capabilities Integration | draft-a2a-moqt-transport | mechanism |
 | Action-Based Authorization | draft-aylward-aiga-2 | mechanism |
 | Multi-layered Security Architecture | draft-aylward-daap-v2 | architecture |
 | Behavioral Monitoring Framework | draft-aylward-daap-v2 | mechanism |
 | Context-Aware Task Scheduling | draft-cui-ai-agent-task | mechanism |
 | Real-Time Task Adaptability | draft-cui-ai-agent-task | requirement |
 *...and 15 more*
 ---
 ## 11. Cross-Protocol Agent Migration
 | | |
 |---|---|
 | **Severity** | MEDIUM |
 | **Category** | A2A protocols |
-| **Drafts in category** | 120 |
+| **Drafts in category** | 150 |
-No standardized mechanisms for migrating agent state and context when moving between different A2A protocols or infrastructure providers. This creates vendor lock-in and limits agent mobility.
+While A2A protocols exist, there's no standardized mechanism for agents to migrate between different protocol frameworks or service providers while maintaining state and identity. This creates vendor lock-in and limits agent portability across heterogeneous systems.
-**Evidence:** 120 A2A protocol drafts with high overlap suggest competing approaches but no migration standards between them
+**Evidence:** 150 A2A protocol drafts with high overlap suggest fragmentation without migration solutions
 ### Related Drafts
@@ -529,75 +180,356 @@ No standardized mechanisms for migrating agent state and context when moving bet
 - [draft-narajala-ans](https://datatracker.ietf.org/doc/draft-narajala-ans/) (score 4.2) — Agent Name Service (ANS): A Universal Directory for Secure AI Agent Discovery an
 - [draft-ietf-emu-eap-edhoc](https://datatracker.ietf.org/doc/draft-ietf-emu-eap-edhoc/) (score 3.2) — Using the Extensible Authentication Protocol (EAP) with Ephemeral Diffie-Hellman
 - [draft-howe-sipcore-mcp-extension](https://datatracker.ietf.org/doc/draft-howe-sipcore-mcp-extension/) (score 3.7) — SIP Extension for Model Context Protocol (MCP)
- [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
+- [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
-**Top-rated in A2A protocols** (120 drafts):
+**Top-rated in A2A protocols** (150 drafts):
 - [draft-guy-bary-stamp-protocol](https://datatracker.ietf.org/doc/draft-guy-bary-stamp-protocol/) (4.6) — Defines STAMP protocol for cryptographic delegation and proof in AI agent systems. Provides task-bou
 - [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
 - [draft-ietf-lake-edhoc](https://datatracker.ietf.org/doc/draft-ietf-lake-edhoc/) (4.6) — Specifies EDHOC, a compact authenticated Diffie-Hellman key exchange protocol for constrained enviro
 - [draft-chang-agent-token-efficient](https://datatracker.ietf.org/doc/draft-chang-agent-token-efficient/) (4.5) — Defines ADOL (Agentic Data Optimization Layer) to address token bloat in agent communication protoco
 - [draft-chen-oauth-rar-agent-extensions](https://datatracker.ietf.org/doc/draft-chen-oauth-rar-agent-extensions/) (4.2) — Extends OAuth RAR with policy_context and lifecycle_binding members for AI agent environments. Enabl
 - [draft-mallick-muacp](https://datatracker.ietf.org/doc/draft-mallick-muacp/) (4.2) — Resource-efficient messaging protocol specifically designed for constrained IoT/Edge devices with de
 ### Partially Addressing Ideas
-3 extracted ideas touch on this gap:
+No directly related technical ideas found in current drafts — this gap is entirely unaddressed.
 | Idea | Draft | Type |
 |------|-------|------|
 | Transport-Independent Attestation Format | draft-drake-email-tpm-attestation | extension |
 | Cross-Protocol Integration Pattern | draft-rosenberg-aiproto-cheq | pattern |
 | Agent Mobility with IPv6 MIPv6 | draft-yc-ipv6-for-ioa | mechanism |
 ---
-## 12. Agent Energy Consumption Optimization
+## 5. Agent Resource Accounting and Billing
 | | |
 |---|---|
-| **Severity** | MEDIUM |
+| **Severity** | HIGH |
-| **Category** | ML traffic mgmt |
+| **Category** | new |
-| **Drafts in category** | 73 |
+| **Drafts in category** | 0 |
-Missing standards for energy-aware agent deployment and operation. As AI workloads are energy-intensive, there's no framework for agents to optimize their energy consumption or for infrastructure to enforce energy budgets.
+No standardized protocols exist for tracking and billing computational resources consumed by autonomous agents across distributed systems. This is essential for commercial deployment but completely unaddressed.
-**Evidence:** 73 ML traffic management drafts focus on performance but lack energy consumption considerations for sustainable AI deployment
+**Evidence:** High focus on protocols and deployment but zero drafts addressing economic models
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
 - [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
 - [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
 - [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
 - [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
 - [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
- [draft-ahc-green-smartpdu-yang](https://datatracker.ietf.org/doc/draft-ahc-green-smartpdu-yang/) (score 2.9) — A YANG Model for SmartPDU Monitoring and Control
+- [draft-jia-oauth-scope-aggregation](https://datatracker.ietf.org/doc/draft-jia-oauth-scope-aggregation/) (score 3.5) — OAuth 2.0 Scope Aggregation for Multi-Step AI Agent Workflows
 **Top-rated in ML traffic mgmt** (73 drafts):
 - [draft-calabria-bmwg-ai-fabric-inference-bench](https://datatracker.ietf.org/doc/draft-calabria-bmwg-ai-fabric-inference-bench/) (4.5) — Defines benchmarking methodology for AI inference network fabrics. Establishes KPIs and test procedu
 - [draft-dhir-http-agent-profile](https://datatracker.ietf.org/doc/draft-dhir-http-agent-profile/) (4.2) — Defines HTTP Agent Profile for authenticating agent traffic, separating human from agent traffic, an
 - [draft-calabria-bmwg-ai-fabric-terminology](https://datatracker.ietf.org/doc/draft-calabria-bmwg-ai-fabric-terminology/) (4.2) — Defines comprehensive benchmarking terminology for AI network fabrics including collective communica
 - [draft-li-spring-rdma-multicast-over-srv6](https://datatracker.ietf.org/doc/draft-li-spring-rdma-multicast-over-srv6/) (4.2) — Specifies SRv6 extensions for RDMA multicast delivery with new End.MT behavior and ACK/NACK aggregat
 - [draft-song-tsvwg-camp](https://datatracker.ietf.org/doc/draft-song-tsvwg-camp/) (4.2) — Proposes CAMP, a multipath transport protocol for interactive multimodal LLM systems that maintains 
 ### Partially Addressing Ideas
-17 extracted ideas touch on this gap:
+8 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
-| SmartPDU Telemetry Framework | draft-ahc-green-smartpdu-yang | mechanism |
+| SCIM 2.0 Extension for Agents and Agentic Applications | draft-abbey-scim-agent-extension | extension |
-| Agent Context Distribution | draft-chang-agent-context-interaction | mechanism |
+| Events Query Protocol | draft-gupta-httpapi-events-query | protocol |
-| Context Distribution Optimization Procedures | draft-chang-agent-context-interaction | protocol |
+| Micro Agent Communication Protocol (µACP) | draft-mallick-muacp | protocol |
-| Schema Deduplication via JSON References | draft-chang-agent-token-efficient | mechanism |
+| MOQT Binding for A2A and MCP Protocols | draft-nandakumar-ai-agent-moq-transport | extension |
-| Agentic Data Optimization Layer (ADOL) | draft-chang-agent-token-efficient | architecture |
+| SCIM 2.0 Agent Extension | draft-scim-agent-extension | extension |
-| Information Exchange Efficiency | draft-chuyi-nmrg-agentic-network-inference | mechanism |
+| Authorized Connection Policy Framework | draft-steckbeck-ua-conn-sec | mechanism |
-| Vector Index Workload Optimization | draft-gaikwad-aps-profile | pattern |
+| Agent Workflow Protocol Well-Known Resource | draft-vinaysingh-awp-wellknown | extension |
-| Collaboration Tunnel Protocol (TCT) | draft-jurkovikj-collab-tunnel | protocol |
+| AI Network Traffic Optimization Agent | draft-yuan-rtgwg-traffic-agent-usecase | architecture |
-*...and 9 more*
+---
 ## 6. Agent Capability Advertisement Verification
 | | |
 |---|---|
 | **Severity** | HIGH |
 | **Category** | Agent discovery/reg |
 | **Drafts in category** | 87 |
 While agent discovery protocols exist, there's no way to cryptographically verify that advertised agent capabilities are accurate. Agents could falsely claim capabilities leading to system failures.
 **Evidence:** 87 discovery drafts but no mention of capability verification mechanisms
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
 - [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
 - [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
 - [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
 - [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
 - [draft-li-dmsc-inf-architecture](https://datatracker.ietf.org/doc/draft-li-dmsc-inf-architecture/) (score 3.1) — Dynamic Multi-agent Secured Collaboration Infrastructure Architecture
 **Top-rated in Agent discovery/reg** (87 drafts):
 - [draft-narajala-ans](https://datatracker.ietf.org/doc/draft-narajala-ans/) (4.2) — Introduces Agent Name Service (ANS) as a DNS-based universal directory for AI agent discovery and ve
 - [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (4.2) — Specifies a comprehensive multi-agent collaboration protocol suite using Agent Gateways for registra
 - [draft-cui-dns-native-agent-naming-resolution](https://datatracker.ietf.org/doc/draft-cui-dns-native-agent-naming-resolution/) (4.1) — Specifies DNS-native naming and resolution for AI agents using FQDNs and SVCB records. Emphasizes DN
 - [draft-nederveld-adl](https://datatracker.ietf.org/doc/draft-nederveld-adl/) (4.1) — Defines ADL, a JSON-based standard for describing AI agents including their capabilities, tools, per
 - [draft-rosenberg-ai-protocols](https://datatracker.ietf.org/doc/draft-rosenberg-ai-protocols/) (4.1) — Establishes framework for AI agent communications on the Internet, surveying existing protocols like
 ### Partially Addressing Ideas
 25 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
 | DNS-based AI Agent Discovery | draft-mozleywilliams-dnsop-bandaid | mechanism |
 | DNS namespace for AI agent discovery | draft-mozleywilliams-dnsop-dnsaid | mechanism |
 | Agent Registration and Discovery Protocol | draft-pioli-agent-discovery | protocol |
 | Intent-based Agent Interconnection Protocol | draft-sun-zhang-iaip | protocol |
 | Capability Advertisement and Intent Resolution | draft-sz-dmsc-iaip | mechanism |
 | Intelligent Agent Communication Gateway Architecture | draft-agent-gw | architecture |
 | AI-Native Network Protocol (AINP) | draft-ainp-protocol | protocol |
 | Distributed AI Accountability Protocol | draft-aylward-daap-v2 | protocol |
 *...and 17 more*
 ---
 ## 7. Cross-Domain Agent Communication Security
 | | |
 |---|---|
 | **Severity** | HIGH |
 | **Category** | Agent identity/auth |
 | **Drafts in category** | 145 |
 Current identity/auth solutions don't address secure communication between agents operating in different security domains or trust boundaries. Critical for enterprise and government deployments.
 **Evidence:** 145 identity drafts show awareness but cross-domain scenarios appear unaddressed
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-diaconu-agents-authz-info-sharing](https://datatracker.ietf.org/doc/draft-diaconu-agents-authz-info-sharing/) (score 3.2) — Cross-Domain AuthZ Information sharing for Agents
 - [draft-cui-dmsc-agent-cdi](https://datatracker.ietf.org/doc/draft-cui-dmsc-agent-cdi/) (score 3.0) — Cross-Domain Interoperability Framework for AI Agent Collaboration
 - [draft-han-rtgwg-agent-gateway-intercomm-framework](https://datatracker.ietf.org/doc/draft-han-rtgwg-agent-gateway-intercomm-framework/) (score 3.6) — Agent Gateway Intercommunication Framework
 - [draft-ni-a2a-ai-agent-security-requirements](https://datatracker.ietf.org/doc/draft-ni-a2a-ai-agent-security-requirements/) (score 3.7) — Security Requirements for AI Agents
 - [draft-intellinode-ai-semantic-contract](https://datatracker.ietf.org/doc/draft-intellinode-ai-semantic-contract/) (score 3.2) — Semantic-Driven Traffic Shaping Contract for AI Networks
 - [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
 **Top-rated in Agent identity/auth** (145 drafts):
 - [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
 - [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (4.8) — Defines comprehensive protocol for AI agent accountability including authentication, monitoring, and
 - [draft-guy-bary-stamp-protocol](https://datatracker.ietf.org/doc/draft-guy-bary-stamp-protocol/) (4.6) — Defines STAMP protocol for cryptographic delegation and proof in AI agent systems. Provides task-bou
 - [draft-drake-email-tpm-attestation](https://datatracker.ietf.org/doc/draft-drake-email-tpm-attestation/) (4.6) — Defines hardware attestation for email using TPM verification chains to prevent spam and provide Syb
 - [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
 ### Partially Addressing Ideas
 46 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
 | Centralized Gateway for Multi-Agent Communication | draft-song-dmsc-problem-statement | architecture |
 | Multi-Tenant Policy Enforcement Infrastructure | draft-song-dmsc-problem-statement | architecture |
 | Intelligent Agent Communication Gateway Architecture | draft-agent-gw | architecture |
 | AI-Native Network Protocol (AINP) | draft-ainp-protocol | protocol |
 | Agent-to-Agent Communication in Transportation Networks | draft-an-nmrg-i2icf-cits | pattern |
 | Zero Trust Runtime Agent Architecture | draft-berlinai-vera | architecture |
 | Agentic Data Optimization Layer (ADOL) | draft-chang-agent-token-efficient | protocol |
 | Agentic network architecture for multi-agent coordination | draft-chuyi-nmrg-agentic-network-inference | architecture |
 *...and 38 more*
 ---
 ## 8. Agent Performance Degradation Detection
 | | |
 |---|---|
 | **Severity** | HIGH |
 | **Category** | new |
 | **Drafts in category** | 0 |
 No standardized protocols exist for detecting when AI agents are experiencing model drift, adversarial attacks, or performance degradation. Essential for maintaining autonomous system reliability.
 **Evidence:** ML traffic management exists but not agent health monitoring standards
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
 - [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
 - [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
 - [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
 - [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
 - [draft-xiong-rtgwg-use-cases-hp-wan](https://datatracker.ietf.org/doc/draft-xiong-rtgwg-use-cases-hp-wan/) (score 2.6) — Use Cases for High-performance Wide Area Network
 ### Partially Addressing Ideas
 5 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
 | Virtual In-Cloud Router as IPv6 Enhancement Agent | draft-he-yi-srv6ops-ipv6-enhancemnet-in-cloud-uc | architecture |
 | 6G Agent Protocol Requirements and Enabling Technologies | draft-hw-ai-agent-6g | requirement |
 | Comparative analysis of messaging protocols for agentic AI | draft-mpsb-agntcy-messaging | pattern |
 | AI Network Security Agent | draft-yuan-rtgwg-security-agent-usecase | architecture |
 | Task-Oriented Multi-Agent Recovery Framework | draft-yue-anima-agent-recovery-networks | architecture |
 ---
 ## 9. Legal Liability Attribution Protocols
 | | |
 |---|---|
 | **Severity** | HIGH |
 | **Category** | Policy/governance |
 | **Drafts in category** | 115 |
 Missing technical protocols for creating audit trails that can determine legal liability when autonomous agents cause harm. Governance drafts exist but not technical accountability mechanisms.
 **Evidence:** 115 governance drafts but legal technology gap for liability attribution
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-madhavan-aipref-displaybasedpref](https://datatracker.ietf.org/doc/draft-madhavan-aipref-displaybasedpref/) (score 2.5) — A Vocabulary for Controlling Usage of Content Collected by Search and AI Crawler
 - [draft-farzdusa-aipref-enduser](https://datatracker.ietf.org/doc/draft-farzdusa-aipref-enduser/) (score 3.8) — AI Preferences Signaling: End User Impact
 - [draft-kotecha-agentic-dispute-protocol](https://datatracker.ietf.org/doc/draft-kotecha-agentic-dispute-protocol/) (score 3.6) — Agentic Dispute Protocol
 - [draft-cui-dmsc-agent-cdi](https://datatracker.ietf.org/doc/draft-cui-dmsc-agent-cdi/) (score 3.0) — Cross-Domain Interoperability Framework for AI Agent Collaboration
 - [draft-ietf-aipref-vocab](https://datatracker.ietf.org/doc/draft-ietf-aipref-vocab/) (score 4.4) — A Vocabulary For Expressing AI Usage Preferences
 - [draft-aylward-aiga-1](https://datatracker.ietf.org/doc/draft-aylward-aiga-1/) (score 4.2) — AI Governance and Accountability Protocol (AIGA)
 **Top-rated in Policy/governance** (115 drafts):
 - [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
 - [draft-aylward-daap-v2](https://datatracker.ietf.org/doc/draft-aylward-daap-v2/) (4.8) — Defines comprehensive protocol for AI agent accountability including authentication, monitoring, and
 - [draft-goswami-agentic-jwt](https://datatracker.ietf.org/doc/draft-goswami-agentic-jwt/) (4.5) — Extends OAuth 2.0 with Agentic JWT to address authorization challenges in autonomous AI systems. Int
 - [draft-wang-cats-odsi](https://datatracker.ietf.org/doc/draft-wang-cats-odsi/) (4.5) — Specifies framework for decentralized LLM inference across untrusted participants with layer-aware e
 - [draft-birkholz-verifiable-agent-conversations](https://datatracker.ietf.org/doc/draft-birkholz-verifiable-agent-conversations/) (4.5) — Defines CDDL-based data format for verifiable agent conversation records using COSE signing. Support
 ### Partially Addressing Ideas
 No directly related technical ideas found in current drafts — this gap is entirely unaddressed.
 ---
 ## 10. Agent Memory and State Persistence Standards
 | | |
 |---|---|
 | **Severity** | MEDIUM |
 | **Category** | Data formats/interop |
 | **Drafts in category** | 165 |
 No standardized formats or protocols exist for how agents should persist long-term memory, experience, and learned behaviors across system restarts or migrations. Each implementation creates proprietary solutions.
 **Evidence:** 165 data format drafts focus on communication but not persistent state management
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
 - [draft-li-dmsc-macp](https://datatracker.ietf.org/doc/draft-li-dmsc-macp/) (score 4.2) — Multi-agent Collaboration Protocol Suite
 - [draft-zheng-dispatch-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-dispatch-agent-identity-management/) (score 3.3) — Agent Identity Managenment
 - [draft-fu-nmop-agent-communication-framework](https://datatracker.ietf.org/doc/draft-fu-nmop-agent-communication-framework/) (score 3.0) — Agent Communication Framework for Network AIOps
 - [draft-zyyhl-agent-networks-framework](https://datatracker.ietf.org/doc/draft-zyyhl-agent-networks-framework/) (score 3.6) — Framework for AI Agent Networks
 - [draft-gaikwad-llm-benchmarking-terminology](https://datatracker.ietf.org/doc/draft-gaikwad-llm-benchmarking-terminology/) (score 2.7) — Benchmarking Terminology for Large Language Model Serving
 **Top-rated in Data formats/interop** (165 drafts):
 - [draft-cowles-volt](https://datatracker.ietf.org/doc/draft-cowles-volt/) (4.8) — Defines tamper-evident execution trace format for AI agent workflows using hash chains and cryptogra
 - [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (4.6) — Defines YANG data model for hierarchical language model coordination across tiny, small, and large L
 - [draft-ietf-lake-app-profiles](https://datatracker.ietf.org/doc/draft-ietf-lake-app-profiles/) (4.6) — Defines canonical CBOR representation for EDHOC application profiles and coordination mechanisms for
 - [draft-chang-agent-token-efficient](https://datatracker.ietf.org/doc/draft-chang-agent-token-efficient/) (4.5) — Defines ADOL (Agentic Data Optimization Layer) to address token bloat in agent communication protoco
 - [draft-birkholz-verifiable-agent-conversations](https://datatracker.ietf.org/doc/draft-birkholz-verifiable-agent-conversations/) (4.5) — Defines CDDL-based data format for verifiable agent conversation records using COSE signing. Support
 ### Partially Addressing Ideas
 16 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
 | Compliance-oriented agent memory model | draft-gaikwad-aps-profile | pattern |
 | Zero Trust Interoperability Framework | draft-liu-saag-zt-problem-statement | requirement |
 | Intelligent Agent Communication Gateway Architecture | draft-agent-gw | architecture |
 | Zero Trust Runtime Agent Architecture | draft-berlinai-vera | architecture |
 | Agentic Hypercall Protocol | draft-campbell-agentic-http | pattern |
 | Agent Persistent State Profile | draft-gaikwad-aps-profile | architecture |
 | Agentic AI for Autonomous Network Management | draft-hong-nmrg-agenticai-ps | requirement |
 | LISP-based geospatial intelligence network | draft-ietf-lisp-nexagon | protocol |
 *...and 8 more*
 ---
 ## 11. Agent-to-Human Escalation Standards
 | | |
 |---|---|
 | **Severity** | MEDIUM |
 | **Category** | Human-agent interaction |
 | **Drafts in category** | 41 |
 While human-in-the-loop protocols exist, there's no standardized framework for when and how agents should escalate decisions to humans based on uncertainty, risk, or ethical considerations.
 **Evidence:** Only 41 human-agent interaction drafts versus complex autonomous systems requiring escalation
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-williams-netmod-lm-hierarchy-topology](https://datatracker.ietf.org/doc/draft-williams-netmod-lm-hierarchy-topology/) (score 4.6) — Hierarchical Topology for Language Model Coordination
 - [draft-ietf-websec-mime-sniff](https://datatracker.ietf.org/doc/draft-ietf-websec-mime-sniff/) (score 3.7) — Media Type Sniffing
 - [draft-scrm-aiproto-usecases](https://datatracker.ietf.org/doc/draft-scrm-aiproto-usecases/) (score 4.1) — Agentic AI Use Cases
 - [draft-zeng-opsawg-llm-netconf-gap](https://datatracker.ietf.org/doc/draft-zeng-opsawg-llm-netconf-gap/) (score 3.9) — Gap Analysis of Network Configuration Protocols in LLM-Driven Intent-Based Netwo
 - [draft-jadoon-nmrg-agentic-ai-autonomous-networks](https://datatracker.ietf.org/doc/draft-jadoon-nmrg-agentic-ai-autonomous-networks/) (score 4.1) — Agentic AI Architectural Principles for Autonomous Computer Networks
 **Top-rated in Human-agent interaction** (41 drafts):
 - [draft-drake-email-tpm-attestation](https://datatracker.ietf.org/doc/draft-drake-email-tpm-attestation/) (4.6) — Defines hardware attestation for email using TPM verification chains to prevent spam and provide Syb
 - [draft-ietf-aipref-vocab](https://datatracker.ietf.org/doc/draft-ietf-aipref-vocab/) (4.4) — Defines a standardized vocabulary for expressing preferences about how digital assets should be used
 - [draft-dhir-http-agent-profile](https://datatracker.ietf.org/doc/draft-dhir-http-agent-profile/) (4.2) — Defines HTTP Agent Profile for authenticating agent traffic, separating human from agent traffic, an
 - [draft-song-tsvwg-camp](https://datatracker.ietf.org/doc/draft-song-tsvwg-camp/) (4.2) — Proposes CAMP, a multipath transport protocol for interactive multimodal LLM systems that maintains 
 - [draft-liu-agent-operation-authorization](https://datatracker.ietf.org/doc/draft-liu-agent-operation-authorization/) (4.1) — Specifies framework for verifiable delegation of actions from humans to AI agents using JWT tokens. 
 ### Partially Addressing Ideas
 No directly related technical ideas found in current drafts — this gap is entirely unaddressed.
 ---
 ## 12. Federated Agent Learning Privacy
 | | |
 |---|---|
 | **Severity** | MEDIUM |
 | **Category** | new |
 | **Drafts in category** | 0 |
 Federated AI operations models exist but lack privacy-preserving protocols for agents learning from shared experiences without exposing sensitive data from individual deployments.
 **Evidence:** Federated models mentioned but privacy-preserving learning protocols absent
 ### Related Drafts
 **Keyword matches** (drafts mentioning gap topic):
 - [draft-kale-agntcy-federated-privacy](https://datatracker.ietf.org/doc/draft-kale-agntcy-federated-privacy/) (score 3.2) — Privacy-Preserving Federated Learning Architecture for Multi-Tenant AI Agent Sys
 - [draft-cui-dmsc-agent-cdi](https://datatracker.ietf.org/doc/draft-cui-dmsc-agent-cdi/) (score 3.0) — Cross-Domain Interoperability Framework for AI Agent Collaboration
 - [draft-ai-traffic](https://datatracker.ietf.org/doc/draft-ai-traffic/) (score 2.5) — Handling inter-DC/Edge AI-related network traffic: Problem statement
 - [draft-aft-ai-traffic](https://datatracker.ietf.org/doc/draft-aft-ai-traffic/) (score 3.1) — Handling inter-DC/Edge AI-related network traffic: Problem statement
 - [draft-aylward-aiga-1](https://datatracker.ietf.org/doc/draft-aylward-aiga-1/) (score 4.2) — AI Governance and Accountability Protocol (AIGA)
 - [draft-zheng-agent-identity-management](https://datatracker.ietf.org/doc/draft-zheng-agent-identity-management/) (score 3.7) — Agent Identity Managenment
 ### Partially Addressing Ideas
 5 extracted ideas touch on this gap:
 | Idea | Draft | Type |
 |------|-------|------|
 | Privacy-Preserving Federated Learning for Multi-Tenant AI Agents | draft-kale-agntcy-federated-privacy | architecture |
 | Cross-Domain Agent Interoperability Framework | draft-cui-dmsc-agent-cdi | architecture |
 | HTTP Agent Profile (HAP) | draft-dhir-http-agent-profile | protocol |
 | AI Network Security Agent | draft-yuan-rtgwg-security-agent-usecase | architecture |
 | AI Network Traffic Optimization Agent | draft-yuan-rtgwg-traffic-agent-usecase | architecture |
 ---
@@ -607,16 +539,14 @@ Missing standards for energy-aware agent deployment and operation. As AI workloa
 | Category | Drafts | Gaps | Gap Topics |
 |----------|-------:|-----:|------------|
-| a2a protocols | 120 | 2 | Multi-Agent Coordination Deadlocks; Cross-Protocol Agent Migration |
+| a2a protocols | 150 | 2 | Multi-Agent Consensus Under Byzantine Conditions; Cross-Protocol Agent Migration |
-| agent identity/auth | 108 | 1 | Agent Privacy Preservation |
+| agent discovery/reg | 87 | 1 | Agent Capability Advertisement Verification |
-| ai safety/alignment | 44 | 2 | Agent Behavior Verification; Agent Capability Degradation Handling |
+| agent identity/auth | 145 | 1 | Cross-Domain Agent Communication Security |
-| autonomous netops | 93 | 1 | Agent Resource Exhaustion Protection |
+| ai safety/alignment | 46 | 2 | Real-time Agent Behavior Verification; Emergency Agent Shutdown Coordination |
-| data formats/interop | 145 | 1 | Agent-Generated Data Provenance |
+| data formats/interop | 165 | 1 | Agent Memory and State Persistence Standards |
-| human-agent interaction | 30 | 1 | Human Override Protocols |
+| human-agent interaction | 41 | 1 | Agent-to-Human Escalation Standards |
-| ml traffic mgmt | 73 | 1 | Agent Energy Consumption Optimization |
+| new | 0 | 3 | Agent Resource Accounting and Billing; Agent Performance Degradation Detection; Federated Agent Learning Privacy |
-| model serving/inference | 42 | 1 | Agent Firmware/Model Update Security |
+| policy/governance | 115 | 1 | Legal Liability Attribution Protocols |
 | other ai/agent | 26 | 1 | Real-time Agent Debugging |
 | policy/governance | 91 | 1 | Cross-Domain Agent Liability |
 ## Recommendations
--- a/data/reports/generated-drafts/draft-gap-37-multi-agent-consensus-protocols.txt
+++ b/data/reports/generated-drafts/draft-gap-37-multi-agent-consensus-protocols.txt
@@ -0,0 +1,656 @@
 Internet-Draft                                           AI/Agent WG
 Intended status: standards-track                             March 2026
 Expires: September 08, 2026
         Multi-Agent Consensus Protocol (MACP) for Distributed AI Agent Coordination
         draft-ai-consensus-protocol-00
 Abstract
   This document defines the Multi-Agent Consensus Protocol (MACP), a
   standardized framework for enabling multiple AI agents to reach
   consensus on shared decisions and resolve conflicting objectives
   in distributed environments. MACP addresses critical coordination
   challenges in autonomous systems where agents must collaborate on
   resource allocation, policy enforcement, and decision-making
   across organizational and domain boundaries. The protocol
   incorporates Byzantine fault tolerance mechanisms, cryptographic
   verification, and hierarchical consensus structures to ensure
   reliable agreement even in the presence of malicious or
   malfunctioning agents. MACP defines message formats, consensus
   algorithms, conflict resolution procedures, and integration
   patterns with existing agent-to-agent communication protocols. The
   protocol supports various consensus models including proof-of-
   authority, weighted voting, and reputation-based systems, enabling
   deployment across diverse use cases from IoT device coordination
   to enterprise AI system orchestration. This specification aims to
   reduce fragmentation in multi-agent systems and provide a
   foundation for interoperable autonomous agent coordination at
   scale.
 Status of This Memo
   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.
   This document is intended to have standards-track status.
   Distribution of this memo is unlimited.
 Table of Contents
   1.  Introduction  ................................................  3
   2.  Terminology  .................................................  4
   3.  Problem Statement  ...........................................  5
   4.  MACP Architecture and Core Components  .......................  6
   5.  Consensus Algorithms and Message Formats  ....................  7
   6.  Conflict Resolution and Decision Binding  ....................  8
   7.  Integration with Existing Agent Protocols  ...................  9
   8.  Security Considerations  .....................................  10
   9.  IANA Considerations  .........................................  11
 1.  Introduction
   The proliferation of autonomous AI agents across distributed
   computing environments has created an urgent need for standardized
   consensus mechanisms that enable coordinated decision-making
   without centralized control. As organizations deploy increasing
   numbers of intelligent agents for tasks ranging from resource
   allocation and policy enforcement to complex multi-party
   negotiations, the lack of interoperable consensus protocols has
   resulted in fragmented implementations that cannot effectively
   coordinate across organizational and domain boundaries. Current
   agent-to-agent communication protocols, while addressing basic
   message exchange and authentication, provide insufficient
   mechanisms for achieving reliable agreement among multiple agents
   with potentially conflicting objectives or incomplete information.
   Existing consensus approaches in multi-agent systems typically
   rely on proprietary coordination mechanisms or adapt consensus
   algorithms designed for blockchain and distributed database
   applications without addressing the unique requirements of AI
   agent coordination. These limitations become particularly acute in
   scenarios involving Byzantine fault tolerance, where agents may
   exhibit malicious behavior, experience partial failures, or
   operate under adversarial conditions. The heterogeneous nature of
   AI agent implementations, combined with varying trust
   relationships and organizational policies, further complicates the
   development of effective consensus mechanisms that can operate
   reliably at scale.
   The Multi-Agent Consensus Protocol (MACP) addresses these
   challenges by providing a standardized framework specifically
   designed for AI agent coordination that incorporates proven
   consensus algorithms while addressing the unique requirements of
   autonomous agent systems. MACP supports multiple consensus models
   including proof-of-authority, weighted voting based on agent
   reputation or capabilities, and hierarchical consensus structures
   that reflect organizational boundaries and trust relationships.
   The protocol integrates cryptographic verification mechanisms and
   Byzantine fault tolerance to ensure reliable consensus achievement
   even in the presence of malicious or malfunctioning agents, while
   maintaining compatibility with existing agent communication and
   attestation frameworks.
   The scope of MACP encompasses the definition of consensus
   algorithms optimized for AI agent coordination, standardized
   message formats for proposal submission and voting processes,
   conflict resolution mechanisms for handling competing objectives,
   and integration patterns with existing agent-to-agent protocols.
   This specification aims to reduce the current fragmentation in
   multi-agent coordination approaches by providing a foundation for
   interoperable consensus mechanisms that can scale from small IoT
   device clusters to enterprise-wide AI system orchestration. By
   establishing common protocols for multi-agent consensus, MACP
   enables the development of more robust and coordinated autonomous
   systems while maintaining the flexibility required for diverse
   deployment scenarios and organizational requirements.
 2.  Terminology
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
   NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
   "MAY", and "OPTIONAL" in this document are to be interpreted as
   described in BCP 14 [RFC2119] [RFC8174] when, and only when, they
   appear in all capitals, as shown here.
   This section establishes terminology specific to multi-agent
   consensus operations and distributed AI agent coordination. These
   definitions build upon established concepts from distributed
   systems literature while introducing terminology specific to
   autonomous agent environments. Where applicable, references are
   made to related terminology from existing agent communication
   protocols such as those defined in [RFC8600] and distributed
   consensus literature.
   A "consensus agent" is an autonomous AI entity capable of
   participating in distributed decision-making processes by
   submitting proposals, evaluating options, and committing to
   agreed-upon outcomes. Consensus agents MUST maintain state
   consistency with other participants and possess cryptographic
   capabilities for message authentication and verification. An
   "observer agent" is a non-participating entity that monitors
   consensus processes without voting rights or decision authority.
   A "coordination domain" defines the scope and boundaries within
   which a group of agents operate under shared governance rules and
   consensus mechanisms. Each coordination domain establishes its own
   policies for membership, voting weights, and decision authority. A
   "decision quorum" represents the minimum number or weight
   threshold of participating agents required to reach a valid
   consensus decision within a coordination domain, expressed either
   as an absolute count or percentage of eligible participants.
   "Byzantine Fault Tolerance" (BFT) refers to the system's ability
   to achieve consensus despite the presence of agents that may
   exhibit arbitrary failures, including malicious behavior, message
   corruption, or incorrect state reporting. MACP implementations
   SHOULD support Byzantine fault tolerance for up to one-third of
   participating agents within any coordination domain. "Practical
   Byzantine Fault Tolerance" (pBFT) describes optimized algorithms
   that achieve Byzantine fault tolerance with reduced message
   complexity and improved performance characteristics.
   "Conflict resolution" encompasses the mechanisms and procedures
   used to address competing proposals, resolve deadlocks, and handle
   situations where multiple valid decisions could be reached. This
   includes tie-breaking algorithms, priority-based selection, and
   escalation procedures to higher-level coordination domains.
   "Decision binding" refers to the enforcement mechanisms that
   ensure participating agents comply with consensus outcomes and
   maintain consistency with agreed-upon decisions across the
   coordination domain.
 3.  Problem Statement
   The proliferation of autonomous AI agents across distributed
   systems has created an urgent need for standardized consensus
   mechanisms that can coordinate decision-making without centralized
   control. Current multi-agent deployments frequently encounter
   scenarios where agents must collectively agree on resource
   allocation, policy enforcement, task scheduling, or strategic
   decisions that affect multiple stakeholders. However, existing
   agent-to-agent communication protocols such as FIPA-ACL and
   emerging frameworks focus primarily on message exchange and basic
   coordination primitives, lacking robust consensus mechanisms
   necessary for reliable distributed decision-making. This gap
   becomes particularly problematic in cross-organizational
   deployments where agents operate under different governance
   models, trust assumptions, and operational constraints.
   Network partitions and communication failures present fundamental
   challenges to consensus achievement in distributed agent
   environments. Unlike traditional distributed systems where nodes
   typically operate within controlled network environments, AI
   agents often function across heterogeneous networks with varying
   reliability characteristics, from edge computing environments to
   cloud infrastructure spanning multiple providers. Agents may
   become temporarily or permanently unreachable, creating scenarios
   where consensus decisions must proceed with incomplete information
   or be safely aborted to prevent system-wide inconsistencies.
   Current agent protocols provide insufficient guidance for handling
   these partition scenarios, often resulting in ad-hoc
   implementations that cannot guarantee safety properties or
   liveness guarantees across different deployment contexts.
   Conflicting objectives among participating agents introduce
   additional complexity beyond traditional distributed consensus
   problems. AI agents frequently operate with competing utility
   functions, resource constraints, and optimization targets that may
   not align with collective decision requirements. For example, in a
   multi-cloud resource allocation scenario, individual agents may
   prioritize cost minimization for their respective organizations
   while the collective decision requires balancing performance,
   security, and availability across all participants. Existing
   consensus algorithms assume participants share common objectives
   or can reduce decisions to simple binary choices, failing to
   address the multi-dimensional optimization problems inherent in AI
   agent coordination.
   The presence of malicious or Byzantine actors poses significant
   threats to consensus integrity in open multi-agent environments.
   Unlike closed distributed systems where all participants operate
   under unified security models, AI agents may need to establish
   consensus across organizational boundaries where some participants
   cannot be fully trusted. Malicious agents may attempt to
   manipulate consensus outcomes through strategic voting, false
   information injection, or coordination attacks designed to prevent
   legitimate consensus achievement. Furthermore, compromised or
   malfunctioning agents may exhibit Byzantine behavior without
   malicious intent, requiring consensus mechanisms that can tolerate
   arbitrary failures while maintaining decision quality and system
   progress.
   Current limitations in existing agent-to-agent protocols create
   additional barriers to reliable consensus implementation. Most
   protocols focus on peer-to-peer communication abstractions without
   addressing the coordination complexity required for multi-party
   decision-making. Authentication and authorization mechanisms are
   typically designed for bilateral interactions rather than group
   consensus scenarios, lacking support for quorum-based verification
   or reputation systems that could improve consensus security.
   Additionally, existing protocols provide limited support for
   hierarchical consensus structures or delegation mechanisms that
   could enable scalable decision-making in large agent populations,
   forcing implementations toward flat consensus models that may not
   perform well beyond small agent groups.
 4.  MACP Architecture and Core Components
   The MACP architecture employs a distributed coordination model
   consisting of three primary components: Consensus Coordinators,
   Participating Agents, and Decision Domains. This architecture is
   designed to scale horizontally while maintaining fault tolerance
   and ensuring efficient consensus across diverse agent populations.
   The system operates on the principle of domain-partitioned
   consensus, where agents are organized into logical groupings based
   on their functional roles, trust relationships, or organizational
   boundaries. Each Decision Domain maintains its own consensus state
   and can interact with other domains through well-defined inter-
   domain protocols.
   Consensus Coordinators serve as the orchestration layer for MACP
   operations within each Decision Domain. A Consensus Coordinator
   MUST be capable of maintaining the current consensus state,
   managing proposal queues, and coordinating message distribution
   among Participating Agents. Coordinators are responsible for
   implementing the selected consensus algorithm, enforcing quorum
   requirements, and ensuring message ordering consistency. In
   deployments requiring high availability, multiple Consensus
   Coordinators MAY operate in a redundant configuration using leader
   election mechanisms similar to those defined in Raft consensus
   algorithms. Coordinators MUST implement Byzantine fault tolerance
   measures when operating in environments where malicious behavior
   is possible, maintaining consensus integrity even when up to one-
   third of coordinators exhibit arbitrary failures.
   Participating Agents represent the autonomous entities that
   contribute to consensus decisions within the MACP framework. Each
   Participating Agent MUST maintain a unique identity within its
   Decision Domain and possess the capability to generate, evaluate,
   and vote on consensus proposals. Agents are classified into one of
   three participation modes: Active Participants that can both
   propose and vote on decisions, Voting Participants that can vote
   but not propose, and Observer Participants that receive consensus
   results but do not participate in the decision process.
   Participating Agents MUST implement proposal validation logic
   appropriate to their domain context and SHOULD incorporate
   reputation tracking mechanisms to assess the trustworthiness of
   proposals from other agents.
   Decision Domains establish the scope and context for consensus
   operations, defining the set of agents authorized to participate
   in specific types of decisions. A Decision Domain MUST specify its
   consensus model (proof-of-authority, weighted voting, or
   reputation-based), quorum requirements, and decision binding
   policies. Domains operate independently but MAY establish inter-
   domain communication channels for coordinating decisions that span
   multiple domains. The domain configuration MUST include conflict
   resolution parameters, timeout specifications, and rollback
   procedures to handle consensus failures gracefully.
   The interaction patterns between these components follow a
   structured request-response model augmented with publish-subscribe
   mechanisms for state synchronization. When a Participating Agent
   initiates a consensus proposal, it MUST first submit the proposal
   to its designated Consensus Coordinator, which validates the
   proposal format and participant authorization. The Coordinator
   then distributes the proposal to all eligible Participating Agents
   within the Decision Domain, collecting votes according to the
   configured consensus algorithm. Upon reaching quorum and achieving
   consensus, the Coordinator publishes the binding decision to all
   participants and updates the domain's consensus state. Failed
   consensus attempts trigger the domain's configured rollback
   procedures, allowing the system to maintain consistency despite
   partial failures or network partitions.
 5.  Consensus Algorithms and Message Formats
   MACP implements multiple consensus algorithms to accommodate
   different operational requirements and network conditions. The
   base specification MUST support the Practical Byzantine Fault
   Tolerance (pBFT) algorithm adapted for multi-agent environments,
   while implementations MAY support additional algorithms including
   RAFT consensus for non-Byzantine scenarios and novel reputation-
   weighted consensus for environments with established agent trust
   relationships. The pBFT implementation assumes a maximum of f
   Byzantine agents out of 3f+1 total participating agents, providing
   safety and liveness guarantees under standard network assumptions.
   Each consensus algorithm is identified by a unique Algorithm
   Identifier (AID) registered with IANA as specified in Section 9.
   The MACP consensus process follows a four-phase protocol:
   Proposal, Pre-voting, Voting, and Commitment. During the Proposal
   phase, any authorized agent MAY submit decision proposals to the
   designated consensus coordinator for the relevant coordination
   domain. The Pre-voting phase allows agents to signal their
   preliminary position and identify potential conflicts or
   dependencies with other pending proposals. The Voting phase
   requires participating agents to submit cryptographically signed
   votes within a specified timeout window, with vote validity
   determined by the agent's authorization level and current
   reputation score. The Commitment phase broadcasts the final
   decision and requires acknowledgment from a quorum of agents
   before considering the consensus binding.
   MACP defines standardized message formats using JSON serialization
   with mandatory digital signatures following RFC 7515 (JSON Web
   Signature). All consensus messages MUST include a common header
   containing the message type, consensus session identifier,
   timestamp, originating agent identifier, and cryptographic
   signature. Proposal messages additionally contain the proposed
   decision payload, expected quorum size, voting timeout duration,
   and conflict resolution parameters. Vote messages include the
   proposal hash, agent's decision (ACCEPT, REJECT, ABSTAIN), voting
   weight, and optional reasoning metadata. Commitment messages
   broadcast the final consensus result, participating agent list,
   vote tally, and binding duration for the agreed decision.
   Message validation requires verification of agent authorization,
   signature authenticity, and temporal constraints before
   processing. Agents MUST reject messages with invalid signatures,
   expired timestamps beyond the configured clock skew tolerance, or
   originating from agents not authorized for the specific
   coordination domain. Vote aggregation follows the specified
   consensus algorithm with additional validation for vote weight
   consistency and duplicate vote detection. The consensus
   coordinator MUST broadcast commitment messages to all
   participating agents and maintain an audit log of the complete
   consensus session for accountability purposes as defined in
   existing agent attestation frameworks.
   Timeout handling and failure recovery mechanisms ensure liveness
   properties under adverse network conditions. If insufficient votes
   are received within the voting timeout window, the consensus
   coordinator MUST initiate a new consensus round with an
   exponentially increasing timeout duration, up to a maximum
   threshold defined in the coordination domain configuration.
   Network partition scenarios are addressed through coordinator
   election protocols that prevent split-brain consensus decisions.
   Failed consensus attempts trigger rollback procedures that notify
   all participating agents of the unsuccessful coordination attempt
   and release any provisionally allocated resources pending the
   consensus outcome.
 6.  Conflict Resolution and Decision Binding
   Conflict resolution in MACP occurs when multiple competing
   proposals are submitted simultaneously or when agents disagree on
   the validity or priority of proposed decisions. When conflicts are
   detected during the proposal phase, participating agents MUST
   invoke the conflict resolution mechanism before proceeding with
   the voting phase. The protocol defines three primary conflict
   types: proposal conflicts (multiple proposals for the same
   decision domain), timing conflicts (simultaneous submissions
   within the conflict detection window), and validity conflicts
   (disagreement on proposal prerequisites or constraints). Agents
   MUST maintain a conflict detection buffer with a configurable
   timeout period (default 30 seconds) to identify competing
   proposals before initiating consensus procedures.
   The tie-breaking procedure activates when voting results in
   equivalent support for multiple proposals or when no proposal
   achieves the required quorum threshold. MACP employs a
   hierarchical tie-breaking mechanism starting with proposal
   priority levels, followed by submitter reputation scores, and
   finally deterministic hash-based selection using the combined hash
   of conflicting proposal identifiers. Participating agents MUST
   apply tie-breaking rules in the specified order and MUST reach
   agreement on tie-breaking criteria during the initial coordination
   domain establishment. When tie-breaking fails to resolve
   conflicts, the consensus coordinator MUST initiate a cooling-off
   period of at least 60 seconds before allowing resubmission of
   conflicting proposals.
   Decision binding ensures that consensus outcomes are enforced
   across all participating agents through cryptographic commitment
   and distributed verification mechanisms. Once consensus is
   achieved, all participating agents MUST generate binding
   commitment messages containing the decision hash, their digital
   signature, and a commitment timestamp. The binding phase requires
   acknowledgment from at least the same quorum that approved the
   original proposal within a configurable binding timeout (default
   120 seconds). Agents MUST store binding commitments in their local
   decision ledger and MUST reject future proposals that violate
   committed decisions unless explicitly superseded through the
   decision override mechanism.
   Timeout handling addresses scenarios where consensus cannot be
   achieved within specified time bounds or when participating agents
   become unresponsive during critical phases. MACP defines distinct
   timeout periods for each consensus phase: proposal timeout
   (default 60 seconds), voting timeout (default 180 seconds), and
   binding timeout (default 120 seconds). When timeouts occur, the
   consensus coordinator MUST broadcast a timeout notification and
   initiate graceful degradation procedures, which may include
   reducing quorum requirements, extending timeout periods, or
   aborting the consensus attempt. Agents that fail to respond within
   timeout periods MUST be temporarily excluded from the current
   consensus round but MAY rejoin subsequent rounds.
   Rollback mechanisms provide recovery capabilities when consensus
   failures occur after the voting phase or when binding commitments
   cannot be properly established. Rollback procedures MUST be
   initiated when binding acknowledgments fall below the required
   threshold, when Byzantine fault detection identifies compromised
   consensus results, or when critical participating agents report
   implementation failures. The rollback process requires the
   consensus coordinator to broadcast rollback notifications
   containing the failed consensus identifier, rollback reason code,
   and reversion instructions. All participating agents MUST
   acknowledge rollback notifications, remove associated decision
   commitments from their local ledgers, and reset their consensus
   state to allow for subsequent retry attempts with modified
   parameters or participant sets.
 7.  Integration with Existing Agent Protocols
   MACP is designed to operate as an overlay protocol that integrates
   seamlessly with existing agent-to-agent communication frameworks
   and infrastructure. Implementations MUST support integration with
   standard authentication protocols including OAuth 2.0 [RFC6749],
   OpenID Connect, and X.509 certificate-based authentication systems
   commonly deployed in enterprise environments. MACP consensus
   messages SHOULD leverage existing secure transport mechanisms such
   as TLS 1.3 [RFC8446] or DTLS for UDP-based communications,
   ensuring that consensus operations benefit from established
   security practices without requiring separate cryptographic
   implementations.
   Integration with agent accountability frameworks requires MACP
   implementations to maintain comprehensive audit trails of
   consensus participation and decision outcomes. Consensus
   coordinators MUST log all proposal submissions, voting records,
   and final decisions in formats compatible with existing audit and
   compliance systems. When operating alongside agent attestation
   protocols, MACP SHOULD verify agent identity and authorization
   status before allowing participation in consensus processes,
   utilizing existing identity providers and policy enforcement
   points where available. The protocol defines standard interfaces
   for querying agent reputation scores and authorization levels from
   external accountability systems.
   MACP consensus operations MUST be designed to coexist with
   workflow management and orchestration platforms commonly used in
   distributed AI deployments. Implementations SHOULD provide APIs
   and event notifications that allow workflow systems to trigger
   consensus processes when collective decisions are required, and to
   receive binding consensus outcomes for subsequent workflow
   execution. The protocol supports asynchronous integration patterns
   where consensus results can be delivered to workflow systems
   through message queues, webhooks, or polling interfaces, ensuring
   compatibility with diverse orchestration architectures.
   For environments utilizing existing agent-to-agent communication
   protocols such as FIPA-ACL or custom REST-based agent APIs, MACP
   provides adapter interfaces that translate consensus-specific
   messages into native communication formats. Protocol
   implementations MAY offer plugin architectures that allow custom
   integration modules for proprietary agent communication systems,
   while maintaining core consensus algorithm integrity. Standard
   message mapping templates are provided for common integration
   scenarios, reducing implementation complexity for organizations
   with established agent communication infrastructure.
   The protocol includes provisions for gradual deployment in mixed
   environments where only a subset of agents support MACP consensus
   mechanisms. Non-MACP agents MAY participate in consensus processes
   through proxy agents that translate between native agent protocols
   and MACP message formats, though such deployments SHOULD implement
   additional verification mechanisms to ensure proxy agent fidelity.
   Integration guidelines specify fallback procedures for scenarios
   where consensus mechanisms are unavailable, allowing graceful
   degradation to existing coordination approaches while maintaining
   system stability.
 8.  Security Considerations
   Security considerations for MACP deployment are paramount given
   the distributed nature of multi-agent systems and the potential
   for malicious actors to compromise consensus integrity. The
   protocol MUST implement comprehensive threat mitigation strategies
   to address attacks specific to distributed consensus mechanisms.
   Primary security concerns include Sybil attacks where malicious
   actors create multiple false agent identities to gain
   disproportionate voting power, coordination attacks where
   compromised agents collude to manipulate consensus outcomes, and
   consensus manipulation through message tampering or replay
   attacks. Additionally, MACP implementations MUST consider denial-
   of-service attacks targeting consensus coordinators, eclipse
   attacks isolating honest agents from the consensus network, and
   long-range attacks where compromised agents attempt to rewrite
   historical consensus decisions.
   Cryptographic requirements for MACP implementations MUST include
   strong identity verification mechanisms to prevent unauthorized
   participation in consensus processes. Each participating agent
   MUST possess a cryptographically verifiable identity backed by
   public-key infrastructure or distributed identity systems such as
   those defined in [RFC6960] and emerging decentralized identity
   standards. Digital signatures MUST be used for all consensus
   messages including proposals, votes, and commitments, with
   signature schemes providing at least 128-bit security strength as
   specified in [RFC3766]. Message integrity MUST be protected
   through cryptographic hash functions resistant to collision
   attacks, and implementations SHOULD employ hash-based message
   authentication codes (HMAC) for additional verification. Time-
   based replay attack prevention MUST be implemented through message
   timestamps and nonce mechanisms, with strict validation of message
   freshness windows.
   Identity verification mechanisms MUST prevent Sybil attacks
   through robust agent authentication and reputation tracking
   systems. MACP implementations SHOULD integrate with existing
   Public Key Infrastructure (PKI) systems or emerging decentralized
   identity frameworks to establish verifiable agent identities.
   Consensus coordinators MUST maintain authoritative lists of
   eligible participating agents and regularly validate agent
   credentials against trusted identity providers. Multi-factor
   authentication SHOULD be employed for high-stakes consensus
   decisions, potentially including hardware security module (HSM)
   attestation or trusted execution environment verification. Agent
   reputation systems MAY be implemented to track historical behavior
   and adjust voting weights based on demonstrated trustworthiness,
   though such systems MUST include mechanisms to prevent reputation
   manipulation attacks.
   Protection mechanisms for consensus integrity MUST address both
   technical and game-theoretic attack vectors inherent in
   distributed decision-making systems. Byzantine Fault Tolerant
   consensus algorithms MUST be configured to handle the maximum
   expected number of malicious agents according to theoretical
   bounds, typically supporting up to f faulty agents in a network of
   3f+1 total agents. Network-level protections SHOULD include
   encrypted communication channels using protocols such as TLS 1.3
   [RFC8446] and distributed denial-of-service (DDoS) mitigation
   strategies to ensure consensus availability. Implementations MUST
   implement consensus finality mechanisms that prevent retroactive
   modification of agreed-upon decisions and provide cryptographic
   proofs of consensus achievement. Regular security audits and
   penetration testing SHOULD be conducted on MACP implementations,
   with particular attention to consensus algorithm correctness and
   cryptographic implementation vulnerabilities.
   Economic and incentive-based security measures SHOULD be
   considered to discourage malicious behavior and ensure honest
   participation in consensus processes. Stake-based consensus
   mechanisms MAY be implemented where agents must commit resources
   or reputation to participate in decision-making, creating economic
   disincentives for malicious behavior. Slashing mechanisms SHOULD
   be employed to penalize agents that violate consensus rules or
   demonstrate Byzantine behavior. However, such economic measures
   MUST be carefully designed to prevent wealth concentration attacks
   and ensure broad participation accessibility. Monitoring and
   anomaly detection systems SHOULD continuously analyze consensus
   patterns to identify potential coordinated attacks or unusual
   voting behaviors that may indicate compromise. Emergency response
   procedures MUST be established to handle detected security
   incidents, including mechanisms for temporarily suspending
   consensus participation of suspected malicious agents and
   initiating incident response protocols.
 9.  IANA Considerations
   This document requests IANA to create and maintain several new
   registries to support the Multi-Agent Consensus Protocol (MACP)
   and ensure protocol extensibility and interoperability. The
   registries defined in this section will enable standardized
   identification of protocol elements while allowing for future
   enhancements and vendor-specific extensions without creating
   conflicts or ambiguity in multi-agent consensus implementations.
   IANA is requested to establish a "Multi-Agent Consensus Protocol
   (MACP) Parameters" registry group containing three distinct
   registries. The first registry, "MACP Message Types", SHALL
   contain identifiers for all MACP message types including consensus
   proposals, votes, commitments, and administrative messages.
   Message type identifiers MUST be allocated as 16-bit unsigned
   integers in the range 0x0000-0xFFFF, with values 0x0000-0x7FFF
   reserved for IETF-defined message types and values 0x8000-0xFFFF
   available for private use and experimental implementations.
   Registration of new message types in the IETF range requires
   Standards Action as defined in RFC 8126, and MUST include a
   complete message format specification and security considerations.
   The second registry, "MACP Consensus Algorithm Identifiers", SHALL
   contain unique identifiers for consensus algorithms supported by
   MACP implementations. Algorithm identifiers MUST be allocated as
   UTF-8 strings following the pattern "algorithm-name.version" with
   a maximum length of 64 characters. The registry MUST include
   algorithm names, version numbers, reference specifications,
   security properties, and applicable use case constraints for each
   entry. Initial registry entries SHALL include "pbft.1.0" for
   Practical Byzantine Fault Tolerance, "poa.1.0" for Proof of
   Authority, and "weighted-vote.1.0" for weighted voting consensus.
   New algorithm registrations require Expert Review with designated
   experts having demonstrated expertise in distributed consensus
   mechanisms and multi-agent systems.
   The third registry, "MACP Conflict Resolution Methods", SHALL
   contain identifiers for standardized conflict resolution
   procedures used when multiple competing proposals achieve similar
   consensus scores. Resolution method identifiers MUST follow the
   same UTF-8 string format as consensus algorithms and include
   detailed descriptions of resolution logic, fairness guarantees,
   and termination conditions. Registration requires Expert Review
   and MUST demonstrate deterministic behavior across all
   participating agents. Initial entries SHALL include "timestamp-
   priority.1.0", "hash-ordering.1.0", and "weighted-random.1.0" with
   complete algorithmic specifications.
   IANA is further requested to establish a "MACP Extension
   Parameters" registry for protocol extension identifiers used in
   MACP header fields and capability negotiation. Extension
   identifiers MUST be allocated as reverse DNS notation strings to
   prevent conflicts and enable vendor-specific extensions while
   maintaining global uniqueness. The registry SHALL operate under
   First Come First Served allocation policy as defined in RFC 8126,
   requiring only basic documentation of the extension purpose and
   format. All registry entries MUST include contact information for
   the registering organization and SHOULD reference publicly
   accessible specification documents for interoperability purposes.
 Author's Address
   Generated by IETF Draft Analyzer
   2026-03-07
--- a/data/reports/reviews/verified-counts.md
+++ b/data/reports/reviews/verified-counts.md
@@ -12,7 +12,7 @@
 | drafts | 434 | Up from 361 after 2026-03-07 fetch |
 | ratings | 434 | 1:1 with drafts |
 | authors | 557 | Unique persons from Datatracker |
-| ideas | 419 | See "Ideas Count History" below |
+| ideas | 462 | Re-extracted 2026-03-08, see "Ideas Count History" below |
 | gaps | 11 | Not 12 -- see gap list below |
 | embeddings | 434 | 1:1 with drafts |
 | draft_authors | 1,057 | Draft-author links |
@@ -79,24 +79,25 @@ Blog posts reference 12 gaps with different names (e.g., "Agent Resource Exhaust
 ## Ideas Count History
-The database currently contains **419 ideas** across **377 drafts**. This is the third different count encountered:
+The database currently contains **462 ideas** across **415 drafts**. This is the fourth count encountered:
 | Source | Count | Date | Likely Explanation |
 |--------|-------|------|-------------------|
 | Blog post 5 filename | 1,262 | ~2026-03-03 | Pre-expansion dataset (260 drafts), before dedup |
 | Blog post 5 text / master stats | 1,780 | ~2026-03-05 | Post-expansion (361 drafts), before dedup |
-| Current database | 419 | 2026-03-08 | After `dedup_ideas` run (0.85 threshold) or re-extraction with different params |
+| Previous database | 419 | 2026-03-08 | After `dedup_ideas` run (0.85 threshold) or re-extraction with different params |
 | Current database | 462 | 2026-03-08 | After re-extraction for 38 drafts missing ideas (474 total drafts, 59 still without ideas) |
 ### Ideas by Type (current DB)
 | Type | Count |
 |------|-------|
-| protocol | 96 |
+| architecture | 107 |
-| architecture | 95 |
+| protocol | 106 |
-| extension | 79 |
+| extension | 84 |
-| mechanism | 68 |
+| mechanism | 74 |
-| requirement | 42 |
+| requirement | 47 |
-| pattern | 35 |
+| pattern | 40 |
 | framework | 3 |
 | format | 1 |
@@ -104,14 +105,30 @@ The database currently contains **419 ideas** across **377 drafts**. This is the
 | Ideas/Draft | Drafts |
 |-------------|--------|
-| 1 | 337 |
+| 1 | 370 |
-| 2 | 38 |
+| 2 | 43 |
 | 3 | 2 |
-| 0 (no ideas) | 57 |
+| 0 (no ideas) | 59 |
 The near-uniform 1-idea-per-draft (89% of drafts with ideas) suggests either aggressive dedup or a re-extraction with constrained output. The original pipeline extracted 1-4 ideas per draft, so the 1,780 figure likely reflects pre-dedup counts.
-Excluding false positives: 365 ideas across 326 drafts.
+### Convergence Analysis (2026-03-08)
 Cross-organization idea convergence analysis (threshold: 0.75 SequenceMatcher similarity):
 | Metric | Value |
 |--------|-------|
 | Total ideas | 462 |
 | Unique clusters | 398 |
 | Cross-org convergent ideas | 132 |
 | Convergence rate | 33% |
 Top convergent ideas by organization count:
 - **Fully Adaptive Routing Ethernet for AI** — 14 orgs (Baidu, Broadcom, China Mobile, etc.)
 - **AI Agent Protocol Framework** — 7 orgs, 3 drafts
 - **Natural Language Protocol for Agent Comm** — 7 orgs
 - **LISP-based geospatial intelligence network** — 6 orgs
 - **MCP-Based Network Management Plane** — 4 orgs (Deutsche Telekom, Huawei, Orange, Telefonica)
 ## Actions Taken (2026-03-08)
--- a/data/reports/wg-analysis.md
+++ b/data/reports/wg-analysis.md
@@ -0,0 +1,97 @@
 # Working Group Analysis
 *Generated 2026-03-06 21:16 UTC — 434 drafts (85 WG-adopted, 349 individual)*
 ## Working Group Overview
 | WG | Drafts | Ideas | Novelty | Maturity | Overlap | Momentum | Relevance |
 |:---|-------:|------:|--------:|---------:|--------:|---------:|----------:|
 | **lake** | 11 | 10 | 3.1 | 3.8 | 2.3 | 3.6 | 3.9 |
 | **lamps** | 9 | 9 | 2.7 | 3.9 | 1.7 | 3.4 | 3.6 |
 | **aipref** | 9 | 10 | 3.0 | 3.2 | 3.2 | 3.3 | 4.1 |
 | **emu** | 6 | 6 | 3.3 | 3.2 | 2.8 | 3.3 | 3.7 |
 | **httpbis** | 5 | 5 | 2.0 | 4.8 | 3.2 | 4.2 | 3.0 |
 | **tsv** | 4 | 4 | 2.8 | 3.8 | 2.2 | 3.0 | 3.0 |
 | **tls** | 4 | 4 | 3.2 | 4.0 | 2.0 | 4.5 | 5.0 |
 | **sshm** | 3 | 3 | 2.0 | 4.3 | 2.0 | 3.7 | 3.7 |
 | **idr** | 3 | 3 | 2.7 | 3.0 | 2.7 | 3.7 | 3.0 |
 | **dnsop** | 3 | 3 | 3.0 | 3.7 | 1.7 | 3.7 | 3.0 |
 | **app** | 3 | 3 | 2.0 | 3.7 | 2.0 | 2.0 | 2.0 |
 | **anima** | 3 | 4 | 3.0 | 4.3 | 2.3 | 3.7 | 3.7 |
 | **sml** | 2 | 2 | 3.0 | 3.5 | 2.0 | 3.0 | 3.0 |
 | **nmrg** | 2 | 2 | 3.0 | 3.0 | 3.5 | 3.0 | 3.5 |
 | **hpke** | 2 | 2 | 3.5 | 4.5 | 2.0 | 4.5 | 5.0 |
 | **dtn** | 2 | 2 | 3.0 | 4.0 | 1.0 | 3.5 | 2.5 |
 | **ace** | 2 | 2 | 3.5 | 4.0 | 3.0 | 4.0 | 4.0 |
 | **websec** | 1 | 1 | 3.0 | 4.0 | 2.0 | 4.0 | 4.0 |
 | **vwrap** | 1 | 1 | 4.0 | 3.0 | 2.0 | 3.0 | 4.0 |
 | **suit** | 1 | 1 | 3.0 | 4.0 | 2.0 | 4.0 | 4.0 |
 | **sip** | 1 | 1 | 3.0 | 4.0 | 2.0 | 4.0 | 3.0 |
 | **sec** | 1 | 1 | 2.0 | 5.0 | 4.0 | 3.0 | 4.0 |
 | **roll** | 1 | 1 | 2.0 | 4.0 | 3.0 | 3.0 | 3.0 |
 | **pim** | 1 | 2 | 2.0 | 3.0 | 3.0 | 3.0 | 2.0 |
 | **netconf** | 1 | 1 | 3.0 | 4.0 | 2.0 | 4.0 | 4.0 |
 | **mailmaint** | 1 | 1 | 2.0 | 4.0 | 4.0 | 2.0 | 2.0 |
 | **lisp** | 1 | 1 | 4.0 | 4.0 | 2.0 | 4.0 | 4.0 |
 | **grow** | 1 | 1 | 4.0 | 4.0 | 2.0 | 4.0 | 5.0 |
 | **core** | 1 | 1 | 3.0 | 4.0 | 1.0 | 4.0 | 4.0 |
 ## Cross-WG Category Spread
 Categories appearing in multiple WGs — potential coordination or alignment needed.
 | Category | WG Count | Total Drafts | WGs |
 |:---------|:--------:|-------------:|:----|
 | Data formats/interop | 23 | 174 | aipref(8), lamps(6), lake(4), httpbis(3), sml(2), sshm(2), hpke(2), lisp(1), mailmaint(1), nmrg(1), ace(1), suit(1), tls(1), anima(1), netconf(1), pim(1), dtn(1), websec(1), app(1), emu(1), core(1), sec(1) |
 | Agent identity/auth | 13 | 152 | lake(8), emu(6), anima(3), lamps(3), sshm(2), ace(2), hpke(2), sml(1), vwrap(1), aipref(1), core(1), sec(1) |
 | A2A protocols | 9 | 155 | idr(3), lake(2), lisp(1), ace(1), aipref(1), sip(1), vwrap(1), dtn(1) |
 | Autonomous netops | 9 | 114 | anima(2), dnsop(2), lisp(1), roll(1), nmrg(1), netconf(1), dtn(1), grow(1) |
 | Policy/governance | 9 | 108 | aipref(9), lamps(2), dnsop(2), lake(1), tls(1), websec(1), httpbis(1), idr(1) |
 | Agent discovery/reg | 8 | 89 | lake(2), roll(1), pim(1), sip(1), aipref(1), app(1), anima(1) |
 | Other AI/agent | 6 | 34 | tsv(3), httpbis(2), tls(2), app(2), dnsop(1) |
 | ML traffic mgmt | 5 | 79 | nmrg(1), tsv(1), aipref(1), grow(1) |
 | Human-agent interaction | 4 | 33 | aipref(3), nmrg(1), vwrap(1) |
 | AI safety/alignment | 3 | 47 | aipref(2), sml(1) |
 | Model serving/inference | 2 | 42 | nmrg(1) |
 ## Cross-WG Idea Overlap
 Same technical ideas appearing in different WGs — strongest signals for alignment.
 ### Hybrid Post-Quantum Cryptography for EAP-AKA' (1 WGs: emu)
 - **[emu]** [draft-ar-emu-hybrid-pqc-eapaka](https://datatracker.ietf.org/doc/draft-ar-emu-hybrid-pqc-eapaka/) — Enhancing Security in EAP-AKA' with Hybrid Post-Quantum Cryptography
 ## Individual vs WG-Adopted Distribution
 | Category | Individual | WG-Adopted | Assessment |
 |:---------|----------:|-----------:|:-----------|
 | A2A protocols | 144 | 11 | WG exists — individual drafts could target it |
 | AI safety/alignment | 44 | 3 | WG exists — individual drafts could target it |
 | Agent discovery/reg | 81 | 8 | WG exists — individual drafts could target it |
 | Agent identity/auth | 121 | 31 | WG exists — individual drafts could target it |
 | Autonomous netops | 104 | 10 | WG exists — individual drafts could target it |
 | Data formats/interop | 132 | 42 | WG exists — individual drafts could target it |
 | Human-agent interaction | 29 | 5 | WG exists — individual drafts could target it |
 | ML traffic mgmt | 75 | 4 | WG exists — individual drafts could target it |
 | Model serving/inference | 41 | 1 | WG exists — individual drafts could target it |
 | Other AI/agent | 24 | 10 | WG exists — individual drafts could target it |
 | Policy/governance | 91 | 18 | WG exists — individual drafts could target it |
 ## Recommended Submission Targets
 For each category, the best WG to submit new work to.
 | Category | Best WG | Alternatives |
 |:---------|:--------|:-------------|
 | Data formats/interop | **aipref** | lamps(6), lake(4) |
 | Agent identity/auth | **lake** | emu(6), anima(3) |
 | A2A protocols | **idr** | lake(2), lisp(1) |
 | Autonomous netops | **anima** | dnsop(2), lisp(1) |
 | Policy/governance | **aipref** | lamps(2), dnsop(2) |
 | Agent discovery/reg | **lake** | roll(1), pim(1) |
 | Other AI/agent | **tsv** | httpbis(2), tls(2) |
 | ML traffic mgmt | **nmrg** | tsv(1), aipref(1) |
 | Human-agent interaction | **aipref** | nmrg(1), vwrap(1) |
 | AI safety/alignment | **aipref** | sml(1) |
 | Model serving/inference | **nmrg** | - |
--- a/scripts/backfill-wg-names.py
+++ b/scripts/backfill-wg-names.py
@@ -0,0 +1,66 @@
 #!/usr/bin/env python3
 """Backfill working group names by resolving group_uri from Datatracker API."""
 import sqlite3
 import time
 import httpx
 DB_PATH = "data/drafts.db"
 conn = sqlite3.connect(DB_PATH)
 conn.row_factory = sqlite3.Row
 # Get distinct group_uris that don't have a group name yet
 rows = conn.execute("""
    SELECT DISTINCT group_uri FROM drafts
    WHERE group_uri IS NOT NULL AND group_uri != ''
      AND ("group" IS NULL OR "group" = '')
 """).fetchall()
 uris = [r["group_uri"] for r in rows]
 print(f"Resolving {len(uris)} unique group URIs...")
 client = httpx.Client(timeout=30, follow_redirects=True)
 resolved = {}
 for uri in uris:
    try:
        resp = client.get(f"https://datatracker.ietf.org{uri}", params={"format": "json"})
        resp.raise_for_status()
        data = resp.json()
        acronym = data.get("acronym", "")
        name = data.get("name", "")
        resolved[uri] = acronym or name or ""
        print(f"  {uri} -> {resolved[uri]} ({name})")
        time.sleep(0.3)
    except Exception as e:
        print(f"  {uri} -> ERROR: {e}")
        resolved[uri] = ""
 client.close()
 # Update the database
 for uri, group_name in resolved.items():
    if group_name:
        conn.execute(
            'UPDATE drafts SET "group" = ? WHERE group_uri = ?',
            (group_name, uri),
        )
 conn.commit()
 # Show summary
 rows = conn.execute("""
    SELECT "group", COUNT(*) as cnt FROM drafts
    WHERE "group" IS NOT NULL AND "group" != ''
    GROUP BY "group" ORDER BY cnt DESC
 """).fetchall()
 print(f"\nWorking groups resolved ({len(rows)} groups):")
 for r in rows:
    print(f"  {r[0]:30s} {r[1]} drafts")
 total = conn.execute('SELECT COUNT(*) FROM drafts WHERE "group" IS NOT NULL AND "group" != ""').fetchone()[0]
 none_count = conn.execute('SELECT COUNT(*) FROM drafts WHERE "group" IS NULL OR "group" = ""').fetchone()[0]
 print(f"\nTotal with WG: {total}, individual/unresolved: {none_count}")
 conn.close()
--- a/scripts/classify-unrated.py
+++ b/scripts/classify-unrated.py
@@ -0,0 +1,39 @@
 #!/usr/bin/env python3
 """Classify unrated drafts using Ollama two-stage filter."""
 import sqlite3
 import sys
 sys.path.insert(0, "src")
 from ietf_analyzer.classifier import Classifier
 from ietf_analyzer.config import Config
 cfg = Config.load()
 conn = sqlite3.connect(cfg.db_path)
 conn.row_factory = sqlite3.Row
 # Get unrated drafts
 rows = conn.execute("""
    SELECT name, title, abstract, source FROM drafts
    WHERE name NOT IN (SELECT draft_name FROM ratings)
    ORDER BY source, name
 """).fetchall()
 drafts = [dict(r) for r in rows]
 print(f"Classifying {len(drafts)} unrated drafts...\n")
 with Classifier(cfg) as clf:
    relevant, irrelevant = clf.classify_batch(drafts, verbose=True)
 print(f"\n--- RELEVANT ({len(relevant)}) ---")
 for d in relevant:
    print(f"  [{d['source']}] {d['name']}")
    print(f"    {d['title'][:100]}")
 print(f"\n--- IRRELEVANT ({len(irrelevant)}) ---")
 for d in irrelevant:
    print(f"  [{d['source']}] {d['name']}")
    print(f"    {d['title'][:100]}")
 print(f"\nSummary: {len(relevant)} relevant, {len(irrelevant)} irrelevant out of {len(drafts)}")
 conn.close()
--- a/scripts/compare-classifiers.py
+++ b/scripts/compare-classifiers.py
@@ -0,0 +1,86 @@
 #!/usr/bin/env python3
 """Compare Ollama classifier vs Claude ratings to find disagreements."""
 import sqlite3
 import sys
 sys.path.insert(0, "src")
 from ietf_analyzer.classifier import Classifier
 from ietf_analyzer.config import Config
 cfg = Config.load()
 conn = sqlite3.connect(cfg.db_path)
 conn.row_factory = sqlite3.Row
 # Get all rated drafts with their Claude ratings
 rows = conn.execute("""
    SELECT d.name, d.title, d.abstract, r.relevance, r.false_positive,
           r.novelty, r.maturity, r.overlap, r.momentum,
           (r.novelty + r.maturity + (5 - r.overlap) + r.momentum + r.relevance) / 5.0 as composite
    FROM drafts d JOIN ratings r ON d.name = r.draft_name
    WHERE d.abstract IS NOT NULL AND d.abstract != ''
    ORDER BY d.name
 """).fetchall()
 print(f"Comparing Ollama classifier vs Claude ratings on {len(rows)} drafts...\n")
 with Classifier(cfg) as clf:
    agree = 0
    disagree_ollama_yes_claude_no = []  # Ollama says relevant, Claude says FP
    disagree_ollama_no_claude_yes = []  # Ollama says irrelevant, Claude says relevant
    for i, r in enumerate(rows):
        is_rel, sim, method = clf.classify(r["title"], r["abstract"])
        # Claude's view: false_positive=1 OR relevance<=2 means "not really relevant"
        claude_relevant = not r["false_positive"] and r["relevance"] >= 3
        if is_rel == claude_relevant:
            agree += 1
        elif is_rel and not claude_relevant:
            disagree_ollama_yes_claude_no.append({
                "name": r["name"], "title": r["title"][:60],
                "sim": sim, "method": method,
                "relevance": r["relevance"], "fp": r["false_positive"],
                "composite": r["composite"],
            })
        else:
            disagree_ollama_no_claude_yes.append({
                "name": r["name"], "title": r["title"][:60],
                "sim": sim, "method": method,
                "relevance": r["relevance"], "fp": r["false_positive"],
                "composite": r["composite"],
            })
        if (i + 1) % 50 == 0:
            print(f"  Processed {i+1}/{len(rows)}...")
 print(f"\n{'='*70}")
 print(f"AGREEMENT: {agree}/{len(rows)} ({100*agree/len(rows):.1f}%)")
 print(f"{'='*70}")
 print(f"\nOllama=RELEVANT but Claude=NOT relevant ({len(disagree_ollama_yes_claude_no)}):")
 print(f"  (These are cases where Ollama wastes Claude tokens on irrelevant drafts)")
 for d in sorted(disagree_ollama_yes_claude_no, key=lambda x: x["sim"], reverse=True)[:15]:
    fp_label = " [FP]" if d["fp"] else ""
    print(f"  sim={d['sim']:.3f} ({d['method']:18s}) rel={d['relevance']}{fp_label} | {d['name']}")
    print(f"    {d['title']}")
 print(f"\nOllama=IRRELEVANT but Claude=RELEVANT ({len(disagree_ollama_no_claude_yes)}):")
 print(f"  (These are cases where Ollama would have incorrectly filtered out good drafts)")
 for d in sorted(disagree_ollama_no_claude_yes, key=lambda x: x["relevance"], reverse=True)[:15]:
    print(f"  sim={d['sim']:.3f} ({d['method']:18s}) rel={d['relevance']} comp={d['composite']:.1f} | {d['name']}")
    print(f"    {d['title']}")
 # Summary stats
 total_fp_by_claude = sum(1 for r in rows if r["false_positive"] or r["relevance"] <= 2)
 total_relevant_by_claude = len(rows) - total_fp_by_claude
 print(f"\n{'='*70}")
 print(f"Claude thinks: {total_relevant_by_claude} relevant, {total_fp_by_claude} not relevant")
 print(f"Ollama would let through: {agree + len(disagree_ollama_yes_claude_no) - len(disagree_ollama_no_claude_yes)} (saves {len(disagree_ollama_no_claude_yes) - len(disagree_ollama_yes_claude_no)} Claude calls)")
 print(f"\nToken savings if Ollama pre-filters:")
 print(f"  Correctly rejected: {agree - total_relevant_by_claude + len(rows) - agree - len(disagree_ollama_yes_claude_no)} drafts")
 print(f"  Incorrectly rejected (missed): {len(disagree_ollama_no_claude_yes)} drafts")
 print(f"  Incorrectly passed (wasted): {len(disagree_ollama_yes_claude_no)} drafts")
 conn.close()
--- a/scripts/download-relevant-text.py
+++ b/scripts/download-relevant-text.py
@@ -0,0 +1,65 @@
 #!/usr/bin/env python3
 """Download full text for the 9 classifier-relevant unrated drafts."""
 import sqlite3
 import time
 import sys
 sys.path.insert(0, "src")
 import httpx
 from ietf_analyzer.config import Config
 cfg = Config.load()
 conn = sqlite3.connect(cfg.db_path)
 conn.row_factory = sqlite3.Row
 # The 9 relevant drafts from classifier
 relevant_names = [
    "draft-bondar-wca",
    "draft-latour-pre-registration",
    "draft-li-trustworthy-routing-discovery",
    "draft-scrm-aiproto-usecases",
    "draft-song-dmsc-problem-statement",
    "draft-wiethuechter-drip-det-moc",
    "draft-wiethuechter-drip-det-tada",
    "draft-zzn-dvs",
    "w3c-cuap",
 ]
 client = httpx.Client(timeout=30, follow_redirects=True)
 for name in relevant_names:
    row = conn.execute("SELECT name, rev, source, source_url, full_text FROM drafts WHERE name=?", (name,)).fetchone()
    if not row:
        print(f"  SKIP {name}: not in DB")
        continue
    if row["full_text"]:
        print(f"  SKIP {name}: already has text")
        continue
    if row["source"] == "w3c":
        url = row["source_url"] or ""
        if not url:
            print(f"  SKIP {name}: no source_url for W3C doc")
            continue
    else:
        rev = row["rev"] or "00"
        url = f"https://www.ietf.org/archive/id/{name}-{rev}.txt"
    print(f"  Fetching {name} from {url}...")
    try:
        resp = client.get(url)
        if resp.status_code == 200:
            text = resp.text[:500000]  # cap at 500K
            conn.execute("UPDATE drafts SET full_text=? WHERE name=?", (text, name))
            conn.commit()
            print(f"    OK ({len(text)} chars)")
        else:
            print(f"    FAIL: HTTP {resp.status_code}")
    except Exception as e:
        print(f"    ERROR: {e}")
    time.sleep(0.5)
 client.close()
 conn.close()
 print("\nDone.")
--- a/scripts/run-webui.sh
+++ b/scripts/run-webui.sh
@@ -0,0 +1,8 @@
 #!/usr/bin/env bash
 # Start the IETF Draft Analyzer Web Dashboard
 #
 # Usage:
 #   ./scripts/run-webui.sh          # Production (admin disabled)
 #   ./scripts/run-webui.sh --dev    # Development (admin enabled)
 cd "$(dirname "$0")/.."
 python src/webui/app.py "$@"
--- a/src/ietf_analyzer/classifier.py
+++ b/src/ietf_analyzer/classifier.py
@@ -0,0 +1,182 @@
 """Local AI-relevance classifier using Ollama.
 Two-stage filter to avoid spending Claude tokens on irrelevant drafts:
 1. Embedding similarity — fast cosine check against a reference description
 2. Chat classification — small local model for borderline cases
 Both stages run locally via Ollama (zero cost).
 """
 from __future__ import annotations
 import numpy as np
 import ollama as ollama_lib
 from rich.console import Console
 from .config import Config
 console = Console()
 # Reference description of what we're looking for.
 # Embedding of this text is compared against each draft's abstract.
 REFERENCE_DESCRIPTION = """
 AI agent protocols, autonomous agent communication, agent-to-agent interaction,
 agent identity and authentication, agent authorization, agent discovery,
 large language model integration with network protocols, agentic systems,
 machine learning for network operations, AI safety in networked systems,
 model context protocol, multi-agent coordination, agent task delegation,
 generative AI infrastructure, intelligent network automation,
 trustworthy AI systems, AI governance in standards.
 """
 # Thresholds for the two-stage filter (calibrated against 434 drafts + 73 FPs)
 # TP avg similarity: 0.685, FP avg: 0.598
 SIMILARITY_ACCEPT = 0.72   # Above this: definitely relevant, skip chat
 SIMILARITY_REJECT = 0.50   # Below this: definitely irrelevant, skip chat
 # Between REJECT and ACCEPT: borderline, use chat model to decide
 CLASSIFY_PROMPT = """\
 You are classifying IETF Internet-Drafts for an AI/agent standards tracker.
 A draft is RELEVANT if it relates to ANY of these topics:
 - AI agents, autonomous agents, multi-agent systems
 - Agent identity, authentication, authorization, discovery
 - Agent-to-agent (A2A) communication protocols
 - Large language models (LLMs), generative AI
 - Machine learning in network operations
 - AI safety, alignment, trustworthiness
 - Model Context Protocol (MCP), agentic workflows
 - OAuth/JWT/credentials for agents or AI systems
 - Autonomous network operations using AI
 - Intelligent network management or traffic handling
 A draft is NOT relevant if it only covers:
 - Pure cryptography without AI/agent context
 - General networking protocols (BGP, DNS, TLS) without AI
 - Email, HTTP, or web standards without AI/agent features
 Title: {title}
 Abstract: {abstract}
 Is this draft relevant to AI agents or related topics? Answer ONLY "yes" or "no"."""
 class Classifier:
    """Classify drafts as AI-relevant using local Ollama models."""
    def __init__(self, config: Config | None = None):
        self.config = config or Config.load()
        self.client = ollama_lib.Client(host=self.config.ollama_url)
        self._ref_embedding: np.ndarray | None = None
    def close(self) -> None:
        if hasattr(self.client, '_client'):
            self.client._client.close()
    def __enter__(self):
        return self
    def __exit__(self, *exc):
        self.close()
    def _get_reference_embedding(self) -> np.ndarray:
        """Get (cached) embedding of the reference AI description."""
        if self._ref_embedding is None:
            resp = self.client.embed(
                model=self.config.ollama_embed_model,
                input=REFERENCE_DESCRIPTION.strip(),
            )
            self._ref_embedding = np.array(resp["embeddings"][0], dtype=np.float32)
        return self._ref_embedding
    def _embed(self, text: str) -> np.ndarray:
        """Embed a text string."""
        resp = self.client.embed(
            model=self.config.ollama_embed_model,
            input=text[:8000],
        )
        return np.array(resp["embeddings"][0], dtype=np.float32)
    def _cosine_similarity(self, a: np.ndarray, b: np.ndarray) -> float:
        dot = np.dot(a, b)
        norm = np.linalg.norm(a) * np.linalg.norm(b)
        return float(dot / norm) if norm > 0 else 0.0
    def _chat_classify(self, title: str, abstract: str) -> bool:
        """Ask local chat model whether a draft is AI-related."""
        prompt = CLASSIFY_PROMPT.format(title=title, abstract=abstract[:2000])
        try:
            resp = self.client.chat(
                model=self.config.ollama_classify_model,
                messages=[{"role": "user", "content": prompt}],
                options={"temperature": 0.0, "num_predict": 10},
            )
            answer = resp["message"]["content"].strip().lower()
            return answer.startswith("yes")
        except Exception as e:
            console.print(f"[yellow]Chat classify failed: {e}, defaulting to relevant[/yellow]")
            return True  # err on the side of inclusion
    def classify(self, title: str, abstract: str) -> tuple[bool, float, str]:
        """Classify a draft as AI-relevant.
        Returns:
            (is_relevant, similarity_score, method)
            method is one of: "embedding_accept", "embedding_reject", "chat_yes", "chat_no"
        """
        text = f"{title}\n{abstract}"
        ref = self._get_reference_embedding()
        emb = self._embed(text)
        sim = self._cosine_similarity(emb, ref)
        if sim >= SIMILARITY_ACCEPT:
            return True, sim, "embedding_accept"
        if sim <= SIMILARITY_REJECT:
            return False, sim, "embedding_reject"
        # Borderline — ask chat model
        is_relevant = self._chat_classify(title, abstract)
        method = "chat_yes" if is_relevant else "chat_no"
        return is_relevant, sim, method
    def classify_batch(
        self, drafts: list[dict], verbose: bool = True
    ) -> tuple[list[dict], list[dict]]:
        """Classify a batch of drafts.
        Args:
            drafts: list of dicts with at least 'name', 'title', 'abstract' keys
        Returns:
            (relevant, irrelevant) — two lists of draft dicts
        """
        relevant = []
        irrelevant = []
        stats = {"embedding_accept": 0, "embedding_reject": 0, "chat_yes": 0, "chat_no": 0}
        for i, d in enumerate(drafts):
            is_rel, sim, method = self.classify(
                d.get("title", ""), d.get("abstract", "")
            )
            stats[method] += 1
            if verbose and (i + 1) % 10 == 0:
                console.print(f"  Classified {i + 1}/{len(drafts)}...")
            if is_rel:
                relevant.append(d)
            else:
                irrelevant.append(d)
        if verbose:
            console.print(
                f"\n  [green]Relevant: {len(relevant)}[/green]  "
                f"[red]Irrelevant: {len(irrelevant)}[/red]\n"
                f"  Embedding accept: {stats['embedding_accept']}  "
                f"  Embedding reject: {stats['embedding_reject']}\n"
                f"  Chat yes: {stats['chat_yes']}  "
                f"  Chat no: {stats['chat_no']}"
            )
        return relevant, irrelevant
--- a/src/ietf_analyzer/db.py
+++ b/src/ietf_analyzer/db.py
@@ -297,8 +297,9 @@ class Database:
    def upsert_draft(self, draft: Draft) -> None:
        self.conn.execute(
            """INSERT INTO drafts (name, rev, title, abstract, time, dt_id, pages, words,
-                "group", group_uri, expires, ad, shepherd, states, full_text, categories, tags, fetched_at)
+                "group", group_uri, expires, ad, shepherd, states, full_text, categories, tags, fetched_at,
-            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+                source, source_id, source_url, doc_status)
            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
            ON CONFLICT(name) DO UPDATE SET
                rev=excluded.rev, title=excluded.title, abstract=excluded.abstract,
                time=excluded.time, dt_id=excluded.dt_id, pages=excluded.pages,
@@ -307,7 +308,9 @@ class Database:
                states=excluded.states,
                full_text=COALESCE(excluded.full_text, full_text),
                categories=excluded.categories, tags=excluded.tags,
-                fetched_at=excluded.fetched_at
+                fetched_at=excluded.fetched_at,
                source=excluded.source, source_id=excluded.source_id,
                source_url=excluded.source_url, doc_status=excluded.doc_status
            """,
            (
                draft.name, draft.rev, draft.title, draft.abstract, draft.time,
@@ -316,6 +319,7 @@ class Database:
                json.dumps(draft.states), draft.full_text,
                json.dumps(draft.categories), json.dumps(draft.tags),
                draft.fetched_at or datetime.now(timezone.utc).isoformat(),
                draft.source, draft.source_id, draft.source_url, draft.doc_status,
            ),
        )
        self.conn.commit()
--- a/src/webui/analytics.py
+++ b/src/webui/analytics.py
@@ -0,0 +1,244 @@
 """Lightweight, GDPR-compliant analytics using SQLite.
 No cookies, no personal data, no consent needed.
 Visitor uniqueness is estimated via daily-salted IP hash (not stored raw).
 Data lives in a separate analytics.db to keep the main DB clean.
 """
 from __future__ import annotations
 import hashlib
 import sqlite3
 import time
 from collections import Counter, defaultdict
 from datetime import date, datetime, timedelta
 from pathlib import Path
 from urllib.parse import urlparse
 from flask import Flask, request, g
 _SCHEMA = """
 CREATE TABLE IF NOT EXISTS page_views (
    id        INTEGER PRIMARY KEY AUTOINCREMENT,
    ts        TEXT    NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%S', 'now')),
    date      TEXT    NOT NULL DEFAULT (strftime('%Y-%m-%d', 'now')),
    path      TEXT    NOT NULL,
    referrer  TEXT,
    visitor   TEXT,
    ua_type   TEXT
 );
 CREATE INDEX IF NOT EXISTS idx_pv_date ON page_views(date);
 CREATE INDEX IF NOT EXISTS idx_pv_path ON page_views(path);
 CREATE TABLE IF NOT EXISTS downloads (
    id        INTEGER PRIMARY KEY AUTOINCREMENT,
    ts        TEXT    NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%S', 'now')),
    date      TEXT    NOT NULL DEFAULT (strftime('%Y-%m-%d', 'now')),
    file_type TEXT    NOT NULL,
    visitor   TEXT
 );
 CREATE INDEX IF NOT EXISTS idx_dl_date ON downloads(date);
 """
 # Daily salt rotates so yesterday's hashes can't be correlated with today's
 _daily_salt: tuple[str, str] = ("", "")
 def _get_salt() -> str:
    global _daily_salt
    today = date.today().isoformat()
    if _daily_salt[0] != today:
        _daily_salt = (today, hashlib.sha256(f"ietf-analytics-{today}".encode()).hexdigest()[:16])
    return _daily_salt[1]
 def _hash_visitor(ip: str) -> str:
    """Create a daily-rotating hash from IP. Cannot be reversed or correlated across days."""
    salt = _get_salt()
    return hashlib.sha256(f"{salt}:{ip}".encode()).hexdigest()[:12]
 def _classify_ua(ua: str) -> str:
    """Rough bot/browser classification."""
    ua_lower = ua.lower()
    if any(b in ua_lower for b in ("bot", "spider", "crawl", "slurp", "wget", "curl", "python-requests")):
        return "bot"
    if "mobile" in ua_lower:
        return "mobile"
    return "browser"
 def _get_analytics_db() -> sqlite3.Connection:
    """Get or create the analytics DB connection for this request."""
    if "analytics_db" not in g:
        db_path = Path(request.app_root or ".").parent.parent.parent / "data" / "analytics.db"
        # Fall back to app config if available
        if hasattr(g, "_analytics_db_path"):
            db_path = g._analytics_db_path
        db_path.parent.mkdir(parents=True, exist_ok=True)
        conn = sqlite3.connect(str(db_path))
        conn.row_factory = sqlite3.Row
        conn.executescript(_SCHEMA)
        g.analytics_db = conn
    return g.analytics_db
 # Paths to skip (static assets, API calls, etc.)
 _SKIP_PREFIXES = ("/static/", "/api/", "/favicon", "/robots.txt", "/admin/")
 def init_analytics(app: Flask, db_path: str | None = None):
    """Register analytics hooks on the Flask app."""
    _resolved_db_path = Path(db_path) if db_path else (
        Path(app.root_path).parent.parent / "data" / "analytics.db"
    )
    @app.before_request
    def track_pageview():
        path = request.path
        # Skip static/API/admin routes
        if any(path.startswith(p) for p in _SKIP_PREFIXES):
            return
        g._analytics_db_path = _resolved_db_path
        try:
            conn = _get_analytics_db()
            ip = request.remote_addr or "unknown"
            visitor = _hash_visitor(ip)
            ua = request.headers.get("User-Agent", "")
            ua_type = _classify_ua(ua)
            # Skip bots from page view counts (still track downloads)
            if ua_type == "bot" and path != "/export/obsidian":
                return
            referrer = request.headers.get("Referer", "")
            # Only keep the domain of referrer
            if referrer:
                try:
                    parsed = urlparse(referrer)
                    referrer = parsed.netloc or ""
                except Exception:
                    referrer = ""
            # Track downloads separately
            if path == "/export/obsidian":
                conn.execute(
                    "INSERT INTO downloads (file_type, visitor) VALUES (?, ?)",
                    ("obsidian", visitor),
                )
                conn.commit()
            conn.execute(
                "INSERT INTO page_views (path, referrer, visitor, ua_type) VALUES (?, ?, ?, ?)",
                (path, referrer, visitor, ua_type),
            )
            conn.commit()
        except Exception:
            pass  # Analytics should never break the app
    @app.teardown_appcontext
    def close_analytics_db(exception=None):
        conn = g.pop("analytics_db", None)
        if conn is not None:
            conn.close()
 def get_analytics_data(db_path: str | Path) -> dict:
    """Query analytics data for the dashboard. Returns dicts ready for rendering."""
    conn = sqlite3.connect(str(db_path))
    conn.row_factory = sqlite3.Row
    conn.executescript(_SCHEMA)
    today = date.today()
    week_ago = (today - timedelta(days=7)).isoformat()
    month_ago = (today - timedelta(days=30)).isoformat()
    # --- Overall stats ---
    total_views = conn.execute("SELECT COUNT(*) FROM page_views").fetchone()[0]
    total_visitors = conn.execute("SELECT COUNT(DISTINCT visitor || date) FROM page_views").fetchone()[0]
    total_downloads = conn.execute("SELECT COUNT(*) FROM downloads").fetchone()[0]
    today_views = conn.execute(
        "SELECT COUNT(*) FROM page_views WHERE date = ?", (today.isoformat(),)
    ).fetchone()[0]
    today_visitors = conn.execute(
        "SELECT COUNT(DISTINCT visitor) FROM page_views WHERE date = ?", (today.isoformat(),)
    ).fetchone()[0]
    week_views = conn.execute(
        "SELECT COUNT(*) FROM page_views WHERE date >= ?", (week_ago,)
    ).fetchone()[0]
    month_views = conn.execute(
        "SELECT COUNT(*) FROM page_views WHERE date >= ?", (month_ago,)
    ).fetchone()[0]
    # --- Daily views (last 30 days) ---
    daily_rows = conn.execute(
        "SELECT date, COUNT(*) as views, COUNT(DISTINCT visitor) as visitors "
        "FROM page_views WHERE date >= ? GROUP BY date ORDER BY date",
        (month_ago,),
    ).fetchall()
    daily = {
        "dates": [r["date"] for r in daily_rows],
        "views": [r["views"] for r in daily_rows],
        "visitors": [r["visitors"] for r in daily_rows],
    }
    # --- Top pages (last 30 days) ---
    page_rows = conn.execute(
        "SELECT path, COUNT(*) as views, COUNT(DISTINCT visitor) as visitors "
        "FROM page_views WHERE date >= ? GROUP BY path ORDER BY views DESC LIMIT 20",
        (month_ago,),
    ).fetchall()
    top_pages = [{"path": r["path"], "views": r["views"], "visitors": r["visitors"]} for r in page_rows]
    # --- Top referrers (last 30 days) ---
    ref_rows = conn.execute(
        "SELECT referrer, COUNT(*) as count FROM page_views "
        "WHERE date >= ? AND referrer != '' GROUP BY referrer ORDER BY count DESC LIMIT 15",
        (month_ago,),
    ).fetchall()
    top_referrers = [{"referrer": r["referrer"], "count": r["count"]} for r in ref_rows]
    # --- Downloads over time ---
    dl_rows = conn.execute(
        "SELECT date, COUNT(*) as count FROM downloads GROUP BY date ORDER BY date"
    ).fetchall()
    downloads_daily = {
        "dates": [r["date"] for r in dl_rows],
        "counts": [r["count"] for r in dl_rows],
    }
    # --- Hourly pattern (last 7 days) ---
    hourly_rows = conn.execute(
        "SELECT CAST(strftime('%H', ts) AS INTEGER) as hour, COUNT(*) as views "
        "FROM page_views WHERE date >= ? GROUP BY hour ORDER BY hour",
        (week_ago,),
    ).fetchall()
    hourly = {r["hour"]: r["views"] for r in hourly_rows}
    hourly_full = {"hours": list(range(24)), "views": [hourly.get(h, 0) for h in range(24)]}
    conn.close()
    return {
        "stats": {
            "total_views": total_views,
            "total_visitors": total_visitors,
            "total_downloads": total_downloads,
            "today_views": today_views,
            "today_visitors": today_visitors,
            "week_views": week_views,
            "month_views": month_views,
        },
        "daily": daily,
        "top_pages": top_pages,
        "top_referrers": top_referrers,
        "downloads_daily": downloads_daily,
        "hourly": hourly_full,
    }
--- a/src/webui/auth.py
+++ b/src/webui/auth.py
@@ -0,0 +1,55 @@
 """Admin authentication with two run modes.
 Production (default):
    python src/webui/app.py
    All admin routes return 404. No way to access private features.
 Development:
    python src/webui/app.py --dev
    Every request is auto-authenticated as admin. No login needed.
 The mode is set once at startup and cannot be changed at runtime.
 """
 from __future__ import annotations
 from functools import wraps
 from flask import abort, g
 # Module-level flag set by init_auth()
 _dev_mode: bool = False
 _initialized: bool = False
 def is_admin() -> bool:
    """Check if the current request has admin access."""
    return _dev_mode
 def admin_required(f):
    """Decorator: returns 404 for non-admin users so routes stay hidden."""
    @wraps(f)
    def decorated(*args, **kwargs):
        if not is_admin():
            abort(404)
        return f(*args, **kwargs)
    return decorated
 def init_auth(app, dev: bool = False):
    """Set the auth mode and register Flask hooks (once only)."""
    global _dev_mode, _initialized
    _dev_mode = dev
    if _initialized:
        return
    _initialized = True
    @app.before_request
    def set_admin_flag():
        g.is_admin = is_admin()
    @app.context_processor
    def inject_admin():
        return {"is_admin": g.get("is_admin", False)}
--- a/src/webui/obsidian_export.py
+++ b/src/webui/obsidian_export.py
@@ -0,0 +1,508 @@
 """Export research data as an Obsidian-compatible vault (ZIP).
 Generates interlinked markdown files with YAML frontmatter,
 [[wikilinks]], #tags, and Mermaid diagrams that Obsidian renders natively.
 """
 from __future__ import annotations
 import io
 import zipfile
 from collections import Counter, defaultdict
 from datetime import date
 from ietf_analyzer.db import Database
 from webui.data import _extract_month
 def _safe_filename(name: str) -> str:
    """Sanitize a string for use as a filename."""
    return name.replace("/", "-").replace("\\", "-").replace(":", "-").replace('"', "")
 def _score_bar(val: float, max_val: float = 5.0) -> str:
    """Render a simple text progress bar."""
    filled = round(val / max_val * 10)
    return "`" + "\u2588" * filled + "\u2591" * (10 - filled) + f"` {val}/{max_val}"
 def _mermaid_pie(title: str, data: dict[str, int], limit: int = 12) -> str:
    """Generate a Mermaid pie chart."""
    items = list(data.items())[:limit]
    if not items:
        return ""
    lines = [f'```mermaid\npie title {title}']
    for label, count in items:
        safe_label = label.replace('"', "'")
        lines.append(f'    "{safe_label}" : {count}')
    lines.append("```")
    return "\n".join(lines)
 def _mermaid_bar(title: str, data: dict[str, float], limit: int = 15) -> str:
    """Generate a Mermaid xychart bar chart."""
    items = list(data.items())[:limit]
    if not items:
        return ""
    labels = [f'"{k[:20]}"' for k, _ in items]
    values = [str(round(v, 1)) for _, v in items]
    return f"""```mermaid
 xychart-beta
    title "{title}"
    x-axis [{", ".join(labels)}]
    y-axis "Score"
    bar [{", ".join(values)}]
 ```"""
 def _mermaid_timeline_chart(monthly: dict[str, int]) -> str:
    """Generate a Mermaid xychart for submissions over time."""
    if len(monthly) < 2:
        return ""
    months = sorted(monthly.keys())
    # Show every 3rd label to avoid clutter
    labels = []
    for i, m in enumerate(months):
        if i % 3 == 0:
            labels.append(f'"{m}"')
        else:
            labels.append('" "')
    values = [str(monthly[m]) for m in months]
    return f"""```mermaid
 xychart-beta
    title "Draft Submissions Over Time"
    x-axis [{", ".join(labels)}]
    y-axis "Drafts"
    bar [{", ".join(values)}]
 ```"""
 def build_obsidian_vault(db: Database) -> bytes:
    """Build a ZIP file containing an Obsidian vault with all research data."""
    buf = io.BytesIO()
    prefix = "IETF-AI-Agent-Drafts"
    pairs = db.drafts_with_ratings(limit=2000)
    all_drafts_list = db.list_drafts(limit=2000, order_by="time DESC")
    draft_map = {d.name: d for d in all_drafts_list}
    all_ideas = db.all_ideas()
    all_authors = db.top_authors(limit=500)
    # Build lookup maps
    cat_counts: Counter = Counter()
    cat_drafts: dict[str, list[str]] = defaultdict(list)
    score_map: dict[str, float] = {}
    rating_map: dict[str, object] = {}
    for d, r in pairs:
        score_map[d.name] = r.composite_score
        rating_map[d.name] = r
        for cat in r.categories:
            cat_counts[cat] += 1
            cat_drafts[cat].append(d.name)
    # Monthly submission counts
    monthly: Counter = Counter()
    for d in all_drafts_list:
        monthly[_extract_month(d.time)] += 1
    # Ideas by draft
    ideas_by_draft: dict[str, list[dict]] = defaultdict(list)
    for idea in all_ideas:
        ideas_by_draft[idea.get("draft_name", "")].append(idea)
    # Author info by draft
    author_drafts: dict[str, list[str]] = defaultdict(list)
    author_info: dict[str, dict] = {}
    for name, aff, cnt, drafts in all_authors:
        author_info[name] = {"affiliation": aff or "", "draft_count": cnt, "drafts": drafts}
        for dn in drafts:
            author_drafts[dn].append(name)
    with zipfile.ZipFile(buf, "w", zipfile.ZIP_DEFLATED) as zf:
        # --- Dashboard.md ---
        top_rated = sorted(pairs, key=lambda p: p[1].composite_score, reverse=True)[:15]
        top_table = "| Draft | Score | Category |\n|---|---|---|\n"
        for d, r in top_rated:
            score = r.composite_score
            cat = r.categories[0] if r.categories else ""
            top_table += f"| [[{d.name}]] | **{score:.2f}** | {cat} |\n"
        cat_pie = _mermaid_pie("Drafts by Category", dict(cat_counts.most_common(12)))
        timeline_chart = _mermaid_timeline_chart(dict(sorted(monthly.items())))
        # Score distribution as mermaid
        score_buckets: Counter = Counter()
        for _, r in pairs:
            bucket = f"{r.composite_score:.0f}"
            score_buckets[bucket] += 1
        score_dist = dict(sorted(score_buckets.items()))
        dashboard = f"""---
 tags: [dashboard, ietf, ai-agents]
 generated: {date.today().isoformat()}
 ---
 # IETF AI/Agent Draft Analysis
 > Automated analysis of {len(all_drafts_list)} Internet-Drafts on AI and agent topics.
 > Generated by [IETF Draft Analyzer](https://github.com) on {date.today().isoformat()}.
 ## Key Stats
 | Metric | Value |
 |---|---|
 | Total Drafts | **{len(all_drafts_list)}** |
 | Rated Drafts | **{len(pairs)}** |
 | Authors | **{len(all_authors)}** |
 | Ideas Extracted | **{len(all_ideas)}** |
 | Categories | **{len(cat_counts)}** |
 ## Categories
 {cat_pie}
 ### Category Index
 {chr(10).join(f"- [[{cat}]] ({count} drafts)" for cat, count in cat_counts.most_common())}
 ## Submissions Over Time
 {timeline_chart}
 ## Top Rated Drafts
 {top_table}
 ## Navigation
 - **[[Categories/index|Categories]]** — Browse by topic
 - **[[Authors/index|Authors]]** — Browse by author
 - **[[Analysis/Score Distribution|Score Distribution]]** — Rating analytics
 - **[[Analysis/Top Rated|Top Rated]]** — Highest-scored drafts
 - **[[Analysis/Ideas Overview|Ideas]]** — Extracted technical ideas
 - **[[Analysis/Glossary|Glossary]]** — Terms, abbreviations, and scoring methodology
 """
        zf.writestr(f"{prefix}/Dashboard.md", dashboard)
        # --- Individual Draft Notes ---
        for d_obj in all_drafts_list:
            name = d_obj.name
            draft = draft_map.get(name, d_obj)
            r = rating_map.get(name)
            ideas = ideas_by_draft.get(name, [])
            authors = author_drafts.get(name, [])
            month = _extract_month(draft.time)
            # Frontmatter
            fm_lines = [
                "---",
                f'title: "{(draft.title or name).replace(chr(34), chr(39))}"',
                f"date: {draft.time or 'unknown'}",
                f"rev: {draft.rev or '00'}",
            ]
            if r:
                fm_lines.append(f"score: {r.composite_score:.2f}")
                fm_lines.append(f"novelty: {r.novelty}")
                fm_lines.append(f"maturity: {r.maturity}")
                fm_lines.append(f"overlap: {r.overlap}")
                fm_lines.append(f"momentum: {r.momentum}")
                fm_lines.append(f"relevance: {r.relevance}")
                if r.categories:
                    fm_lines.append(f"categories: [{', '.join(r.categories)}]")
            if authors:
                fm_lines.append(f"authors: [{', '.join(a.replace(',', '') for a in authors)}]")
            fm_lines.append(f"tags: [draft, ietf, {month}]")
            fm_lines.append("---")
            frontmatter = "\n".join(fm_lines)
            # Body
            body = f"\n# {draft.title or name}\n\n"
            body += f"**{name}** | rev {draft.rev or '00'} | {draft.time or 'unknown'}\n\n"
            if authors:
                body += "## Authors\n\n"
                body += "\n".join(f"- [[{a}]]" for a in authors) + "\n\n"
            if r:
                body += "## Rating\n\n"
                body += f"**Composite Score: {r.composite_score:.2f}**\n\n"
                body += f"| Dimension | Score |\n|---|---|\n"
                body += f"| Novelty | {_score_bar(r.novelty)} |\n"
                body += f"| Maturity | {_score_bar(r.maturity)} |\n"
                body += f"| Overlap | {_score_bar(r.overlap)} |\n"
                body += f"| Momentum | {_score_bar(r.momentum)} |\n"
                body += f"| Relevance | {_score_bar(r.relevance)} |\n\n"
                if r.summary:
                    body += f"> {r.summary}\n\n"
                if r.categories:
                    body += "**Categories:** " + ", ".join(f"[[{c}]]" for c in r.categories) + "\n\n"
            if draft.abstract:
                body += "## Abstract\n\n"
                body += draft.abstract + "\n\n"
            if ideas:
                body += f"## Extracted Ideas ({len(ideas)})\n\n"
                for idea in ideas:
                    novelty = f" `N:{idea.get('novelty_score', '?')}`" if idea.get("novelty_score") else ""
                    itype = f" *{idea.get('type', '')}*" if idea.get("type") else ""
                    body += f"- **{idea.get('title', 'Untitled')}**{itype}{novelty}\n"
                    if idea.get("description"):
                        body += f"  {idea['description']}\n"
                body += "\n"
            body += "## Links\n\n"
            body += f"- [View on IETF Datatracker](https://datatracker.ietf.org/doc/{name}/)\n"
            if draft.rev:
                body += f"- [Read Full Text](https://www.ietf.org/archive/id/{name}-{draft.rev}.txt)\n"
            content = frontmatter + body
            zf.writestr(f"{prefix}/Drafts/{_safe_filename(name)}.md", content)
        # --- Author Notes ---
        author_index_lines = [
            "---\ntags: [index, authors]\n---\n",
            "# Authors\n\n",
            f"**{len(all_authors)}** authors contributing to AI/agent Internet-Drafts.\n\n",
            "| Author | Affiliation | Drafts |\n|---|---|---|\n",
        ]
        for name, aff, cnt, drafts in sorted(all_authors, key=lambda x: x[2], reverse=True):
            author_index_lines.append(f"| [[{name}]] | {aff or ''} | {cnt} |\n")
        zf.writestr(f"{prefix}/Authors/index.md", "".join(author_index_lines))
        for name, aff, cnt, drafts in all_authors:
            fm = f"---\ntags: [author]\naffiliation: \"{aff or ''}\"\ndraft_count: {cnt}\n---\n"
            body = f"\n# {name}\n\n"
            if aff:
                body += f"**Affiliation:** {aff}\n\n"
            body += f"## Drafts ({cnt})\n\n"
            for dn in drafts:
                d = draft_map.get(dn)
                title = d.title if d else dn
                score = score_map.get(dn, "")
                score_str = f" (score: {score:.2f})" if score else ""
                body += f"- [[{dn}|{title}]]{score_str}\n"
            # Co-authors
            coauthors: Counter = Counter()
            for dn in drafts:
                for other in author_drafts.get(dn, []):
                    if other != name:
                        coauthors[other] += 1
            if coauthors:
                body += f"\n## Co-authors\n\n"
                for co, shared in coauthors.most_common(20):
                    body += f"- [[{co}]] ({shared} shared)\n"
            zf.writestr(f"{prefix}/Authors/{_safe_filename(name)}.md", fm + body)
        # --- Category Notes ---
        cat_index_lines = [
            "---\ntags: [index, categories]\n---\n",
            "# Categories\n\n",
            _mermaid_pie("Draft Distribution", dict(cat_counts.most_common(12))),
            "\n\n",
        ]
        for cat, count in cat_counts.most_common():
            cat_index_lines.append(f"- [[{cat}]] — {count} drafts\n")
        zf.writestr(f"{prefix}/Categories/index.md", "".join(cat_index_lines))
        for cat, count in cat_counts.most_common():
            fm = f"---\ntags: [category]\ndraft_count: {count}\n---\n"
            body = f"\n# {cat}\n\n"
            body += f"**{count} drafts** in this category.\n\n"
            # Table of drafts sorted by score
            draft_names = cat_drafts[cat]
            scored = [(dn, score_map.get(dn, 0)) for dn in draft_names]
            scored.sort(key=lambda x: x[1], reverse=True)
            body += "| Draft | Score |\n|---|---|\n"
            for dn, score in scored:
                d = draft_map.get(dn)
                title = d.title[:60] if d else dn
                body += f"| [[{dn}|{title}]] | {score:.2f} |\n"
            zf.writestr(f"{prefix}/Categories/{_safe_filename(cat)}.md", fm + body)
        # --- Analysis Notes ---
        # Score Distribution
        score_lines = [
            "---\ntags: [analysis]\n---\n",
            "\n# Score Distribution\n\n",
            "Composite scores across all rated drafts (1.0–5.0 scale).\n\n",
        ]
        # Mermaid bar chart of score buckets
        buckets: dict[str, int] = defaultdict(int)
        for _, r in pairs:
            b = f"{r.composite_score:.1f}"
            buckets[b] += 1
        sorted_buckets = dict(sorted(buckets.items()))
        if sorted_buckets:
            labels = [f'"{k}"' for k in sorted_buckets.keys()]
            values = [str(v) for v in sorted_buckets.values()]
            score_lines.append(f"""```mermaid
 xychart-beta
    title "Score Distribution"
    x-axis [{", ".join(labels)}]
    y-axis "Count"
    bar [{", ".join(values)}]
 ```\n\n""")
        # Dimension averages
        dims = {"Novelty": [], "Maturity": [], "Overlap": [], "Momentum": [], "Relevance": []}
        for _, r in pairs:
            dims["Novelty"].append(r.novelty)
            dims["Maturity"].append(r.maturity)
            dims["Overlap"].append(r.overlap)
            dims["Momentum"].append(r.momentum)
            dims["Relevance"].append(r.relevance)
        score_lines.append("## Dimension Averages\n\n")
        score_lines.append("| Dimension | Average | Min | Max |\n|---|---|---|---|\n")
        for dim, vals in dims.items():
            if vals:
                avg = sum(vals) / len(vals)
                score_lines.append(f"| {dim} | {avg:.2f} | {min(vals)} | {max(vals)} |\n")
        zf.writestr(f"{prefix}/Analysis/Score Distribution.md", "".join(score_lines))
        # Top Rated
        top_lines = [
            "---\ntags: [analysis]\n---\n",
            "\n# Top Rated Drafts\n\n",
            "Drafts ranked by composite score.\n\n",
            "| # | Draft | Score | Novelty | Maturity | Overlap | Momentum | Relevance | Category |\n",
            "|---|---|---|---|---|---|---|---|---|\n",
        ]
        for i, (d, r) in enumerate(top_rated[:30], 1):
            cat = r.categories[0] if r.categories else ""
            top_lines.append(
                f"| {i} | [[{d.name}|{(d.title or d.name)[:45]}]] | **{r.composite_score:.2f}** | "
                f"{r.novelty} | {r.maturity} | {r.overlap} | {r.momentum} | {r.relevance} | {cat} |\n"
            )
        zf.writestr(f"{prefix}/Analysis/Top Rated.md", "".join(top_lines))
        # Ideas Overview
        type_counts = Counter(i.get("type", "other") or "other" for i in all_ideas)
        ideas_lines = [
            "---\ntags: [analysis, ideas]\n---\n",
            f"\n# Extracted Ideas\n\n",
            f"**{len(all_ideas)}** technical ideas extracted from rated drafts.\n\n",
            _mermaid_pie("Ideas by Type", dict(type_counts.most_common(10))),
            "\n\n## By Type\n\n",
        ]
        for itype, count in type_counts.most_common():
            ideas_lines.append(f"- **{itype}**: {count} ideas\n")
        ideas_lines.append(f"\n## Recent Ideas\n\n")
        for idea in all_ideas[:50]:
            dn = idea.get("draft_name", "")
            novelty = f" `N:{idea.get('novelty_score')}`" if idea.get("novelty_score") else ""
            ideas_lines.append(f"- **{idea.get('title', 'Untitled')}**{novelty} — [[{dn}]]\n")
        if len(all_ideas) > 50:
            ideas_lines.append(f"\n*...and {len(all_ideas) - 50} more. See individual draft notes.*\n")
        zf.writestr(f"{prefix}/Analysis/Ideas Overview.md", "".join(ideas_lines))
        # Timeline
        timeline_lines = [
            "---\ntags: [analysis, timeline]\n---\n",
            "\n# Timeline\n\n",
            "Draft submission activity over time.\n\n",
            _mermaid_timeline_chart(dict(sorted(monthly.items()))),
            "\n\n## Monthly Counts\n\n",
            "| Month | Drafts |\n|---|---|\n",
        ]
        for m in sorted(monthly.keys()):
            timeline_lines.append(f"| {m} | {monthly[m]} |\n")
        zf.writestr(f"{prefix}/Analysis/Timeline.md", "".join(timeline_lines))
        # --- Glossary ---
        glossary = """---
 tags: [reference, glossary]
 ---
 # Glossary
 Reference for all terms, abbreviations, and scoring dimensions used in this vault.
 ## Scoring Dimensions
 Each draft is rated by Claude AI on five dimensions, scored from 1 (lowest) to 5 (highest).
 | Dimension | Description |
 |---|---|
 | **Novelty** | How original is this draft? Does it introduce new ideas, or rehash existing approaches? High = genuinely new contribution. |
 | **Maturity** | How complete and well-developed is the specification? High = detailed protocol, clear data formats, ready for implementation. Low = early sketch or position paper. |
 | **Overlap** | How much does this draft duplicate existing work? High overlap (5) = very similar to other drafts. Low overlap (1) = unique in the landscape. *Note: In composite score, this is inverted (5 - overlap) so lower overlap contributes positively.* |
 | **Momentum** | Is this draft gaining traction? High = active revisions, working group adoption, multiple authors/organizations. Low = single submission, no updates. |
 | **Relevance** | How relevant is this draft to AI agent infrastructure? High = directly addresses agent-to-agent communication, identity, authorization. Low = tangentially related. |
 ## Composite Score
 The **composite score** (1.0–5.0) is calculated as:
 ```
 score = (novelty + maturity + (5 - overlap) + momentum + relevance) / 5
 ```
 Overlap is inverted because a *lower* overlap is better (more unique).
 ## Score Bars
 Score bars visualize ratings: `\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2591\u2591\u2591` = 3.5/5.0
 - `\u2588` (filled) = earned score
 - `\u2591` (empty) = remaining
 ## Other Terms
 | Term | Meaning |
 |---|---|
 | **Draft / I-D** | Internet-Draft — a working document submitted to the IETF. Not yet an RFC (standard). |
 | **RFC** | Request for Comments — a published IETF standard or informational document. |
 | **Working Group (WG)** | An IETF group chartered to work on a specific topic (e.g., WIMSE, OAuth). |
 | **Category** | Topic classification assigned by Claude during analysis (e.g., "A2A protocols", "AI safety/alignment"). A draft can belong to multiple categories. |
 | **Idea** | A distinct technical concept extracted from a draft by Claude. Each idea has a type (protocol, mechanism, framework, etc.) and a novelty score. |
 | **Novelty Score (N:1–5)** | Per-idea originality rating. Shown as `N:4` next to ideas. 5 = completely new concept, 1 = well-known approach. |
 | **Gap** | An area identified where no existing draft adequately addresses a need in the AI agent ecosystem. |
 | **Affiliation** | The organization an author is associated with (from IETF Datatracker records). |
 | **Co-authorship** | Two authors who appear together on at least one draft. |
 | **Datatracker** | The IETF's official system for tracking Internet-Drafts, RFCs, and working groups (datatracker.ietf.org). |
 """
        zf.writestr(f"{prefix}/Analysis/Glossary.md", glossary)
        # --- .obsidian settings for graph colors ---
        graph_json = """{
  "collapse-filter": false,
  "search": "",
  "showTags": true,
  "showAttachments": false,
  "hideUnresolved": false,
  "showOrphans": true,
  "collapse-color-groups": false,
  "colorGroups": [
    {"query": "path:Drafts", "color": {"a": 1, "rgb": 3444735}},
    {"query": "path:Authors", "color": {"a": 1, "rgb": 10092441}},
    {"query": "path:Categories", "color": {"a": 1, "rgb": 16744448}},
    {"query": "path:Analysis", "color": {"a": 1, "rgb": 2293541}}
  ],
  "collapse-display": false,
  "showArrow": true,
  "textFadeMultiplier": 0,
  "nodeSizeMultiplier": 1.2,
  "lineSizeMultiplier": 1,
  "collapse-forces": true,
  "centerStrength": 0.5,
  "repelStrength": 10,
  "linkStrength": 1,
  "linkDistance": 100
 }"""
        zf.writestr(f"{prefix}/.obsidian/graph.json", graph_json)
    buf.seek(0)
    return buf.getvalue()
--- a/tests/test_obsidian_export.py
+++ b/tests/test_obsidian_export.py
@@ -0,0 +1,200 @@
 """Tests for the Obsidian vault export.
 If this test breaks, the export is out of sync with the data model.
 Fix obsidian_export.py to match whatever changed.
 """
 from __future__ import annotations
 import io
 import sys
 import zipfile
 from pathlib import Path
 import pytest
 _project_root = Path(__file__).resolve().parent.parent
 if str(_project_root / "src") not in sys.path:
    sys.path.insert(0, str(_project_root / "src"))
 from webui.obsidian_export import build_obsidian_vault
 def test_vault_structure(seeded_db):
    """Vault ZIP should contain expected folders and key files."""
    data = build_obsidian_vault(seeded_db)
    assert len(data) > 0
    z = zipfile.ZipFile(io.BytesIO(data))
    names = z.namelist()
    # Key structural files must exist
    assert "IETF-AI-Agent-Drafts/Dashboard.md" in names
    assert "IETF-AI-Agent-Drafts/Authors/index.md" in names
    assert "IETF-AI-Agent-Drafts/Categories/index.md" in names
    assert "IETF-AI-Agent-Drafts/.obsidian/graph.json" in names
    # Should have analysis notes
    analysis = [n for n in names if "/Analysis/" in n]
    assert len(analysis) >= 3  # Score Distribution, Top Rated, Ideas Overview
 def test_vault_has_all_drafts(seeded_db):
    """Every draft in the DB should have a corresponding note in the vault."""
    data = build_obsidian_vault(seeded_db)
    z = zipfile.ZipFile(io.BytesIO(data))
    draft_files = [n for n in z.namelist() if "/Drafts/" in n]
    # seeded_db has 5 drafts
    assert len(draft_files) == 5
    # Check each draft name appears
    draft_names = {Path(f).stem for f in draft_files}
    assert "draft-alpha-agent-comm" in draft_names
    assert "draft-gamma-agent-id" in draft_names
 def test_draft_note_has_frontmatter(seeded_db):
    """Draft notes must have YAML frontmatter with score and categories."""
    data = build_obsidian_vault(seeded_db)
    z = zipfile.ZipFile(io.BytesIO(data))
    content = z.read("IETF-AI-Agent-Drafts/Drafts/draft-alpha-agent-comm.md").decode()
    # YAML frontmatter
    assert content.startswith("---")
    assert "score:" in content
    assert "novelty:" in content
    assert "maturity:" in content
    assert "categories:" in content
    assert "tags:" in content
    # No floating-point noise (e.g., 3.4000000000000004)
    import re
    long_floats = re.findall(r"\d+\.\d{4,}", content)
    assert len(long_floats) == 0, f"Unformatted floats found: {long_floats}"
 def test_draft_note_has_wikilinks(seeded_db):
    """Draft notes should link to authors and categories with [[wikilinks]]."""
    data = build_obsidian_vault(seeded_db)
    z = zipfile.ZipFile(io.BytesIO(data))
    content = z.read("IETF-AI-Agent-Drafts/Drafts/draft-alpha-agent-comm.md").decode()
    # Should link to authors
    assert "[[Alice Researcher]]" in content
    assert "[[Bob Engineer]]" in content
    # Should link to categories
    assert "[[A2A protocols]]" in content
 def test_draft_note_has_ideas(seeded_db):
    """Draft notes should include extracted ideas."""
    data = build_obsidian_vault(seeded_db)
    z = zipfile.ZipFile(io.BytesIO(data))
    content = z.read("IETF-AI-Agent-Drafts/Drafts/draft-alpha-agent-comm.md").decode()
    assert "Extracted Ideas" in content
    assert "Agent Handshake" in content
    assert "Capability Negotiation" in content
 def test_draft_note_has_rating_bars(seeded_db):
    """Draft notes should include visual score bars."""
    data = build_obsidian_vault(seeded_db)
    z = zipfile.ZipFile(io.BytesIO(data))
    content = z.read("IETF-AI-Agent-Drafts/Drafts/draft-alpha-agent-comm.md").decode()
    # Score bars use block chars
    assert "\u2588" in content  # filled block
    assert "\u2591" in content  # empty block
    assert "/5.0" in content
 def test_author_notes(seeded_db):
    """Author notes should list their drafts with wikilinks."""
    data = build_obsidian_vault(seeded_db)
    z = zipfile.ZipFile(io.BytesIO(data))
    content = z.read("IETF-AI-Agent-Drafts/Authors/Alice Researcher.md").decode()
    assert content.startswith("---")
    assert "affiliation:" in content
    assert "ExampleCorp" in content
    assert "[[draft-alpha-agent-comm" in content
    assert "[[draft-gamma-agent-id" in content
 def test_category_notes(seeded_db):
    """Category notes should list drafts with scores."""
    data = build_obsidian_vault(seeded_db)
    z = zipfile.ZipFile(io.BytesIO(data))
    cat_files = [n for n in z.namelist() if "/Categories/" in n and "index" not in n]
    # seeded_db has 5 distinct categories
    assert len(cat_files) >= 4
    # Check one category note
    content = z.read("IETF-AI-Agent-Drafts/Categories/A2A protocols.md").decode()
    assert "[[draft-alpha-agent-comm" in content
    assert "draft_count:" in content
 def test_dashboard_has_mermaid(seeded_db):
    """Dashboard should contain Mermaid chart blocks."""
    data = build_obsidian_vault(seeded_db)
    z = zipfile.ZipFile(io.BytesIO(data))
    content = z.read("IETF-AI-Agent-Drafts/Dashboard.md").decode()
    assert "```mermaid" in content
    assert "pie title" in content
    assert "Key Stats" in content
    assert "Total Drafts" in content
 def test_vault_has_glossary(seeded_db):
    """Vault should contain a Glossary with scoring dimensions explained."""
    data = build_obsidian_vault(seeded_db)
    z = zipfile.ZipFile(io.BytesIO(data))
    assert "IETF-AI-Agent-Drafts/Analysis/Glossary.md" in z.namelist()
    content = z.read("IETF-AI-Agent-Drafts/Analysis/Glossary.md").decode()
    # All five dimensions must be explained
    for dim in ("Novelty", "Maturity", "Overlap", "Momentum", "Relevance"):
        assert dim in content, f"Glossary missing dimension: {dim}"
    assert "Composite Score" in content
    assert "Internet-Draft" in content
 def test_top_rated_uses_full_names(seeded_db):
    """Top Rated table should use full dimension names, not abbreviations."""
    data = build_obsidian_vault(seeded_db)
    z = zipfile.ZipFile(io.BytesIO(data))
    content = z.read("IETF-AI-Agent-Drafts/Analysis/Top Rated.md").decode()
    assert "Novelty" in content
    assert "Maturity" in content
    assert "| Nov |" not in content  # no abbreviations
 def test_vault_is_valid_zip(seeded_db):
    """The output should be a valid ZIP that can be extracted."""
    data = build_obsidian_vault(seeded_db)
    z = zipfile.ZipFile(io.BytesIO(data))
    # Should not raise
    bad = z.testzip()
    assert bad is None, f"Corrupt file in ZIP: {bad}"
    # All files should be decodable as UTF-8
    for name in z.namelist():
        if name.endswith(".md"):
            z.read(name).decode("utf-8")