Files

Christian Nennemann e247bfef8f Run pipeline, write Post 08, commit untracked files

Pipeline:
- Extract ideas for 38 new drafts → 462 ideas total
- Convergence analysis: 132 cross-org convergent ideas (33% rate)
- Fetch authors for 102 drafts → 709 authors (up from 403)
- Refresh gap analysis: 12 gaps across full 474-draft corpus
- Update verified counts with new totals

Post 08:
- Complete rewrite of "Agents Building the Agent Analysis" (2,953 words)
- Covers 3 phases: writing team → review cycle → fix cycle
- Meta-irony table mapping team coordination to IETF gap names
- Specific examples from dev journal (SQL injection, consent conflation, ideas mismatch)

Untracked files committed:
- scripts/: backfill-wg-names, classify-unrated, compare-classifiers, download-relevant-text, run-webui
- src/ietf_analyzer/classifier.py: two-stage Ollama classifier
- src/webui/: analytics (GDPR-compliant), auth, obsidian_export
- tests/test_obsidian_export.py (10 tests)
- data/reports/: wg-analysis, generated draft for gap #37

Housekeeping:
- .gitignore: exclude LaTeX artifacts, stale DBs, analytics.db

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-08 15:31:30 +01:00

5.3 KiB

Raw Blame History

Verified Database Counts

Source: data/drafts.db -- queried 2026-03-08 Purpose: Single source of truth for all counts, replacing inconsistent numbers across blog posts and reports.

Core Tables

Table	Count	Notes
drafts	434	Up from 361 after 2026-03-07 fetch
ratings	434	1:1 with drafts
authors	557	Unique persons from Datatracker
ideas	462	Re-extracted 2026-03-08, see "Ideas Count History" below
gaps	11	Not 12 -- see gap list below
embeddings	434	1:1 with drafts
draft_authors	1,057	Draft-author links
llm_cache	1,397	Cached API calls

False Positive Analysis

73 drafts flagged as false_positive = 1 in ratings table (new column added 2026-03-08).

Criteria	Count
Relevance <= 2 (auto-flagged)	38
Relevance 3+ but clearly not AI-agent (manually reviewed)	35
Total false positives	73
Drafts excluding false positives	361

Relevance Score Distribution (all 434 drafts)

Relevance	Count
1	2
2	36
3	102
4	196
5	98

Category Counts (excluding false positives)

All categories normalized to short-form names (21 legacy long-form entries migrated 2026-03-08).

Category	Count
Data formats/interop	146
A2A protocols	146
Agent identity/auth	127
Autonomous netops	103
Policy/governance	97
Agent discovery/reg	82
ML traffic mgmt	77
AI safety/alignment	44
Model serving/inference	42
Human-agent interaction	33
Other AI/agent	18

Note: Drafts average ~2.4 categories each, so these sum to more than 361.

Gap List (11 gaps, not 12)

ID	Topic	Severity	Category
37	Multi-Agent Consensus Protocols	high	A2A protocols
38	Agent Behavioral Verification	critical	AI safety/alignment
39	Cross-Protocol Agent Migration	medium	Agent discovery/reg
40	Real-Time Agent Rollback Mechanisms	high	Autonomous netops
41	Agent Resource Accounting and Billing	medium	new
42	Federated Agent Learning Privacy	high	Policy/governance
43	Agent Capability Negotiation	medium	A2A protocols
44	Cross-Domain Agent Audit Trails	high	Agent identity/auth
45	Agent Failure Cascade Prevention	critical	AI safety/alignment
46	Human Override Standardization	high	Human-agent interaction
47	Agent Performance Benchmarking	medium	new

Blog posts reference 12 gaps with different names (e.g., "Agent Resource Exhaustion Protection" vs DB's "Agent Resource Accounting and Billing"). The blog list appears to be an editorial rewrite, not raw pipeline output. The missing 12th gap may be "Cross-Protocol Translation" or "Agent Data Provenance" which appear in blog posts but not in the database.

Ideas Count History

The database currently contains 462 ideas across 415 drafts. This is the fourth count encountered:

Source	Count	Date	Likely Explanation
Blog post 5 filename	1,262	~2026-03-03	Pre-expansion dataset (260 drafts), before dedup
Blog post 5 text / master stats	1,780	~2026-03-05	Post-expansion (361 drafts), before dedup
Previous database	419	2026-03-08	After `dedup_ideas` run (0.85 threshold) or re-extraction with different params
Current database	462	2026-03-08	After re-extraction for 38 drafts missing ideas (474 total drafts, 59 still without ideas)

Ideas by Type (current DB)

Type	Count
architecture	107
protocol	106
extension	84
mechanism	74
requirement	47
pattern	40
framework	3
format	1

Ideas per Draft Distribution

Ideas/Draft	Drafts
1	370
2	43
3	2
0 (no ideas)	59

The near-uniform 1-idea-per-draft (89% of drafts with ideas) suggests either aggressive dedup or a re-extraction with constrained output. The original pipeline extracted 1-4 ideas per draft, so the 1,780 figure likely reflects pre-dedup counts.

Convergence Analysis (2026-03-08)

Cross-organization idea convergence analysis (threshold: 0.75 SequenceMatcher similarity):

Metric	Value
Total ideas	462
Unique clusters	398
Cross-org convergent ideas	132
Convergence rate	33%

Top convergent ideas by organization count:

Fully Adaptive Routing Ethernet for AI — 14 orgs (Baidu, Broadcom, China Mobile, etc.)
AI Agent Protocol Framework — 7 orgs, 3 drafts
Natural Language Protocol for Agent Comm — 7 orgs
LISP-based geospatial intelligence network — 6 orgs
MCP-Based Network Management Plane — 4 orgs (Deutsche Telekom, Huawei, Orange, Telefonica)

Actions Taken (2026-03-08)

Category normalization: Updated 21 ratings rows from legacy long-form category names to canonical short forms. All 11 categories now consistent.
False positive flagging: Added false_positive column to ratings table. Flagged 73 drafts (38 with relevance <= 2, 35 manually reviewed at relevance 3+).
Schema migration: Updated db.py schema and migration code to include false_positive column.
This document: Created as single source of truth for counts.

5.3 KiB Raw Blame History