Release prep: - Version bump to 0.3.0 (pyproject.toml, cli.py) - Rewrite README.md with current stats (475 drafts, 713 authors, 501 ideas) - Add CONTRIBUTING.md with dev setup and code conventions Blog site: - Add scripts/build-site.py (markdown → HTML with clean CSS, dark mode, nav) - Generate static site in docs/blog/ (10 pages) - Ready for GitHub Pages deployment Academic paper (paper/main.tex): - Update all counts: 474→475 drafts, 557→710 authors, 1907→462 ideas, 11→12 gaps - Add false-positive filtering methodology (113 excluded, 361 relevant) - Add cross-org convergence analysis (132 ideas, 33% rate) - Add GDPR compliance gap to gap table - Add LLM-as-judge caveats to rating methodology and limitations - Add FIPA, IEEE P3394, W3C WoT to related work with bibliography entries - Fix safety ratio to show monthly variation (1.5:1 to 21:1) Pipeline: - Fetch 1 new draft (475 total), 3 new authors (713 total) - Fix 16 ruff lint errors across test files - All 106 tests pass Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
228 lines
8.7 KiB
Markdown
228 lines
8.7 KiB
Markdown
# IETF Draft Analyzer
|
|
|
|
Track, categorize, rate, and map AI/agent-related IETF Internet-Drafts.
|
|
|
|
**475 drafts** analyzed (361 relevant after false-positive filtering) with **713 authors**, **501 extracted ideas**, **132 cross-org convergent ideas**, and **12 identified gaps** — spanning 2024 to March 2026.
|
|
|
|
## What This Does
|
|
|
|
The IETF is experiencing an unprecedented wave of standardization activity around AI agents. This tool provides a quantitative lens on that activity:
|
|
|
|
- **Fetches** draft metadata and full text from the IETF Datatracker API
|
|
- **Rates** each draft on 5 dimensions (novelty, maturity, overlap, momentum, relevance) using Claude
|
|
- **Embeds** drafts with Ollama for pairwise similarity and clustering
|
|
- **Extracts** discrete technical ideas and identifies landscape gaps
|
|
- **Analyzes** cross-organizational convergence (SequenceMatcher at 0.75 threshold)
|
|
- **Maps** the author collaboration network and organizational affiliations
|
|
- **Generates** markdown reports and a full web dashboard
|
|
- **Filters** false positives automatically (relevance-based + manual flagging)
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# Install
|
|
pip install -e .
|
|
|
|
# Set your API key (or add to .env file)
|
|
export ANTHROPIC_API_KEY=sk-ant-...
|
|
|
|
# Fetch drafts from IETF Datatracker
|
|
ietf fetch
|
|
|
|
# Rate all unrated drafts (--cheap uses Haiku for lower cost)
|
|
ietf analyze --all
|
|
ietf analyze --all --cheap # ~10x cheaper with Haiku
|
|
|
|
# Generate embeddings (requires Ollama running locally)
|
|
ietf embed
|
|
|
|
# Extract technical ideas
|
|
ietf ideas --all --cheap --batch 5
|
|
|
|
# Analyze cross-org convergence
|
|
ietf ideas convergence
|
|
|
|
# Identify gaps in the landscape
|
|
ietf gaps
|
|
|
|
# Fetch author data
|
|
ietf authors --fetch
|
|
|
|
# Generate reports
|
|
ietf report overview
|
|
ietf report landscape
|
|
ietf report authors
|
|
|
|
# Launch the web dashboard
|
|
./scripts/run-webui.sh
|
|
```
|
|
|
|
## Web Dashboard
|
|
|
|
A full interactive dashboard at `http://127.0.0.1:5000`:
|
|
|
|
```bash
|
|
./scripts/run-webui.sh
|
|
# or: FLASK_APP=src/webui/app.py flask run
|
|
```
|
|
|
|
| Page | What it shows |
|
|
|------|---------------|
|
|
| **Overview** | Stat cards, score histogram, category radar, submission timeline |
|
|
| **Draft Explorer** | Searchable/filterable/sortable table with category pills and score badges |
|
|
| **Draft Detail** | Score ring, dimension bars, ideas, references, linked authors |
|
|
| **Ratings** | Score distributions, box plots, category radar, novelty vs maturity scatter |
|
|
| **Landscape** | t-SNE embedding map, quality quadrants |
|
|
| **Authors** | Co-authorship force-directed graph, organization charts |
|
|
| **Ideas** | Extracted ideas grouped by type with search |
|
|
| **Gaps** | Gap cards sorted by severity with related drafts |
|
|
| **Citations** | RFC cross-reference graph |
|
|
| **Similarity** | Draft similarity network |
|
|
| **Timeline** | Submission trends over time |
|
|
| **Monitor** | Pipeline health, API costs, processing status |
|
|
|
|
Charts are interactive (Plotly.js). GDPR-compliant analytics (no cookies, daily-salted IP hashing).
|
|
|
|
## Blog Series
|
|
|
|
An 8-post analysis series in `data/reports/blog-series/`:
|
|
|
|
1. **The Gold Rush** — Growth from 9 drafts to 9.3% of all IETF submissions
|
|
2. **Who Writes the Rules** — Huawei's 16%, geopolitical dynamics, team blocs
|
|
3. **The OAuth Wars** — 14 competing OAuth proposals, fragmentation costs
|
|
4. **What Nobody Builds** — The safety deficit (4:1 ratio), 12 identified gaps
|
|
5. **Where Drafts Converge** — 132 cross-org convergent ideas, implicit consensus
|
|
6. **The Big Picture** — Architectural vision, EU AI Act implications
|
|
7. **How We Built This** — Methodology, cost ($9-15), limitations
|
|
8. **Agents Building the Agent Analysis** — Meta post on using Claude agent teams
|
|
|
|
## Key Findings
|
|
|
|
- **Safety deficit**: ~4:1 ratio of capability-building to safety proposals (varies 1.5:1 to 21:1 monthly)
|
|
- **Extreme fragmentation**: 155 competing A2A protocols, 42 overlap clusters
|
|
- **Organizational concentration**: Huawei ~16% of all drafts, Chinese orgs ~40%
|
|
- **Cross-org convergence**: 132 ideas (33%) independently proposed by multiple organizations
|
|
- **12 gaps identified**: 2 critical (behavior verification, human override), 5 high, 5 medium
|
|
- **Top-rated drafts**: Safety-focused proposals score highest (VOLT 4.75, DAAP 4.75)
|
|
|
|
## CLI Commands
|
|
|
|
### Core Pipeline
|
|
|
|
| Command | Description |
|
|
|---------|-------------|
|
|
| `ietf fetch` | Fetch AI/agent drafts from IETF Datatracker |
|
|
| `ietf analyze --all [--cheap] [--dry-run]` | Rate drafts using Claude |
|
|
| `ietf embed [--dry-run]` | Generate semantic embeddings via Ollama |
|
|
| `ietf ideas --all [--cheap] [--batch N] [--dry-run]` | Extract technical ideas |
|
|
| `ietf ideas convergence [--threshold 0.75]` | Cross-org convergence analysis |
|
|
| `ietf ideas dedup` | Deduplicate similar ideas |
|
|
| `ietf gaps [--dry-run]` | Identify landscape gaps |
|
|
| `ietf authors --fetch` | Fetch author/affiliation data |
|
|
|
|
### Exploration
|
|
|
|
| Command | Description |
|
|
|---------|-------------|
|
|
| `ietf list` | List tracked drafts |
|
|
| `ietf show <name>` | Show detailed info for a draft |
|
|
| `ietf search <query>` | Full-text search (FTS5) |
|
|
| `ietf similar <name>` | Find similar drafts by embedding similarity |
|
|
| `ietf clusters` | Find clusters of near-duplicate drafts |
|
|
| `ietf compare <name1> <name2>` | Compare drafts for overlap |
|
|
| `ietf authors` | Top authors and draft counts |
|
|
| `ietf network` | Organizational collaboration network |
|
|
|
|
### Reports (`ietf report`)
|
|
|
|
| Command | Description |
|
|
|---------|-------------|
|
|
| `ietf report overview` | Sortable table of all rated drafts |
|
|
| `ietf report landscape` | Category-grouped view with rankings |
|
|
| `ietf report authors` | Top authors, organizations, collaboration |
|
|
| `ietf report ideas` | Ideas by type, most common, unique |
|
|
| `ietf report gaps` | Gap analysis with severity ratings |
|
|
| `ietf report timeline` | Monthly submission trends |
|
|
| `ietf report overlap-matrix` | Similar pairs and cross-category matrix |
|
|
|
|
## Rating System
|
|
|
|
Each draft is scored 1-5 on five dimensions by Claude (LLM-as-judge, see [methodology](data/reports/methodology.md) for caveats):
|
|
|
|
| Dimension | What it measures |
|
|
|-----------|-----------------|
|
|
| **Novelty** | Originality relative to existing standards |
|
|
| **Maturity** | Completeness of specification |
|
|
| **Overlap** | Redundancy with other drafts (5 = heavily overlapping) |
|
|
| **Momentum** | Community engagement, revisions, adoption |
|
|
| **Relevance** | Importance to the AI/agent ecosystem |
|
|
|
|
**Important**: Ratings are generated from abstracts and partial full text without human calibration. They should be treated as relative rankings, not absolute quality measures.
|
|
|
|
## Tech Stack
|
|
|
|
- **Python 3.11+** with Click CLI
|
|
- **SQLite** with FTS5 full-text search and WAL mode
|
|
- **Anthropic Claude** (Sonnet/Haiku) for analysis, rating, idea extraction, gap analysis
|
|
- **Ollama** (nomic-embed-text) for local embeddings and similarity
|
|
- **Flask** with Jinja2 for the web dashboard
|
|
- **Plotly** for interactive visualizations
|
|
- **NumPy/SciPy/scikit-learn** for similarity computation and clustering
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
src/ietf_analyzer/
|
|
cli.py # Click CLI entry point (~30 commands)
|
|
fetcher.py # IETF Datatracker API client
|
|
analyzer.py # Claude-based analysis (rating, ideas, gaps)
|
|
embeddings.py # Ollama embeddings + similarity + clustering
|
|
db.py # SQLite with FTS5 (8 tables)
|
|
models.py # Author, Draft, Rating dataclasses
|
|
reports.py # Markdown report generation
|
|
authors.py # Author network analysis
|
|
search.py # Hybrid FTS5 + embedding search
|
|
classifier.py # Two-stage Ollama classifier
|
|
readiness.py # Draft readiness scoring
|
|
config.py # Configuration
|
|
|
|
src/webui/
|
|
app.py # Flask application (20 API endpoints)
|
|
data.py # Data access layer with TypedDicts
|
|
auth.py # Admin authentication
|
|
analytics.py # GDPR-compliant pageview tracking
|
|
templates/ # Jinja2 templates (23 pages)
|
|
|
|
data/
|
|
drafts.db # SQLite database
|
|
reports/ # Generated reports + blog series
|
|
.cache/ # Similarity matrix cache
|
|
|
|
paper/
|
|
main.tex # arXiv paper
|
|
```
|
|
|
|
## Database Schema
|
|
|
|
| Table | Purpose | Records |
|
|
|-------|---------|--------:|
|
|
| `drafts` | Draft metadata + full text | 475 |
|
|
| `ratings` | 5-dimension ratings + summary + false_positive flag | 475 |
|
|
| `embeddings` | Semantic vectors (nomic-embed-text, 768-dim) | 475 |
|
|
| `llm_cache` | Claude API response cache (SHA-256 dedup) | ~1,500 |
|
|
| `authors` | Person records from Datatracker | 713 |
|
|
| `draft_authors` | Author-draft relationships | ~1,400 |
|
|
| `ideas` | Extracted + deduplicated technical ideas | 501 |
|
|
| `gaps` | Gap analysis results | 12 |
|
|
|
|
## Cost
|
|
|
|
Full pipeline for 475 drafts: ~$9-15 USD total
|
|
- Sonnet for rating + gap analysis (~$3)
|
|
- Haiku for bulk idea extraction (~$1)
|
|
- Ollama embeddings: free (local)
|
|
|
|
## License
|
|
|
|
MIT — see [LICENSE](LICENSE)
|