Go to file

Christian Nennemann 9a0dc899a8 feat: add ACT+ECT over MCP demo with LangGraph agent

End-to-end PoC demonstrating Agent Context Token authorization and
Execution Context Token accountability over MCP tool calls, using a
LangGraph agent with ES256-signed JWT tokens and DAG verification.

2026-04-12 12:43:22 +00:00

.github/workflows

Complete remaining medium/low issues: performance, CLI, types, CI, tests

2026-03-08 14:06:54 +01:00

data

feat: add draft data, gap analysis report, and workspace config

2026-04-06 18:47:15 +02:00

demo/act-ect-mcp

feat: add ACT+ECT over MCP demo with LangGraph agent

2026-04-12 12:43:22 +00:00

docs

fix: refimpl hash format aligned to -01 spec, draft rebuilt

2026-04-11 17:51:50 +02:00

paper

feat: add IETF landscape paper source (LaTeX + BibTeX + Makefile)

2026-04-12 12:43:15 +00:00

scripts

feat: refresh pipeline, reorganize draft generation, polish public pages

2026-03-09 05:39:13 +01:00

src

feat: refresh pipeline, reorganize draft generation, polish public pages

2026-03-09 05:39:13 +01:00

tests

Add test coverage for CLI commands, Flask routes, and shared DB methods

2026-03-09 03:49:09 +01:00

workspace

feat: unified drafts/ structure with PDF outputs for ACT and ECT

2026-04-12 14:01:57 +02:00

.dockerignore

Platform upgrade: semantic search, citations, readiness, tests, Docker

2026-03-07 20:52:56 +01:00

.gitignore

feat: add IETF landscape paper source (LaTeX + BibTeX + Makefile)

2026-04-12 12:43:15 +00:00

CLAUDE.md

Enforce public/private visibility for web UI pages

2026-03-08 20:52:43 +01:00

CONTRIBUTING.md

v0.3.0: Publication-ready release with blog site, paper update, and polish

2026-03-08 17:54:43 +01:00

docker-compose.yml

Platform upgrade: semantic search, citations, readiness, tests, Docker

2026-03-07 20:52:56 +01:00

Dockerfile

Platform upgrade: semantic search, citations, readiness, tests, Docker

2026-03-07 20:52:56 +01:00

FEATURE_BACKLOG.md

fix: dev mode auth regression from blueprint refactor

2026-03-09 03:52:02 +01:00

LICENSE

Fix remaining critical, high, and medium issues from 4-perspective review

2026-03-08 12:47:47 +01:00

master-prompt.md

IETF Draft Analyzer v0.1.0 — track, categorize, and rate AI/agent drafts

2026-02-28 00:36:45 +01:00

pyproject.toml

v0.3.0: Publication-ready release with blog site, paper update, and polish

2026-03-08 17:54:43 +01:00

README.md

v0.3.0: Publication-ready release with blog site, paper update, and polish

2026-03-08 17:54:43 +01:00

README.md

IETF Draft Analyzer

Track, categorize, rate, and map AI/agent-related IETF Internet-Drafts.

475 drafts analyzed (361 relevant after false-positive filtering) with 713 authors, 501 extracted ideas, 132 cross-org convergent ideas, and 12 identified gaps — spanning 2024 to March 2026.

What This Does

The IETF is experiencing an unprecedented wave of standardization activity around AI agents. This tool provides a quantitative lens on that activity:

Fetches draft metadata and full text from the IETF Datatracker API
Rates each draft on 5 dimensions (novelty, maturity, overlap, momentum, relevance) using Claude
Embeds drafts with Ollama for pairwise similarity and clustering
Extracts discrete technical ideas and identifies landscape gaps
Analyzes cross-organizational convergence (SequenceMatcher at 0.75 threshold)
Maps the author collaboration network and organizational affiliations
Generates markdown reports and a full web dashboard
Filters false positives automatically (relevance-based + manual flagging)

Quick Start

# Install
pip install -e .

# Set your API key (or add to .env file)
export ANTHROPIC_API_KEY=sk-ant-...

# Fetch drafts from IETF Datatracker
ietf fetch

# Rate all unrated drafts (--cheap uses Haiku for lower cost)
ietf analyze --all
ietf analyze --all --cheap    # ~10x cheaper with Haiku

# Generate embeddings (requires Ollama running locally)
ietf embed

# Extract technical ideas
ietf ideas --all --cheap --batch 5

# Analyze cross-org convergence
ietf ideas convergence

# Identify gaps in the landscape
ietf gaps

# Fetch author data
ietf authors --fetch

# Generate reports
ietf report overview
ietf report landscape
ietf report authors

# Launch the web dashboard
./scripts/run-webui.sh

Web Dashboard

A full interactive dashboard at http://127.0.0.1:5000:

./scripts/run-webui.sh
# or: FLASK_APP=src/webui/app.py flask run

Page	What it shows
Overview	Stat cards, score histogram, category radar, submission timeline
Draft Explorer	Searchable/filterable/sortable table with category pills and score badges
Draft Detail	Score ring, dimension bars, ideas, references, linked authors
Ratings	Score distributions, box plots, category radar, novelty vs maturity scatter
Landscape	t-SNE embedding map, quality quadrants
Authors	Co-authorship force-directed graph, organization charts
Ideas	Extracted ideas grouped by type with search
Gaps	Gap cards sorted by severity with related drafts
Citations	RFC cross-reference graph
Similarity	Draft similarity network
Timeline	Submission trends over time
Monitor	Pipeline health, API costs, processing status

Charts are interactive (Plotly.js). GDPR-compliant analytics (no cookies, daily-salted IP hashing).

Blog Series

An 8-post analysis series in data/reports/blog-series/:

The Gold Rush — Growth from 9 drafts to 9.3% of all IETF submissions
Who Writes the Rules — Huawei's 16%, geopolitical dynamics, team blocs
The OAuth Wars — 14 competing OAuth proposals, fragmentation costs
What Nobody Builds — The safety deficit (4:1 ratio), 12 identified gaps
Where Drafts Converge — 132 cross-org convergent ideas, implicit consensus
The Big Picture — Architectural vision, EU AI Act implications
How We Built This — Methodology, cost ($9-15), limitations
Agents Building the Agent Analysis — Meta post on using Claude agent teams

Key Findings

Safety deficit: ~4:1 ratio of capability-building to safety proposals (varies 1.5:1 to 21:1 monthly)
Extreme fragmentation: 155 competing A2A protocols, 42 overlap clusters
Organizational concentration: Huawei ~16% of all drafts, Chinese orgs ~40%
Cross-org convergence: 132 ideas (33%) independently proposed by multiple organizations
12 gaps identified: 2 critical (behavior verification, human override), 5 high, 5 medium
Top-rated drafts: Safety-focused proposals score highest (VOLT 4.75, DAAP 4.75)

CLI Commands

Core Pipeline

Command	Description
`ietf fetch`	Fetch AI/agent drafts from IETF Datatracker
`ietf analyze --all [--cheap] [--dry-run]`	Rate drafts using Claude
`ietf embed [--dry-run]`	Generate semantic embeddings via Ollama
`ietf ideas --all [--cheap] [--batch N] [--dry-run]`	Extract technical ideas
`ietf ideas convergence [--threshold 0.75]`	Cross-org convergence analysis
`ietf ideas dedup`	Deduplicate similar ideas
`ietf gaps [--dry-run]`	Identify landscape gaps
`ietf authors --fetch`	Fetch author/affiliation data

Exploration

Command	Description
`ietf list`	List tracked drafts
`ietf show <name>`	Show detailed info for a draft
`ietf search <query>`	Full-text search (FTS5)
`ietf similar <name>`	Find similar drafts by embedding similarity
`ietf clusters`	Find clusters of near-duplicate drafts
`ietf compare <name1> <name2>`	Compare drafts for overlap
`ietf authors`	Top authors and draft counts
`ietf network`	Organizational collaboration network

Reports (`ietf report`)

Command	Description
`ietf report overview`	Sortable table of all rated drafts
`ietf report landscape`	Category-grouped view with rankings
`ietf report authors`	Top authors, organizations, collaboration
`ietf report ideas`	Ideas by type, most common, unique
`ietf report gaps`	Gap analysis with severity ratings
`ietf report timeline`	Monthly submission trends
`ietf report overlap-matrix`	Similar pairs and cross-category matrix

Rating System

Each draft is scored 1-5 on five dimensions by Claude (LLM-as-judge, see methodology for caveats):

Dimension	What it measures
Novelty	Originality relative to existing standards
Maturity	Completeness of specification
Overlap	Redundancy with other drafts (5 = heavily overlapping)
Momentum	Community engagement, revisions, adoption
Relevance	Importance to the AI/agent ecosystem

Important: Ratings are generated from abstracts and partial full text without human calibration. They should be treated as relative rankings, not absolute quality measures.

Tech Stack

Python 3.11+ with Click CLI
SQLite with FTS5 full-text search and WAL mode
Anthropic Claude (Sonnet/Haiku) for analysis, rating, idea extraction, gap analysis
Ollama (nomic-embed-text) for local embeddings and similarity
Flask with Jinja2 for the web dashboard
Plotly for interactive visualizations
NumPy/SciPy/scikit-learn for similarity computation and clustering

Project Structure

src/ietf_analyzer/
    cli.py          # Click CLI entry point (~30 commands)
    fetcher.py      # IETF Datatracker API client
    analyzer.py     # Claude-based analysis (rating, ideas, gaps)
    embeddings.py   # Ollama embeddings + similarity + clustering
    db.py           # SQLite with FTS5 (8 tables)
    models.py       # Author, Draft, Rating dataclasses
    reports.py      # Markdown report generation
    authors.py      # Author network analysis
    search.py       # Hybrid FTS5 + embedding search
    classifier.py   # Two-stage Ollama classifier
    readiness.py    # Draft readiness scoring
    config.py       # Configuration

src/webui/
    app.py          # Flask application (20 API endpoints)
    data.py         # Data access layer with TypedDicts
    auth.py         # Admin authentication
    analytics.py    # GDPR-compliant pageview tracking
    templates/      # Jinja2 templates (23 pages)

data/
    drafts.db       # SQLite database
    reports/        # Generated reports + blog series
    .cache/         # Similarity matrix cache

paper/
    main.tex        # arXiv paper

Database Schema

Table	Purpose	Records
`drafts`	Draft metadata + full text	475
`ratings`	5-dimension ratings + summary + false_positive flag	475
`embeddings`	Semantic vectors (nomic-embed-text, 768-dim)	475
`llm_cache`	Claude API response cache (SHA-256 dedup)	~1,500
`authors`	Person records from Datatracker	713
`draft_authors`	Author-draft relationships	~1,400
`ideas`	Extracted + deduplicated technical ideas	501
`gaps`	Gap analysis results	12

Cost

Full pipeline for 475 drafts: ~$9-15 USD total

Sonnet for rating + gap analysis (~$3)
Haiku for bulk idea extraction (~$1)
Ollama embeddings: free (local)

License

MIT — see LICENSE

README.md

IETF Draft Analyzer

What This Does

Quick Start

Web Dashboard

Blog Series

Key Findings

CLI Commands

Core Pipeline

Exploration

Reports (ietf report)

Rating System

Tech Stack

Project Structure

Database Schema

Cost

License

Reports (`ietf report`)