Files

Chris Nennemann 6771a4c235 IETF Draft Analyzer v0.1.0 — track, categorize, and rate AI/agent drafts

Python CLI tool that fetches AI/agent-related Internet-Drafts from the IETF
Datatracker, rates them using Claude, generates embeddings via Ollama for
similarity/clustering, and produces markdown reports.

Features:
- Fetch drafts by keyword from Datatracker API with full text download
- Batch analysis with Claude (token-optimized, responses cached in SQLite)
- Embedding-based similarity search and overlap cluster detection
- Reports: overview, landscape by category, overlap clusters, weekly digest
- SQLite with FTS5 for full-text search across 260 tracked drafts

Initial analysis of 260 drafts reveals OAuth agent auth (13 drafts) and
agent gateway/collaboration (10 drafts) as the most crowded clusters,
while AI safety/alignment is underserved with the highest quality scores.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-28 00:36:45 +01:00

6.1 KiB

Raw Permalink Blame History

IETF Draft Analyzer — Master Prompt

Vision

A tool to track, categorize, compare, and rate the growing flood of IETF Internet-Drafts related to AI and autonomous agents — helping an informed reader stay on top of novel ideas, spot overlaps, and identify gaps worth filling.

Problem

The IETF is seeing a surge of AI/agent-related drafts. Many overlap significantly, some introduce genuinely novel concepts, and it's hard to maintain a mental map of the landscape. Manual tracking doesn't scale.

Primary Data Sources

Recent drafts: https://datatracker.ietf.org/doc/recent
Keyword search (e.g. "agent"): https://datatracker.ietf.org/doc/search?name=agent&sort=&rfcs=on&activedrafts=on&by=group&group=
Datatracker API: https://datatracker.ietf.org/api/ (machine-readable metadata)
Additional keyword searches: ai, llm, autonomous, machine-learning, ml, intelligent, inference, etc.

Core Features

1. Draft Ingestion & Tracking

Fetch drafts from Datatracker (API + scraping where needed)
Track new drafts, revisions, and status changes over time
Store metadata: title, authors, abstract, WG, dates, status, keywords
Download and parse full draft text for deeper analysis

2. Categorization & Tagging

Auto-categorize drafts into topic clusters, e.g.:
- Agent-to-agent communication protocols
- AI safety / guardrails / alignment in networking
- ML-based traffic management / optimization
- Autonomous network operations (intent-based, closed-loop)
- Identity / authentication for AI agents
- Data formats / semantics for AI interop
- Policy / governance / ethical frameworks
Support user-defined tags and manual overrides
Detect which IETF working groups / areas are involved

3. Overlap & Novelty Detection

Similarity analysis between drafts (abstract-level and full-text)
- Semantic similarity (embeddings-based)
- Structural similarity (do they define similar mechanisms?)
Flag clusters of highly overlapping drafts
Highlight novel contributions — ideas that don't appear elsewhere
Track how ideas evolve across draft revisions

4. Brief Rating / Assessment

Per-draft rating along dimensions like:
- Novelty — How new/unique is the core idea?
- Maturity — How complete and well-specified is the draft?
- Overlap — How much does it duplicate existing work?
- Momentum — Author track record, WG adoption, revision frequency
- Relevance — How central is it to the AI/agent topic?
Generate a short (2–4 sentence) AI summary + assessment for each draft
Optional: composite score for quick sorting

5. Overview & Visualization

Dashboard view — sortable/filterable table of tracked drafts
Landscape map — visual clustering of drafts by topic similarity
Timeline — when drafts appeared, how the space is evolving
Diff view — compare two drafts side-by-side (key claims, mechanisms)
Overlap matrix — heatmap showing which drafts cover similar ground

6. Staying Informed

Watch list — mark drafts of special interest
Change alerts — notify on new drafts matching keywords, new revisions, status changes
Periodic digest — weekly/on-demand summary of what's new in the space
Gap analysis — suggest areas not yet covered by any draft

Stretch / Nice-to-Have Features

Idea extraction — pull out discrete technical ideas/mechanisms from each draft, track them independently
Cross-reference with RFCs — link drafts to existing standards they build on or conflict with
Author network — who collaborates with whom, which orgs are active
Meeting tracking — link to relevant IETF meeting agenda items, minutes, slides
Export — generate reports (markdown, PDF) for sharing or personal reference
Personal notes — attach annotations and thoughts to any draft

Design Principles

Local-first — runs on the user's machine, data stored locally
CLI + optional web UI — start simple (CLI/TUI), add a local web dashboard later
LLM-assisted but transparent — use AI for summarization/rating but always show reasoning
Incremental — can start with a small set of drafts and scale up
Open data — all source data comes from public IETF resources

LLM Strategy

Two viable options — use the best tool for each job:

Option A: Claude API (via subscription)

Pro: Superior reasoning for summarization, rating, novelty detection, and comparative analysis of dense technical text. IETF drafts are complex — quality matters here.
Pro: Better at structured output (JSON ratings, consistent categorization)
Con: API costs / rate limits (though subscription helps)
Best for: Summarization, rating, categorization, overlap analysis, gap detection

Option B: Local Ollama

Pro: Free, private, no rate limits, works offline
Pro: Good enough for embeddings (e.g. nomic-embed-text, mxbai-embed-large)
Con: Smaller models struggle with nuanced technical assessment of IETF-grade content
Best for: Embeddings for similarity/clustering, bulk preprocessing, quick triage

Recommended: Hybrid Approach

Task	Model
Embeddings (similarity, clustering)	Ollama local (`nomic-embed-text` or similar)
Quick triage (is this draft AI-related?)	Ollama local (fast, cheap)
Summarization & rating	Claude (better quality on technical text)
Overlap/novelty analysis	Claude (needs strong reasoning)
Gap analysis & insights	Claude (creative + analytical)

This keeps costs down (embeddings are the bulk operation) while using Claude where quality actually matters. The tool should support both backends with a simple config switch, so you can fall back to all-local or all-Claude as needed.

Decisions Made

Tech stack: Python
Storage: SQLite (single DB file, FTS5 for full-text search)
Scope: AI/agent-focused first, generalizable later
Interface: CLI + Markdown report output (v1)
LLM: Hybrid — Ollama for embeddings/triage, Claude for analysis/rating

6.1 KiB Raw Permalink Blame History Unescape Escape