Python CLI tool that fetches AI/agent-related Internet-Drafts from the IETF Datatracker, rates them using Claude, generates embeddings via Ollama for similarity/clustering, and produces markdown reports. Features: - Fetch drafts by keyword from Datatracker API with full text download - Batch analysis with Claude (token-optimized, responses cached in SQLite) - Embedding-based similarity search and overlap cluster detection - Reports: overview, landscape by category, overlap clusters, weekly digest - SQLite with FTS5 for full-text search across 260 tracked drafts Initial analysis of 260 drafts reveals OAuth agent auth (13 drafts) and agent gateway/collaboration (10 drafts) as the most crowded clusters, while AI safety/alignment is underserved with the highest quality scores. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
6.1 KiB
6.1 KiB
IETF Draft Analyzer — Master Prompt
Vision
A tool to track, categorize, compare, and rate the growing flood of IETF Internet-Drafts related to AI and autonomous agents — helping an informed reader stay on top of novel ideas, spot overlaps, and identify gaps worth filling.
Problem
The IETF is seeing a surge of AI/agent-related drafts. Many overlap significantly, some introduce genuinely novel concepts, and it's hard to maintain a mental map of the landscape. Manual tracking doesn't scale.
Primary Data Sources
- Recent drafts: https://datatracker.ietf.org/doc/recent
- Keyword search (e.g. "agent"): https://datatracker.ietf.org/doc/search?name=agent&sort=&rfcs=on&activedrafts=on&by=group&group=
- Datatracker API: https://datatracker.ietf.org/api/ (machine-readable metadata)
- Additional keyword searches:
ai,llm,autonomous,machine-learning,ml,intelligent,inference, etc.
Core Features
1. Draft Ingestion & Tracking
- Fetch drafts from Datatracker (API + scraping where needed)
- Track new drafts, revisions, and status changes over time
- Store metadata: title, authors, abstract, WG, dates, status, keywords
- Download and parse full draft text for deeper analysis
2. Categorization & Tagging
- Auto-categorize drafts into topic clusters, e.g.:
- Agent-to-agent communication protocols
- AI safety / guardrails / alignment in networking
- ML-based traffic management / optimization
- Autonomous network operations (intent-based, closed-loop)
- Identity / authentication for AI agents
- Data formats / semantics for AI interop
- Policy / governance / ethical frameworks
- Support user-defined tags and manual overrides
- Detect which IETF working groups / areas are involved
3. Overlap & Novelty Detection
- Similarity analysis between drafts (abstract-level and full-text)
- Semantic similarity (embeddings-based)
- Structural similarity (do they define similar mechanisms?)
- Flag clusters of highly overlapping drafts
- Highlight novel contributions — ideas that don't appear elsewhere
- Track how ideas evolve across draft revisions
4. Brief Rating / Assessment
- Per-draft rating along dimensions like:
- Novelty — How new/unique is the core idea?
- Maturity — How complete and well-specified is the draft?
- Overlap — How much does it duplicate existing work?
- Momentum — Author track record, WG adoption, revision frequency
- Relevance — How central is it to the AI/agent topic?
- Generate a short (2–4 sentence) AI summary + assessment for each draft
- Optional: composite score for quick sorting
5. Overview & Visualization
- Dashboard view — sortable/filterable table of tracked drafts
- Landscape map — visual clustering of drafts by topic similarity
- Timeline — when drafts appeared, how the space is evolving
- Diff view — compare two drafts side-by-side (key claims, mechanisms)
- Overlap matrix — heatmap showing which drafts cover similar ground
6. Staying Informed
- Watch list — mark drafts of special interest
- Change alerts — notify on new drafts matching keywords, new revisions, status changes
- Periodic digest — weekly/on-demand summary of what's new in the space
- Gap analysis — suggest areas not yet covered by any draft
Stretch / Nice-to-Have Features
- Idea extraction — pull out discrete technical ideas/mechanisms from each draft, track them independently
- Cross-reference with RFCs — link drafts to existing standards they build on or conflict with
- Author network — who collaborates with whom, which orgs are active
- Meeting tracking — link to relevant IETF meeting agenda items, minutes, slides
- Export — generate reports (markdown, PDF) for sharing or personal reference
- Personal notes — attach annotations and thoughts to any draft
Design Principles
- Local-first — runs on the user's machine, data stored locally
- CLI + optional web UI — start simple (CLI/TUI), add a local web dashboard later
- LLM-assisted but transparent — use AI for summarization/rating but always show reasoning
- Incremental — can start with a small set of drafts and scale up
- Open data — all source data comes from public IETF resources
LLM Strategy
Two viable options — use the best tool for each job:
Option A: Claude API (via subscription)
- Pro: Superior reasoning for summarization, rating, novelty detection, and comparative analysis of dense technical text. IETF drafts are complex — quality matters here.
- Pro: Better at structured output (JSON ratings, consistent categorization)
- Con: API costs / rate limits (though subscription helps)
- Best for: Summarization, rating, categorization, overlap analysis, gap detection
Option B: Local Ollama
- Pro: Free, private, no rate limits, works offline
- Pro: Good enough for embeddings (e.g.
nomic-embed-text,mxbai-embed-large) - Con: Smaller models struggle with nuanced technical assessment of IETF-grade content
- Best for: Embeddings for similarity/clustering, bulk preprocessing, quick triage
Recommended: Hybrid Approach
| Task | Model |
|---|---|
| Embeddings (similarity, clustering) | Ollama local (nomic-embed-text or similar) |
| Quick triage (is this draft AI-related?) | Ollama local (fast, cheap) |
| Summarization & rating | Claude (better quality on technical text) |
| Overlap/novelty analysis | Claude (needs strong reasoning) |
| Gap analysis & insights | Claude (creative + analytical) |
This keeps costs down (embeddings are the bulk operation) while using Claude where quality actually matters. The tool should support both backends with a simple config switch, so you can fall back to all-local or all-Claude as needed.
Decisions Made
- Tech stack: Python
- Storage: SQLite (single DB file, FTS5 for full-text search)
- Scope: AI/agent-focused first, generalizable later
- Interface: CLI + Markdown report output (v1)
- LLM: Hybrid — Ollama for embeddings/triage, Claude for analysis/rating