Files
ietf-draft-analyzer/master-prompt.md
Chris Nennemann 6771a4c235 IETF Draft Analyzer v0.1.0 — track, categorize, and rate AI/agent drafts
Python CLI tool that fetches AI/agent-related Internet-Drafts from the IETF
Datatracker, rates them using Claude, generates embeddings via Ollama for
similarity/clustering, and produces markdown reports.

Features:
- Fetch drafts by keyword from Datatracker API with full text download
- Batch analysis with Claude (token-optimized, responses cached in SQLite)
- Embedding-based similarity search and overlap cluster detection
- Reports: overview, landscape by category, overlap clusters, weekly digest
- SQLite with FTS5 for full-text search across 260 tracked drafts

Initial analysis of 260 drafts reveals OAuth agent auth (13 drafts) and
agent gateway/collaboration (10 drafts) as the most crowded clusters,
while AI safety/alignment is underserved with the highest quality scores.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 00:36:45 +01:00

6.1 KiB
Raw Permalink Blame History

IETF Draft Analyzer — Master Prompt

Vision

A tool to track, categorize, compare, and rate the growing flood of IETF Internet-Drafts related to AI and autonomous agents — helping an informed reader stay on top of novel ideas, spot overlaps, and identify gaps worth filling.

Problem

The IETF is seeing a surge of AI/agent-related drafts. Many overlap significantly, some introduce genuinely novel concepts, and it's hard to maintain a mental map of the landscape. Manual tracking doesn't scale.

Primary Data Sources

Core Features

1. Draft Ingestion & Tracking

  • Fetch drafts from Datatracker (API + scraping where needed)
  • Track new drafts, revisions, and status changes over time
  • Store metadata: title, authors, abstract, WG, dates, status, keywords
  • Download and parse full draft text for deeper analysis

2. Categorization & Tagging

  • Auto-categorize drafts into topic clusters, e.g.:
    • Agent-to-agent communication protocols
    • AI safety / guardrails / alignment in networking
    • ML-based traffic management / optimization
    • Autonomous network operations (intent-based, closed-loop)
    • Identity / authentication for AI agents
    • Data formats / semantics for AI interop
    • Policy / governance / ethical frameworks
  • Support user-defined tags and manual overrides
  • Detect which IETF working groups / areas are involved

3. Overlap & Novelty Detection

  • Similarity analysis between drafts (abstract-level and full-text)
    • Semantic similarity (embeddings-based)
    • Structural similarity (do they define similar mechanisms?)
  • Flag clusters of highly overlapping drafts
  • Highlight novel contributions — ideas that don't appear elsewhere
  • Track how ideas evolve across draft revisions

4. Brief Rating / Assessment

  • Per-draft rating along dimensions like:
    • Novelty — How new/unique is the core idea?
    • Maturity — How complete and well-specified is the draft?
    • Overlap — How much does it duplicate existing work?
    • Momentum — Author track record, WG adoption, revision frequency
    • Relevance — How central is it to the AI/agent topic?
  • Generate a short (24 sentence) AI summary + assessment for each draft
  • Optional: composite score for quick sorting

5. Overview & Visualization

  • Dashboard view — sortable/filterable table of tracked drafts
  • Landscape map — visual clustering of drafts by topic similarity
  • Timeline — when drafts appeared, how the space is evolving
  • Diff view — compare two drafts side-by-side (key claims, mechanisms)
  • Overlap matrix — heatmap showing which drafts cover similar ground

6. Staying Informed

  • Watch list — mark drafts of special interest
  • Change alerts — notify on new drafts matching keywords, new revisions, status changes
  • Periodic digest — weekly/on-demand summary of what's new in the space
  • Gap analysis — suggest areas not yet covered by any draft

Stretch / Nice-to-Have Features

  • Idea extraction — pull out discrete technical ideas/mechanisms from each draft, track them independently
  • Cross-reference with RFCs — link drafts to existing standards they build on or conflict with
  • Author network — who collaborates with whom, which orgs are active
  • Meeting tracking — link to relevant IETF meeting agenda items, minutes, slides
  • Export — generate reports (markdown, PDF) for sharing or personal reference
  • Personal notes — attach annotations and thoughts to any draft

Design Principles

  • Local-first — runs on the user's machine, data stored locally
  • CLI + optional web UI — start simple (CLI/TUI), add a local web dashboard later
  • LLM-assisted but transparent — use AI for summarization/rating but always show reasoning
  • Incremental — can start with a small set of drafts and scale up
  • Open data — all source data comes from public IETF resources

LLM Strategy

Two viable options — use the best tool for each job:

Option A: Claude API (via subscription)

  • Pro: Superior reasoning for summarization, rating, novelty detection, and comparative analysis of dense technical text. IETF drafts are complex — quality matters here.
  • Pro: Better at structured output (JSON ratings, consistent categorization)
  • Con: API costs / rate limits (though subscription helps)
  • Best for: Summarization, rating, categorization, overlap analysis, gap detection

Option B: Local Ollama

  • Pro: Free, private, no rate limits, works offline
  • Pro: Good enough for embeddings (e.g. nomic-embed-text, mxbai-embed-large)
  • Con: Smaller models struggle with nuanced technical assessment of IETF-grade content
  • Best for: Embeddings for similarity/clustering, bulk preprocessing, quick triage
Task Model
Embeddings (similarity, clustering) Ollama local (nomic-embed-text or similar)
Quick triage (is this draft AI-related?) Ollama local (fast, cheap)
Summarization & rating Claude (better quality on technical text)
Overlap/novelty analysis Claude (needs strong reasoning)
Gap analysis & insights Claude (creative + analytical)

This keeps costs down (embeddings are the bulk operation) while using Claude where quality actually matters. The tool should support both backends with a simple config switch, so you can fall back to all-local or all-Claude as needed.

Decisions Made

  • Tech stack: Python
  • Storage: SQLite (single DB file, FTS5 for full-text search)
  • Scope: AI/agent-focused first, generalizable later
  • Interface: CLI + Markdown report output (v1)
  • LLM: Hybrid — Ollama for embeddings/triage, Claude for analysis/rating