IETF Draft Analyzer v0.1.0 — track, categorize, and rate AI/agent drafts

Python CLI tool that fetches AI/agent-related Internet-Drafts from the IETF
Datatracker, rates them using Claude, generates embeddings via Ollama for
similarity/clustering, and produces markdown reports.

Features:
- Fetch drafts by keyword from Datatracker API with full text download
- Batch analysis with Claude (token-optimized, responses cached in SQLite)
- Embedding-based similarity search and overlap cluster detection
- Reports: overview, landscape by category, overlap clusters, weekly digest
- SQLite with FTS5 for full-text search across 260 tracked drafts

Initial analysis of 260 drafts reveals OAuth agent auth (13 drafts) and
agent gateway/collaboration (10 drafts) as the most crowded clusters,
while AI safety/alignment is underserved with the highest quality scores.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-28 00:36:45 +01:00
commit 6771a4c235
17 changed files with 2823 additions and 0 deletions

View File

@@ -0,0 +1,44 @@
"""Configuration management."""
from __future__ import annotations
import json
from dataclasses import dataclass, field, asdict
from pathlib import Path
DEFAULT_DATA_DIR = Path(__file__).resolve().parent.parent.parent / "data"
CONFIG_FILE = DEFAULT_DATA_DIR / "config.json"
DEFAULT_KEYWORDS = [
"agent",
"ai-agent",
"llm",
"autonomous",
"machine-learning",
"artificial-intelligence",
]
@dataclass
class Config:
data_dir: str = str(DEFAULT_DATA_DIR)
db_path: str = str(DEFAULT_DATA_DIR / "drafts.db")
ollama_url: str = "http://localhost:11434"
ollama_embed_model: str = "nomic-embed-text"
claude_model: str = "claude-sonnet-4-20250514"
search_keywords: list[str] = field(default_factory=lambda: list(DEFAULT_KEYWORDS))
# Only fetch drafts newer than this (ISO date string)
fetch_since: str = "2024-01-01"
# Polite delay between API requests (seconds)
fetch_delay: float = 0.5
def save(self) -> None:
Path(self.data_dir).mkdir(parents=True, exist_ok=True)
CONFIG_FILE.write_text(json.dumps(asdict(self), indent=2))
@classmethod
def load(cls) -> Config:
if CONFIG_FILE.exists():
data = json.loads(CONFIG_FILE.read_text())
return cls(**{k: v for k, v in data.items() if k in cls.__dataclass_fields__})
return cls()