ietf-draft-analyzer/CLAUDE.md

# IETF Draft Analyzer — Project Instructions

## What This Is

Python CLI tool (`ietf`) to track, categorize, rate, and map IETF Internet-Drafts on AI/agent topics. 361 drafts, 403 authors, 1,262 ideas, 12 gaps. Uses Claude for analysis, Ollama for embeddings, SQLite for storage.

## Key Paths

- Source: `src/ietf_analyzer/`
- Database: `data/drafts.db` (NOT `data/ietf_drafts.db`)
- Reports: `data/reports/`
- Blog series: `data/reports/blog-series/`
- Agent definitions: `.claude/agents/`
- Team prompt: `scripts/agent-team-prompt.md`
- Scripts: `scripts/`

## Development Journal

**Every agent and every session MUST log development milestones to `data/reports/dev-journal.md`.**

This journal serves two purposes:
1. Track progress across sessions so nothing gets lost
2. Source material for the meta blog post about using Claude agent teams to build this project

### What to Log

Append entries in this format:

```markdown
### [DATE] [AGENT/SESSION] — [SHORT TITLE]

**What**: [What was done — features built, analyses run, posts written]
**Why**: [The reasoning or decision behind it]
**Result**: [Outcome, key numbers, links to artifacts]
**Surprise**: [Optional — anything unexpected, a lesson learned, a tool limitation hit]
**Cost**: [Optional — API tokens, time taken, model used]
```

### Examples of What to Log

- Pipeline runs (how many drafts processed, cost, any failures)
- New features implemented (what, why, how it changed the analysis)
- Blog posts drafted or revised (key editorial decisions)
- Architectural decisions (why we structured something a certain way)
- Agent coordination moments (when one agent's output changed another's direction)
- Surprises in the data (unexpected findings that shifted the narrative)
- Tool/infra issues (things that broke, workarounds found)

### What NOT to Log

- Routine file reads or searches
- Minor formatting fixes
- Anything already captured in git commits

## Agent Team Conventions

When working as a team:

1. **Architect** designs the narrative arc and reviews everything for coherence
2. **Analyst** runs the pipeline, queries the DB, provides data packages
3. **Coder** implements new features following existing patterns (Click CLI, SQLite, rich output)
4. **Writer** produces the blog series from data packages and architectural guidance

**Always launch agents in parallel when possible.** If agents have independent tasks (e.g., Analyst querying data while Writer drafts from existing material, or Coder implementing features that don't depend on each other), spawn them concurrently in a single message rather than sequentially. Only run agents sequentially when one genuinely depends on another's output.

All agents should:
- Read `scripts/agent-team-prompt.md` for the full brief
- Log milestones to `data/reports/dev-journal.md`
- Write blog posts to `data/reports/blog-series/`
- Save reusable scripts to `scripts/`
- Follow existing code patterns (don't over-engineer)

## Blog Series

7 posts planned in `data/reports/blog-series/` (01 through 07), plus:
- **Post 8: "Agents Building the Agent Analysis"** — Meta post about using Claude Code agent teams to analyze and write about IETF agent standards. The dev-journal.md is the source material for this post.

## Code Conventions

- CLI: Click commands in `cli.py` with `@click.option()` decorators
- DB: Tables in `db.py` `ensure_tables()`, queries as methods on `DraftDB`
- Reports: Report types in `reports.py` `generate_report()`
- Always cache Claude API calls via `llm_cache` table
- Use `rich` for console output
- Save multi-step workflows as scripts in `scripts/`

## Web UI: Public vs Dev-Only Pages

The web dashboard runs in two modes: production (default) and dev (`--dev` flag).
**When adding new pages, always decide which mode they belong to.**

Use `@admin_required` decorator on dev-only routes, and `{% if is_admin %}` in `base.html` nav links.

### Public pages (visible to everyone)
Pages showing **publicly available data** or **high-level results** that are defensible:
- Overview, Draft Explorer, Draft Detail — browsable catalog
- Authors — public data from Datatracker
- Citations — public citation data
- Ratings — score distributions (aggregate, not per-draft methodology)
- Timeline — submission trends (factual)
- Search — search functionality
- About, Impressum, Datenschutz — legal/info pages

### Dev-only pages (`@admin_required`, `--dev` mode)
Pages exposing **internal methodology**, **LLM judgments**, **cost data**, or **debatable analysis**:
- Gap Explorer, Gap Generation — internal gap analysis, draft generation
- Monitor — pipeline health, API costs, token usage
- Analytics — pageview tracking
- Compare — side-by-side draft comparison (uses Claude)
- AI Ask/Synthesize — Claude-powered Q&A (costs tokens)
- Annotations — internal notes
- False Positives — exposes filtering methodology, raw LLM judgment calls
- Complexity — correlations between LLM ratings and structural metrics (methodologically debatable)
- Idea Analysis — LLM-generated novelty scores could be challenged
- Trends — safety ratio uses internal category mappings
- Sources — rating comparisons across standards bodies (could offend orgs)
- Similarity — embedding-based methodology
- Landscape — t-SNE map (methodology-dependent)
- Obsidian Export — internal tool

### Decision criteria for new pages
- **Public if**: data comes from public sources (Datatracker, standards body websites), or shows aggregate statistics without exposing LLM methodology
- **Dev-only if**: page reveals how Claude rates/classifies things, shows internal cost/token data, compares organizations in potentially sensitive ways, or uses methodology that could be questioned without full context

## Current Status (2026-03-03)

- v0.2.0, 361 drafts (101 new, unprocessed)
- 101 new drafts need: analyze, authors, ideas, embed, gaps
- Blog series: planned, not yet written
- Agent team: defined in `.claude/agents/`, ready to launch