Files

Christian Nennemann d6beb9c0a0 v0.3.0: Gap-to-Draft pipeline, Living Standards Observatory, blog series

Gap-to-Draft Pipeline (ietf pipeline):
- Context builder assembles ideas, RFC foundations, similar drafts, ecosystem vision
- Generator produces outlines + sections using rich context with Claude
- Quality gates: novelty (embedding similarity), references, format, self-rating
- Family coordinator generates 5-draft ecosystem (AEM/ATD/HITL/AEPB/APAE)
- I-D formatter with proper headers, references, 72-char wrapping

Living Standards Observatory (ietf observatory):
- Source abstraction with IETF + W3C fetchers
- 7-step update pipeline: snapshot, fetch, analyze, embed, ideas, gaps, record
- Static GitHub Pages dashboard (explorer, gap tracker, timeline)
- Weekly CI/CD automation via GitHub Actions

Also includes:
- 361 drafts (expanded from 260 with 6 new keywords), 403 authors, 1,262 ideas, 12 gaps
- Blog series (8 posts planned), reports, arXiv paper figures
- Agent team infrastructure (CLAUDE.md, scripts, dev journal)
- 5 new DB tables, schema migration, ~15 new query methods

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-04 00:48:57 +01:00

23 KiB

Raw Permalink Blame History

Sprint Plan: Next 3 Sprints

Created: 2026-03-03 | Status: Ready for execution

Decisions made:

Publication platform: GitHub Pages (Markdown-native, pairs with open-source release)
Publication cadence: Staggered, 1 post per day (Post 1 + series overview on day 1)
License: MIT

Current State (Sprint 0 — Where We Are)

Asset	Status	Details
Blog series (8 posts + overview)	Final polish	~22K words across 10 files. Writer doing final editorial pass. All numbers updated to 361-draft dataset.
CLI tool	Working	24+ commands, 6,882 lines Python, 13 source files. 7 new features built this session (refs, trends, idea-overlap, status, revisions, centrality, co-occurrence).
Database	Fully processed	361 drafts, 557 authors, 230 orgs, 1,780 ideas (628 cross-org convergent), 12 gaps. Pipeline complete on all drafts.
Reports	Fresh	All 15 report types regenerated from full dataset. 7 data packages in `data/reports/blog-series/data/`.
arXiv paper	Draft v1	13 pages at `paper/main.tex`, based on 260-draft dataset — needs update to 361+ with new analyses.
Packaging	Minimal	pyproject.toml exists (version 0.1.0). No tests, no CI, no LICENSE, no CONTRIBUTING.md.
README	Outdated	References 260 drafts, version 0.1.0, missing new commands and features.
Open source readiness	Not ready	No tests, no CI, no contribution guide, stale README. Tool works but isn't packaged for others.

Sprint 1: Publish the Blog Series + Begin Outreach (Highest Value, Time-Sensitive)

Theme: Get the analysis in front of people while the landscape is still moving fast. IETF 122 is upcoming and the data changes monthly — publishing now maximizes relevance. The pipeline is complete, the blog series is in final editorial polish — this sprint is about getting it live and in front of the right people.

Duration: 3-5 days

NOTE: Tasks 1.1-1.3 from the original plan are ALREADY DONE:

Pipeline ran on all 361 drafts (557 authors, 1,780 ideas, 628 cross-org convergent, 12 gaps)
All 15 reports regenerated
All 8 blog posts updated with expanded-dataset numbers and verified for consistency
Two deep analysis rounds completed (revision velocity, safety ratio trends, RFC divergence, co-occurrence, IETF meeting timing)
Seven new CLI features implemented (refs, trends, idea-overlap, status, revisions, centrality, co-occurrence)

Task 1.1: Complete Final Editorial Polish

Agent: Writer Effort: S What: Finish the ongoing editorial pass. The Writer is currently on this task. Once complete, the series is publication-ready. Acceptance criteria: All 8 posts finalized. Writer confirms no further changes needed. Series overview (Post 00) is consistent with all posts. Dependencies: None (in progress now)

Task 1.2: Choose a Publication Platform and Publish

Agent: Writer + Architect Effort: M What: Decide where to publish the blog series and get posts live. Options (in order of recommendation):

GitHub Pages (fastest, free, Markdown-native) — create a simple static site with Jekyll/Hugo from the blog-series directory. Can publish straight from the repo.
Substack / Medium / dev.to — wider audience, built-in discovery, but requires manual formatting and copy-paste from Markdown.
Personal blog — if the author already has one, lowest friction.

Key decisions:

Publish all 8 at once (series dump) or staggered (1 per day / every 2 days)?
- Recommendation: Publish Post 1 + Post 00 (series overview) on day 1, then one post per day. This builds anticipation and makes each post shareable individually.
Interlink: each post should link to the previous and next post in the series.
If GitHub Pages: set up a minimal Jekyll/Hugo site with posts from data/reports/blog-series/.

Acceptance criteria: Post 1 is live and publicly accessible. Publication schedule for posts 2-8 is set. Dependencies: Task 1.1

Agent: Architect (drafts messaging) + Writer (final copy) Effort: M What: Write announcement copy and identify channels:

Short-form posts (Twitter/X, LinkedIn, Mastodon, Bluesky): 3-4 variations highlighting different angles:
- The growth stat + safety deficit angle ("IETF AI agent drafts went from 0.5% to 9.3% of all submissions in 15 months — with a 4:1 capability-to-safety ratio")
- The geopolitics angle ("One company writes 18% of all AI agent standards drafts")
- The fragmentation angle ("120 A2A protocols, 14 competing OAuth proposals, zero interop layer")
- The methodology angle ("We analyzed 361 IETF drafts with Claude for ~$9 and found the standards world is building highways before traffic lights")
IETF mailing lists: Post to relevant lists (art@ietf.org, agents@ietf.org if it exists, or general discussion) with a brief summary and link to Post 1
Hacker News / Reddit /r/networking /r/machinelearning: Submit Post 1 or Post 4 (gaps — the most newsworthy finding)
Direct outreach: Identify 5-10 people who would care most (IETF area directors, authors of top-ranked drafts like DAAP and VOLT, AI standards researchers, MCP/A2A protocol teams) and share directly Acceptance criteria: Announcement text ready for 3+ platforms. List of 5-10 direct outreach targets with contact method. Post submitted to at least one community (HN or IETF list). Dependencies: Task 1.2

Task 1.4: Fetch Latest Drafts (IETF 122 Prep)

Agent: Analyst Effort: S What: Run ietf fetch to pick up any new drafts submitted since the last fetch (2026-03-03). IETF meetings drive submission spikes — capture anything new before the paper update. Process any new drafts through the full pipeline (analyze, authors, ideas, embed, gaps). Acceptance criteria: Any new drafts since 2026-03-03 are fetched, stored, and fully processed. Count reported. Dependencies: None (can run in parallel with publishing tasks)

Task 1.5: Commit All Current Work

Agent: Coder Effort: S What: The git status shows extensive uncommitted work: modified source files (cli.py, db.py, analyzer.py, reports.py, visualize.py, models.py, config.py), new files (CLAUDE.md, orgs.py, scripts/, blog series), new reports, agent definitions. Create a clean commit (or a few topical commits) capturing the current state: v0.3.0 with expanded dataset, 8-post blog series, 7 new features, agent team artifacts. Acceptance criteria: All meaningful changes committed. Working tree clean. Version bumped to 0.3.0 in pyproject.toml and cli.py. Dependencies: Task 1.1 (want final polished blog posts in the commit)

Sprint 1 Success Criteria

Blog series editorial polish complete
Blog Post 1 is live and publicly accessible
Publication schedule set for remaining 7 posts
Announcement posted to at least 2 channels
Latest drafts fetched for IETF 122 coverage
All current work committed to git (v0.3.0)

Sprint 2: Update the arXiv Paper + Deepen Analysis

Theme: Turn the expanded dataset into a credible academic contribution. The paper at paper/main.tex is solid but frozen at 260 drafts — updating it to 361+ with the new analyses makes it publishable.

Duration: 5-7 days

Task 2.1: Run Pipeline on Any Newly Fetched Drafts

Agent: Analyst Effort: S-M What: Process any new drafts picked up in Sprint 1 Task 1.4 (IETF 122 fetch). Run the full pipeline: analyze, authors, ideas, embed, gaps. Regenerate reports. Acceptance criteria: All drafts in the database are fully analyzed. Reports refreshed. Dependencies: Sprint 1 Task 1.4

Task 2.2: Generate Updated Figures for the Paper

Agent: Analyst + Coder Effort: M What: The paper has placeholder figures ([TIMELINE], [QUALITY], [RADAR], [NETWORK]). Run ietf viz all to regenerate the HTML visualizations, then use paper/export_figures.py (or write one if it doesn't work) to export publication-quality PNG/PDF versions.

Figures needed:

Figure 1: Monthly submission timeline (stacked area by category) — with 361+ drafts
Figure 2: Rating distributions by category (violin plots) — updated
Figure 3: Similarity heatmap (now 361x361) — updated
Figure 4: Quality vs. uniqueness scatter — updated
Figure 5: Radar charts for top categories — updated
Figure 6: Author collaboration network — updated with expanded dataset

Acceptance criteria: All 6 figures exported as PDF/PNG in paper/figures/. Each figure reflects the full expanded dataset. Dependencies: Task 2.1

Task 2.3: Update Paper Text and Tables

Agent: Writer + Architect Effort: L What: Update paper/main.tex throughout:

Title: Change "260" to the current count
Abstract: Update all numbers (drafts, authors, ideas, similarity pairs, tokens)
Section 3 (Methodology): Add the 6 new keywords (mcp, agentic, inference, generative, intelligent, aipref). Update seed keyword list from 6 to 12.
Section 4 (Dataset Overview): Update Table 1 with new numbers. Time span now extends further.
Section 5 (Findings): All tables and stats need updating with 361+ draft data. Particularly:
- Table 2 (categories) — will likely have new entries from expanded keywords
- Table 4 (organizations) — author/org counts will change
- Table 5 (top-ranked drafts) — new top drafts may emerge
- All prose stats (percentages, counts, growth rates)
Section 6 (Discussion): Revisit conclusions — do they still hold with 40% more data? Any new insights?
Section 7 (Future Work): Several items are now done (citation network via refs, gap-driven standardization). Update.
Section 8 (Conclusion): Update closing numbers
Appendix A: Full category list will likely expand
Repository URL: Fill in the [TODO] with actual URL after open-source release (or leave as "available upon publication")
References: Add citations for MCP, any new relevant work

Acceptance criteria: Paper compiles cleanly with make (or pdflatex). All numbers match the current database. No [TODO] placeholders remain (except possibly author name/email if not yet decided). Dependencies: Tasks 2.1, 2.2

Task 2.4: Add New Analysis Sections to Paper

Agent: Architect + Analyst Effort: M What: The blog series uncovered findings that aren't in the paper. Consider adding:

Cross-organization convergence: The idea-overlap analysis showing which technical ideas appear across multiple independent organizations (convergence signals vs. within-org duplication)
Team bloc analysis: The 33 detected collaboration clusters and what they mean for coordination
RFC cross-reference graph: Which foundational RFCs underpin the agent ecosystem (if ietf refs data is available)
Gap evolution: Whether any of the 12 identified gaps are being filled by newer submissions

Pick the 1-2 most compelling additions that strengthen the paper's contribution without bloating it past 15 pages.

Acceptance criteria: 1-2 new subsections added. Paper is <= 16 pages. New sections have supporting data tables/figures. Dependencies: Task 2.3

Task 2.5: Submit to arXiv

Agent: Architect Effort: S What: Prepare the arXiv submission package:

Compile the final PDF
Ensure all figures are included
Write the arXiv abstract (may differ slightly from paper abstract)
Choose categories: cs.AI, cs.SE, or cs.CY (Computers and Society) — recommend cs.AI as primary, cs.SE as secondary
Upload to arXiv Acceptance criteria: Paper submitted to arXiv with no compilation errors. arXiv ID received. Dependencies: Tasks 2.3, 2.4

Task 2.6: Continue Blog Publication (Posts 2-8)

Agent: Writer Effort: S (per post, ongoing) What: Continue publishing the remaining blog posts on the schedule set in Sprint 1. For each post: format for the platform, add interlinks, publish, announce on social media. Acceptance criteria: All 8 posts published by end of Sprint 2. Each post has at least one social media announcement. Dependencies: Sprint 1 Task 1.4

Sprint 2 Success Criteria

arXiv paper updated with 361+ draft data and submitted
All 8 blog posts published
At least one new analysis (cross-org convergence or RFC cross-refs) added to paper
Paper figures reflect the expanded dataset

Sprint 3: Open-Source Release + Tool Polish

Theme: Make the tool usable by others. The analysis is interesting, but the methodology — combining LLM rating with embedding similarity at scale — is the reusable contribution. A clean open-source release turns a one-off project into a tool others can use to monitor IETF standardization activity.

Duration: 7-10 days

Task 3.1: Add Tests

Agent: Coder Effort: L What: Currently zero tests. Add a test suite covering the critical paths:

tests/test_db.py — Database operations (upsert, query, FTS search, embedding storage)
tests/test_config.py — Config load/save, defaults, overrides
tests/test_models.py — Dataclass creation, serialization
tests/test_fetcher.py — Mock Datatracker API responses, parse draft metadata
tests/test_analyzer.py — Mock Claude API, parse ratings/ideas from responses
tests/test_cli.py — Click CLI integration tests (using CliRunner)
tests/test_reports.py — Report generation produces valid markdown

Use pytest. Mock all external APIs (Datatracker, Claude, Ollama). Include a small test fixture database with ~5 drafts.

Add to pyproject.toml:

[project.optional-dependencies]
dev = ["pytest>=7.0", "pytest-cov", "responses>=0.25"]

Acceptance criteria: pytest passes with >= 70% coverage on db.py, config.py, models.py. All tests run without network access. Dependencies: None

Task 3.2: Set Up CI/CD (GitHub Actions)

Agent: Coder Effort: M What: Create .github/workflows/ci.yml:

Trigger on push to main and PRs
Python 3.11+ matrix
Install deps, run pytest
Run linting (ruff or flake8)
Optionally: build the package and check it installs cleanly

Acceptance criteria: CI green on current main. PRs get automatic test runs. Badge in README. Dependencies: Task 3.1

Task 3.3: Clean Up Code for Public Release

Agent: Coder Effort: M What: Audit and fix:

Version: Update from 0.1.0 to 0.3.0 (0.2.0 was the viz release, 0.3.0 is the expanded+blog release) in both pyproject.toml and cli.py
Config portability: config.py uses Path(__file__).resolve().parent.parent.parent / "data" as default — this only works in dev mode. For installed packages, default to ~/.ietf-analyzer/ or use XDG conventions. Keep current behavior as fallback for dev.
Remove stale files: data/ietf_drafts.db (empty/stale per MEMORY.md). Clean up any other debris.
Error handling at boundaries: Ensure graceful failures when:
- Ollama is not running (ietf embed should say "Ollama not reachable" not crash)
- ANTHROPIC_API_KEY is not set (ietf analyze should say "Set ANTHROPIC_API_KEY" not crash)
- Network is unavailable (ietf fetch should handle timeouts)
Type hints: Ensure public API functions have type hints (don't add to internal helpers)
.env support: Already uses python-dotenv — verify it works from any working directory
Remove any debug prints or commented-out code

Acceptance criteria: Tool installs cleanly via pip install . in a fresh venv. Commands that need external services give clear error messages when those services are unavailable. Dependencies: None

Task 3.4: Write Proper README and Docs

Agent: Writer + Coder Effort: M What: Rewrite README.md for an open-source audience:

Project description with the hook (what it does, why it matters)
Updated stats (361+ drafts, 557+ authors, 1,780+ ideas)
Installation: pip install, API key setup, Ollama setup
Quick start: the 6-command pipeline
Full CLI reference (all 24 commands with options)
Configuration: explain config.json, environment variables, defaults
Architecture overview: how the modules fit together (fetcher -> db -> analyzer -> reports/viz)
Screenshots/examples of output (report excerpts, viz screenshots)
Contributing guide (CONTRIBUTING.md)
Link to the blog series and arXiv paper
License badge, CI badge

Acceptance criteria: A newcomer can go from git clone to running their first analysis by following the README. No broken links. Dependencies: Task 3.3

Task 3.5: Add LICENSE

Agent: Coder Effort: S What: Choose and add a license. Recommendations:

MIT — simplest, most permissive, standard for tools like this
Apache 2.0 — if patent protection matters

Add to pyproject.toml:

license = {text = "MIT"}

Acceptance criteria: LICENSE file in repo root. License field in pyproject.toml. License header convention documented. Dependencies: None

Task 3.6: Prepare PyPI Package

Agent: Coder Effort: M What: Make the package installable from PyPI:

Update pyproject.toml with full metadata (author, URL, classifiers, keywords, long_description from README)
Ensure python -m build produces a clean sdist and wheel
Test install from the built wheel in a fresh venv
Write publish script: scripts/publish.sh using twine
Register the package name on PyPI (or Test PyPI first)

[project]
name = "ietf-draft-analyzer"
version = "0.3.0"
description = "Track, categorize, and rate AI/agent-related IETF Internet-Drafts"
readme = "README.md"
license = {text = "MIT"}
keywords = ["ietf", "internet-draft", "ai-agents", "standards", "nlp"]
classifiers = [
    "Development Status :: 3 - Alpha",
    "Environment :: Console",
    "Intended Audience :: Science/Research",
    "Topic :: Scientific/Engineering :: Artificial Intelligence",
    "Programming Language :: Python :: 3.11",
]

Acceptance criteria: pip install ietf-draft-analyzer works from PyPI (or Test PyPI). The ietf command is available after install. Dependencies: Tasks 3.3, 3.4, 3.5

Task 3.7: Public Repository Setup

Agent: Coder + Architect Effort: S What: Prepare the GitHub repository for public visibility:

Ensure .gitignore covers: .env, __pycache__, *.egg-info, data/config.json, data/drafts.db (the actual user data shouldn't be in the repo — but the schema should be recreatable)
Add CONTRIBUTING.md with: how to set up dev environment, how to run tests, code style, PR process
Add issue templates (bug report, feature request)
Update paper [TODO] repository URL
Decide: include data/drafts.db in repo (useful for reproducibility) or .gitignore it (cleaner)?
- Recommendation: Include a snapshot as a GitHub Release asset, not in the git tree. Add a scripts/download-data.sh that fetches it.
Set repo topics: ietf, internet-drafts, ai-agents, standardization, nlp, cli

Acceptance criteria: Repo is ready to be made public. No secrets or personal data in git history. README renders correctly on GitHub. Dependencies: Tasks 3.4, 3.5, 3.6

Sprint 3 Success Criteria

Test suite with >= 70% coverage on core modules
CI/CD running on GitHub Actions
Tool installable via pip install ietf-draft-analyzer
README guides a newcomer from clone to first analysis
LICENSE file present (MIT)
Repository is public-ready (no secrets, clean history, proper .gitignore)

Cross-Sprint Dependencies

Sprint 1                    Sprint 2                    Sprint 3
────────                    ────────                    ────────
1.1 Final polish ────────> 1.2 Publish
1.2 Publish ─────────────> 2.6 Continue publishing
1.4 Fetch latest ────────> 2.1 Process new drafts
1.5 Commit v0.3.0 ──────> 3.3 Code cleanup
                            2.1 Process new ──────────> 2.2 Figures
                            2.2 Figures ───────────────> 2.3 Paper text
                            2.3 Paper ─────────────────> 2.4 New sections
                            2.4 New sections ──────────> 2.5 Submit arXiv
                            2.5 Submit arXiv ──────────> 3.7 Add arXiv URL to README
                            2.2 Figures ───────────────> 3.4 Screenshots in README
                                                        3.1 Tests ──────> 3.2 CI/CD
                                                        3.3 Cleanup ────> 3.6 PyPI
                                                        3.5 License ────> 3.6 PyPI

Parallelization Plan

Sprint 1 (3-5 days)

Day 1: Writer finishes editorial polish (1.1). Analyst fetches latest drafts (1.4, parallel). Coder prepares commit (1.5, parallel).
Day 2: Architect + Writer choose publication platform and set up (1.2). Architect drafts social media copy (1.3, start).
Day 3-4: Publish Post 1 (1.2). Finalize social media copy (1.3). Begin announcing.
Day 5: First post live, schedule confirmed, announcements out.

Sprint 2 (5-7 days)

Day 1-2: Analyst processes new drafts (2.1) + generates figures (2.2). Writer publishes posts 2-4 (2.6).
Day 3-5: Writer + Architect update paper text (2.3). Architect + Analyst add new sections (2.4). Writer publishes posts 5-7 (2.6).
Day 6-7: Architect prepares arXiv submission (2.5). Writer publishes post 8 (2.6).

Sprint 3 (7-10 days)

Day 1-3: Coder writes tests (3.1) and cleans code (3.3) in parallel. Coder adds LICENSE (3.5).
Day 4-5: Coder sets up CI (3.2). Writer + Coder write README/docs (3.4).
Day 6-8: Coder prepares PyPI package (3.6). Coder + Architect prepare public repo (3.7).
Day 9-10: Final testing, publish to PyPI, make repo public.

Risk Register

Risk	Impact	Mitigation
Blog stats shift significantly with 101 new drafts	M — requires rewriting parts of posts	Run pipeline first (1.1), check before publishing
Ollama not available for embeddings	M — blocks pipeline	Can proceed without embeddings; analyze + ideas work independently
arXiv paper exceeds page limit after updates	L — need to trim	Set hard limit at 16 pages; prioritize 1-2 new sections max
PyPI name `ietf-draft-analyzer` taken	L — just rename	Check name availability early in Sprint 3
IETF 122 generates a burst of new drafts mid-sprint	L — good problem to have	Fetch and process incrementally; blog posts note the snapshot date

What Value Comes When

Milestone	Sprint	Impact
Blog Post 1 live	Sprint 1	First public visibility; test the messaging
Full blog series published	Sprint 2	Complete narrative in front of the community
arXiv paper submitted	Sprint 2	Academic credibility; citeable artifact
Open-source tool on PyPI	Sprint 3	Others can reproduce and extend the analysis
Public GitHub repo	Sprint 3	Community contributions; methodology visible

The highest-value action is publishing the blog series NOW (Sprint 1). The landscape is moving — every week of delay makes the analysis slightly less current. The paper and open-source release build credibility and longevity but can follow.

23 KiB Raw Permalink Blame History