# Sprint Plan: Next 3 Sprints *Created: 2026-03-03 | Status: Ready for execution* **Decisions made:** - **Publication platform**: GitHub Pages (Markdown-native, pairs with open-source release) - **Publication cadence**: Staggered, 1 post per day (Post 1 + series overview on day 1) - **License**: MIT --- ## Current State (Sprint 0 — Where We Are) | Asset | Status | Details | |-------|--------|---------| | Blog series (8 posts + overview) | Final polish | ~22K words across 10 files. Writer doing final editorial pass. All numbers updated to 361-draft dataset. | | CLI tool | Working | 24+ commands, 6,882 lines Python, 13 source files. 7 new features built this session (refs, trends, idea-overlap, status, revisions, centrality, co-occurrence). | | Database | Fully processed | 361 drafts, 557 authors, 230 orgs, 1,780 ideas (628 cross-org convergent), 12 gaps. Pipeline complete on all drafts. | | Reports | Fresh | All 15 report types regenerated from full dataset. 7 data packages in `data/reports/blog-series/data/`. | | arXiv paper | Draft v1 | 13 pages at `paper/main.tex`, based on 260-draft dataset — needs update to 361+ with new analyses. | | Packaging | Minimal | pyproject.toml exists (version 0.1.0). No tests, no CI, no LICENSE, no CONTRIBUTING.md. | | README | Outdated | References 260 drafts, version 0.1.0, missing new commands and features. | | Open source readiness | Not ready | No tests, no CI, no contribution guide, stale README. Tool works but isn't packaged for others. | --- ## Sprint 1: Publish the Blog Series + Begin Outreach (Highest Value, Time-Sensitive) **Theme**: Get the analysis in front of people while the landscape is still moving fast. IETF 122 is upcoming and the data changes monthly — publishing now maximizes relevance. The pipeline is complete, the blog series is in final editorial polish — this sprint is about getting it live and in front of the right people. **Duration**: 3-5 days **NOTE**: Tasks 1.1-1.3 from the original plan are ALREADY DONE: - Pipeline ran on all 361 drafts (557 authors, 1,780 ideas, 628 cross-org convergent, 12 gaps) - All 15 reports regenerated - All 8 blog posts updated with expanded-dataset numbers and verified for consistency - Two deep analysis rounds completed (revision velocity, safety ratio trends, RFC divergence, co-occurrence, IETF meeting timing) - Seven new CLI features implemented (refs, trends, idea-overlap, status, revisions, centrality, co-occurrence) ### Task 1.1: Complete Final Editorial Polish **Agent**: Writer **Effort**: S **What**: Finish the ongoing editorial pass. The Writer is currently on this task. Once complete, the series is publication-ready. **Acceptance criteria**: All 8 posts finalized. Writer confirms no further changes needed. Series overview (Post 00) is consistent with all posts. **Dependencies**: None (in progress now) ### Task 1.2: Choose a Publication Platform and Publish **Agent**: Writer + Architect **Effort**: M **What**: Decide where to publish the blog series and get posts live. Options (in order of recommendation): 1. **GitHub Pages** (fastest, free, Markdown-native) — create a simple static site with Jekyll/Hugo from the blog-series directory. Can publish straight from the repo. 2. **Substack / Medium / dev.to** — wider audience, built-in discovery, but requires manual formatting and copy-paste from Markdown. 3. **Personal blog** — if the author already has one, lowest friction. Key decisions: - Publish all 8 at once (series dump) or staggered (1 per day / every 2 days)? - **Recommendation**: Publish Post 1 + Post 00 (series overview) on day 1, then one post per day. This builds anticipation and makes each post shareable individually. - Interlink: each post should link to the previous and next post in the series. - If GitHub Pages: set up a minimal Jekyll/Hugo site with posts from `data/reports/blog-series/`. **Acceptance criteria**: Post 1 is live and publicly accessible. Publication schedule for posts 2-8 is set. **Dependencies**: Task 1.1 ### Task 1.3: Social Media and IETF Community Outreach **Agent**: Architect (drafts messaging) + Writer (final copy) **Effort**: M **What**: Write announcement copy and identify channels: - **Short-form posts** (Twitter/X, LinkedIn, Mastodon, Bluesky): 3-4 variations highlighting different angles: - The growth stat + safety deficit angle ("IETF AI agent drafts went from 0.5% to 9.3% of all submissions in 15 months — with a 4:1 capability-to-safety ratio") - The geopolitics angle ("One company writes 18% of all AI agent standards drafts") - The fragmentation angle ("120 A2A protocols, 14 competing OAuth proposals, zero interop layer") - The methodology angle ("We analyzed 361 IETF drafts with Claude for ~$9 and found the standards world is building highways before traffic lights") - **IETF mailing lists**: Post to relevant lists (art@ietf.org, agents@ietf.org if it exists, or general discussion) with a brief summary and link to Post 1 - **Hacker News / Reddit /r/networking /r/machinelearning**: Submit Post 1 or Post 4 (gaps — the most newsworthy finding) - **Direct outreach**: Identify 5-10 people who would care most (IETF area directors, authors of top-ranked drafts like DAAP and VOLT, AI standards researchers, MCP/A2A protocol teams) and share directly **Acceptance criteria**: Announcement text ready for 3+ platforms. List of 5-10 direct outreach targets with contact method. Post submitted to at least one community (HN or IETF list). **Dependencies**: Task 1.2 ### Task 1.4: Fetch Latest Drafts (IETF 122 Prep) **Agent**: Analyst **Effort**: S **What**: Run `ietf fetch` to pick up any new drafts submitted since the last fetch (2026-03-03). IETF meetings drive submission spikes — capture anything new before the paper update. Process any new drafts through the full pipeline (analyze, authors, ideas, embed, gaps). **Acceptance criteria**: Any new drafts since 2026-03-03 are fetched, stored, and fully processed. Count reported. **Dependencies**: None (can run in parallel with publishing tasks) ### Task 1.5: Commit All Current Work **Agent**: Coder **Effort**: S **What**: The git status shows extensive uncommitted work: modified source files (cli.py, db.py, analyzer.py, reports.py, visualize.py, models.py, config.py), new files (CLAUDE.md, orgs.py, scripts/, blog series), new reports, agent definitions. Create a clean commit (or a few topical commits) capturing the current state: v0.3.0 with expanded dataset, 8-post blog series, 7 new features, agent team artifacts. **Acceptance criteria**: All meaningful changes committed. Working tree clean. Version bumped to 0.3.0 in pyproject.toml and cli.py. **Dependencies**: Task 1.1 (want final polished blog posts in the commit) ### Sprint 1 Success Criteria - [ ] Blog series editorial polish complete - [ ] Blog Post 1 is live and publicly accessible - [ ] Publication schedule set for remaining 7 posts - [ ] Announcement posted to at least 2 channels - [ ] Latest drafts fetched for IETF 122 coverage - [ ] All current work committed to git (v0.3.0) --- ## Sprint 2: Update the arXiv Paper + Deepen Analysis **Theme**: Turn the expanded dataset into a credible academic contribution. The paper at `paper/main.tex` is solid but frozen at 260 drafts — updating it to 361+ with the new analyses makes it publishable. **Duration**: 5-7 days ### Task 2.1: Run Pipeline on Any Newly Fetched Drafts **Agent**: Analyst **Effort**: S-M **What**: Process any new drafts picked up in Sprint 1 Task 1.4 (IETF 122 fetch). Run the full pipeline: analyze, authors, ideas, embed, gaps. Regenerate reports. **Acceptance criteria**: All drafts in the database are fully analyzed. Reports refreshed. **Dependencies**: Sprint 1 Task 1.4 ### Task 2.2: Generate Updated Figures for the Paper **Agent**: Analyst + Coder **Effort**: M **What**: The paper has placeholder figures ([TIMELINE], [QUALITY], [RADAR], [NETWORK]). Run `ietf viz all` to regenerate the HTML visualizations, then use `paper/export_figures.py` (or write one if it doesn't work) to export publication-quality PNG/PDF versions. Figures needed: - Figure 1: Monthly submission timeline (stacked area by category) — with 361+ drafts - Figure 2: Rating distributions by category (violin plots) — updated - Figure 3: Similarity heatmap (now 361x361) — updated - Figure 4: Quality vs. uniqueness scatter — updated - Figure 5: Radar charts for top categories — updated - Figure 6: Author collaboration network — updated with expanded dataset **Acceptance criteria**: All 6 figures exported as PDF/PNG in `paper/figures/`. Each figure reflects the full expanded dataset. **Dependencies**: Task 2.1 ### Task 2.3: Update Paper Text and Tables **Agent**: Writer + Architect **Effort**: L **What**: Update `paper/main.tex` throughout: - **Title**: Change "260" to the current count - **Abstract**: Update all numbers (drafts, authors, ideas, similarity pairs, tokens) - **Section 3 (Methodology)**: Add the 6 new keywords (mcp, agentic, inference, generative, intelligent, aipref). Update seed keyword list from 6 to 12. - **Section 4 (Dataset Overview)**: Update Table 1 with new numbers. Time span now extends further. - **Section 5 (Findings)**: All tables and stats need updating with 361+ draft data. Particularly: - Table 2 (categories) — will likely have new entries from expanded keywords - Table 4 (organizations) — author/org counts will change - Table 5 (top-ranked drafts) — new top drafts may emerge - All prose stats (percentages, counts, growth rates) - **Section 6 (Discussion)**: Revisit conclusions — do they still hold with 40% more data? Any new insights? - **Section 7 (Future Work)**: Several items are now done (citation network via refs, gap-driven standardization). Update. - **Section 8 (Conclusion)**: Update closing numbers - **Appendix A**: Full category list will likely expand - **Repository URL**: Fill in the [TODO] with actual URL after open-source release (or leave as "available upon publication") - **References**: Add citations for MCP, any new relevant work **Acceptance criteria**: Paper compiles cleanly with `make` (or `pdflatex`). All numbers match the current database. No [TODO] placeholders remain (except possibly author name/email if not yet decided). **Dependencies**: Tasks 2.1, 2.2 ### Task 2.4: Add New Analysis Sections to Paper **Agent**: Architect + Analyst **Effort**: M **What**: The blog series uncovered findings that aren't in the paper. Consider adding: - **Cross-organization convergence**: The idea-overlap analysis showing which technical ideas appear across multiple independent organizations (convergence signals vs. within-org duplication) - **Team bloc analysis**: The 33 detected collaboration clusters and what they mean for coordination - **RFC cross-reference graph**: Which foundational RFCs underpin the agent ecosystem (if `ietf refs` data is available) - **Gap evolution**: Whether any of the 12 identified gaps are being filled by newer submissions Pick the 1-2 most compelling additions that strengthen the paper's contribution without bloating it past 15 pages. **Acceptance criteria**: 1-2 new subsections added. Paper is <= 16 pages. New sections have supporting data tables/figures. **Dependencies**: Task 2.3 ### Task 2.5: Submit to arXiv **Agent**: Architect **Effort**: S **What**: Prepare the arXiv submission package: - Compile the final PDF - Ensure all figures are included - Write the arXiv abstract (may differ slightly from paper abstract) - Choose categories: cs.AI, cs.SE, or cs.CY (Computers and Society) — recommend cs.AI as primary, cs.SE as secondary - Upload to arXiv **Acceptance criteria**: Paper submitted to arXiv with no compilation errors. arXiv ID received. **Dependencies**: Tasks 2.3, 2.4 ### Task 2.6: Continue Blog Publication (Posts 2-8) **Agent**: Writer **Effort**: S (per post, ongoing) **What**: Continue publishing the remaining blog posts on the schedule set in Sprint 1. For each post: format for the platform, add interlinks, publish, announce on social media. **Acceptance criteria**: All 8 posts published by end of Sprint 2. Each post has at least one social media announcement. **Dependencies**: Sprint 1 Task 1.4 ### Sprint 2 Success Criteria - [ ] arXiv paper updated with 361+ draft data and submitted - [ ] All 8 blog posts published - [ ] At least one new analysis (cross-org convergence or RFC cross-refs) added to paper - [ ] Paper figures reflect the expanded dataset --- ## Sprint 3: Open-Source Release + Tool Polish **Theme**: Make the tool usable by others. The analysis is interesting, but the *methodology* — combining LLM rating with embedding similarity at scale — is the reusable contribution. A clean open-source release turns a one-off project into a tool others can use to monitor IETF standardization activity. **Duration**: 7-10 days ### Task 3.1: Add Tests **Agent**: Coder **Effort**: L **What**: Currently zero tests. Add a test suite covering the critical paths: - `tests/test_db.py` — Database operations (upsert, query, FTS search, embedding storage) - `tests/test_config.py` — Config load/save, defaults, overrides - `tests/test_models.py` — Dataclass creation, serialization - `tests/test_fetcher.py` — Mock Datatracker API responses, parse draft metadata - `tests/test_analyzer.py` — Mock Claude API, parse ratings/ideas from responses - `tests/test_cli.py` — Click CLI integration tests (using CliRunner) - `tests/test_reports.py` — Report generation produces valid markdown Use pytest. Mock all external APIs (Datatracker, Claude, Ollama). Include a small test fixture database with ~5 drafts. Add to pyproject.toml: ```toml [project.optional-dependencies] dev = ["pytest>=7.0", "pytest-cov", "responses>=0.25"] ``` **Acceptance criteria**: `pytest` passes with >= 70% coverage on db.py, config.py, models.py. All tests run without network access. **Dependencies**: None ### Task 3.2: Set Up CI/CD (GitHub Actions) **Agent**: Coder **Effort**: M **What**: Create `.github/workflows/ci.yml`: - Trigger on push to main and PRs - Python 3.11+ matrix - Install deps, run pytest - Run linting (ruff or flake8) - Optionally: build the package and check it installs cleanly **Acceptance criteria**: CI green on current main. PRs get automatic test runs. Badge in README. **Dependencies**: Task 3.1 ### Task 3.3: Clean Up Code for Public Release **Agent**: Coder **Effort**: M **What**: Audit and fix: - **Version**: Update from 0.1.0 to 0.3.0 (0.2.0 was the viz release, 0.3.0 is the expanded+blog release) in both `pyproject.toml` and `cli.py` - **Config portability**: `config.py` uses `Path(__file__).resolve().parent.parent.parent / "data"` as default — this only works in dev mode. For installed packages, default to `~/.ietf-analyzer/` or use XDG conventions. Keep current behavior as fallback for dev. - **Remove stale files**: `data/ietf_drafts.db` (empty/stale per MEMORY.md). Clean up any other debris. - **Error handling at boundaries**: Ensure graceful failures when: - Ollama is not running (`ietf embed` should say "Ollama not reachable" not crash) - ANTHROPIC_API_KEY is not set (`ietf analyze` should say "Set ANTHROPIC_API_KEY" not crash) - Network is unavailable (`ietf fetch` should handle timeouts) - **Type hints**: Ensure public API functions have type hints (don't add to internal helpers) - **.env support**: Already uses python-dotenv — verify it works from any working directory - **Remove any debug prints or commented-out code** **Acceptance criteria**: Tool installs cleanly via `pip install .` in a fresh venv. Commands that need external services give clear error messages when those services are unavailable. **Dependencies**: None ### Task 3.4: Write Proper README and Docs **Agent**: Writer + Coder **Effort**: M **What**: Rewrite `README.md` for an open-source audience: - Project description with the hook (what it does, why it matters) - Updated stats (361+ drafts, 557+ authors, 1,780+ ideas) - Installation: pip install, API key setup, Ollama setup - Quick start: the 6-command pipeline - Full CLI reference (all 24 commands with options) - Configuration: explain config.json, environment variables, defaults - Architecture overview: how the modules fit together (fetcher -> db -> analyzer -> reports/viz) - Screenshots/examples of output (report excerpts, viz screenshots) - Contributing guide (CONTRIBUTING.md) - Link to the blog series and arXiv paper - License badge, CI badge **Acceptance criteria**: A newcomer can go from `git clone` to running their first analysis by following the README. No broken links. **Dependencies**: Task 3.3 ### Task 3.5: Add LICENSE **Agent**: Coder **Effort**: S **What**: Choose and add a license. Recommendations: - **MIT** — simplest, most permissive, standard for tools like this - **Apache 2.0** — if patent protection matters Add to pyproject.toml: ```toml license = {text = "MIT"} ``` **Acceptance criteria**: LICENSE file in repo root. License field in pyproject.toml. License header convention documented. **Dependencies**: None ### Task 3.6: Prepare PyPI Package **Agent**: Coder **Effort**: M **What**: Make the package installable from PyPI: - Update pyproject.toml with full metadata (author, URL, classifiers, keywords, long_description from README) - Ensure `python -m build` produces a clean sdist and wheel - Test install from the built wheel in a fresh venv - Write publish script: `scripts/publish.sh` using twine - Register the package name on PyPI (or Test PyPI first) ```toml [project] name = "ietf-draft-analyzer" version = "0.3.0" description = "Track, categorize, and rate AI/agent-related IETF Internet-Drafts" readme = "README.md" license = {text = "MIT"} keywords = ["ietf", "internet-draft", "ai-agents", "standards", "nlp"] classifiers = [ "Development Status :: 3 - Alpha", "Environment :: Console", "Intended Audience :: Science/Research", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Programming Language :: Python :: 3.11", ] ``` **Acceptance criteria**: `pip install ietf-draft-analyzer` works from PyPI (or Test PyPI). The `ietf` command is available after install. **Dependencies**: Tasks 3.3, 3.4, 3.5 ### Task 3.7: Public Repository Setup **Agent**: Coder + Architect **Effort**: S **What**: Prepare the GitHub repository for public visibility: - Ensure `.gitignore` covers: `.env`, `__pycache__`, `*.egg-info`, `data/config.json`, `data/drafts.db` (the actual user data shouldn't be in the repo — but the schema should be recreatable) - Add `CONTRIBUTING.md` with: how to set up dev environment, how to run tests, code style, PR process - Add issue templates (bug report, feature request) - Update paper [TODO] repository URL - Decide: include `data/drafts.db` in repo (useful for reproducibility) or .gitignore it (cleaner)? - **Recommendation**: Include a snapshot as a GitHub Release asset, not in the git tree. Add a `scripts/download-data.sh` that fetches it. - Set repo topics: ietf, internet-drafts, ai-agents, standardization, nlp, cli **Acceptance criteria**: Repo is ready to be made public. No secrets or personal data in git history. README renders correctly on GitHub. **Dependencies**: Tasks 3.4, 3.5, 3.6 ### Sprint 3 Success Criteria - [ ] Test suite with >= 70% coverage on core modules - [ ] CI/CD running on GitHub Actions - [ ] Tool installable via `pip install ietf-draft-analyzer` - [ ] README guides a newcomer from clone to first analysis - [ ] LICENSE file present (MIT) - [ ] Repository is public-ready (no secrets, clean history, proper .gitignore) --- ## Cross-Sprint Dependencies ``` Sprint 1 Sprint 2 Sprint 3 ──────── ──────── ──────── 1.1 Final polish ────────> 1.2 Publish 1.2 Publish ─────────────> 2.6 Continue publishing 1.4 Fetch latest ────────> 2.1 Process new drafts 1.5 Commit v0.3.0 ──────> 3.3 Code cleanup 2.1 Process new ──────────> 2.2 Figures 2.2 Figures ───────────────> 2.3 Paper text 2.3 Paper ─────────────────> 2.4 New sections 2.4 New sections ──────────> 2.5 Submit arXiv 2.5 Submit arXiv ──────────> 3.7 Add arXiv URL to README 2.2 Figures ───────────────> 3.4 Screenshots in README 3.1 Tests ──────> 3.2 CI/CD 3.3 Cleanup ────> 3.6 PyPI 3.5 License ────> 3.6 PyPI ``` ## Parallelization Plan ### Sprint 1 (3-5 days) - **Day 1**: Writer finishes editorial polish (1.1). Analyst fetches latest drafts (1.4, parallel). Coder prepares commit (1.5, parallel). - **Day 2**: Architect + Writer choose publication platform and set up (1.2). Architect drafts social media copy (1.3, start). - **Day 3-4**: Publish Post 1 (1.2). Finalize social media copy (1.3). Begin announcing. - **Day 5**: First post live, schedule confirmed, announcements out. ### Sprint 2 (5-7 days) - **Day 1-2**: Analyst processes new drafts (2.1) + generates figures (2.2). Writer publishes posts 2-4 (2.6). - **Day 3-5**: Writer + Architect update paper text (2.3). Architect + Analyst add new sections (2.4). Writer publishes posts 5-7 (2.6). - **Day 6-7**: Architect prepares arXiv submission (2.5). Writer publishes post 8 (2.6). ### Sprint 3 (7-10 days) - **Day 1-3**: Coder writes tests (3.1) and cleans code (3.3) in parallel. Coder adds LICENSE (3.5). - **Day 4-5**: Coder sets up CI (3.2). Writer + Coder write README/docs (3.4). - **Day 6-8**: Coder prepares PyPI package (3.6). Coder + Architect prepare public repo (3.7). - **Day 9-10**: Final testing, publish to PyPI, make repo public. --- ## Risk Register | Risk | Impact | Mitigation | |------|--------|------------| | Blog stats shift significantly with 101 new drafts | M — requires rewriting parts of posts | Run pipeline first (1.1), check before publishing | | Ollama not available for embeddings | M — blocks pipeline | Can proceed without embeddings; analyze + ideas work independently | | arXiv paper exceeds page limit after updates | L — need to trim | Set hard limit at 16 pages; prioritize 1-2 new sections max | | PyPI name `ietf-draft-analyzer` taken | L — just rename | Check name availability early in Sprint 3 | | IETF 122 generates a burst of new drafts mid-sprint | L — good problem to have | Fetch and process incrementally; blog posts note the snapshot date | --- ## What Value Comes When | Milestone | Sprint | Impact | |-----------|--------|--------| | Blog Post 1 live | Sprint 1 | First public visibility; test the messaging | | Full blog series published | Sprint 2 | Complete narrative in front of the community | | arXiv paper submitted | Sprint 2 | Academic credibility; citeable artifact | | Open-source tool on PyPI | Sprint 3 | Others can reproduce and extend the analysis | | Public GitHub repo | Sprint 3 | Community contributions; methodology visible | The highest-value action is publishing the blog series NOW (Sprint 1). The landscape is moving — every week of delay makes the analysis slightly less current. The paper and open-source release build credibility and longevity but can follow.