Platform upgrade: semantic search, citations, readiness, tests, Docker

Major features added by 5 parallel agent teams:
- Semantic "Ask" (NL queries via FTS5 + embeddings + Claude synthesis)
- Global search across drafts, ideas, authors, gaps
- REST API expansion (14 endpoints, up from 3) with CSV/JSON export
- Citation graph visualization (D3.js, 440 nodes, 2422 edges)
- Standards readiness scoring (0-100 composite from 6 factors)
- Side-by-side draft comparison view with shared/unique analysis
- Annotation system (notes + tags per draft, DB-persisted)
- Docker deployment (Dockerfile + docker-compose with Ollama)
- Scheduled updates (cron script with log rotation)
- Pipeline health dashboard (stage progress bars, cost tracking)
- Test suite foundation (54 pytest tests covering DB, models, web data)

Fixes: compare_drafts() stubbed→working, get_authors_for_draft() bug,
source-aware analysis prompts, config env var overrides + validation,
resilient batch error handling with --retry-failed, observatory --dry-run

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-07 20:52:56 +01:00
parent da2a989744
commit 757b781c67
33 changed files with 4253 additions and 170 deletions

View File

@@ -45,8 +45,34 @@ ietf viz all
# Open the interactive browser
xdg-open data/figures/browser.html
# Launch the web dashboard
./scripts/run-webui.sh
```
## Web Dashboard
A full interactive dashboard at `http://127.0.0.1:5000` with 8 pages:
```bash
# Start the dashboard
./scripts/run-webui.sh
# or: python src/webui/app.py
```
| Page | What it shows |
|------|---------------|
| **Overview** | Stat cards, score histogram, category donut, submission timeline, category radar |
| **Draft Explorer** | Searchable/filterable/sortable table of all drafts with category pills and score badges |
| **Draft Detail** | Individual draft view with score ring, dimension bars, ideas, references, and linked authors |
| **Ratings** | Score distributions, dimension box plots, category radar, novelty vs maturity scatter, top-20 leaderboard |
| **Landscape** | t-SNE embedding map, quality quadrants, violin plots by category |
| **Authors** | Co-authorship force-directed graph, organization charts, cross-org collaboration |
| **Ideas** | Extracted ideas grouped by type with search |
| **Gaps** | Gap cards sorted by severity with links to related drafts |
Charts are interactive (Plotly.js) — click data points to navigate to draft details, click categories to filter.
## CLI Commands
### Core Pipeline
@@ -182,6 +208,7 @@ The safety deficit is the most striking finding — only **12.3%** of categorize
- **SQLite** with FTS5 full-text search and WAL mode
- **Anthropic Claude** (Sonnet 4) for analysis, rating, idea extraction, gap analysis
- **Ollama** (nomic-embed-text) for local embeddings and similarity
- **Flask** with Jinja2 for the interactive web dashboard
- **Plotly** for interactive HTML visualizations
- **Matplotlib/Seaborn** for publication-ready static figures
- **NetworkX** for author collaboration graph analysis
@@ -203,6 +230,11 @@ src/ietf_analyzer/
draftgen.py # Internet-Draft generation from gap analysis
config.py # Configuration with defaults
src/webui/
app.py # Flask application with all routes
data.py # Data access layer (stats, filtering, t-SNE, network graphs)
templates/ # Jinja2 templates (base + 8 page templates)
data/
drafts.db # SQLite database (all analysis data)
reports/ # Generated markdown reports