Commit Graph

6 Commits

Author SHA1 Message Date
e7527ad68e Fix remaining critical, high, and medium issues from 4-perspective review
Critical fixes:
- Fix rating clamp range 1-10 → 1-5 (actual scale)
- Add `ietf ideas convergence` command (SequenceMatcher at 0.75 threshold)
- Fix "628 cross-org ideas" → 130 (verified from current DB) across 8 files

Security fixes:
- Sanitize FTS5 query input (strip special chars + boolean operators)
- Add rate limiting (10 req/min/IP) on Claude-calling endpoints
- Change <path:name> → <string:name> on draft routes

Codebase fixes:
- Add Database context manager (__enter__/__exit__)
- Wire false_positive filtering into queries (exclude by default in web UI)
- Fix Post 3 arithmetic ("~300" → "~409" distinct proposals)

Content & licensing:
- Add MIT LICENSE file
- Add IPR/FRAND notes (BCP 79, RFC 8179) to Posts 03 and 07
- Qualify "4:1 safety ratio" with monthly variation in 6 remaining files
- Add "Data as of March 2026" freeze-date headers to all 10 blog posts
- Hedge causal language in Post 04

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 12:47:47 +01:00
f1a0b0264c Fix blog accuracy and add methodology documentation
Blog posts (all 10 files updated):
- Update all counts to match DB: 434 drafts, 557 authors, 419 ideas, 11 gaps
- Fix EU AI Act timeline to August 2026 (5 months, not 18)
- Reframe growth claim from "36x" to actual monthly figures (5→61→85)
- Add safety ratio nuance (1.5:1 to 21:1 monthly variation)
- Fix composite scores (4.8→4.75, 4.6→4.5)
- Add OAuth/GDPR consent distinction (Art. 6(1)(a), Art. 28)
- Add EU AI Act Annex III + MDR context to hospital scenario
- Add FIPA, IEEE P3394, eIDAS 2.0 references
- Add GDPR gap paragraph (DPIA, erasure, portability, purpose limitation)
- Rewrite Post 04 gap table to match actual DB gap names

Methodology:
- Expand methodology.md: pipeline docs, limitations, related work
- Add LLM-as-judge caveats and explicit rating rubric to analyzer.py
- Add clustering threshold rationale to embeddings.py
- Add gap analysis grounding notes to analyzer.py
- Add Limitations section to Post 07

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 11:04:40 +01:00
439424bd04 Fix security, data integrity, and accuracy issues from 4-perspective review
Security fixes:
- Fix SQL injection in db.py:update_generation_run (column name whitelist)
- Flask SECRET_KEY from env var instead of hardcoded
- Add LLM rating bounds validation (_clamp_rating, 1-10)
- Fix JSON extraction trailing whitespace handling

Data integrity:
- Normalize 21 legacy category names to 11 canonical short forms
- Add false_positive column, flag 73 non-AI drafts (361 relevant remain)
- Document verified counts: 434 total/361 relevant drafts, 557 authors, 419 ideas, 11 gaps

Code quality:
- Fix version string 0.1.0 → 0.2.0
- Add close()/context manager to Embedder class
- Dynamic matrix size instead of hardcoded "260x260"

Blog accuracy:
- Fix EU AI Act timeline (enforcement Aug 2026, not "18 months")
- Distinguish OAuth consent from GDPR Einwilligung
- Add EU AI Act Annex III context to hospital scenario
- Add FIPA, eIDAS 2.0 references where relevant

Methodology:
- Add methodology.md documenting pipeline, limitations, rating rubric
- Add LLM-as-judge caveats to analyzer.py
- Document clustering threshold rationale

Reviews from: legal (German/EU law), statistics, development, science perspectives.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 10:52:33 +01:00
757b781c67 Platform upgrade: semantic search, citations, readiness, tests, Docker
Major features added by 5 parallel agent teams:
- Semantic "Ask" (NL queries via FTS5 + embeddings + Claude synthesis)
- Global search across drafts, ideas, authors, gaps
- REST API expansion (14 endpoints, up from 3) with CSV/JSON export
- Citation graph visualization (D3.js, 440 nodes, 2422 edges)
- Standards readiness scoring (0-100 composite from 6 factors)
- Side-by-side draft comparison view with shared/unique analysis
- Annotation system (notes + tags per draft, DB-persisted)
- Docker deployment (Dockerfile + docker-compose with Ollama)
- Scheduled updates (cron script with log rotation)
- Pipeline health dashboard (stage progress bars, cost tracking)
- Test suite foundation (54 pytest tests covering DB, models, web data)

Fixes: compare_drafts() stubbed→working, get_authors_for_draft() bug,
source-aware analysis prompts, config env var overrides + validation,
resilient batch error handling with --retry-failed, observatory --dry-run

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 20:52:56 +01:00
6e3a387778 Idea quality pipeline, web UI features, academic paper
- Tighten idea extraction prompts (1-4 ideas, no sub-features) reducing
  1,907 ideas to 468 across 434 drafts (78% reduction)
- Add embedding-based dedup (ietf dedup-ideas) for same-draft similarity
- Add novelty scoring (ietf ideas score) and filtering (ietf ideas filter)
  using Claude to rate ideas 1-5, removing 49 generic building blocks
- Final count: 419 high-quality ideas (avg 1.1/draft)
- Web UI: gap explorer with live draft generation and pre-generated demos
- Web UI: D3.js author collaboration network (498 nodes, 1142 edges,
  68 clusters, org filtering, interactive zoom/pan)
- Academic paper: 15-page LaTeX workshop paper analyzing the 434-draft
  AI agent standards landscape
- Save improvement ideas backlog to data/reports/improvement-ideas.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 22:17:57 +01:00
d6beb9c0a0 v0.3.0: Gap-to-Draft pipeline, Living Standards Observatory, blog series
Gap-to-Draft Pipeline (ietf pipeline):
- Context builder assembles ideas, RFC foundations, similar drafts, ecosystem vision
- Generator produces outlines + sections using rich context with Claude
- Quality gates: novelty (embedding similarity), references, format, self-rating
- Family coordinator generates 5-draft ecosystem (AEM/ATD/HITL/AEPB/APAE)
- I-D formatter with proper headers, references, 72-char wrapping

Living Standards Observatory (ietf observatory):
- Source abstraction with IETF + W3C fetchers
- 7-step update pipeline: snapshot, fetch, analyze, embed, ideas, gaps, record
- Static GitHub Pages dashboard (explorer, gap tracker, timeline)
- Weekly CI/CD automation via GitHub Actions

Also includes:
- 361 drafts (expanded from 260 with 6 new keywords), 403 authors, 1,262 ideas, 12 gaps
- Blog series (8 posts planned), reports, arXiv paper figures
- Agent team infrastructure (CLAUDE.md, scripts, dev journal)
- 5 new DB tables, schema migration, ~15 new query methods

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 00:48:57 +01:00