e7527ad68e
Fix remaining critical, high, and medium issues from 4-perspective review
...
Critical fixes:
- Fix rating clamp range 1-10 → 1-5 (actual scale)
- Add `ietf ideas convergence` command (SequenceMatcher at 0.75 threshold)
- Fix "628 cross-org ideas" → 130 (verified from current DB) across 8 files
Security fixes:
- Sanitize FTS5 query input (strip special chars + boolean operators)
- Add rate limiting (10 req/min/IP) on Claude-calling endpoints
- Change <path:name> → <string:name> on draft routes
Codebase fixes:
- Add Database context manager (__enter__/__exit__)
- Wire false_positive filtering into queries (exclude by default in web UI)
- Fix Post 3 arithmetic ("~300" → "~409" distinct proposals)
Content & licensing:
- Add MIT LICENSE file
- Add IPR/FRAND notes (BCP 79, RFC 8179) to Posts 03 and 07
- Qualify "4:1 safety ratio" with monthly variation in 6 remaining files
- Add "Data as of March 2026" freeze-date headers to all 10 blog posts
- Hedge causal language in Post 04
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-08 12:47:47 +01:00
439424bd04
Fix security, data integrity, and accuracy issues from 4-perspective review
...
Security fixes:
- Fix SQL injection in db.py:update_generation_run (column name whitelist)
- Flask SECRET_KEY from env var instead of hardcoded
- Add LLM rating bounds validation (_clamp_rating, 1-10)
- Fix JSON extraction trailing whitespace handling
Data integrity:
- Normalize 21 legacy category names to 11 canonical short forms
- Add false_positive column, flag 73 non-AI drafts (361 relevant remain)
- Document verified counts: 434 total/361 relevant drafts, 557 authors, 419 ideas, 11 gaps
Code quality:
- Fix version string 0.1.0 → 0.2.0
- Add close()/context manager to Embedder class
- Dynamic matrix size instead of hardcoded "260x260"
Blog accuracy:
- Fix EU AI Act timeline (enforcement Aug 2026, not "18 months")
- Distinguish OAuth consent from GDPR Einwilligung
- Add EU AI Act Annex III context to hospital scenario
- Add FIPA, eIDAS 2.0 references where relevant
Methodology:
- Add methodology.md documenting pipeline, limitations, rating rubric
- Add LLM-as-judge caveats to analyzer.py
- Document clustering threshold rationale
Reviews from: legal (German/EU law), statistics, development, science perspectives.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-08 10:52:33 +01:00
6e3a387778
Idea quality pipeline, web UI features, academic paper
...
- Tighten idea extraction prompts (1-4 ideas, no sub-features) reducing
1,907 ideas to 468 across 434 drafts (78% reduction)
- Add embedding-based dedup (ietf dedup-ideas) for same-draft similarity
- Add novelty scoring (ietf ideas score) and filtering (ietf ideas filter)
using Claude to rate ideas 1-5, removing 49 generic building blocks
- Final count: 419 high-quality ideas (avg 1.1/draft)
- Web UI: gap explorer with live draft generation and pre-generated demos
- Web UI: D3.js author collaboration network (498 nodes, 1142 edges,
68 clusters, org filtering, interactive zoom/pan)
- Academic paper: 15-page LaTeX workshop paper analyzing the 434-draft
AI agent standards landscape
- Save improvement ideas backlog to data/reports/improvement-ideas.md
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-06 22:17:57 +01:00
404092b938
Generate 5-draft ecosystem family, fix formatter markdown stripping
...
Pipeline output:
- ABVP: Agent Behavior Verification Protocol (quality 3.0/5)
- AEM: Privacy-Preserving Agent Learning Protocol (quality 2.1/5)
- ATD: Agent Task DAG Framework (quality 2.5/5)
- HITL: Human-in-the-Loop Primitives (quality 2.4/5)
- AEPB: Real-Time Agent Rollback Protocol (quality 2.5/5)
- APAE: Agent Provenance Assurance Ecosystem (quality 2.5/5)
Quality gates: all pass novelty + references, format gate improved
with markdown stripping (_strip_markdown) and dynamic header padding.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-04 01:42:30 +01:00
7a1aa346b9
Observatory update: 434 docs, fix W3C fetcher, regenerate dashboard
...
- Fixed W3C fetcher to paginate /specifications endpoint (group
endpoints use type prefixes like cg/, wg/ that weren't in config)
- Fetched 72 new IETF drafts + 1 W3C spec, all analyzed and embedded
- Regenerated dashboard with updated data
- Total: 434 docs, 11 gaps, 1907 ideas
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-04 01:09:30 +01:00
d6beb9c0a0
v0.3.0: Gap-to-Draft pipeline, Living Standards Observatory, blog series
...
Gap-to-Draft Pipeline (ietf pipeline):
- Context builder assembles ideas, RFC foundations, similar drafts, ecosystem vision
- Generator produces outlines + sections using rich context with Claude
- Quality gates: novelty (embedding similarity), references, format, self-rating
- Family coordinator generates 5-draft ecosystem (AEM/ATD/HITL/AEPB/APAE)
- I-D formatter with proper headers, references, 72-char wrapping
Living Standards Observatory (ietf observatory):
- Source abstraction with IETF + W3C fetchers
- 7-step update pipeline: snapshot, fetch, analyze, embed, ideas, gaps, record
- Static GitHub Pages dashboard (explorer, gap tracker, timeline)
- Weekly CI/CD automation via GitHub Actions
Also includes:
- 361 drafts (expanded from 260 with 6 new keywords), 403 authors, 1,262 ideas, 12 gaps
- Blog series (8 posts planned), reports, arXiv paper figures
- Agent team infrastructure (CLAUDE.md, scripts, dev journal)
- 5 new DB tables, schema migration, ~15 new query methods
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-04 00:48:57 +01:00
be9cf9c5d9
v0.2.0: visualizations, interactive browser, arXiv paper, gap analysis
...
New features:
- 12 interactive visualizations (ietf viz): t-SNE landscape, similarity
heatmap, score distributions, timeline, bubble explorer, radar charts,
author network graph, category treemap, quality vs overlap, org bar chart,
ideas chart, and interactive draft browser
- Interactive draft browser (browser.html): filterable by category, keyword,
score sliders with sortable table and expandable detail rows
- arXiv paper (paper/main.tex): 13-page manuscript with all findings
- Gap analysis: 12 identified under-addressed areas
- Author network: collaboration graph, org contributions, cross-org analysis
- Draft generation from gaps (ietf draft-gen)
- Auto-load .env for API keys (python-dotenv)
New modules: visualize.py, authors.py, draftgen.py
New reports: timeline, overlap-matrix, authors, gaps
New deps: plotly, matplotlib, seaborn, scipy, scikit-learn, networkx, python-dotenv
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-28 13:37:55 +01:00
f44f9265bd
Add SQLite database with 260 analyzed drafts
...
Includes all draft metadata, full text, Claude ratings (cached),
and nomic-embed-text embeddings. This is the expensive data —
~114k tokens of Claude analysis + 260 Ollama embeddings.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-28 00:49:18 +01:00