Fix security, data integrity, and accuracy issues from 4-perspective review

Security fixes: - Fix SQL injection in db.py:update_generation_run (column name whitelist) - Flask SECRET_KEY from env var instead of hardcoded - Add LLM rating bounds validation (_clamp_rating, 1-10) - Fix JSON extraction trailing whitespace handling Data integrity: - Normalize 21 legacy category names to 11 canonical short forms - Add false_positive column, flag 73 non-AI drafts (361 relevant remain) - Document verified counts: 434 total/361 relevant drafts, 557 authors, 419 ideas, 11 gaps Code quality: - Fix version string 0.1.0 → 0.2.0 - Add close()/context manager to Embedder class - Dynamic matrix size instead of hardcoded "260x260" Blog accuracy: - Fix EU AI Act timeline (enforcement Aug 2026, not "18 months") - Distinguish OAuth consent from GDPR Einwilligung - Add EU AI Act Annex III context to hospital scenario - Add FIPA, eIDAS 2.0 references where relevant Methodology: - Add methodology.md documenting pipeline, limitations, rating rubric - Add LLM-as-judge caveats to analyzer.py - Document clustering threshold rationale Reviews from: legal (German/EU law), statistics, development, science perspectives. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 10:52:33 +01:00
parent a386d0bb1a
commit 439424bd04
19 changed files with 1745 additions and 126 deletions
--- a/data/reports/blog-series/06-big-picture.md
+++ b/data/reports/blog-series/06-big-picture.md
@@ -105,7 +105,7 @@ Each draft addresses specific gaps. Together, they provide the connective tissue

 ## Traction vs. Aspiration

-A reality check: of the 361 drafts, only **36 (10%)** have been adopted by IETF working groups. The rest are individual submissions -- proposals without institutional backing. The WG-adopted drafts score higher on average (**3.54 vs. 3.31**), particularly on maturity (+1.28) and momentum (+0.98), but lower on novelty (-0.45). The WGs that have adopted the most agent-relevant drafts are security-focused: **lamps** (6 drafts), **lake** (5), **tls** (3), **emu** (3). Agent-specific WGs like `aipref` have adopted only 2 drafts.
+A reality check: of the 361 drafts, only **36 (10%)** have been adopted by IETF working groups. The rest are individual submissions -- proposals without institutional backing. The WG-adopted drafts score higher on average (**3.54 vs. 3.31**), particularly on maturity (+1.28) and momentum (+0.98), but lower on novelty (-0.45). *(Note: scores are LLM-generated relative rankings from abstracts; see [Methodology](../methodology.md).)* The WGs that have adopted the most agent-relevant drafts are security-focused: **lamps** (6 drafts), **lake** (5), **tls** (3), **emu** (3). Agent-specific WGs like `aipref` have adopted only 2 drafts.

 This reveals a structural insight: the IETF is not building agent standards from scratch. It is **retrofitting security standards for agents**. The agent architecture we propose above would need to work within this reality -- building on the security WGs' infrastructure rather than competing with it.