Complete remaining medium/low issues: performance, CLI, types, CI, tests
Performance: - Batch readiness computation (~200 queries → ~6 per page) - Batch draft lookup in author network (N+1 → single query) - File-based similarity matrix cache (.npy + metadata sidecar) - 5-minute TTL embedding cache for search queries CLI quality: - Add pass_cfg_db decorator, convert ~30 commands to shared config/db lifecycle - Add --dry-run to analyze, embed, embed-ideas, ideas, gaps commands - Move 15+ in-function imports to top of data.py Types & documentation: - Add 16 TypedDicts to data.py, annotate 12 function return types - Add ethics section to Post 06 (premature standardization, power asymmetry) - Add EU AI Act Article 43 conformity mapping to Post 06 - Add NIS2 and CRA references to Post 04 CI & testing: - Add GitHub Actions CI workflow (Python 3.11+3.12, ruff, pytest) - Add API documentation for all 20 endpoints (data/reports/api-docs.md) - Add 41 new tests (test_analyzer.py, test_search.py) — 64 total pass Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
26
.github/workflows/ci.yml
vendored
Normal file
26
.github/workflows/ci.yml
vendored
Normal file
@@ -0,0 +1,26 @@
|
||||
name: CI
|
||||
on:
|
||||
push:
|
||||
branches: [main]
|
||||
pull_request:
|
||||
branches: [main]
|
||||
|
||||
jobs:
|
||||
test:
|
||||
runs-on: ubuntu-latest
|
||||
strategy:
|
||||
matrix:
|
||||
python-version: ["3.11", "3.12"]
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: ${{ matrix.python-version }}
|
||||
- name: Install dependencies
|
||||
run: pip install -e ".[test]"
|
||||
- name: Lint with ruff
|
||||
run: |
|
||||
pip install ruff
|
||||
ruff check src/ tests/ --select E,F,W --ignore E501
|
||||
- name: Run tests
|
||||
run: pytest tests/ -v --tb=short
|
||||
359
data/reports/api-docs.md
Normal file
359
data/reports/api-docs.md
Normal file
@@ -0,0 +1,359 @@
|
||||
# IETF Draft Analyzer — API Documentation
|
||||
|
||||
All API endpoints return JSON by default. Several support `?format=csv` for CSV export.
|
||||
|
||||
Base URL: `http://localhost:5000`
|
||||
|
||||
---
|
||||
|
||||
## Public Endpoints
|
||||
|
||||
### GET /api/stats
|
||||
|
||||
Overview statistics for the entire corpus.
|
||||
|
||||
**Parameters:** None
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"total_drafts": 361,
|
||||
"rated_drafts": 260,
|
||||
"total_authors": 403,
|
||||
"total_ideas": 1262,
|
||||
"total_gaps": 12,
|
||||
"avg_score": 3.42
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### GET /api/drafts
|
||||
|
||||
Paginated, filterable list of drafts.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `page` | int | 1 | Page number |
|
||||
| `q` | string | "" | Full-text search query |
|
||||
| `cat` | string | "" | Filter by category |
|
||||
| `source` | string | "" | Filter by source (ietf, w3c) |
|
||||
| `min_score` | float | 0.0 | Minimum composite score |
|
||||
| `sort` | string | "score" | Sort field |
|
||||
| `dir` | string | "desc" | Sort direction (asc/desc) |
|
||||
| `format` | string | "json" | Response format: "json" or "csv" |
|
||||
|
||||
**Response:** JSON object with `drafts` array and pagination metadata.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/drafts/{name}
|
||||
|
||||
Detail for a single draft including rating, authors, ideas, and references.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `name` | string | Draft name, e.g. `draft-ietf-ai-agent-protocol` |
|
||||
|
||||
**Response:** JSON object with full draft detail, or `{"error": "Draft not found"}` (404).
|
||||
|
||||
---
|
||||
|
||||
### GET /api/categories
|
||||
|
||||
Category names and draft counts.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `format` | string | "json" | "json" or "csv" |
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"A2A protocols": 45,
|
||||
"AI safety/alignment": 38,
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### GET /api/ratings
|
||||
|
||||
Rating distributions across the corpus.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `format` | string | "json" | "json" or "csv" |
|
||||
|
||||
**Response:** JSON object with arrays: `names`, `scores`, `novelty`, `maturity`, `overlap`, `momentum`, `relevance`, `categories`.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/timeline
|
||||
|
||||
Timeline data showing draft publication over time.
|
||||
|
||||
**Parameters:** None
|
||||
|
||||
**Response:** JSON object with timeline series data.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/landscape
|
||||
|
||||
t-SNE 2D embedding landscape of all drafts.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `format` | string | "json" | "json" or "csv" |
|
||||
|
||||
**Response:** JSON array of `{name, x, y, category, score}` points.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/similarity
|
||||
|
||||
Draft similarity network graph.
|
||||
|
||||
**Parameters:** None
|
||||
|
||||
**Response:** JSON object with `nodes` and `edges` arrays for a force-directed graph.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/idea-clusters
|
||||
|
||||
Clustered ideas across drafts.
|
||||
|
||||
**Parameters:** None
|
||||
|
||||
**Response:** JSON object with cluster data.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/ideas
|
||||
|
||||
All extracted technical ideas, grouped by type.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `format` | string | "json" | "json" or "csv" |
|
||||
|
||||
**Response:** JSON object with `ideas` array.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/authors/network
|
||||
|
||||
Author collaboration network graph.
|
||||
|
||||
**Parameters:** None
|
||||
|
||||
**Response:** JSON object with `nodes` and `edges` arrays.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/citations
|
||||
|
||||
Citation/reference graph between drafts.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `min_refs` | int | 2 | Minimum references to include a node |
|
||||
|
||||
**Response:** JSON object with citation graph data.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/search
|
||||
|
||||
Global search across drafts, ideas, authors, and gaps.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `q` | string | "" | Search query (required for results) |
|
||||
| `format` | string | "json" | "json" or "csv" |
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"drafts": [...],
|
||||
"ideas": [...],
|
||||
"authors": [...],
|
||||
"gaps": [...]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### POST /api/ask
|
||||
|
||||
Search-only question answering (free, no Claude API call). Returns relevant sources and any cached answer.
|
||||
|
||||
**Request body:**
|
||||
```json
|
||||
{
|
||||
"question": "What drafts address agent authentication?",
|
||||
"top_k": 5
|
||||
}
|
||||
```
|
||||
|
||||
**Response:** JSON with `sources` array and optional cached `answer`.
|
||||
|
||||
---
|
||||
|
||||
## Admin-Only Endpoints
|
||||
|
||||
These endpoints require admin mode (`--dev` flag) or authentication.
|
||||
|
||||
### POST /api/ask/synthesize
|
||||
|
||||
Synthesize an answer using Claude (costs tokens, rate-limited to 10 req/min/IP). Answers are cached permanently.
|
||||
|
||||
**Auth:** Admin required
|
||||
|
||||
**Request body:**
|
||||
```json
|
||||
{
|
||||
"question": "How do IETF drafts approach agent identity?",
|
||||
"top_k": 5
|
||||
}
|
||||
```
|
||||
|
||||
**Response:** JSON with `sources` array and synthesized `answer`.
|
||||
|
||||
**Errors:** 429 if rate-limited.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/gaps
|
||||
|
||||
All identified standardization gaps.
|
||||
|
||||
**Auth:** Admin required
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `format` | string | "json" | "json" or "csv" |
|
||||
|
||||
**Response:** JSON array of gap objects.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/gaps/{gap_id}
|
||||
|
||||
Detail for a single gap.
|
||||
|
||||
**Auth:** Admin required
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `gap_id` | int | Gap ID |
|
||||
|
||||
**Response:** JSON object with gap detail, or `{"error": "Gap not found"}` (404).
|
||||
|
||||
---
|
||||
|
||||
### POST /api/compare
|
||||
|
||||
Compare multiple drafts using Claude (costs tokens, rate-limited).
|
||||
|
||||
**Auth:** Admin required
|
||||
|
||||
**Request body:**
|
||||
```json
|
||||
{
|
||||
"drafts": ["draft-name-one", "draft-name-two"]
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"text": "Comparison analysis text...",
|
||||
"drafts": ["draft-name-one", "draft-name-two"]
|
||||
}
|
||||
```
|
||||
|
||||
**Errors:** 400 if fewer than 2 drafts provided.
|
||||
|
||||
---
|
||||
|
||||
### POST /api/drafts/{name}/annotate
|
||||
|
||||
Add or update annotations (notes, tags) for a draft.
|
||||
|
||||
**Auth:** Admin required
|
||||
|
||||
**Request body:**
|
||||
```json
|
||||
{
|
||||
"note": "Interesting approach to agent handshake",
|
||||
"tags": ["important", "review"],
|
||||
"add_tag": "flagged",
|
||||
"remove_tag": "review"
|
||||
}
|
||||
```
|
||||
|
||||
All fields are optional. `add_tag`/`remove_tag` operate on existing tags incrementally.
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"annotation": {"note": "...", "tags": ["important", "flagged"]}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### GET /api/monitor
|
||||
|
||||
Pipeline monitoring status (processing progress, error counts).
|
||||
|
||||
**Auth:** Admin required
|
||||
|
||||
**Response:** JSON object with monitoring data.
|
||||
|
||||
---
|
||||
|
||||
## Non-API Data Endpoints
|
||||
|
||||
### GET /export/obsidian
|
||||
|
||||
Download the entire research corpus as an Obsidian vault ZIP file.
|
||||
|
||||
**Response:** `application/zip` file download.
|
||||
|
||||
---
|
||||
|
||||
## Authentication
|
||||
|
||||
- **Production mode** (default): Admin endpoints return 403.
|
||||
- **Development mode** (`--dev` flag): All admin endpoints are accessible without authentication.
|
||||
- Rate-limited endpoints (`/api/ask/synthesize`, `/api/compare`): 10 requests per minute per IP, enforced via in-memory sliding window.
|
||||
|
||||
## Error Responses
|
||||
|
||||
All errors return JSON:
|
||||
```json
|
||||
{"error": "Description of the error"}
|
||||
```
|
||||
|
||||
Common HTTP status codes:
|
||||
- `400` — Bad request (missing parameters)
|
||||
- `403` — Admin access required
|
||||
- `404` — Resource not found
|
||||
- `429` — Rate limit exceeded
|
||||
- `500` — Internal server error
|
||||
@@ -58,7 +58,7 @@ A notable omission from this gap list: **GDPR-mandated capabilities**. The gap a
|
||||
|
||||
**What is missing**: Circuit breakers for cascading failures. Checkpoint and rollback protocols. Blast radius containment. Graceful degradation. All concepts well-established in distributed systems engineering, but absent from the agent standards landscape.
|
||||
|
||||
**The scenario**: A telecom operator deploys 50 AI agents for network monitoring, troubleshooting, and optimization. During a major outage, all 50 agents simultaneously request inference resources to diagnose the problem. With no failure cascade prevention, agents compete chaotically. The most aggressive agents get resources; the most important diagnostic tasks may not. The outage extends because the agents that could fix it are starved by the agents that are observing it.
|
||||
**The scenario**: A telecom operator deploys 50 AI agents for network monitoring, troubleshooting, and optimization. During a major outage, all 50 agents simultaneously request inference resources to diagnose the problem. With no failure cascade prevention, agents compete chaotically. The most aggressive agents get resources; the most important diagnostic tasks may not. The outage extends because the agents that could fix it are starved by the agents that are observing it. For telecom operators in the EU, the NIS2 Directive (Directive 2022/2555) classifies electronic communications as an essential service, requiring incident response capabilities and supply chain security measures -- making cascade prevention not just an engineering problem but a regulatory obligation.
|
||||
|
||||
## High Gap: Real-Time Agent Rollback Mechanisms
|
||||
|
||||
@@ -90,7 +90,7 @@ An agent operating across multiple domains or organizations needs to maintain au
|
||||
|
||||
### Federated Agent Learning Privacy
|
||||
|
||||
While federated architectures exist, there is insufficient specification for privacy-preserving agent learning that prevents data leakage between federated participants during model updates.
|
||||
While federated architectures exist, there is insufficient specification for privacy-preserving agent learning that prevents data leakage between federated participants during model updates. The absence of secure update mechanisms also intersects with the EU Cyber Resilience Act (Regulation 2024/2847), which requires products with digital elements -- including AI agent software -- to handle updates securely and provide vulnerability management throughout their lifecycle.
|
||||
|
||||
### Cross-Protocol Agent Migration
|
||||
|
||||
|
||||
@@ -77,7 +77,7 @@ The architecture achieves this with *assurance profiles* -- named configurations
|
||||
| L2 | Signed ECTs (JWT) | Cross-org, standard compliance |
|
||||
| L3 | Signed ECTs + external audit ledger | Regulated industries |
|
||||
|
||||
This dual-regime approach resolves the tension between "move fast" deployments and "prove everything" regulated environments. Ideas touching behavior verification and data provenance become implementable at higher assurance levels without imposing their cost on every deployment.
|
||||
This dual-regime approach resolves the tension between "move fast" deployments and "prove everything" regulated environments. Ideas touching behavior verification and data provenance become implementable at higher assurance levels without imposing their cost on every deployment. Notably, the L2 and L3 profiles map directly to the conformity assessment requirements of the EU AI Act (Art. 43): high-risk AI systems must demonstrate compliance through either internal control (L2's signed ECTs) or third-party audit (L3's external audit ledger), making assurance profiles not just an engineering convenience but a regulatory implementation pathway.
|
||||
|
||||
## How It Builds on What Exists
|
||||
|
||||
@@ -123,6 +123,14 @@ Based on the data trajectories and current momentum:
|
||||
|
||||
**The risk**: If the architecture work does not happen in the next 12 months, the agent ecosystem will calcify around vendor-specific protocol stacks (OpenAI's, Google's, Anthropic's, Huawei's). Each will have its own auth, discovery, and communication layer. The interoperability window will close, and the IETF's work will be standards for islands rather than standards for the internet.
|
||||
|
||||
### The Ethics of Standardizing Early
|
||||
|
||||
There is a harder question underneath the technical one: should the IETF be standardizing agent capabilities at all before safety frameworks are mature? The 4:1 capability-to-safety ratio is not just a gap -- it is a policy choice being made by default. Every A2A protocol that ships without behavior verification baked in creates a deployed base that resists retrofitting. The standards community is building the defaults that will govern billions of agent interactions, and those defaults currently assume trust rather than requiring proof.
|
||||
|
||||
The structural dynamics make this worse. The authorship analysis from Post 2 showed that a small number of large organizations -- Huawei, China Mobile, Cisco -- drive a disproportionate share of submissions. Civil society organizations, academic safety researchers, and smaller companies are largely absent from the drafting process. Standards that define agent identity, discovery, and communication also define what can be monitored, audited, and controlled. An agent discovery protocol designed primarily for enterprise deployment efficiency may inadvertently create a surveillance-friendly architecture if privacy and human autonomy are not first-class design constraints. The EU AI Act mandates human oversight (Art. 14), but a mandate is only as good as the protocol that implements it.
|
||||
|
||||
The IETF has historically been good at building infrastructure that serves everyone -- the end-to-end principle, protocol layering, rough consensus. But "rough consensus" among the current participants may not represent the interests of those most affected by autonomous agent systems. The architecture proposed above includes human-in-the-loop as a pillar, not an option. That is the right instinct. The question is whether the community will treat it with the same urgency as the protocol work -- or whether, as the data currently suggests, it will remain an aspiration while the highways ship without traffic lights.
|
||||
|
||||
### Two Equilibria
|
||||
|
||||
By 2028, the landscape will have resolved into one of two stable states.
|
||||
|
||||
@@ -4,6 +4,53 @@
|
||||
|
||||
---
|
||||
|
||||
### 2026-03-08 CODER — TypedDicts for data layer, ethics + regulatory content in blog series
|
||||
|
||||
**What**: Four improvements across typing and content:
|
||||
1. **TypedDicts in `src/webui/data.py`** — Added 16 TypedDict definitions for common return shapes: `OverviewStats`, `DraftsPage`, `DraftListItem`, `AuthorInfo`, `AuthorNetwork` (with `AuthorNetworkNode`, `AuthorNetworkEdge`, `AuthorCluster`), `SimilarityGraph`, `TimelineData`, `MonitorStatus` (with `MonitorPipeline`, `MonitorCost`), `SearchResults`, `CitationGraph`. Annotated 12 function return types.
|
||||
2. **Ethics section in Post 06** — Added "The Ethics of Standardizing Early" section (3 paragraphs) covering: premature capability standardization, power asymmetry in authorship, surveillance-friendly architecture risk, and human oversight as non-optional.
|
||||
3. **EU AI Act conformity assessment note in Post 06** — Connected L2/L3 assurance profiles to Art. 43 conformity assessment requirements (1 sentence in Pillar 4 section).
|
||||
4. **NIS2 + CRA references in Post 04** — Added NIS2 Directive reference to telecom cascade scenario (essential service obligations). Added Cyber Resilience Act reference to federated learning privacy gap (secure update lifecycle requirements).
|
||||
**Why**: Untyped dicts make the data layer hard to maintain and refactor. Blog series lacked ethical framing and key EU regulatory cross-references (NIS2, CRA) that strengthen the compliance narrative.
|
||||
**Result**: 16 TypedDicts with 12 annotated functions. 3 blog post sections added/expanded across Posts 04 and 06.
|
||||
|
||||
---
|
||||
|
||||
### 2026-03-08 CODER — CI/CD, API docs, and test coverage expansion
|
||||
|
||||
**What**: Three infrastructure additions:
|
||||
1. **GitHub Actions CI** — Added `.github/workflows/ci.yml` that runs on push/PR to main. Tests Python 3.11 and 3.12, installs from `[test]` extras, runs ruff lint (E/F/W rules, ignoring E501), and runs pytest.
|
||||
2. **API documentation** — Created `data/reports/api-docs.md` documenting all 20 API endpoints in `src/webui/app.py` with method, URL, parameters, response format, and auth requirements. Covers public endpoints (drafts, stats, search, ideas, ratings, etc.) and admin-only endpoints (gaps, compare, synthesize, annotate, monitor).
|
||||
3. **New test files** — Added `tests/test_analyzer.py` (21 tests covering `_extract_json`, `_clamp_rating`, `_parse_rating` with compact/verbose keys, defaults, and clamping) and `tests/test_search.py` (19 tests covering `sanitize_fts_query` with injection attempts, boolean operators, special chars, edge cases). Total: 64 tests all passing.
|
||||
**Why**: Project had zero CI, no API docs for the web UI, and test coverage only on DB/models. These are prerequisites for public deployment and contributor onboarding.
|
||||
**Result**: CI workflow ready, API fully documented, test count increased from 23 to 64. All tests pass in 0.6s.
|
||||
|
||||
---
|
||||
|
||||
### 2026-03-08 CODER — Performance: fix N+1 queries and add caching
|
||||
|
||||
**What**: Four targeted performance fixes across the codebase:
|
||||
1. **Batch readiness computation** — `compute_readiness_batch()` in `readiness.py` replaces per-draft readiness calls on the drafts page. Bulk-loads ref counts, cited-by counts, author experience, and ratings in ~6 queries total instead of ~200 (4 queries x 50 drafts/page).
|
||||
2. **Batch draft lookup in author network** — `_compute_author_network_full()` now calls `db.get_drafts_by_names()` once to pre-load all drafts referenced by authors, instead of calling `db.get_draft()` in a loop inside cluster building.
|
||||
3. **File-based similarity matrix cache** — `Embedder.similarity_matrix()` now caches the O(n^2) cosine similarity matrix to disk (`.cache/` dir next to DB), keyed by SHA256 hash of draft names. Reloads from cache if the set of embedded drafts hasn't changed.
|
||||
4. **Embeddings cache for search** — `HybridSearch._get_all_embeddings()` caches the result of `db.all_embeddings()` with a 5-minute TTL, avoiding a full DB scan on every search query.
|
||||
Also added `Database.get_drafts_by_names()` batch method in `db.py` (chunked to stay under SQLite's 999 variable limit).
|
||||
**Why**: Page loads on the drafts listing and author network pages were slow due to N+1 query patterns. The similarity matrix was recomputed from scratch on every CLI invocation. Search queries redundantly loaded all embeddings from disk.
|
||||
**Result**: Drafts page: ~200 queries reduced to ~6. Author network cluster building: ~100 `get_draft` calls reduced to 1 batch query. Similarity matrix: cached to disk, skips O(n^2) recomputation when embeddings unchanged. Search: embeddings loaded once per 5 minutes instead of per query.
|
||||
|
||||
---
|
||||
|
||||
### 2026-03-08 CODER — CLI boilerplate reduction, --dry-run flags, webui import cleanup
|
||||
|
||||
**What**: Three code quality improvements across the CLI and web UI:
|
||||
1. **CLI boilerplate reduction** — Created a `pass_cfg_db` decorator that extracts `cfg` and `db` from the Click context, replacing ~40 instances of `cfg = _get_config(); db = Database(cfg); try: ... finally: db.close()`. The `main()` group now initializes config/db once and registers `db.close()` via `ctx.call_on_close()`. Converted ~30 commands to use the new pattern (all report, viz, wg, ideas, and core commands). Remaining ~15 read-only commands still use the old pattern but work correctly.
|
||||
2. **--dry-run on destructive commands** — Added `--dry-run` flag to `analyze`, `embed`, `embed-ideas`, `ideas` (extract), and `gaps`. Each shows what would be processed (draft names, counts) without making API calls or DB changes. Pre-existing dry-run flags on `ideas filter`, `dedup-ideas`, `pipeline generate`, and `observatory update` were preserved.
|
||||
3. **webui/data.py import cleanup** — Moved 15+ in-function imports to the top of the file: `numpy`, `re`, `sklearn.{TSNE, AgglomerativeClustering, normalize}`, `ietf_analyzer.{readiness, search}`. Fixed `json as _json` alias to use the already-imported `json`. sklearn imports inside try/except blocks (for graceful failure) were moved to top level since sklearn is a required dependency.
|
||||
**Why**: The CLI had ~800 lines of pure boilerplate. The try/finally pattern was error-prone (easy to forget db.close()). Missing --dry-run on destructive commands made it risky to explore what a command would do. In-function imports in data.py were unnecessary since all dependencies are required.
|
||||
**Result**: cli.py reduced by ~200 lines of boilerplate. 6 commands now have --dry-run. data.py has clean top-level imports. Both files pass syntax checks and the CLI loads correctly.
|
||||
|
||||
---
|
||||
|
||||
### 2026-03-08 CODER — Critical fixes: rating clamp, convergence command, blog number correction
|
||||
|
||||
**What**: Three fixes addressing data integrity and reproducibility:
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -13,16 +13,15 @@ CONFIG_FILE = DEFAULT_DATA_DIR / "config.json"
|
||||
DEFAULT_KEYWORDS = [
|
||||
"agent",
|
||||
"ai-agent",
|
||||
"llm",
|
||||
"autonomous",
|
||||
"machine-learning",
|
||||
"artificial-intelligence",
|
||||
"mcp",
|
||||
"agentic",
|
||||
"autonomous",
|
||||
"mcp",
|
||||
"inference",
|
||||
"generative",
|
||||
"intelligent",
|
||||
"aipref",
|
||||
"large language model",
|
||||
"multi-agent",
|
||||
"trustworth",
|
||||
]
|
||||
|
||||
# Environment variable overrides (env var name -> config field name)
|
||||
@@ -39,6 +38,7 @@ class Config:
|
||||
db_path: str = str(DEFAULT_DATA_DIR / "drafts.db")
|
||||
ollama_url: str = "http://localhost:11434"
|
||||
ollama_embed_model: str = "nomic-embed-text"
|
||||
ollama_classify_model: str = "llama3.2"
|
||||
claude_model: str = "claude-sonnet-4-20250514"
|
||||
claude_model_cheap: str = "claude-haiku-4-5-20251001"
|
||||
search_keywords: list[str] = field(default_factory=lambda: list(DEFAULT_KEYWORDS))
|
||||
|
||||
@@ -326,6 +326,23 @@ class Database:
|
||||
return None
|
||||
return self._row_to_draft(row)
|
||||
|
||||
def get_drafts_by_names(self, names: list[str]) -> dict[str, "Draft"]:
|
||||
"""Batch-fetch drafts by name. Returns {name: Draft} dict."""
|
||||
if not names:
|
||||
return {}
|
||||
result = {}
|
||||
# SQLite has a variable limit (~999), so chunk if needed
|
||||
for i in range(0, len(names), 900):
|
||||
chunk = names[i : i + 900]
|
||||
placeholders = ",".join("?" for _ in chunk)
|
||||
rows = self.conn.execute(
|
||||
f"SELECT * FROM drafts WHERE name IN ({placeholders})", chunk
|
||||
).fetchall()
|
||||
for r in rows:
|
||||
d = self._row_to_draft(r)
|
||||
result[d.name] = d
|
||||
return result
|
||||
|
||||
def list_drafts(
|
||||
self,
|
||||
limit: int = 100,
|
||||
|
||||
@@ -2,6 +2,10 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import hashlib
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
import numpy as np
|
||||
import ollama as ollama_lib
|
||||
from rich.console import Console
|
||||
@@ -111,16 +115,49 @@ class Embedder:
|
||||
return similarities[:top_n]
|
||||
|
||||
def similarity_matrix(self) -> tuple[list[str], np.ndarray]:
|
||||
"""Compute pairwise similarity matrix for all embedded drafts."""
|
||||
"""Compute pairwise similarity matrix for all embedded drafts.
|
||||
|
||||
Uses a file-based cache keyed by the hash of embedding draft names.
|
||||
If the set of embedded drafts hasn't changed, the cached matrix is
|
||||
reloaded from disk instead of recomputing O(n^2) cosine similarities.
|
||||
"""
|
||||
all_embeddings = self.db.all_embeddings()
|
||||
names = sorted(all_embeddings.keys())
|
||||
n = len(names)
|
||||
|
||||
# Build cache key from sorted draft names
|
||||
names_hash = hashlib.sha256("\n".join(names).encode()).hexdigest()[:16]
|
||||
cache_dir = Path(self.config.db_path).parent / ".cache"
|
||||
cache_meta = cache_dir / f"sim_matrix_{names_hash}.json"
|
||||
cache_npy = cache_dir / f"sim_matrix_{names_hash}.npy"
|
||||
|
||||
# Try loading from cache
|
||||
if cache_meta.exists() and cache_npy.exists():
|
||||
try:
|
||||
cached_names = json.loads(cache_meta.read_text())
|
||||
if cached_names == names:
|
||||
matrix = np.load(cache_npy)
|
||||
if matrix.shape == (n, n):
|
||||
return names, matrix
|
||||
except Exception:
|
||||
pass # Cache corrupted, recompute
|
||||
|
||||
# Compute fresh
|
||||
matrix = np.zeros((n, n), dtype=np.float32)
|
||||
for i in range(n):
|
||||
for j in range(i, n):
|
||||
sim = _cosine_similarity(all_embeddings[names[i]], all_embeddings[names[j]])
|
||||
matrix[i, j] = sim
|
||||
matrix[j, i] = sim
|
||||
|
||||
# Save to cache
|
||||
try:
|
||||
cache_dir.mkdir(exist_ok=True)
|
||||
np.save(cache_npy, matrix)
|
||||
cache_meta.write_text(json.dumps(names))
|
||||
except Exception:
|
||||
pass # Non-fatal if caching fails
|
||||
|
||||
return names, matrix
|
||||
|
||||
def find_clusters(self, threshold: float = 0.85) -> list[list[str]]:
|
||||
|
||||
@@ -100,3 +100,136 @@ def compute_readiness(db, draft_name: str) -> dict:
|
||||
f["contribution"] = round(f["value"] * f["weight"] * 100, 1)
|
||||
|
||||
return {"score": score, "factors": factors}
|
||||
|
||||
|
||||
def compute_readiness_batch(db, draft_names: list[str]) -> dict[str, dict]:
|
||||
"""Batch-compute readiness for multiple drafts using bulk queries.
|
||||
|
||||
Returns {draft_name: {score, factors}} — same format as compute_readiness.
|
||||
Reduces ~6 queries per draft to ~6 queries total.
|
||||
"""
|
||||
if not draft_names:
|
||||
return {}
|
||||
|
||||
# Batch-load drafts
|
||||
drafts_map = db.get_drafts_by_names(draft_names)
|
||||
|
||||
# Batch-load ref counts per draft
|
||||
ref_counts: dict[str, int] = {}
|
||||
rows = db.conn.execute(
|
||||
"SELECT draft_name, COUNT(*) as cnt FROM draft_refs GROUP BY draft_name"
|
||||
).fetchall()
|
||||
for r in rows:
|
||||
ref_counts[r["draft_name"]] = r["cnt"]
|
||||
|
||||
# Max refs across corpus (single query)
|
||||
max_refs_row = db.conn.execute(
|
||||
"SELECT MAX(cnt) FROM (SELECT COUNT(*) as cnt FROM draft_refs GROUP BY draft_name)"
|
||||
).fetchone()
|
||||
max_refs = (max_refs_row[0] or 1) if max_refs_row else 1
|
||||
|
||||
# Batch-load cited-by counts
|
||||
cited_by_counts: dict[str, int] = {}
|
||||
rows = db.conn.execute(
|
||||
"SELECT ref_id, COUNT(DISTINCT draft_name) as cnt FROM draft_refs "
|
||||
"WHERE ref_type = 'draft' GROUP BY ref_id"
|
||||
).fetchall()
|
||||
for r in rows:
|
||||
cited_by_counts[r["ref_id"]] = r["cnt"]
|
||||
|
||||
# Batch-load author experience: person_id -> draft count
|
||||
author_draft_counts: dict[int, int] = {}
|
||||
rows = db.conn.execute(
|
||||
"SELECT person_id, COUNT(*) as cnt FROM draft_authors GROUP BY person_id"
|
||||
).fetchall()
|
||||
for r in rows:
|
||||
author_draft_counts[r["person_id"]] = r["cnt"]
|
||||
|
||||
# Batch-load draft->author mappings
|
||||
draft_authors: dict[str, list[int]] = {}
|
||||
rows = db.conn.execute(
|
||||
"SELECT draft_name, person_id FROM draft_authors"
|
||||
).fetchall()
|
||||
for r in rows:
|
||||
draft_authors.setdefault(r["draft_name"], []).append(r["person_id"])
|
||||
|
||||
# Batch-load ratings (momentum)
|
||||
ratings_map: dict[str, float] = {}
|
||||
rows = db.conn.execute(
|
||||
"SELECT draft_name, momentum FROM ratings"
|
||||
).fetchall()
|
||||
for r in rows:
|
||||
ratings_map[r["draft_name"]] = r["momentum"]
|
||||
|
||||
# Now compute readiness for each draft using pre-loaded data
|
||||
results = {}
|
||||
for name in draft_names:
|
||||
draft = drafts_map.get(name)
|
||||
if not draft:
|
||||
results[name] = {"score": 0, "factors": {}}
|
||||
continue
|
||||
|
||||
factors = {}
|
||||
|
||||
# 1. WG Adopted
|
||||
wg_val = 1.0 if name.startswith("draft-ietf-") else 0.0
|
||||
factors["wg_adopted"] = {"value": wg_val, "weight": 0.25,
|
||||
"label": "WG Adopted",
|
||||
"detail": "draft-ietf-*" if wg_val else "individual"}
|
||||
|
||||
# 2. Revision Maturity
|
||||
try:
|
||||
rev_num = int(draft.rev) if draft.rev else 0
|
||||
except (ValueError, TypeError):
|
||||
rev_num = 0
|
||||
rev_val = min(rev_num / 5.0, 1.0)
|
||||
factors["revision_maturity"] = {"value": round(rev_val, 3), "weight": 0.15,
|
||||
"label": "Revision Maturity",
|
||||
"detail": f"rev {rev_num}"}
|
||||
|
||||
# 3. Reference Density
|
||||
ref_count = ref_counts.get(name, 0)
|
||||
ref_val = min(ref_count / max_refs, 1.0)
|
||||
factors["reference_density"] = {"value": round(ref_val, 3), "weight": 0.15,
|
||||
"label": "Reference Density",
|
||||
"detail": f"{ref_count} refs (max {max_refs})"}
|
||||
|
||||
# 4. Cited By Count
|
||||
cited_by = cited_by_counts.get(name, 0)
|
||||
cited_val = min(cited_by / 5.0, 1.0)
|
||||
factors["cited_by_count"] = {"value": round(cited_val, 3), "weight": 0.15,
|
||||
"label": "Cited By Others",
|
||||
"detail": f"{cited_by} draft(s)"}
|
||||
|
||||
# 5. Author Experience
|
||||
person_ids = draft_authors.get(name, [])
|
||||
if person_ids:
|
||||
counts = [author_draft_counts.get(pid, 1) for pid in person_ids]
|
||||
avg_exp = sum(counts) / len(counts)
|
||||
exp_val = min(avg_exp / 5.0, 1.0)
|
||||
else:
|
||||
exp_val = 0.0
|
||||
avg_exp = 0
|
||||
factors["author_experience"] = {"value": round(exp_val, 3), "weight": 0.15,
|
||||
"label": "Author Experience",
|
||||
"detail": f"avg {avg_exp:.1f} drafts/author"}
|
||||
|
||||
# 6. Momentum Rating
|
||||
momentum = ratings_map.get(name)
|
||||
if momentum is not None:
|
||||
mom_val = (momentum - 1) / 4.0
|
||||
else:
|
||||
mom_val = 0.0
|
||||
factors["momentum_rating"] = {"value": round(mom_val, 3), "weight": 0.15,
|
||||
"label": "Momentum",
|
||||
"detail": f"{momentum}/5" if momentum else "unrated"}
|
||||
|
||||
# Compute weighted score
|
||||
total = sum(f["value"] * f["weight"] for f in factors.values())
|
||||
score = round(total * 100, 1)
|
||||
for f in factors.values():
|
||||
f["contribution"] = round(f["value"] * f["weight"] * 100, 1)
|
||||
|
||||
results[name] = {"score": score, "factors": factors}
|
||||
|
||||
return results
|
||||
|
||||
@@ -4,6 +4,7 @@ from __future__ import annotations
|
||||
|
||||
import hashlib
|
||||
import re
|
||||
import time
|
||||
from collections import defaultdict
|
||||
|
||||
import numpy as np
|
||||
@@ -50,6 +51,9 @@ class HybridSearch:
|
||||
self.db = db
|
||||
self._embedder = embedder
|
||||
self._ollama_available: bool | None = None
|
||||
self._embeddings_cache: dict[str, np.ndarray] | None = None
|
||||
self._embeddings_cache_time: float = 0
|
||||
self._EMBEDDINGS_TTL: float = 300 # 5 minutes
|
||||
|
||||
@property
|
||||
def embedder(self):
|
||||
@@ -79,6 +83,16 @@ class HybridSearch:
|
||||
self._ollama_available = False
|
||||
return self._ollama_available
|
||||
|
||||
def _get_all_embeddings(self) -> dict[str, np.ndarray]:
|
||||
"""Return all embeddings, cached with TTL to avoid reloading on every query."""
|
||||
now = time.monotonic()
|
||||
if (self._embeddings_cache is not None
|
||||
and now - self._embeddings_cache_time < self._EMBEDDINGS_TTL):
|
||||
return self._embeddings_cache
|
||||
self._embeddings_cache = self.db.all_embeddings()
|
||||
self._embeddings_cache_time = now
|
||||
return self._embeddings_cache
|
||||
|
||||
def search(self, query: str, top_k: int = 10) -> list[dict]:
|
||||
"""Combine FTS5 keyword search + embedding similarity search.
|
||||
|
||||
@@ -144,7 +158,7 @@ class HybridSearch:
|
||||
self._ollama_available = False
|
||||
return []
|
||||
|
||||
all_embeddings = self.db.all_embeddings()
|
||||
all_embeddings = self._get_all_embeddings()
|
||||
if not all_embeddings:
|
||||
return []
|
||||
|
||||
|
||||
@@ -7,11 +7,176 @@ ready for JSON serialization or Jinja2 template rendering.
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
import sys
|
||||
import time
|
||||
from collections import Counter, defaultdict
|
||||
from functools import lru_cache
|
||||
from pathlib import Path
|
||||
from typing import TypedDict
|
||||
|
||||
import numpy as np
|
||||
from sklearn.cluster import AgglomerativeClustering
|
||||
from sklearn.manifold import TSNE
|
||||
from sklearn.preprocessing import normalize as sk_normalize
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# TypedDicts for common return shapes
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class OverviewStats(TypedDict):
|
||||
"""High-level dashboard statistics from :func:`get_overview_stats`."""
|
||||
total_drafts: int
|
||||
rated_count: int
|
||||
author_count: int
|
||||
idea_count: int
|
||||
gap_count: int
|
||||
input_tokens: int
|
||||
output_tokens: int
|
||||
false_positive_count: int
|
||||
|
||||
|
||||
class DraftListItem(TypedDict):
|
||||
"""Single draft in the paginated listing from :func:`get_drafts_page`."""
|
||||
name: str
|
||||
title: str
|
||||
date: str | None
|
||||
url: str
|
||||
pages: int
|
||||
group: str
|
||||
source: str
|
||||
score: float
|
||||
novelty: float
|
||||
maturity: float
|
||||
overlap: float
|
||||
momentum: float
|
||||
relevance: float
|
||||
categories: list[str]
|
||||
summary: str
|
||||
readiness: float
|
||||
|
||||
|
||||
class DraftsPage(TypedDict):
|
||||
"""Paginated draft listing from :func:`get_drafts_page`."""
|
||||
drafts: list[DraftListItem]
|
||||
total: int
|
||||
page: int
|
||||
per_page: int
|
||||
pages: int
|
||||
|
||||
|
||||
class AuthorInfo(TypedDict):
|
||||
"""Author entry from :func:`get_top_authors`."""
|
||||
name: str
|
||||
affiliation: str
|
||||
draft_count: int
|
||||
drafts: list[str]
|
||||
|
||||
|
||||
class AuthorNetworkNode(TypedDict):
|
||||
"""Node in the author network graph."""
|
||||
id: str
|
||||
name: str
|
||||
org: str
|
||||
draft_count: int
|
||||
avg_score: float
|
||||
drafts: list[str]
|
||||
|
||||
|
||||
class AuthorNetworkEdge(TypedDict):
|
||||
"""Edge in the author network graph."""
|
||||
source: str
|
||||
target: str
|
||||
weight: int
|
||||
|
||||
|
||||
class AuthorCluster(TypedDict):
|
||||
"""Cluster in the author network."""
|
||||
id: int
|
||||
members: list[str]
|
||||
org_mix: dict[str, int]
|
||||
size: int
|
||||
drafts: list[dict[str, str]]
|
||||
draft_count: int
|
||||
|
||||
|
||||
class AuthorNetwork(TypedDict):
|
||||
"""Full author network from :func:`get_author_network_full`."""
|
||||
nodes: list[AuthorNetworkNode]
|
||||
edges: list[AuthorNetworkEdge]
|
||||
clusters: list[AuthorCluster]
|
||||
|
||||
|
||||
class SimilarityGraphStats(TypedDict):
|
||||
"""Stats sub-dict in similarity graph."""
|
||||
node_count: int
|
||||
edge_count: int
|
||||
avg_similarity: float
|
||||
|
||||
|
||||
class SimilarityGraph(TypedDict):
|
||||
"""Draft similarity network from :func:`get_similarity_graph`."""
|
||||
nodes: list[dict]
|
||||
edges: list[dict]
|
||||
stats: SimilarityGraphStats
|
||||
|
||||
|
||||
class TimelineData(TypedDict):
|
||||
"""Monthly category counts from :func:`get_timeline_data`."""
|
||||
months: list[str]
|
||||
series: dict[str, list[int]]
|
||||
categories: list[str]
|
||||
|
||||
|
||||
class MonitorCost(TypedDict):
|
||||
"""Cost sub-dict in monitor status."""
|
||||
input_tokens: int
|
||||
output_tokens: int
|
||||
estimated_usd: float
|
||||
|
||||
|
||||
class MonitorPipeline(TypedDict):
|
||||
"""Pipeline sub-dict in monitor status."""
|
||||
total_drafts: int
|
||||
rated: int
|
||||
embedded: int
|
||||
with_ideas: int
|
||||
idea_total: int
|
||||
gap_count: int
|
||||
|
||||
|
||||
class MonitorStatus(TypedDict):
|
||||
"""Monitor status from :func:`get_monitor_status`."""
|
||||
last_run: dict | None
|
||||
runs: list[dict]
|
||||
unprocessed: dict[str, int]
|
||||
total_runs: int
|
||||
pipeline: MonitorPipeline
|
||||
cost: MonitorCost
|
||||
|
||||
|
||||
class SearchResults(TypedDict):
|
||||
"""Global search results from :func:`global_search`."""
|
||||
drafts: list[dict]
|
||||
ideas: list[dict]
|
||||
authors: list[dict]
|
||||
gaps: list[dict]
|
||||
|
||||
|
||||
class CitationGraphStats(TypedDict):
|
||||
"""Stats sub-dict in citation graph."""
|
||||
node_count: int
|
||||
edge_count: int
|
||||
rfc_count: int
|
||||
draft_count: int
|
||||
|
||||
|
||||
class CitationGraph(TypedDict):
|
||||
"""Citation network from :func:`get_citation_graph`."""
|
||||
nodes: list[dict]
|
||||
edges: list[dict]
|
||||
stats: CitationGraphStats
|
||||
|
||||
# Add project root to path so we can import ietf_analyzer
|
||||
_project_root = Path(__file__).resolve().parent.parent.parent
|
||||
@@ -20,6 +185,8 @@ if str(_project_root) not in sys.path:
|
||||
|
||||
from ietf_analyzer.config import Config
|
||||
from ietf_analyzer.db import Database
|
||||
from ietf_analyzer.readiness import compute_readiness, compute_readiness_batch
|
||||
from ietf_analyzer.search import HybridSearch
|
||||
|
||||
def _extract_month(time_str: str | None) -> str:
|
||||
"""Normalize a date string to YYYY-MM format."""
|
||||
@@ -55,7 +222,7 @@ def get_db() -> Database:
|
||||
return Database(config)
|
||||
|
||||
|
||||
def get_overview_stats(db: Database) -> dict:
|
||||
def get_overview_stats(db: Database) -> OverviewStats:
|
||||
"""Return high-level stats for the dashboard home page.
|
||||
|
||||
Excludes drafts flagged as false positives from rated counts.
|
||||
@@ -204,7 +371,7 @@ def get_drafts_page(
|
||||
sort: str = "score",
|
||||
sort_dir: str = "desc",
|
||||
source: str = "",
|
||||
) -> dict:
|
||||
) -> DraftsPage:
|
||||
"""Return a paginated, filtered list of drafts with ratings.
|
||||
|
||||
Returns dict with keys: drafts, total, page, per_page, pages.
|
||||
@@ -262,11 +429,9 @@ def get_drafts_page(
|
||||
start = (page - 1) * per_page
|
||||
page_items = filtered[start : start + per_page]
|
||||
|
||||
# Pre-compute readiness for page items (lightweight version)
|
||||
from ietf_analyzer.readiness import compute_readiness
|
||||
readiness_cache = {}
|
||||
for draft, rating in page_items:
|
||||
readiness_cache[draft.name] = compute_readiness(db, draft.name)
|
||||
# Pre-compute readiness in batch (~6 queries total instead of ~200)
|
||||
|
||||
readiness_cache = compute_readiness_batch(db, [d.name for d, _ in page_items])
|
||||
|
||||
drafts = []
|
||||
for draft, rating in page_items:
|
||||
@@ -350,7 +515,7 @@ def get_draft_detail(db: Database, name: str) -> dict | None:
|
||||
}
|
||||
|
||||
# Readiness score
|
||||
from ietf_analyzer.readiness import compute_readiness
|
||||
|
||||
result["readiness"] = compute_readiness(db, name)
|
||||
|
||||
# Annotation
|
||||
@@ -387,7 +552,7 @@ def get_rating_distributions(db: Database) -> dict:
|
||||
return dims
|
||||
|
||||
|
||||
def get_timeline_data(db: Database) -> dict:
|
||||
def get_timeline_data(db: Database) -> TimelineData:
|
||||
"""Return monthly counts by category for timeline chart."""
|
||||
pairs = db.drafts_with_ratings(limit=1000)
|
||||
all_drafts = db.list_drafts(limit=1000, order_by="time ASC")
|
||||
@@ -482,7 +647,7 @@ def read_generated_draft(filename: str) -> str | None:
|
||||
return path.read_text(errors="replace")
|
||||
|
||||
|
||||
def get_top_authors(db: Database, limit: int = 30) -> list[dict]:
|
||||
def get_top_authors(db: Database, limit: int = 30) -> list[AuthorInfo]:
|
||||
"""Return top authors by draft count."""
|
||||
rows = db.top_authors(limit=limit)
|
||||
return [
|
||||
@@ -561,19 +726,19 @@ def get_coauthor_network(db: Database, min_shared: int = 1) -> dict:
|
||||
return {"nodes": nodes, "edges": edges}
|
||||
|
||||
|
||||
def get_similarity_graph(db: Database, threshold: float = 0.75) -> dict:
|
||||
def get_similarity_graph(db: Database, threshold: float = 0.75) -> SimilarityGraph:
|
||||
"""Return draft similarity network (cached)."""
|
||||
return _cached(f"similarity_{threshold}", lambda: _compute_similarity_graph(db, threshold))
|
||||
|
||||
|
||||
def _compute_similarity_graph(db: Database, threshold: float = 0.75) -> dict:
|
||||
def _compute_similarity_graph(db: Database, threshold: float = 0.75) -> SimilarityGraph:
|
||||
"""Return draft similarity network for force-directed graph.
|
||||
|
||||
Returns {nodes: [{name, title, category, score}],
|
||||
edges: [{source, target, similarity}],
|
||||
stats: {node_count, edge_count, avg_similarity}}
|
||||
"""
|
||||
import numpy as np
|
||||
|
||||
|
||||
embeddings = db.all_embeddings()
|
||||
if len(embeddings) < 2:
|
||||
@@ -639,12 +804,12 @@ def get_cross_org_data(db: Database, limit: int = 20) -> list[dict]:
|
||||
]
|
||||
|
||||
|
||||
def get_author_network_full(db: Database) -> dict:
|
||||
def get_author_network_full(db: Database) -> AuthorNetwork:
|
||||
"""Return author network (cached for 5 min)."""
|
||||
return _cached("author_network", lambda: _compute_author_network_full(db))
|
||||
|
||||
|
||||
def _compute_author_network_full(db: Database) -> dict:
|
||||
def _compute_author_network_full(db: Database) -> AuthorNetwork:
|
||||
"""Return enriched co-authorship network with avg scores and cluster info.
|
||||
|
||||
Returns {
|
||||
@@ -704,6 +869,12 @@ def _compute_author_network_full(db: Database) -> dict:
|
||||
visited: set[str] = set()
|
||||
clusters = []
|
||||
|
||||
# Batch-load all drafts referenced by authors (avoid N+1 in cluster loop)
|
||||
_all_dn = set()
|
||||
for _ai in author_info.values():
|
||||
_all_dn.update(_ai.get("drafts", []))
|
||||
_all_drafts_map = db.get_drafts_by_names(list(_all_dn))
|
||||
|
||||
for node in sorted(node_set):
|
||||
if node in visited:
|
||||
continue
|
||||
@@ -728,7 +899,7 @@ def _compute_author_network_full(db: Database) -> dict:
|
||||
org_mix[org] += 1
|
||||
for dn in author_info.get(m, {}).get("drafts", []):
|
||||
if dn not in cluster_drafts:
|
||||
d = db.get_draft(dn)
|
||||
d = _all_drafts_map.get(dn)
|
||||
cluster_drafts[dn] = d.title[:80] if d else dn
|
||||
clusters.append({
|
||||
"id": len(clusters),
|
||||
@@ -756,9 +927,7 @@ def _compute_idea_clusters(db: Database) -> dict:
|
||||
a target of ~30 clusters for readable groupings. Enriches each cluster
|
||||
with WG info and category breakdown.
|
||||
"""
|
||||
import json as _json
|
||||
import numpy as np
|
||||
from sklearn.preprocessing import normalize as sk_normalize
|
||||
|
||||
|
||||
embeddings = db.all_idea_embeddings()
|
||||
if not embeddings:
|
||||
@@ -777,8 +946,8 @@ def _compute_idea_clusters(db: Database) -> dict:
|
||||
draft_cats: dict[str, list[str]] = {}
|
||||
for r in rating_rows:
|
||||
try:
|
||||
draft_cats[r["draft_name"]] = _json.loads(r["categories"]) if r["categories"] else []
|
||||
except (_json.JSONDecodeError, TypeError):
|
||||
draft_cats[r["draft_name"]] = json.loads(r["categories"]) if r["categories"] else []
|
||||
except (json.JSONDecodeError, TypeError):
|
||||
draft_cats[r["draft_name"]] = []
|
||||
|
||||
# Build matrix from embeddings that have matching ideas
|
||||
@@ -792,7 +961,6 @@ def _compute_idea_clusters(db: Database) -> dict:
|
||||
# Ward clustering on normalized vectors — target ~30 clusters scaled by dataset size
|
||||
n_target = max(10, min(40, len(idea_ids) // 12))
|
||||
try:
|
||||
from sklearn.cluster import AgglomerativeClustering
|
||||
clustering = AgglomerativeClustering(n_clusters=n_target, linkage='ward')
|
||||
labels = clustering.fit_predict(matrix_norm)
|
||||
except Exception:
|
||||
@@ -877,7 +1045,6 @@ def _compute_idea_clusters(db: Database) -> dict:
|
||||
# t-SNE for scatter
|
||||
scatter = []
|
||||
try:
|
||||
from sklearn.manifold import TSNE
|
||||
perp = min(30, len(idea_ids) - 1)
|
||||
tsne = TSNE(n_components=2, perplexity=perp, random_state=42, max_iter=500)
|
||||
coords = tsne.fit_transform(matrix_norm)
|
||||
@@ -917,7 +1084,7 @@ def _compute_timeline_animation_data(db: Database) -> dict:
|
||||
animation frames. Each point carries a ``month`` field (YYYY-MM) so the
|
||||
front-end can build cumulative animation frames.
|
||||
"""
|
||||
import numpy as np
|
||||
|
||||
|
||||
embeddings = db.all_embeddings()
|
||||
if len(embeddings) < 5:
|
||||
@@ -935,7 +1102,6 @@ def _compute_timeline_animation_data(db: Database) -> dict:
|
||||
matrix = np.array([embeddings[n] for n in names])
|
||||
|
||||
try:
|
||||
from sklearn.manifold import TSNE
|
||||
tsne = TSNE(n_components=2, perplexity=min(30, len(names) - 1),
|
||||
random_state=42, max_iter=500)
|
||||
coords = tsne.fit_transform(matrix)
|
||||
@@ -975,7 +1141,7 @@ def _compute_timeline_animation_data(db: Database) -> dict:
|
||||
}
|
||||
|
||||
|
||||
def get_monitor_status(db: Database) -> dict:
|
||||
def get_monitor_status(db: Database) -> MonitorStatus:
|
||||
"""Return monitoring status data for dashboard."""
|
||||
runs = db.get_monitor_runs(limit=20)
|
||||
last = runs[0] if runs else None
|
||||
@@ -1014,12 +1180,12 @@ def get_monitor_status(db: Database) -> dict:
|
||||
}
|
||||
|
||||
|
||||
def get_citation_graph(db: Database, min_refs: int = 2) -> dict:
|
||||
def get_citation_graph(db: Database, min_refs: int = 2) -> CitationGraph:
|
||||
"""Return citation graph (cached for 5 min)."""
|
||||
return _cached(f"citation_graph_{min_refs}", lambda: _compute_citation_graph(db, min_refs))
|
||||
|
||||
|
||||
def _compute_citation_graph(db: Database, min_refs: int = 2) -> dict:
|
||||
def _compute_citation_graph(db: Database, min_refs: int = 2) -> CitationGraph:
|
||||
"""Return citation network data for force-directed graph.
|
||||
|
||||
Returns {nodes: [{id, type, title, influence, ...}],
|
||||
@@ -1131,7 +1297,7 @@ def _compute_citation_graph(db: Database, min_refs: int = 2) -> dict:
|
||||
}
|
||||
|
||||
|
||||
def global_search(db: Database, query: str) -> dict:
|
||||
def global_search(db: Database, query: str) -> SearchResults:
|
||||
"""Search across drafts (FTS5), ideas, authors, and gaps.
|
||||
|
||||
Returns {drafts: [...], ideas: [...], authors: [...], gaps: [...]}.
|
||||
@@ -1144,7 +1310,6 @@ def global_search(db: Database, query: str) -> dict:
|
||||
|
||||
# 1. Drafts via FTS5
|
||||
try:
|
||||
import re
|
||||
fts_query = re.sub(r'[^\w\s]', '', q)
|
||||
fts_query = re.sub(r'\b(NEAR|OR|AND|NOT)\b', '', fts_query, flags=re.IGNORECASE)
|
||||
fts_query = re.sub(r'\s+', ' ', fts_query).strip()
|
||||
@@ -1242,7 +1407,7 @@ def get_landscape_tsne(db: Database) -> list[dict]:
|
||||
|
||||
def _compute_landscape_tsne(db: Database) -> list[dict]:
|
||||
"""Compute t-SNE from embeddings, return [{name, title, x, y, category, score}]."""
|
||||
import numpy as np
|
||||
|
||||
|
||||
embeddings = db.all_embeddings()
|
||||
if len(embeddings) < 5:
|
||||
@@ -1260,7 +1425,6 @@ def _compute_landscape_tsne(db: Database) -> list[dict]:
|
||||
matrix = np.array([embeddings[n] for n in names])
|
||||
|
||||
try:
|
||||
from sklearn.manifold import TSNE
|
||||
tsne = TSNE(n_components=2, perplexity=min(30, len(names) - 1),
|
||||
random_state=42, max_iter=500)
|
||||
coords = tsne.fit_transform(matrix)
|
||||
@@ -1295,7 +1459,7 @@ def get_comparison_data(db: Database, names: list[str]) -> dict | None:
|
||||
comparison_text: str | None,
|
||||
}
|
||||
"""
|
||||
import numpy as np
|
||||
|
||||
|
||||
drafts_data = []
|
||||
all_ideas: dict[str, list[dict]] = {}
|
||||
@@ -1384,9 +1548,6 @@ def get_comparison_data(db: Database, names: list[str]) -> dict | None:
|
||||
|
||||
def get_ask_search(db: Database, question: str, top_k: int = 5) -> dict:
|
||||
"""Search-only (free) — returns sources + cached answer if available."""
|
||||
from ietf_analyzer.config import Config
|
||||
from ietf_analyzer.search import HybridSearch
|
||||
|
||||
config = Config.load()
|
||||
searcher = HybridSearch(config, db)
|
||||
return searcher.search_only(question, top_k=top_k)
|
||||
@@ -1394,9 +1555,6 @@ def get_ask_search(db: Database, question: str, top_k: int = 5) -> dict:
|
||||
|
||||
def get_ask_synthesize(db: Database, question: str, top_k: int = 5, cheap: bool = True) -> dict:
|
||||
"""Run Claude synthesis (costs tokens, result is cached permanently)."""
|
||||
from ietf_analyzer.config import Config
|
||||
from ietf_analyzer.search import HybridSearch
|
||||
|
||||
config = Config.load()
|
||||
searcher = HybridSearch(config, db)
|
||||
return searcher.ask(question, top_k=top_k, cheap=cheap)
|
||||
|
||||
166
tests/test_analyzer.py
Normal file
166
tests/test_analyzer.py
Normal file
@@ -0,0 +1,166 @@
|
||||
"""Tests for pure functions in ietf_analyzer.analyzer (no API calls)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
|
||||
import pytest
|
||||
|
||||
from ietf_analyzer.analyzer import Analyzer
|
||||
from ietf_analyzer.models import Rating
|
||||
|
||||
|
||||
# ---- _extract_json ----
|
||||
|
||||
|
||||
class TestExtractJson:
|
||||
"""Test the _extract_json static-ish method that strips markdown fences."""
|
||||
|
||||
@staticmethod
|
||||
def _extract(text: str) -> str:
|
||||
# _extract_json is an instance method but only uses self for nothing,
|
||||
# so we call it on a dummy — avoid constructing full Analyzer (needs API key).
|
||||
return Analyzer._extract_json(None, text)
|
||||
|
||||
def test_plain_json(self):
|
||||
raw = '{"key": "value"}'
|
||||
assert self._extract(raw) == '{"key": "value"}'
|
||||
|
||||
def test_json_with_fences(self):
|
||||
raw = '```json\n{"key": "value"}\n```'
|
||||
assert self._extract(raw) == '{"key": "value"}'
|
||||
|
||||
def test_json_with_plain_fences(self):
|
||||
raw = '```\n{"key": "value"}\n```'
|
||||
assert self._extract(raw) == '{"key": "value"}'
|
||||
|
||||
def test_json_with_whitespace(self):
|
||||
raw = ' \n {"key": "value"} \n '
|
||||
assert self._extract(raw) == '{"key": "value"}'
|
||||
|
||||
def test_json_array_with_fences(self):
|
||||
raw = '```json\n[{"a": 1}, {"b": 2}]\n```'
|
||||
result = self._extract(raw)
|
||||
assert json.loads(result) == [{"a": 1}, {"b": 2}]
|
||||
|
||||
def test_multiline_json_with_fences(self):
|
||||
raw = '```json\n{\n "key": "value",\n "num": 42\n}\n```'
|
||||
result = self._extract(raw)
|
||||
parsed = json.loads(result)
|
||||
assert parsed == {"key": "value", "num": 42}
|
||||
|
||||
def test_no_fences_passthrough(self):
|
||||
raw = '[1, 2, 3]'
|
||||
assert self._extract(raw) == '[1, 2, 3]'
|
||||
|
||||
def test_empty_string(self):
|
||||
assert self._extract('') == ''
|
||||
|
||||
def test_fences_with_trailing_whitespace(self):
|
||||
raw = '```json\n{"ok": true}\n``` \n'
|
||||
result = self._extract(raw)
|
||||
assert json.loads(result) == {"ok": True}
|
||||
|
||||
|
||||
# ---- _clamp_rating ----
|
||||
|
||||
|
||||
class TestClampRating:
|
||||
def test_normal_values(self):
|
||||
assert Analyzer._clamp_rating(3) == 3
|
||||
assert Analyzer._clamp_rating(1) == 1
|
||||
assert Analyzer._clamp_rating(5) == 5
|
||||
|
||||
def test_clamp_high(self):
|
||||
assert Analyzer._clamp_rating(10) == 5
|
||||
assert Analyzer._clamp_rating(99) == 5
|
||||
|
||||
def test_clamp_low(self):
|
||||
assert Analyzer._clamp_rating(0) == 1
|
||||
assert Analyzer._clamp_rating(-5) == 1
|
||||
|
||||
def test_float_truncated(self):
|
||||
assert Analyzer._clamp_rating(3.7) == 3
|
||||
assert Analyzer._clamp_rating(4.9) == 4
|
||||
|
||||
def test_string_number(self):
|
||||
assert Analyzer._clamp_rating("4") == 4
|
||||
assert Analyzer._clamp_rating("1") == 1
|
||||
|
||||
def test_invalid_returns_default(self):
|
||||
assert Analyzer._clamp_rating("abc") == 3
|
||||
assert Analyzer._clamp_rating(None) == 3
|
||||
assert Analyzer._clamp_rating([]) == 3
|
||||
|
||||
def test_custom_default(self):
|
||||
assert Analyzer._clamp_rating("abc", default=2) == 2
|
||||
|
||||
def test_custom_range(self):
|
||||
assert Analyzer._clamp_rating(8, lo=1, hi=10) == 8
|
||||
assert Analyzer._clamp_rating(15, lo=1, hi=10) == 10
|
||||
|
||||
|
||||
# ---- _parse_rating ----
|
||||
|
||||
|
||||
class TestParseRating:
|
||||
"""Test _parse_rating with compact and verbose key formats."""
|
||||
|
||||
@staticmethod
|
||||
def _parse(draft_name: str, data: dict) -> Rating:
|
||||
# _parse_rating calls self._clamp_rating, so we need a minimal object.
|
||||
# Create an object with just the _clamp_rating method bound.
|
||||
stub = object.__new__(Analyzer)
|
||||
return stub._parse_rating(draft_name, data)
|
||||
|
||||
def test_compact_keys(self):
|
||||
data = {
|
||||
"s": "A summary",
|
||||
"n": 4, "nn": "novel approach",
|
||||
"m": 3, "mn": "early stage",
|
||||
"o": 2, "on": "minor overlap",
|
||||
"mo": 5, "mon": "strong momentum",
|
||||
"r": 4, "rn": "relevant",
|
||||
"c": ["A2A protocols"],
|
||||
}
|
||||
rating = self._parse("draft-test", data)
|
||||
assert rating.draft_name == "draft-test"
|
||||
assert rating.novelty == 4
|
||||
assert rating.maturity == 3
|
||||
assert rating.overlap == 2
|
||||
assert rating.momentum == 5
|
||||
assert rating.relevance == 4
|
||||
assert rating.summary == "A summary"
|
||||
assert rating.categories == ["A2A protocols"]
|
||||
|
||||
def test_verbose_keys(self):
|
||||
data = {
|
||||
"summary": "A summary",
|
||||
"novelty": 3, "novelty_note": "ok",
|
||||
"maturity": 2, "maturity_note": "early",
|
||||
"overlap": 1, "overlap_note": "unique",
|
||||
"momentum": 4, "momentum_note": "active",
|
||||
"relevance": 5, "relevance_note": "core",
|
||||
"categories": ["AI safety/alignment"],
|
||||
}
|
||||
rating = self._parse("draft-test-2", data)
|
||||
assert rating.novelty == 3
|
||||
assert rating.relevance == 5
|
||||
assert rating.categories == ["AI safety/alignment"]
|
||||
|
||||
def test_missing_keys_use_defaults(self):
|
||||
data = {}
|
||||
rating = self._parse("draft-empty", data)
|
||||
assert rating.novelty == 3 # default
|
||||
assert rating.maturity == 3
|
||||
assert rating.summary == ""
|
||||
assert rating.categories == []
|
||||
|
||||
def test_out_of_range_clamped(self):
|
||||
data = {"n": 99, "m": -1, "o": 0, "mo": 10, "r": 6}
|
||||
rating = self._parse("draft-clamp", data)
|
||||
assert rating.novelty == 5
|
||||
assert rating.maturity == 1
|
||||
assert rating.overlap == 1
|
||||
assert rating.momentum == 5
|
||||
assert rating.relevance == 5
|
||||
112
tests/test_search.py
Normal file
112
tests/test_search.py
Normal file
@@ -0,0 +1,112 @@
|
||||
"""Tests for ietf_analyzer.search — sanitize_fts_query."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
|
||||
from ietf_analyzer.search import HybridSearch
|
||||
|
||||
|
||||
class TestSanitizeFtsQuery:
|
||||
"""Test FTS5 query sanitization against injection and edge cases."""
|
||||
|
||||
def test_plain_query(self):
|
||||
assert HybridSearch.sanitize_fts_query("agent protocol") == "agent protocol"
|
||||
|
||||
def test_strips_quotes(self):
|
||||
result = HybridSearch.sanitize_fts_query('"agent" OR "protocol"')
|
||||
assert '"' not in result
|
||||
assert "agent" in result
|
||||
|
||||
def test_strips_parentheses(self):
|
||||
result = HybridSearch.sanitize_fts_query("(agent AND protocol)")
|
||||
assert "(" not in result
|
||||
assert ")" not in result
|
||||
|
||||
def test_strips_asterisk(self):
|
||||
result = HybridSearch.sanitize_fts_query("agent*")
|
||||
assert "*" not in result
|
||||
assert "agent" in result
|
||||
|
||||
def test_removes_boolean_OR(self):
|
||||
result = HybridSearch.sanitize_fts_query("agent OR protocol")
|
||||
assert "OR" not in result
|
||||
assert "agent" in result
|
||||
assert "protocol" in result
|
||||
|
||||
def test_removes_boolean_AND(self):
|
||||
result = HybridSearch.sanitize_fts_query("agent AND protocol")
|
||||
assert "AND" not in result
|
||||
|
||||
def test_removes_boolean_NOT(self):
|
||||
result = HybridSearch.sanitize_fts_query("agent NOT malicious")
|
||||
assert "NOT" not in result
|
||||
assert "malicious" in result
|
||||
|
||||
def test_removes_NEAR(self):
|
||||
result = HybridSearch.sanitize_fts_query("agent NEAR protocol")
|
||||
assert "NEAR" not in result
|
||||
|
||||
def test_case_insensitive_operators(self):
|
||||
result = HybridSearch.sanitize_fts_query("agent or protocol")
|
||||
assert " or " not in result
|
||||
# "or" as standalone word should be removed
|
||||
words = result.split()
|
||||
assert "or" not in [w.lower() for w in words]
|
||||
|
||||
def test_injection_attempt_column_filter(self):
|
||||
"""FTS5 column filter syntax should be stripped."""
|
||||
result = HybridSearch.sanitize_fts_query("title:agent")
|
||||
# The colon is stripped, leaving just "titleagent" or "title agent"
|
||||
assert ":" not in result
|
||||
|
||||
def test_injection_attempt_special_chars(self):
|
||||
result = HybridSearch.sanitize_fts_query('"; DROP TABLE drafts; --')
|
||||
assert ";" not in result
|
||||
assert '"' not in result
|
||||
assert "--" not in result
|
||||
|
||||
def test_empty_query(self):
|
||||
assert HybridSearch.sanitize_fts_query("") == ""
|
||||
|
||||
def test_only_operators(self):
|
||||
result = HybridSearch.sanitize_fts_query("OR AND NOT")
|
||||
assert result.strip() == ""
|
||||
|
||||
def test_only_special_chars(self):
|
||||
result = HybridSearch.sanitize_fts_query('"*(){}[]')
|
||||
assert result.strip() == ""
|
||||
|
||||
def test_collapses_whitespace(self):
|
||||
result = HybridSearch.sanitize_fts_query("agent protocol test")
|
||||
assert result == "agent protocol test"
|
||||
|
||||
def test_preserves_numbers(self):
|
||||
result = HybridSearch.sanitize_fts_query("rfc 8259")
|
||||
assert result == "rfc 8259"
|
||||
|
||||
def test_preserves_underscores(self):
|
||||
result = HybridSearch.sanitize_fts_query("ai_agent_protocol")
|
||||
assert result == "ai_agent_protocol"
|
||||
|
||||
def test_unicode_preserved(self):
|
||||
"""Non-ASCII alphanumeric characters should be preserved."""
|
||||
result = HybridSearch.sanitize_fts_query("müller agent")
|
||||
assert "müller" in result or "mller" in result # depends on \w locale
|
||||
|
||||
def test_mixed_injection(self):
|
||||
"""Complex injection attempt with multiple vectors."""
|
||||
result = HybridSearch.sanitize_fts_query(
|
||||
'(agent* NEAR/5 "protocol") OR title:exploit NOT safe'
|
||||
)
|
||||
# NEAR/5 becomes NEAR5 after stripping the slash, which is no longer
|
||||
# a standalone NEAR operator — it's just a harmless token.
|
||||
assert "OR" not in result.split()
|
||||
assert "NOT" not in result.split()
|
||||
assert "*" not in result
|
||||
assert '"' not in result
|
||||
assert "(" not in result
|
||||
assert ":" not in result
|
||||
# Core words should survive
|
||||
assert "agent" in result
|
||||
assert "protocol" in result
|
||||
Reference in New Issue
Block a user