Files
Christian Nennemann 20c45a7eba Complete remaining medium/low issues: performance, CLI, types, CI, tests
Performance:
- Batch readiness computation (~200 queries → ~6 per page)
- Batch draft lookup in author network (N+1 → single query)
- File-based similarity matrix cache (.npy + metadata sidecar)
- 5-minute TTL embedding cache for search queries

CLI quality:
- Add pass_cfg_db decorator, convert ~30 commands to shared config/db lifecycle
- Add --dry-run to analyze, embed, embed-ideas, ideas, gaps commands
- Move 15+ in-function imports to top of data.py

Types & documentation:
- Add 16 TypedDicts to data.py, annotate 12 function return types
- Add ethics section to Post 06 (premature standardization, power asymmetry)
- Add EU AI Act Article 43 conformity mapping to Post 06
- Add NIS2 and CRA references to Post 04

CI & testing:
- Add GitHub Actions CI workflow (Python 3.11+3.12, ruff, pytest)
- Add API documentation for all 20 endpoints (data/reports/api-docs.md)
- Add 41 new tests (test_analyzer.py, test_search.py) — 64 total pass

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 14:06:54 +01:00

360 lines
7.0 KiB
Markdown

# IETF Draft Analyzer — API Documentation
All API endpoints return JSON by default. Several support `?format=csv` for CSV export.
Base URL: `http://localhost:5000`
---
## Public Endpoints
### GET /api/stats
Overview statistics for the entire corpus.
**Parameters:** None
**Response:**
```json
{
"total_drafts": 361,
"rated_drafts": 260,
"total_authors": 403,
"total_ideas": 1262,
"total_gaps": 12,
"avg_score": 3.42
}
```
---
### GET /api/drafts
Paginated, filterable list of drafts.
**Parameters:**
| Param | Type | Default | Description |
|-------|------|---------|-------------|
| `page` | int | 1 | Page number |
| `q` | string | "" | Full-text search query |
| `cat` | string | "" | Filter by category |
| `source` | string | "" | Filter by source (ietf, w3c) |
| `min_score` | float | 0.0 | Minimum composite score |
| `sort` | string | "score" | Sort field |
| `dir` | string | "desc" | Sort direction (asc/desc) |
| `format` | string | "json" | Response format: "json" or "csv" |
**Response:** JSON object with `drafts` array and pagination metadata.
---
### GET /api/drafts/{name}
Detail for a single draft including rating, authors, ideas, and references.
**Parameters:**
| Param | Type | Description |
|-------|------|-------------|
| `name` | string | Draft name, e.g. `draft-ietf-ai-agent-protocol` |
**Response:** JSON object with full draft detail, or `{"error": "Draft not found"}` (404).
---
### GET /api/categories
Category names and draft counts.
**Parameters:**
| Param | Type | Default | Description |
|-------|------|---------|-------------|
| `format` | string | "json" | "json" or "csv" |
**Response:**
```json
{
"A2A protocols": 45,
"AI safety/alignment": 38,
...
}
```
---
### GET /api/ratings
Rating distributions across the corpus.
**Parameters:**
| Param | Type | Default | Description |
|-------|------|---------|-------------|
| `format` | string | "json" | "json" or "csv" |
**Response:** JSON object with arrays: `names`, `scores`, `novelty`, `maturity`, `overlap`, `momentum`, `relevance`, `categories`.
---
### GET /api/timeline
Timeline data showing draft publication over time.
**Parameters:** None
**Response:** JSON object with timeline series data.
---
### GET /api/landscape
t-SNE 2D embedding landscape of all drafts.
**Parameters:**
| Param | Type | Default | Description |
|-------|------|---------|-------------|
| `format` | string | "json" | "json" or "csv" |
**Response:** JSON array of `{name, x, y, category, score}` points.
---
### GET /api/similarity
Draft similarity network graph.
**Parameters:** None
**Response:** JSON object with `nodes` and `edges` arrays for a force-directed graph.
---
### GET /api/idea-clusters
Clustered ideas across drafts.
**Parameters:** None
**Response:** JSON object with cluster data.
---
### GET /api/ideas
All extracted technical ideas, grouped by type.
**Parameters:**
| Param | Type | Default | Description |
|-------|------|---------|-------------|
| `format` | string | "json" | "json" or "csv" |
**Response:** JSON object with `ideas` array.
---
### GET /api/authors/network
Author collaboration network graph.
**Parameters:** None
**Response:** JSON object with `nodes` and `edges` arrays.
---
### GET /api/citations
Citation/reference graph between drafts.
**Parameters:**
| Param | Type | Default | Description |
|-------|------|---------|-------------|
| `min_refs` | int | 2 | Minimum references to include a node |
**Response:** JSON object with citation graph data.
---
### GET /api/search
Global search across drafts, ideas, authors, and gaps.
**Parameters:**
| Param | Type | Default | Description |
|-------|------|---------|-------------|
| `q` | string | "" | Search query (required for results) |
| `format` | string | "json" | "json" or "csv" |
**Response:**
```json
{
"drafts": [...],
"ideas": [...],
"authors": [...],
"gaps": [...]
}
```
---
### POST /api/ask
Search-only question answering (free, no Claude API call). Returns relevant sources and any cached answer.
**Request body:**
```json
{
"question": "What drafts address agent authentication?",
"top_k": 5
}
```
**Response:** JSON with `sources` array and optional cached `answer`.
---
## Admin-Only Endpoints
These endpoints require admin mode (`--dev` flag) or authentication.
### POST /api/ask/synthesize
Synthesize an answer using Claude (costs tokens, rate-limited to 10 req/min/IP). Answers are cached permanently.
**Auth:** Admin required
**Request body:**
```json
{
"question": "How do IETF drafts approach agent identity?",
"top_k": 5
}
```
**Response:** JSON with `sources` array and synthesized `answer`.
**Errors:** 429 if rate-limited.
---
### GET /api/gaps
All identified standardization gaps.
**Auth:** Admin required
**Parameters:**
| Param | Type | Default | Description |
|-------|------|---------|-------------|
| `format` | string | "json" | "json" or "csv" |
**Response:** JSON array of gap objects.
---
### GET /api/gaps/{gap_id}
Detail for a single gap.
**Auth:** Admin required
**Parameters:**
| Param | Type | Description |
|-------|------|-------------|
| `gap_id` | int | Gap ID |
**Response:** JSON object with gap detail, or `{"error": "Gap not found"}` (404).
---
### POST /api/compare
Compare multiple drafts using Claude (costs tokens, rate-limited).
**Auth:** Admin required
**Request body:**
```json
{
"drafts": ["draft-name-one", "draft-name-two"]
}
```
**Response:**
```json
{
"text": "Comparison analysis text...",
"drafts": ["draft-name-one", "draft-name-two"]
}
```
**Errors:** 400 if fewer than 2 drafts provided.
---
### POST /api/drafts/{name}/annotate
Add or update annotations (notes, tags) for a draft.
**Auth:** Admin required
**Request body:**
```json
{
"note": "Interesting approach to agent handshake",
"tags": ["important", "review"],
"add_tag": "flagged",
"remove_tag": "review"
}
```
All fields are optional. `add_tag`/`remove_tag` operate on existing tags incrementally.
**Response:**
```json
{
"success": true,
"annotation": {"note": "...", "tags": ["important", "flagged"]}
}
```
---
### GET /api/monitor
Pipeline monitoring status (processing progress, error counts).
**Auth:** Admin required
**Response:** JSON object with monitoring data.
---
## Non-API Data Endpoints
### GET /export/obsidian
Download the entire research corpus as an Obsidian vault ZIP file.
**Response:** `application/zip` file download.
---
## Authentication
- **Production mode** (default): Admin endpoints return 403.
- **Development mode** (`--dev` flag): All admin endpoints are accessible without authentication.
- Rate-limited endpoints (`/api/ask/synthesize`, `/api/compare`): 10 requests per minute per IP, enforced via in-memory sliding window.
## Error Responses
All errors return JSON:
```json
{"error": "Description of the error"}
```
Common HTTP status codes:
- `400` — Bad request (missing parameters)
- `403` — Admin access required
- `404` — Resource not found
- `429` — Rate limit exceeded
- `500` — Internal server error