Complete remaining medium/low issues: performance, CLI, types, CI, tests
Performance: - Batch readiness computation (~200 queries → ~6 per page) - Batch draft lookup in author network (N+1 → single query) - File-based similarity matrix cache (.npy + metadata sidecar) - 5-minute TTL embedding cache for search queries CLI quality: - Add pass_cfg_db decorator, convert ~30 commands to shared config/db lifecycle - Add --dry-run to analyze, embed, embed-ideas, ideas, gaps commands - Move 15+ in-function imports to top of data.py Types & documentation: - Add 16 TypedDicts to data.py, annotate 12 function return types - Add ethics section to Post 06 (premature standardization, power asymmetry) - Add EU AI Act Article 43 conformity mapping to Post 06 - Add NIS2 and CRA references to Post 04 CI & testing: - Add GitHub Actions CI workflow (Python 3.11+3.12, ruff, pytest) - Add API documentation for all 20 endpoints (data/reports/api-docs.md) - Add 41 new tests (test_analyzer.py, test_search.py) — 64 total pass Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
359
data/reports/api-docs.md
Normal file
359
data/reports/api-docs.md
Normal file
@@ -0,0 +1,359 @@
|
||||
# IETF Draft Analyzer — API Documentation
|
||||
|
||||
All API endpoints return JSON by default. Several support `?format=csv` for CSV export.
|
||||
|
||||
Base URL: `http://localhost:5000`
|
||||
|
||||
---
|
||||
|
||||
## Public Endpoints
|
||||
|
||||
### GET /api/stats
|
||||
|
||||
Overview statistics for the entire corpus.
|
||||
|
||||
**Parameters:** None
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"total_drafts": 361,
|
||||
"rated_drafts": 260,
|
||||
"total_authors": 403,
|
||||
"total_ideas": 1262,
|
||||
"total_gaps": 12,
|
||||
"avg_score": 3.42
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### GET /api/drafts
|
||||
|
||||
Paginated, filterable list of drafts.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `page` | int | 1 | Page number |
|
||||
| `q` | string | "" | Full-text search query |
|
||||
| `cat` | string | "" | Filter by category |
|
||||
| `source` | string | "" | Filter by source (ietf, w3c) |
|
||||
| `min_score` | float | 0.0 | Minimum composite score |
|
||||
| `sort` | string | "score" | Sort field |
|
||||
| `dir` | string | "desc" | Sort direction (asc/desc) |
|
||||
| `format` | string | "json" | Response format: "json" or "csv" |
|
||||
|
||||
**Response:** JSON object with `drafts` array and pagination metadata.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/drafts/{name}
|
||||
|
||||
Detail for a single draft including rating, authors, ideas, and references.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `name` | string | Draft name, e.g. `draft-ietf-ai-agent-protocol` |
|
||||
|
||||
**Response:** JSON object with full draft detail, or `{"error": "Draft not found"}` (404).
|
||||
|
||||
---
|
||||
|
||||
### GET /api/categories
|
||||
|
||||
Category names and draft counts.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `format` | string | "json" | "json" or "csv" |
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"A2A protocols": 45,
|
||||
"AI safety/alignment": 38,
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### GET /api/ratings
|
||||
|
||||
Rating distributions across the corpus.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `format` | string | "json" | "json" or "csv" |
|
||||
|
||||
**Response:** JSON object with arrays: `names`, `scores`, `novelty`, `maturity`, `overlap`, `momentum`, `relevance`, `categories`.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/timeline
|
||||
|
||||
Timeline data showing draft publication over time.
|
||||
|
||||
**Parameters:** None
|
||||
|
||||
**Response:** JSON object with timeline series data.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/landscape
|
||||
|
||||
t-SNE 2D embedding landscape of all drafts.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `format` | string | "json" | "json" or "csv" |
|
||||
|
||||
**Response:** JSON array of `{name, x, y, category, score}` points.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/similarity
|
||||
|
||||
Draft similarity network graph.
|
||||
|
||||
**Parameters:** None
|
||||
|
||||
**Response:** JSON object with `nodes` and `edges` arrays for a force-directed graph.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/idea-clusters
|
||||
|
||||
Clustered ideas across drafts.
|
||||
|
||||
**Parameters:** None
|
||||
|
||||
**Response:** JSON object with cluster data.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/ideas
|
||||
|
||||
All extracted technical ideas, grouped by type.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `format` | string | "json" | "json" or "csv" |
|
||||
|
||||
**Response:** JSON object with `ideas` array.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/authors/network
|
||||
|
||||
Author collaboration network graph.
|
||||
|
||||
**Parameters:** None
|
||||
|
||||
**Response:** JSON object with `nodes` and `edges` arrays.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/citations
|
||||
|
||||
Citation/reference graph between drafts.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `min_refs` | int | 2 | Minimum references to include a node |
|
||||
|
||||
**Response:** JSON object with citation graph data.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/search
|
||||
|
||||
Global search across drafts, ideas, authors, and gaps.
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `q` | string | "" | Search query (required for results) |
|
||||
| `format` | string | "json" | "json" or "csv" |
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"drafts": [...],
|
||||
"ideas": [...],
|
||||
"authors": [...],
|
||||
"gaps": [...]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### POST /api/ask
|
||||
|
||||
Search-only question answering (free, no Claude API call). Returns relevant sources and any cached answer.
|
||||
|
||||
**Request body:**
|
||||
```json
|
||||
{
|
||||
"question": "What drafts address agent authentication?",
|
||||
"top_k": 5
|
||||
}
|
||||
```
|
||||
|
||||
**Response:** JSON with `sources` array and optional cached `answer`.
|
||||
|
||||
---
|
||||
|
||||
## Admin-Only Endpoints
|
||||
|
||||
These endpoints require admin mode (`--dev` flag) or authentication.
|
||||
|
||||
### POST /api/ask/synthesize
|
||||
|
||||
Synthesize an answer using Claude (costs tokens, rate-limited to 10 req/min/IP). Answers are cached permanently.
|
||||
|
||||
**Auth:** Admin required
|
||||
|
||||
**Request body:**
|
||||
```json
|
||||
{
|
||||
"question": "How do IETF drafts approach agent identity?",
|
||||
"top_k": 5
|
||||
}
|
||||
```
|
||||
|
||||
**Response:** JSON with `sources` array and synthesized `answer`.
|
||||
|
||||
**Errors:** 429 if rate-limited.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/gaps
|
||||
|
||||
All identified standardization gaps.
|
||||
|
||||
**Auth:** Admin required
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `format` | string | "json" | "json" or "csv" |
|
||||
|
||||
**Response:** JSON array of gap objects.
|
||||
|
||||
---
|
||||
|
||||
### GET /api/gaps/{gap_id}
|
||||
|
||||
Detail for a single gap.
|
||||
|
||||
**Auth:** Admin required
|
||||
|
||||
**Parameters:**
|
||||
| Param | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `gap_id` | int | Gap ID |
|
||||
|
||||
**Response:** JSON object with gap detail, or `{"error": "Gap not found"}` (404).
|
||||
|
||||
---
|
||||
|
||||
### POST /api/compare
|
||||
|
||||
Compare multiple drafts using Claude (costs tokens, rate-limited).
|
||||
|
||||
**Auth:** Admin required
|
||||
|
||||
**Request body:**
|
||||
```json
|
||||
{
|
||||
"drafts": ["draft-name-one", "draft-name-two"]
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"text": "Comparison analysis text...",
|
||||
"drafts": ["draft-name-one", "draft-name-two"]
|
||||
}
|
||||
```
|
||||
|
||||
**Errors:** 400 if fewer than 2 drafts provided.
|
||||
|
||||
---
|
||||
|
||||
### POST /api/drafts/{name}/annotate
|
||||
|
||||
Add or update annotations (notes, tags) for a draft.
|
||||
|
||||
**Auth:** Admin required
|
||||
|
||||
**Request body:**
|
||||
```json
|
||||
{
|
||||
"note": "Interesting approach to agent handshake",
|
||||
"tags": ["important", "review"],
|
||||
"add_tag": "flagged",
|
||||
"remove_tag": "review"
|
||||
}
|
||||
```
|
||||
|
||||
All fields are optional. `add_tag`/`remove_tag` operate on existing tags incrementally.
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"annotation": {"note": "...", "tags": ["important", "flagged"]}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### GET /api/monitor
|
||||
|
||||
Pipeline monitoring status (processing progress, error counts).
|
||||
|
||||
**Auth:** Admin required
|
||||
|
||||
**Response:** JSON object with monitoring data.
|
||||
|
||||
---
|
||||
|
||||
## Non-API Data Endpoints
|
||||
|
||||
### GET /export/obsidian
|
||||
|
||||
Download the entire research corpus as an Obsidian vault ZIP file.
|
||||
|
||||
**Response:** `application/zip` file download.
|
||||
|
||||
---
|
||||
|
||||
## Authentication
|
||||
|
||||
- **Production mode** (default): Admin endpoints return 403.
|
||||
- **Development mode** (`--dev` flag): All admin endpoints are accessible without authentication.
|
||||
- Rate-limited endpoints (`/api/ask/synthesize`, `/api/compare`): 10 requests per minute per IP, enforced via in-memory sliding window.
|
||||
|
||||
## Error Responses
|
||||
|
||||
All errors return JSON:
|
||||
```json
|
||||
{"error": "Description of the error"}
|
||||
```
|
||||
|
||||
Common HTTP status codes:
|
||||
- `400` — Bad request (missing parameters)
|
||||
- `403` — Admin access required
|
||||
- `404` — Resource not found
|
||||
- `429` — Rate limit exceeded
|
||||
- `500` — Internal server error
|
||||
Reference in New Issue
Block a user