Files

Christian Nennemann 61cdab16b9 fix: dev mode auth regression from blueprint refactor

The _initialized singleton in auth.py prevented hooks from registering
on the correct app instance when create_app() was called twice (once
eagerly at import, once from __main__). Removed the guard and made
the module-level app lazy. Also adds feature backlog and architecture
assessment from the review team.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-09 03:52:02 +01:00

13 KiB

Raw Blame History

IETF Draft Analyzer — Architectural Assessment

Date: 2026-03-09 Scope: Core source code analysis (src/, tests/) Project Size: ~7.6 MB, 19,662 lines of Python

1. File Sizes and Complexity

God Files (largest, highest complexity risk)

File	LOC	Severity	Issue
`webui/data.py`	4,360	HIGH	Service/data access layer doing too much
`cli.py`	3,438	MEDIUM	96 functions, 40+ Click commands, hard to navigate
`reports.py`	2,739	MEDIUM	Single Reporter class with many report generation methods
`db.py`	1,690	MEDIUM	100+ methods, schema + CRUD + business logic mixed

Healthy Modules (< 500 LOC, focused)

models.py (104 LOC) — Domain models only
config.py (108 LOC) — Configuration with env overrides
embeddings.py (205 LOC) — Ollama embedding wrapper
authors.py (137 LOC) — Author network fetching
fetcher.py (204 LOC) — Datatracker API client

2. Module Boundaries: Core vs. Web

Clean Separation ✓

Core layer (src/ietf_analyzer/): Self-contained, no Flask dependencies
Web layer (src/webui/): Depends on core, not vice-versa
No circular imports detected

Problem: webui/data.py Violates Single Responsibility

What it does:

Wraps Database (4,360 lines!)
Implements domain logic (clustering, readiness scoring, similarity graphs)
Prepares data for JSON/Jinja2 serialization
Defines TypedDicts for response shapes
Calls sklearn for TSNE/hierarchical clustering
Builds visualization data (radar, histogram, network graphs)

Risk: Tests only test_web_data.py — hard to regression-test domain logic when mixed with presentation layer.

3. Flask Structure (`app.py`)

Routes Count

72 functions (includes helpers)
~40+ @app.route() handlers
No blueprints — monolithic Flask app

Route Categories

Overview pages (5) — /, /landscape, /timeline, /idea-clusters, /ratings
Detail pages (6) — /drafts, /drafts/<name>, /gaps, /gaps/<id>, etc.
Feature pages (8) — /search, /ask, /compare, /monitor, /admin/analytics, etc.
API endpoints (20+) — /api/drafts, /api/stats, /api/search, /api/ask, etc.
Helpers (5+) — auth, rate limiting, CSV export, DB context

Issues

Issue	Effort	Impact
No blueprint organization — Mix of concerns (pages, APIs, admin) in one file	SMALL	Makes navigation hard
Tight coupling to data.py — 50 imports from data.py	SMALL	Hard to refactor data layer
Mixed JSON/HTML rendering — Some routes render both based on Accept header	SMALL	Should be separate APIs
Admin functions inline — `/admin/analytics` uses `@admin_required` decorator	SMALL	Should be separate blueprint

Recommendation: Split into 4 blueprints:

blueprints/pages.py — HTML pages
blueprints/api.py — JSON endpoints
blueprints/admin.py — Admin routes
blueprints/helpers.py — Shared utilities

4. Database Layer (`db.py`)

Structure: Single `Database` Class

100+ methods doing:

Schema definition — SCHEMA constant, ensure_tables()
CRUD operations — add_draft(), update_draft(), get_draft(), delete_draft()
Bulk operations — add_drafts(), update_ratings()
Complex queries — get_drafts_by_category(), search_fts(), most_cited(), co_authors()
Business logic — Rating aggregations, clustering, similarity ranking
Cache management — llm_cache table operations
Stats — count_drafts(), count_by_source(), aggregations

Issues

Issue	Evidence	Refactor Effort
Mixed concerns	Methods scattered: schema, CRUD, queries, business logic	LARGE
No transaction support	`add_draft()` does 3+ INSERT statements without explicit tx	MEDIUM
Hard to unit test	Database class touches 8+ tables; need fixtures for each	MEDIUM
Tight coupling to models	Direct `Author`, `Draft`, `Rating` dataclass deps	SMALL
No query builders	Raw SQL in 20+ methods (injection risk if not careful)	MEDIUM

Refactoring Path (4-step)

# Current: db.Database (100 methods)
#
# Refactored:
# - db.Schema — @dataclass fixtures, schema def (10 methods)
# - db.Repository — CRUD base class (15 methods)
# - db.DraftRepository, AuthorRepository, etc. — domain-specific CRUD
# - db.Queries — Complex queries as static methods or separate class
# - db.Cache — LLM cache operations (10 methods)

Effort: LARGE (4+ hours) Benefit: Testability, reusability, transaction support, easier migrations

5. Pipeline Architecture (`pipeline/`)

Structure: Modular design ✓

context.py — ContextBuilder (domain logic for draft generation)
generator.py — DraftGenerator (Claude-based content generation)
family.py — Family/relationships logic
formatter.py — Output formatting
quality.py — Quality checks
prompts.py — System prompts
PROMPTS constant — Shared across modules

Assessment

✓ Good separation — Each module has a single responsibility ✓ Testable — Pure functions + dependency injection via Config/Database ✓ Extensible — Can add new stages without touching existing code

No refactoring needed for pipeline itself.

6. Sources Architecture (`sources/`)

Structure: Plugin pattern ✓

base.py — SourceDocument dataclass, SourceFetcher protocol
ietf.py, w3c.py, etsi.py, itu.py, iso.py, nist.py — Concrete fetchers

Assessment

✓ Excellent separation — Base protocol + concrete implementations ✓ Testable — Mock fetchers easy to create ✓ Extensible — New source = new file, no changes to orchestrator

No refactoring needed.

7. Config Management (`config.py`)

Structure

Single Config dataclass (40 fields)
load() class method with env var override support
save() method to persist to JSON
Validation in _validate()

Assessment

✓ Clean — Single responsibility ✓ Testable — No I/O except file read/write ✓ Env support — _ENV_OVERRIDES dict maps env vars

Minor issue: Could use structured logging of which env vars override config (currently silent).

No refactoring needed unless config grows beyond 50 fields.

8. CLI Structure (`cli.py`)

Count: 96 functions

Command groups:

fetch — Datatracker + multi-source fetching
classify — Ollama-based pre-filtering
list, search, show, annotate — Draft browsing
analyze — Claude analysis (rate, ideas, gaps)
ask, compare — Interactive queries
embed, embed-ideas — Ollama embeddings
similar, clusters — Embedding-based search
report (group) → overview, landscape, digest, timeline, etc.
monitor — Background pipeline automation
pipeline, pipeline-status, pipeline-auto-heal — Orchestration
observatory — Multi-source dashboard
readiness — Release readiness analysis
export — Generate drafts from gaps
web — Flask app launcher

Issues

Issue	Severity	Solution
96 functions in one file	MEDIUM	Hard to navigate, should split into subcommands or files
Long help text inline	LOW	Could use .help-txt files for docstrings
Late imports	LOW	Some imports inside `@pass_cfg_db` functions to save startup time (OK pattern)
Global console object	LOW	OK for Click, allows colorized output

Recommended Split (5 files)

# cli.py (main entry, 200 lines)
# └─ cli/fetching.py (fetch, classify) — 400 lines
# └─ cli/analysis.py (analyze, ask, compare) — 600 lines
# └─ cli/reporting.py (report *, export, observatory) — 800 lines
# └─ cli/admin.py (monitor, pipeline, web) — 400 lines

Effort: SMALL (1 hour) Benefit: Easier to navigate, faster to find commands, clearer testing boundaries.

9. Circular Dependencies

Check Results

✓ No circular imports detected

cli.py → db, config, models ✓
app.py → webui.data, webui.auth, ietf_analyzer.* ✓
webui.data → ietf_analyzer.db, ietf_analyzer.config ✓
db.py → models, config ✓

10. Test Structure

Test files: 8 modules covering:

test_db.py — Database operations
test_analyzer.py — Claude analysis
test_search.py — Similarity + FTS
test_web_data.py — Data layer for web
test_models.py — Domain models
test_obsidian_export.py — Export

Coverage gaps:

No tests for cli.py (big commands, hard to test without mocking db)
No tests for app.py routes (would need Flask test client + fixtures)
No tests for pipeline modules (context, generator, family)
No tests for sources (would need HTTP mocks)

Effort to add 30% CLI coverage: MEDIUM (2-3 hours)

Summary: Refactoring Roadmap

Priority	Module	Issue	Effort	Benefit
HIGH	`webui/data.py`	4,360 LOC, mixed concerns (CRUD + domain logic + presentation)	LARGE	Separates presentation from domain, enables better testing
MEDIUM	`db.py`	100 methods, mixed schema/CRUD/logic	LARGE	Testability, transaction support, query builders
MEDIUM	`cli.py`	96 functions, hard to navigate	SMALL	Better organization, easier to find commands
MEDIUM	`app.py`	40+ routes, no blueprints	SMALL	Clearer structure, easier to refactor
LOW	Config	40 fields, working well	NONE	Monitor for growth beyond 50 fields
LOW	Pipeline/Sources	Well-structured, testable	NONE	No changes needed

Recommendations (Priority Order)

1. Extract webui/data.py → presentation + domain (4 hours)

webui/
├── data.py (current 4,360 → 1,000) — Only JSON serialization
└── services/
    ├── drafts.py (200) — Draft filtering/sorting logic
    ├── analytics.py (400) — Dashboard stats + visualizations
    ├── search.py (300) — Search + clustering
    └── readiness.py (200) — Readiness scoring

Cost: 4 hours Benefit: Can reuse analytics for CLI reports, easier to test

2. Refactor db.py → Repository pattern (4 hours)

db/
├── __init__.py (exports Database facade)
├── schema.py (100) — Schema definition
├── repository.py (200) — Base CRUD class
├── drafts.py (300) — DraftRepository (get, add, update, delete drafts)
├── ratings.py (200) — RatingRepository
├── authors.py (150) — AuthorRepository
├── queries.py (400) — Complex queries (search, similarity, aggregations)
└── cache.py (150) — LLM cache operations

Cost: 4 hours Benefit: 80% reduction in method count per class, transaction support, testability

3. Split cli.py into subcommand groups (1 hour)

cli/
├── __init__.py (main entry, ~200 lines)
├── fetching.py (fetch, classify)
├── analysis.py (analyze, ask, compare)
├── reporting.py (report *, export, observatory)
└── admin.py (monitor, pipeline, web)

Cost: 1 hour Benefit: Easier to navigate, clearer boundaries, faster to find commands

4. Convert Flask app to blueprints (1.5 hours)

webui/
├── app.py (core Flask setup, ~100 lines)
└── blueprints/
    ├── pages.py (HTML routes: /, /drafts, /landscape, etc.)
    ├── api.py (JSON endpoints: /api/*)
    ├── admin.py (/admin/*)
    └── helpers.py (rate limiting, auth, CSV export)

Cost: 1.5 hours Benefit: Clearer separation of concerns, easier to add/remove features

5. Add CLI tests (3 hours)

Mock database for each command
Test success paths + error cases
Quick smoke tests for all 15+ major commands

Cost: 3 hours Benefit: Catch regressions early, safe refactoring

Current State Assessment

Dimension	Score	Notes
Modularity	7/10	Good separation of concerns; core vs. web clean; webui/data.py is the weak point
Testability	6/10	DB layer hard to unit test; pipeline/sources good; no CLI tests
Maintainability	6/10	Many large files; well-documented; consistent patterns throughout
Extensibility	8/10	Plugin pattern for sources; easy to add new reports; CLI is open-ended
Performance	8/10	Caching, FTS5, lazy imports in CLI; no N+1 queries detected

Overall: 7/10 — Production-ready but showing signs of technical debt accumulation.

The project is well-organized at a high level but needs refactoring of large monolithic files to stay maintainable as it grows. The webui/data.py and db.py files should be prioritized first.

13 KiB Raw Blame History

IETF Draft Analyzer — Architectural Assessment

1. File Sizes and Complexity

God Files (largest, highest complexity risk)

Healthy Modules (< 500 LOC, focused)

2. Module Boundaries: Core vs. Web

Clean Separation ✓

Problem: webui/data.py Violates Single Responsibility

3. Flask Structure (app.py)

Routes Count

Route Categories

Issues

4. Database Layer (db.py)

Structure: Single Database Class

Issues

Refactoring Path (4-step)

5. Pipeline Architecture (pipeline/)

Structure: Modular design ✓

Assessment

6. Sources Architecture (sources/)

Structure: Plugin pattern ✓

Assessment

7. Config Management (config.py)

Structure

Assessment

8. CLI Structure (cli.py)

Count: 96 functions

Issues

Recommended Split (5 files)

9. Circular Dependencies

Check Results

10. Test Structure

Summary: Refactoring Roadmap

Recommendations (Priority Order)

1. Extract webui/data.py → presentation + domain (4 hours)

2. Refactor db.py → Repository pattern (4 hours)

3. Split cli.py into subcommand groups (1 hour)

4. Convert Flask app to blueprints (1.5 hours)

5. Add CLI tests (3 hours)

Current State Assessment

13 KiB

Raw Blame History

3. Flask Structure (`app.py`)

4. Database Layer (`db.py`)

Structure: Single `Database` Class

5. Pipeline Architecture (`pipeline/`)

6. Sources Architecture (`sources/`)

7. Config Management (`config.py`)

8. CLI Structure (`cli.py`)