Compare commits

..

7 Commits

Author SHA1 Message Date
e83491cb91 chore: add uv.lock for ACT+ECT demo reproducibility
Some checks failed
CI / test (3.11) (push) Failing after 8s
CI / test (3.12) (push) Failing after 9s
Observatory Update / update (push) Failing after 10s
2026-04-12 12:43:40 +00:00
316fdefcd7 fix: rename ACT from "Compact" to "Context" token, add pyproject.toml and PoC plan
Corrects the ACT acronym expansion to "Agent Context Token" in the
reference implementation. Adds proper pyproject.toml for the act package
and the MCP+LangGraph PoC planning document.
2026-04-12 12:43:31 +00:00
9a0dc899a8 feat: add ACT+ECT over MCP demo with LangGraph agent
End-to-end PoC demonstrating Agent Context Token authorization and
Execution Context Token accountability over MCP tool calls, using a
LangGraph agent with ES256-signed JWT tokens and DAG verification.
2026-04-12 12:43:22 +00:00
45cb13fbe8 feat: add IETF landscape paper source (LaTeX + BibTeX + Makefile)
New LaTeX paper analyzing the AI-agent standardization landscape across
IETF Internet-Drafts. Includes bibliography, updated Makefile for
pdflatex+bibtex build, and gitignore entries for build artifacts.
2026-04-12 12:43:15 +00:00
56f2ce669c feat: unified drafts/ structure with PDF outputs for ACT and ECT
Both drafts now live in workspace/drafts/ as siblings:
  drafts/
  ├── act/                       (ACT -01, native to parent repo)
  │   ├── draft-nennemann-act-01.md     kramdown-rfc source
  │   ├── draft-nennemann-act-01.{xml,txt,html,pdf}
  │   ├── .refcache/             bibxml cache
  │   └── build.sh
  ├── ietf-wimse-ect/            (ECT -02, submodule, PDF added)
  │   └── ...
  └── README-pdf.md              PDF toolchain docs

ACT kramdown-rfc conversion:
- full YAML frontmatter (title, author, refs)
- section structure matches kramdown-rfc conventions
- {{REF}} citation syntax, auto-numbered sections
- references auto-built from normative/informative blocks
- removed manual TOC (kramdown-rfc generates)
- builds cleanly: 133K XML, 89K TXT, 208K HTML, 167K PDF

PDF toolchain:
- xml2rfc --pdf via weasyprint<60 + pydyf<0.10 injected into xml2rfc pipx venv
- both build.sh scripts now produce PDF as Step 4
- README-pdf.md documents the setup for new machines

Submodule: bump ietf-wimse-ect pointer for build.sh PDF step
2026-04-12 14:01:57 +02:00
37859beef6 feat: interop test package + session handoff doc
Cross-spec interop validation between ietf-act and ietf-ect:
- new packages/interop/ sibling package (ietf-act-ect-interop)
- 32 tests pass: shared claims, algorithm matrix, DAG structure,
  divergence handling, anti-goals
- documents ES256 raw signature wire-compatibility
- documents airtight typ separation (act+jwt vs exec+jwt)

Hazards surfaced:
- ACTLedger.append() silently accepts ECT Payload via duck-typing
  (both have .jti) — documented in interop README as a production
  hazard requiring external isinstance checks

Session handoff:
- SESSION-2026-04-12.md — snapshot of decisions, artifacts, open
  actions, and next-session starting points

Also: session-end commit of hash-format fix propagation to
packages/ect/ (the fix was applied to the old refimpl location
but did not propagate through the parallel package-move agent).
2026-04-12 07:39:41 +02:00
3a139dfc7e feat: ACT/ECT strategy, package restructure, draft -01/-02 prep
Strategic work for IETF submission of draft-nennemann-act-01 and
draft-nennemann-wimse-ect-02:

Package restructure:
- move ACT and ECT refimpls to workspace/packages/{act,ect}/
- ietf-act and ietf-ect distribution names (sibling packages)
- cross-spec interop test plan (INTEROP-TEST-PLAN.md)

ACT draft -01 revisions:
- rename 'par' claim to 'pred' (align with ECT)
- rename 'Agent Compact Token' to 'Agent Context Token' (semantic
  alignment with ECT family)
- add Applicability section (MCP, OpenAI, LangGraph, A2A, CrewAI)
- add DAG vs Linear Delegation Chains section (differentiator vs
  txn-tokens-for-agents actchain, Agentic JWT, AIP/IBCTs)
- add Related Work: AIP, SentinelAgent, Agentic JWT, txn-tokens-for-agents,
  HDP, SCITT-AI-agent-execution
- pin SCITT arch to -22, note AUTH48 status

Outreach drafts:
- Emirdag liaison email (SCITT-AI coordination)
- OAuth ML response on txn-tokens-for-agents-06

Strategy document:
- STRATEGY.md with phased action plan, risk register, timeline

Submodule:
- update workspace/drafts/ietf-wimse-ect pointer to -02 commit
2026-04-12 07:33:08 +02:00
108 changed files with 24341 additions and 13 deletions

4
.gitignore vendored
View File

@@ -16,3 +16,7 @@ paper/*.out
paper/*.synctex.gz
paper/*.fls
paper/*.fdb_latexmk
paper/*.bbl
paper/*.blg
paper/*.pdf
paper/*.toc

9
demo/act-ect-mcp/.gitignore vendored Normal file
View File

@@ -0,0 +1,9 @@
keys/*.pem
keys/*.json
!keys/.gitkeep
__pycache__/
*.egg-info/
.pytest_cache/
.venv/
build/
dist/

159
demo/act-ect-mcp/README.md Normal file
View File

@@ -0,0 +1,159 @@
# ACT + ECT + MCP + LangGraph — end-to-end PoC
A working demonstration of `draft-nennemann-act-01` (Agent Context Token)
and `draft-nennemann-wimse-ect-01` (Execution Context Token) in a realistic
agent stack:
* a **LangGraph** ReAct agent driven by a local **Ollama** LLM;
* talking over **MCP** streamable-HTTP to a FastMCP server;
* every request carries an **ACT** mandate, a per-call **ECT**, and an
**RFC 9421** HTTP signature with the `wimse-aud` parameter from
`draft-ietf-wimse-http-signature-03`;
* the server rejects any request where ACT / ECT / HTTP-signature /
capability / body-hash binding fails;
* a verifier CLI replays the run's ledger, re-runs the two refimpls, and
prints the resulting DAG.
## Why this exists
The two drafts (ACT and ECT) claim to fit together — ACT giving the
lifecycle (mandate → execution record) and ECT giving the per-call
execution context on the wire. This PoC proves the claim end to end:
the same refimpls that ship in `workspace/packages/{act,ect}/` are the
only crypto/verification layer used here. There is no token forgery
shortcut.
## Requirements
* `uv` ([install](https://docs.astral.sh/uv/))
* Local Ollama with a chat model pulled (`qwen3:8b` by default)
* Python 3.11+ (uv will fetch one if missing)
## Run the demo
```bash
./demo.sh
```
This script:
1. Syncs deps (uv installs the sibling `ietf-act` and `ietf-ect`
packages in editable mode, plus `mcp`, `langgraph`,
`langchain-ollama`, `langchain-mcp-adapters`, `fastapi`, …).
2. Launches the MCP server on `127.0.0.1:8765`.
3. Runs the agent (`poc-agent`) against it with a canned research task.
4. Runs the verifier (`poc-verify`) over the ledger the agent emitted.
5. Shows the last five server-audit entries.
Expected tail of output:
```
mandate verified jti=64f5ec87
ects verified n=7 (tool-calls=2, session=5)
record verified jti=64f5ec87 status=completed
ect-dag wellformed every pred is the mandate or a prior ECT
Run
===
mandate 64f5ec87 task='Search for quantum entanglement, …'
iss=user sub=agent aud=mcp-server
cap=['mcp.session.initialize', 'mcp.session.list_tools', 'mcp.session.other', 'mcp.search', 'mcp.summarize']
Tool-call ECT DAG:
ect 73af4cd3 exec_act=mcp.search pred=['64f5ec87']
ect 0e3ffa01 exec_act=mcp.summarize pred=['64f5ec87', '73af4cd3']
ACT Phase 2 record:
jti=64f5ec87 exec_act=mcp.summarize
status=completed pred=[]
inp_hash=… out_hash=…
```
The mandate jti and the record jti are identical — this is ACT §3.2:
the Phase 2 token records the same task it started as. The tool-call
ECT DAG captures the per-HTTP-request ordering.
## Architecture
```
user ──(mints ACT mandate)──► agent
│ create_react_agent(ChatOllama, tools)
┌──────────────────────────────┐
│ LangGraph ReAct loop │
│ LLM decides tools to call │
│ │
│ langchain-mcp-adapters │
│ ↑ streamable_http session │
└──────────┬───────────────────┘
httpx.AsyncClient with event hooks
┌───────────▼─────────────┐
│ on_request: │
│ - mint ECT(inp_hash) │
│ - RFC 9421 sign │
│ - attach Authorization │
│ + Wimse-ECT + sig │
└───────────┬──────────────┘
POST /mcp
┌───────────▼──────────────┐
│ FastMCP streamable-http │
│ + ActEctAuthMiddleware │
│ verifies: │
│ ACT mandate │
│ ECT │
│ HTTP-signature │
│ inp_hash == body │
│ exec_act in cap │
│ ECT.iss == sub │
│ then dispatches tool │
└──────────────────────────┘
```
## Files
* `src/poc/keys.py` — ES256 keys for the three PoC identities
(`user`, `agent`, `mcp-server`).
* `src/poc/tokens.py` — thin wrappers around `ietf-act` and `ietf-ect`
that fix the PoC's shape.
* `src/poc/http_sig.py` — minimal RFC 9421 signer/verifier covering
`@method`, `@target-uri`, `content-digest`, `wimse-ect`, with the
`wimse-aud` metadata parameter from http-signature-03.
* `src/poc/server.py` — FastMCP server with ACT + ECT + signature
middleware. Writes `keys/server-audit.jsonl`.
* `src/poc/agent.py` — LangGraph + Ollama agent. Writes
`keys/ledger.jsonl` — one mandate, N ECTs, one final ACT record.
* `src/poc/verify_cli.py` — ledger verifier, prints the DAG.
* `tests/` — pytest suite (see below).
## Non-goals
* No real LLM API costs (Ollama is local).
* No distributed SCITT anchoring — server audit log is a plain JSONL.
* No Go-side client in this PoC; Python ↔ Python. Go refimpl lives in
`workspace/drafts/ietf-wimse-ect/refimpl/go-lang/`.
* The PoC uses a single mandate per run and a single Phase 2 record
(ACT §3.2 jti preservation). If you want a multi-record DAG of ACT
tasks, you'd need to exercise the delegation machinery
(`act.delegation.create_delegated_mandate`) — this PoC does not, to
keep the wire story tight.
## Tests
```bash
uv run pytest
```
Covers:
* Minting + round-trip verification of each token type.
* HTTP-signature round-trip.
* Middleware rejection paths (missing headers, wrong audience, stolen
ECT on a mutated body).
* End-to-end: launches an in-process server via httpx's ASGI transport
and runs the agent's token-injection hooks over it (no Ollama
required — uses a fake LLM).

55
demo/act-ect-mcp/demo.sh Executable file
View File

@@ -0,0 +1,55 @@
#!/usr/bin/env bash
# End-to-end demo: start MCP server, run the LangGraph agent, verify the DAG.
#
# Requirements:
# - uv installed (https://docs.astral.sh/uv/)
# - Ollama running locally with `qwen3:8b` pulled
# (override via POC_MODEL and OLLAMA_HOST env vars)
#
# The script is idempotent — keys and ledgers are written under ./keys/.
set -euo pipefail
cd "$(dirname "$0")"
POC_MODEL="${POC_MODEL:-qwen3:8b}"
POC_PORT="${POC_PORT:-8765}"
POC_PURPOSE="${POC_PURPOSE:-Search for quantum entanglement, then summarise the top result.}"
mkdir -p keys
rm -f keys/ledger.jsonl keys/server-audit.jsonl
echo "==> syncing dependencies"
uv sync --quiet
echo "==> starting MCP server on 127.0.0.1:${POC_PORT}"
uv run python -m poc.server --port "${POC_PORT}" >/tmp/poc-server.log 2>&1 &
SERVER_PID=$!
trap 'kill "$SERVER_PID" 2>/dev/null || true' EXIT
for _ in $(seq 1 25); do
if curl -sSf -o /dev/null -X POST "http://127.0.0.1:${POC_PORT}/mcp" \
-H 'content-type: application/json' \
--data '{"jsonrpc":"2.0","id":0,"method":"initialize"}' 2>/dev/null \
|| curl -sS "http://127.0.0.1:${POC_PORT}/mcp" -o /dev/null; then
break
fi
sleep 0.2
done
echo "==> running agent (model=${POC_MODEL})"
uv run poc-agent \
--purpose "${POC_PURPOSE}" \
--model "${POC_MODEL}" \
--mcp-url "http://127.0.0.1:${POC_PORT}/mcp"
echo
echo "==> verifying ledger"
uv run poc-verify
echo
echo "==> server audit log (last 5 lines)"
tail -n 5 keys/server-audit.jsonl || true
echo
echo "demo OK"

View File

View File

@@ -0,0 +1,13 @@
{"kind":"mandate","jti":"2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","compact":"eyJhbGciOiJFUzI1NiIsInR5cCI6ImFjdCtqd3QiLCJraWQiOiJraWQ6dXNlcjp2MSJ9.eyJpc3MiOiJ1c2VyIiwic3ViIjoiYWdlbnQiLCJhdWQiOiJtY3Atc2VydmVyIiwiaWF0IjoxNzc1OTc2NzY5LCJleHAiOjE3NzU5Nzc2NjksImp0aSI6IjJhOTJjNWI3LWRkMGYtNDRmNS04NzY4LWQxYzNiZjhmMWNmNyIsInRhc2siOnsicHVycG9zZSI6IlNlYXJjaCBmb3IgcXVhbnR1bSBlbnRhbmdsZW1lbnQsIHRoZW4gc3VtbWFyaXNlIHRoZSB0b3AgcmVzdWx0LiIsImNyZWF0ZWRfYnkiOiJ1c2VyIn0sImNhcCI6W3siYWN0aW9uIjoibWNwLnNlc3Npb24uaW5pdGlhbGl6ZSJ9LHsiYWN0aW9uIjoibWNwLnNlc3Npb24ubGlzdF90b29scyJ9LHsiYWN0aW9uIjoibWNwLnNlc3Npb24ub3RoZXIifSx7ImFjdGlvbiI6Im1jcC5zZWFyY2gifSx7ImFjdGlvbiI6Im1jcC5zdW1tYXJpemUifV0sIndpZCI6ImFnZW50In0.X6uhGB4FYY4PsS7np1GFL-4z-lhMdClKhq5G8T9Nic4DUxUeFKwW6aaqYxf20TjbhLfs23Zwq_Nv4jRy8KjTZA","metadata":{"iss":"user","sub":"agent","aud":"mcp-server","task":{"purpose":"Search for quantum entanglement, then summarise the top result.","created_by":"user"},"cap":[{"action":"mcp.session.initialize"},{"action":"mcp.session.list_tools"},{"action":"mcp.session.other"},{"action":"mcp.search"},{"action":"mcp.summarize"}]}}
{"kind":"ect","jti":"c5655719-7187-47db-8080-d0ed95ad1740","compact":"eyJhbGciOiJFUzI1NiIsImtpZCI6ImtpZDphZ2VudDp2MSIsInR5cCI6ImV4ZWMrand0In0.eyJpc3MiOiJhZ2VudCIsImF1ZCI6Im1jcC1zZXJ2ZXIiLCJpYXQiOjE3NzU5NzY3NjksImV4cCI6MTc3NTk3NzA2OSwianRpIjoiYzU2NTU3MTktNzE4Ny00N2RiLTgwODAtZDBlZDk1YWQxNzQwIiwiZXhlY19hY3QiOiJtY3Auc2Vzc2lvbi5pbml0aWFsaXplIiwicHJlZCI6WyIyYTkyYzViNy1kZDBmLTQ0ZjUtODc2OC1kMWMzYmY4ZjFjZjciXSwid2lkIjoiYWdlbnQiLCJpbnBfaGFzaCI6ImxmUDBFbWUwcDBjeVNUa01fbzFfc1ZGNzhRQnBOdF9LVjRXRGtjdE1oZkUifQ.9Gy14pINnRn5WYvz2XW4LdDvV_G8ZqJb5TVd-hpO_q0aTbL0HrgNL5bK_zKf_FoFYU9DTcWkd_ukJPhd_eF11A","metadata":{"method":"initialize","exec_act":"mcp.session.initialize","session_only":true}}
{"kind":"ect","jti":"9c31c1b8-0b50-44e9-a58f-a751a659a16a","compact":"eyJhbGciOiJFUzI1NiIsImtpZCI6ImtpZDphZ2VudDp2MSIsInR5cCI6ImV4ZWMrand0In0.eyJpc3MiOiJhZ2VudCIsImF1ZCI6Im1jcC1zZXJ2ZXIiLCJpYXQiOjE3NzU5NzY3NjksImV4cCI6MTc3NTk3NzA2OSwianRpIjoiOWMzMWMxYjgtMGI1MC00NGU5LWE1OGYtYTc1MWE2NTlhMTZhIiwiZXhlY19hY3QiOiJtY3Auc2Vzc2lvbi5vdGhlciIsInByZWQiOlsiMmE5MmM1YjctZGQwZi00NGY1LTg3NjgtZDFjM2JmOGYxY2Y3Il0sIndpZCI6ImFnZW50IiwiaW5wX2hhc2giOiJHWkZFREhRemRjd19jUjdEZUl4ekxQXzNHWVNUT292M053SF9WalFuaGcwIn0.dSNnWYdk35hiz_7rVa8BOrz2fRtLDonGBlGCIdprdVUF7qD1pLnZJY6koG04gE2Ayn5yXnN_jAr4e_KRYhfyNQ","metadata":{"method":"notifications/initialized","exec_act":"mcp.session.other","session_only":true}}
{"kind":"ect","jti":"f6d45c98-fede-4d85-9fd6-39993a68cb42","compact":"eyJhbGciOiJFUzI1NiIsImtpZCI6ImtpZDphZ2VudDp2MSIsInR5cCI6ImV4ZWMrand0In0.eyJpc3MiOiJhZ2VudCIsImF1ZCI6Im1jcC1zZXJ2ZXIiLCJpYXQiOjE3NzU5NzY3NjksImV4cCI6MTc3NTk3NzA2OSwianRpIjoiZjZkNDVjOTgtZmVkZS00ZDg1LTlmZDYtMzk5OTNhNjhjYjQyIiwiZXhlY19hY3QiOiJtY3Auc2Vzc2lvbi5saXN0X3Rvb2xzIiwicHJlZCI6WyIyYTkyYzViNy1kZDBmLTQ0ZjUtODc2OC1kMWMzYmY4ZjFjZjciXSwid2lkIjoiYWdlbnQiLCJpbnBfaGFzaCI6ImRmREhZbm8xeC1scUNvaElKaUtnMGV4LWdjRHJWQlNvaTZlcXVVVU05enMifQ.5xqO6a4OEliM2QXB1ZnVIPSdJVZtffwcdILnbnKsnMFI5lRTScE2DGBjMC_fo67dxgmfbR950wE-BpkHga03Ig","metadata":{"method":"tools/list","exec_act":"mcp.session.list_tools","session_only":true}}
{"kind":"ect","jti":"eeca8478-4df0-46a5-8c80-6f31913226e3","compact":"eyJhbGciOiJFUzI1NiIsImtpZCI6ImtpZDphZ2VudDp2MSIsInR5cCI6ImV4ZWMrand0In0.eyJpc3MiOiJhZ2VudCIsImF1ZCI6Im1jcC1zZXJ2ZXIiLCJpYXQiOjE3NzU5NzY3NzIsImV4cCI6MTc3NTk3NzA3MiwianRpIjoiZWVjYTg0NzgtNGRmMC00NmE1LThjODAtNmYzMTkxMzIyNmUzIiwiZXhlY19hY3QiOiJtY3Auc2Vzc2lvbi5pbml0aWFsaXplIiwicHJlZCI6WyIyYTkyYzViNy1kZDBmLTQ0ZjUtODc2OC1kMWMzYmY4ZjFjZjciXSwid2lkIjoiYWdlbnQiLCJpbnBfaGFzaCI6ImxmUDBFbWUwcDBjeVNUa01fbzFfc1ZGNzhRQnBOdF9LVjRXRGtjdE1oZkUifQ.vnxBJGIq25E6kQpwkmkG3AyiXYXTCumuyVq_BNJ4fA61xaR4kO2_zNokxo3uJ9hBP6JDCEXpTGlvoLHgtgcTEA","metadata":{"method":"initialize","exec_act":"mcp.session.initialize","session_only":true}}
{"kind":"ect","jti":"c359aebf-d687-44f6-8c94-12493b387969","compact":"eyJhbGciOiJFUzI1NiIsImtpZCI6ImtpZDphZ2VudDp2MSIsInR5cCI6ImV4ZWMrand0In0.eyJpc3MiOiJhZ2VudCIsImF1ZCI6Im1jcC1zZXJ2ZXIiLCJpYXQiOjE3NzU5NzY3NzIsImV4cCI6MTc3NTk3NzA3MiwianRpIjoiYzM1OWFlYmYtZDY4Ny00NGY2LThjOTQtMTI0OTNiMzg3OTY5IiwiZXhlY19hY3QiOiJtY3Auc2Vzc2lvbi5vdGhlciIsInByZWQiOlsiMmE5MmM1YjctZGQwZi00NGY1LTg3NjgtZDFjM2JmOGYxY2Y3Il0sIndpZCI6ImFnZW50IiwiaW5wX2hhc2giOiJHWkZFREhRemRjd19jUjdEZUl4ekxQXzNHWVNUT292M053SF9WalFuaGcwIn0.3baO9jqD74qXgb70Ffh0FjmhYX4Kc974La6uu__PNv15KydtaOoo530NlKhCEJ5Y-B10eoeXd51P9emtXGNlxg","metadata":{"method":"notifications/initialized","exec_act":"mcp.session.other","session_only":true}}
{"kind":"ect","jti":"7ac2034b-ac27-42b8-bb9c-1d790ac9a4f0","compact":"eyJhbGciOiJFUzI1NiIsImtpZCI6ImtpZDphZ2VudDp2MSIsInR5cCI6ImV4ZWMrand0In0.eyJpc3MiOiJhZ2VudCIsImF1ZCI6Im1jcC1zZXJ2ZXIiLCJpYXQiOjE3NzU5NzY3NzIsImV4cCI6MTc3NTk3NzA3MiwianRpIjoiN2FjMjAzNGItYWMyNy00MmI4LWJiOWMtMWQ3OTBhYzlhNGYwIiwiZXhlY19hY3QiOiJtY3Auc2VhcmNoIiwicHJlZCI6WyIyYTkyYzViNy1kZDBmLTQ0ZjUtODc2OC1kMWMzYmY4ZjFjZjciXSwid2lkIjoiYWdlbnQiLCJpbnBfaGFzaCI6IlFrVUZBUmJVTDJMQzhMb2VDcTItVWpldFlfWG5oWWs1UjRYRUZOZG1SQlkifQ.u0bwRvmuUC1IDv7fmBy0pFb1ozeDnJHbnUr3X4jd3ClbeFX2aSZOVodB7HO97MBJdiavFebo01ZAQXJsbhb6MQ","metadata":{"method":"tools/call","tool_name":"search","exec_act":"mcp.search","pred":["2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7"]}}
{"kind":"ect","jti":"295f6c79-df73-4851-abeb-07e5c9282268","compact":"eyJhbGciOiJFUzI1NiIsImtpZCI6ImtpZDphZ2VudDp2MSIsInR5cCI6ImV4ZWMrand0In0.eyJpc3MiOiJhZ2VudCIsImF1ZCI6Im1jcC1zZXJ2ZXIiLCJpYXQiOjE3NzU5NzY3NzIsImV4cCI6MTc3NTk3NzA3MiwianRpIjoiMjk1ZjZjNzktZGY3My00ODUxLWFiZWItMDdlNWM5MjgyMjY4IiwiZXhlY19hY3QiOiJtY3Auc2Vzc2lvbi5saXN0X3Rvb2xzIiwicHJlZCI6WyIyYTkyYzViNy1kZDBmLTQ0ZjUtODc2OC1kMWMzYmY4ZjFjZjciXSwid2lkIjoiYWdlbnQiLCJpbnBfaGFzaCI6IkZBb1FWQkpXZTdMWlJUMFJfS0RNMDBWUk9hdmpCcjVveTJGblVUaDh4VzgifQ.oy7Ayuz-xlwx6E-VWi5-71vhUwtPzjl-SejZRiO0vxO4rxO8K5Qua2IjN4D1w77VZcSq-3EhWOuu-BZ0eSKYvg","metadata":{"method":"tools/list","exec_act":"mcp.session.list_tools","session_only":true}}
{"kind":"ect","jti":"d242256f-f17c-4b3d-995d-60f13beacc25","compact":"eyJhbGciOiJFUzI1NiIsImtpZCI6ImtpZDphZ2VudDp2MSIsInR5cCI6ImV4ZWMrand0In0.eyJpc3MiOiJhZ2VudCIsImF1ZCI6Im1jcC1zZXJ2ZXIiLCJpYXQiOjE3NzU5NzY3NzcsImV4cCI6MTc3NTk3NzA3NywianRpIjoiZDI0MjI1NmYtZjE3Yy00YjNkLTk5NWQtNjBmMTNiZWFjYzI1IiwiZXhlY19hY3QiOiJtY3Auc2Vzc2lvbi5pbml0aWFsaXplIiwicHJlZCI6WyIyYTkyYzViNy1kZDBmLTQ0ZjUtODc2OC1kMWMzYmY4ZjFjZjciXSwid2lkIjoiYWdlbnQiLCJpbnBfaGFzaCI6ImxmUDBFbWUwcDBjeVNUa01fbzFfc1ZGNzhRQnBOdF9LVjRXRGtjdE1oZkUifQ.otfuVqTG-nmSpv_WOA9G-1BoJqs_Hpv-Y6BmwRFBgxMvhM0e4KhfUKqh0BPFnweQNqoJvoLW_jP2uh6wepBkxg","metadata":{"method":"initialize","exec_act":"mcp.session.initialize","session_only":true}}
{"kind":"ect","jti":"2eb7eef6-6195-44a1-a88f-78be075cdf17","compact":"eyJhbGciOiJFUzI1NiIsImtpZCI6ImtpZDphZ2VudDp2MSIsInR5cCI6ImV4ZWMrand0In0.eyJpc3MiOiJhZ2VudCIsImF1ZCI6Im1jcC1zZXJ2ZXIiLCJpYXQiOjE3NzU5NzY3NzcsImV4cCI6MTc3NTk3NzA3NywianRpIjoiMmViN2VlZjYtNjE5NS00NGExLWE4OGYtNzhiZTA3NWNkZjE3IiwiZXhlY19hY3QiOiJtY3Auc2Vzc2lvbi5vdGhlciIsInByZWQiOlsiMmE5MmM1YjctZGQwZi00NGY1LTg3NjgtZDFjM2JmOGYxY2Y3Il0sIndpZCI6ImFnZW50IiwiaW5wX2hhc2giOiJHWkZFREhRemRjd19jUjdEZUl4ekxQXzNHWVNUT292M053SF9WalFuaGcwIn0.qVPXigUqNu_bUMhYJrJQW6teu626-AZANpX7m-4o42ZsKAoNYE9D-xnOKdRnkcMybUVVKJlICPMWO9EU9ds_Hw","metadata":{"method":"notifications/initialized","exec_act":"mcp.session.other","session_only":true}}
{"kind":"ect","jti":"27c2682f-3b0a-4658-8710-113ae45864c1","compact":"eyJhbGciOiJFUzI1NiIsImtpZCI6ImtpZDphZ2VudDp2MSIsInR5cCI6ImV4ZWMrand0In0.eyJpc3MiOiJhZ2VudCIsImF1ZCI6Im1jcC1zZXJ2ZXIiLCJpYXQiOjE3NzU5NzY3NzcsImV4cCI6MTc3NTk3NzA3NywianRpIjoiMjdjMjY4MmYtM2IwYS00NjU4LTg3MTAtMTEzYWU0NTg2NGMxIiwiZXhlY19hY3QiOiJtY3Auc3VtbWFyaXplIiwicHJlZCI6WyIyYTkyYzViNy1kZDBmLTQ0ZjUtODc2OC1kMWMzYmY4ZjFjZjciLCI3YWMyMDM0Yi1hYzI3LTQyYjgtYmI5Yy0xZDc5MGFjOWE0ZjAiXSwid2lkIjoiYWdlbnQiLCJpbnBfaGFzaCI6Ilg5b2VtUFlUTUx3anZCclBYSTZfQ0s0WWZ5T01KemZYTnRpTk5FY3FHTU0ifQ.kjapcaU417lSbAxjL9qzhKJdq3WxK5AwytOfAItWZxhTOnQ11HgGQ6nGmwdKxRv47mYxVo8xe5XsMSuwF9mLqw","metadata":{"method":"tools/call","tool_name":"summarize","exec_act":"mcp.summarize","pred":["2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","7ac2034b-ac27-42b8-bb9c-1d790ac9a4f0"]}}
{"kind":"ect","jti":"e220a1d8-ec56-43f8-9952-a1f4387f607b","compact":"eyJhbGciOiJFUzI1NiIsImtpZCI6ImtpZDphZ2VudDp2MSIsInR5cCI6ImV4ZWMrand0In0.eyJpc3MiOiJhZ2VudCIsImF1ZCI6Im1jcC1zZXJ2ZXIiLCJpYXQiOjE3NzU5NzY3NzcsImV4cCI6MTc3NTk3NzA3NywianRpIjoiZTIyMGExZDgtZWM1Ni00M2Y4LTk5NTItYTFmNDM4N2Y2MDdiIiwiZXhlY19hY3QiOiJtY3Auc2Vzc2lvbi5saXN0X3Rvb2xzIiwicHJlZCI6WyIyYTkyYzViNy1kZDBmLTQ0ZjUtODc2OC1kMWMzYmY4ZjFjZjciXSwid2lkIjoiYWdlbnQiLCJpbnBfaGFzaCI6IkZBb1FWQkpXZTdMWlJUMFJfS0RNMDBWUk9hdmpCcjVveTJGblVUaDh4VzgifQ.10ySm9JKxbH1ytgl0R-9NfApjED8qJw5awRUWwpzRVuA8CU5nsiyni0VYuyaOstfsrTUKtb71NmRLjcPBaSF1Q","metadata":{"method":"tools/list","exec_act":"mcp.session.list_tools","session_only":true}}
{"kind":"record","jti":"2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","compact":"eyJhbGciOiJFUzI1NiIsInR5cCI6ImFjdCtqd3QiLCJraWQiOiJraWQ6YWdlbnQ6djEifQ.eyJpc3MiOiJ1c2VyIiwic3ViIjoiYWdlbnQiLCJhdWQiOiJtY3Atc2VydmVyIiwiaWF0IjoxNzc1OTc2NzY5LCJleHAiOjE3NzU5Nzc2NjksImp0aSI6IjJhOTJjNWI3LWRkMGYtNDRmNS04NzY4LWQxYzNiZjhmMWNmNyIsInRhc2siOnsicHVycG9zZSI6IlNlYXJjaCBmb3IgcXVhbnR1bSBlbnRhbmdsZW1lbnQsIHRoZW4gc3VtbWFyaXNlIHRoZSB0b3AgcmVzdWx0LiIsImNyZWF0ZWRfYnkiOiJ1c2VyIn0sImNhcCI6W3siYWN0aW9uIjoibWNwLnNlc3Npb24uaW5pdGlhbGl6ZSJ9LHsiYWN0aW9uIjoibWNwLnNlc3Npb24ubGlzdF90b29scyJ9LHsiYWN0aW9uIjoibWNwLnNlc3Npb24ub3RoZXIifSx7ImFjdGlvbiI6Im1jcC5zZWFyY2gifSx7ImFjdGlvbiI6Im1jcC5zdW1tYXJpemUifV0sImV4ZWNfYWN0IjoibWNwLnN1bW1hcml6ZSIsInByZWQiOltdLCJleGVjX3RzIjoxNzc1OTc2NzgxLCJzdGF0dXMiOiJjb21wbGV0ZWQiLCJ3aWQiOiJhZ2VudCIsImlucF9oYXNoIjoiLWJMNF9Wb2JDWkNxZnFIc2RqN1hCR2tyeFpYaGE5VVp2YURCNkVTNGFYVSIsIm91dF9oYXNoIjoicFhXdjdKSjVtSmVSRDk3czlrMk41bkF4cVo0ajJPc3ZPNFM1WmJHNzVFdyJ9.PXoZHHCBizZiTLB_NL5o1j2O-wCEt6Px7syWP53pAaAUohdzNsbUGVgKR4LcXriP98yG8JnvbfPSa7tG1rA6qQ","metadata":{"exec_act":"mcp.summarize","status":"completed","pred":[],"inp_hash":"-bL4_VobCZCqfqHsdj7XBGkrxZXha9UZvaDB6ES4aXU","out_hash":"pXWv7JJ5mJeRD97s9k2N5nAxqZ4j2OsvO4S5ZbG75Ew","n_tool_ects":2}}

View File

@@ -0,0 +1,11 @@
{"ts":1775976769,"mandate_jti":"2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","ect_jti":"c5655719-7187-47db-8080-d0ed95ad1740","exec_act":"mcp.session.initialize","pred":["2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7"],"inp_hash":"lfP0Eme0p0cySTkM_o1_sVF78QBpNt_KV4WDkctMhfE","wimse_aud":"mcp-server","keyid":"kid:agent:v1"}
{"ts":1775976769,"mandate_jti":"2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","ect_jti":"9c31c1b8-0b50-44e9-a58f-a751a659a16a","exec_act":"mcp.session.other","pred":["2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7"],"inp_hash":"GZFEDHQzdcw_cR7DeIxzLP_3GYSTOov3NwH_VjQnhg0","wimse_aud":"mcp-server","keyid":"kid:agent:v1"}
{"ts":1775976769,"mandate_jti":"2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","ect_jti":"f6d45c98-fede-4d85-9fd6-39993a68cb42","exec_act":"mcp.session.list_tools","pred":["2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7"],"inp_hash":"dfDHYno1x-lqCohIJiKg0ex-gcDrVBSoi6equUUM9zs","wimse_aud":"mcp-server","keyid":"kid:agent:v1"}
{"ts":1775976772,"mandate_jti":"2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","ect_jti":"eeca8478-4df0-46a5-8c80-6f31913226e3","exec_act":"mcp.session.initialize","pred":["2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7"],"inp_hash":"lfP0Eme0p0cySTkM_o1_sVF78QBpNt_KV4WDkctMhfE","wimse_aud":"mcp-server","keyid":"kid:agent:v1"}
{"ts":1775976772,"mandate_jti":"2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","ect_jti":"c359aebf-d687-44f6-8c94-12493b387969","exec_act":"mcp.session.other","pred":["2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7"],"inp_hash":"GZFEDHQzdcw_cR7DeIxzLP_3GYSTOov3NwH_VjQnhg0","wimse_aud":"mcp-server","keyid":"kid:agent:v1"}
{"ts":1775976772,"mandate_jti":"2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","ect_jti":"7ac2034b-ac27-42b8-bb9c-1d790ac9a4f0","exec_act":"mcp.search","pred":["2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7"],"inp_hash":"QkUFARbUL2LC8LoeCq2-UjetY_XnhYk5R4XEFNdmRBY","wimse_aud":"mcp-server","keyid":"kid:agent:v1"}
{"ts":1775976772,"mandate_jti":"2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","ect_jti":"295f6c79-df73-4851-abeb-07e5c9282268","exec_act":"mcp.session.list_tools","pred":["2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7"],"inp_hash":"FAoQVBJWe7LZRT0R_KDM00VROavjBr5oy2FnUTh8xW8","wimse_aud":"mcp-server","keyid":"kid:agent:v1"}
{"ts":1775976777,"mandate_jti":"2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","ect_jti":"d242256f-f17c-4b3d-995d-60f13beacc25","exec_act":"mcp.session.initialize","pred":["2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7"],"inp_hash":"lfP0Eme0p0cySTkM_o1_sVF78QBpNt_KV4WDkctMhfE","wimse_aud":"mcp-server","keyid":"kid:agent:v1"}
{"ts":1775976777,"mandate_jti":"2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","ect_jti":"2eb7eef6-6195-44a1-a88f-78be075cdf17","exec_act":"mcp.session.other","pred":["2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7"],"inp_hash":"GZFEDHQzdcw_cR7DeIxzLP_3GYSTOov3NwH_VjQnhg0","wimse_aud":"mcp-server","keyid":"kid:agent:v1"}
{"ts":1775976777,"mandate_jti":"2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","ect_jti":"27c2682f-3b0a-4658-8710-113ae45864c1","exec_act":"mcp.summarize","pred":["2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","7ac2034b-ac27-42b8-bb9c-1d790ac9a4f0"],"inp_hash":"X9oemPYTMLwjvBrPXI6_CK4YfyOMJzfXNtiNNEcqGMM","wimse_aud":"mcp-server","keyid":"kid:agent:v1"}
{"ts":1775976777,"mandate_jti":"2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7","ect_jti":"e220a1d8-ec56-43f8-9952-a1f4387f607b","exec_act":"mcp.session.list_tools","pred":["2a92c5b7-dd0f-44f5-8768-d1c3bf8f1cf7"],"inp_hash":"FAoQVBJWe7LZRT0R_KDM00VROavjBr5oy2FnUTh8xW8","wimse_aud":"mcp-server","keyid":"kid:agent:v1"}

View File

@@ -0,0 +1,48 @@
[build-system]
requires = ["setuptools>=68"]
build-backend = "setuptools.build_meta"
[project]
name = "act-ect-poc"
version = "0.1.0"
description = "End-to-end PoC: LangGraph agent calling MCP tools with ACT mandate + ECT execution context"
requires-python = ">=3.11"
dependencies = [
# Our refimpls (installed from sibling packages)
"ietf-act",
"ietf-ect",
# MCP server + client
"mcp>=1.6.0",
# LangGraph + LLM plumbing
"langgraph>=0.3.0",
"langchain>=0.3.0",
"langchain-core>=0.3.0",
"langchain-ollama>=0.2.0",
"langchain-mcp-adapters>=0.1.0",
# HTTP server + client
"fastapi>=0.110",
"uvicorn[standard]>=0.29",
"httpx>=0.27",
# Crypto / JWT
"cryptography>=42.0",
"PyJWT>=2.8.0",
]
[project.optional-dependencies]
dev = ["pytest>=8.0", "pytest-asyncio>=0.23"]
[project.scripts]
poc-server = "poc.server:main"
poc-agent = "poc.agent:main"
poc-verify = "poc.verify_cli:main"
[tool.setuptools.packages.find]
where = ["src"]
[tool.uv.sources]
ietf-act = { path = "../../workspace/packages/act", editable = true }
ietf-ect = { path = "../../workspace/packages/ect", editable = true }
[tool.pytest.ini_options]
testpaths = ["tests"]
asyncio_mode = "auto"

View File

@@ -0,0 +1,3 @@
"""ACT + ECT + MCP + LangGraph end-to-end PoC."""
__version__ = "0.1.0"

View File

@@ -0,0 +1,447 @@
"""LangGraph ReAct agent that calls MCP tools with ACT + ECT on every request.
Flow per run
------------
1. ``mint_mandate`` — user issues a Phase 1 ACT mandate that authorises the
agent to use ``mcp.search``, ``mcp.summarize``, plus session-level actions.
2. ``MultiServerMCPClient`` opens a streamable-HTTP session to the MCP
server. The session's ``httpx.AsyncClient`` has event hooks installed
(``_install_ect_hooks``) that, on every outgoing POST to /mcp:
* build an ECT over the request body (inp_hash),
* sign the request per RFC 9421 with ``wimse-aud=mcp-server``,
* attach ``Authorization: Bearer <ACT>``, ``Wimse-ECT: <ect>``,
``Content-Digest``, ``Signature-Input`` and ``Signature``.
Each ECT's ``pred`` chains to the mandate plus all prior tool-call ECTs
in this run, so the ECT DAG captures the per-tool-call ordering.
3. ``create_react_agent`` runs a LangGraph ReAct loop with ChatOllama; the
LLM decides when/what to call. The token plumbing is transparent to
the model.
4. After the agent finishes its response, a single Phase 2 ACT execution
record is minted that summarises the run (ACT §3.2: one mandate → one
record; jti preserved). The record's ``inp_hash`` covers the task
purpose and ``out_hash`` covers the final assistant message.
"""
from __future__ import annotations
import argparse
import asyncio
import hashlib
import json
import logging
import os
import time
from contextlib import asynccontextmanager
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, AsyncIterator
import httpx
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain_ollama import ChatOllama
from langgraph.prebuilt import create_react_agent
from act.crypto import b64url_sha256
from .http_sig import SignedRequest, content_digest, sign_request
from .keys import Identity, load_identities
from .tokens import (
MintedMandate,
MintedRecord,
MintedECT,
exec_act_for_rpc_method,
mint_ect,
mint_exec_record,
mint_mandate,
)
LOG = logging.getLogger("poc.agent")
SERVER_IDENTITY_NAME = "mcp-server"
# ---- Session ledger ---------------------------------------------------------
@dataclass
class LedgerEntry:
kind: str # "mandate" | "ect" | "record"
compact: str
jti: str
metadata: dict[str, Any] = field(default_factory=dict)
def to_json(self) -> str:
return json.dumps(
{
"kind": self.kind,
"jti": self.jti,
"compact": self.compact,
"metadata": self.metadata,
},
separators=(",", ":"),
)
@dataclass
class SessionLedger:
"""Mutable per-run state: mandate + growing chain of ECT tool invocations.
The ECT ``pred`` set grows with each successful tool call, giving a
DAG of execution contexts. There is exactly one ACT Phase 2 record per
run (minted at the end), whose jti equals the mandate jti per ACT §3.2.
"""
path: Path
mandate: MintedMandate
tool_ects: list[MintedECT] = field(default_factory=list)
final_record: MintedRecord | None = None
def write_entry(self, entry: LedgerEntry) -> None:
self.path.parent.mkdir(parents=True, exist_ok=True)
with self.path.open("a", encoding="utf-8") as fh:
fh.write(entry.to_json() + "\n")
def tool_ect_pred(self) -> list[str]:
"""pred list for the *next* tool-call ECT: mandate + prior tool ECTs."""
return [self.mandate.mandate.jti] + [e.payload.jti for e in self.tool_ects]
# ---- httpx event hooks ------------------------------------------------------
def _rpc_method_and_tool(body: bytes) -> tuple[str | None, str | None]:
"""Sniff a JSON-RPC request body for (method, tool_name)."""
try:
obj = json.loads(body.decode("utf-8"))
except Exception:
return None, None
if not isinstance(obj, dict):
return None, None
method = obj.get("method")
if not isinstance(method, str):
return None, None
tool_name = None
if method == "tools/call":
params = obj.get("params") or {}
name = params.get("name") if isinstance(params, dict) else None
if isinstance(name, str):
tool_name = name
return method, tool_name
def _install_ect_hooks(
client: httpx.AsyncClient,
*,
agent: Identity,
audience: str,
ledger: SessionLedger,
mcp_path: str = "/mcp",
) -> None:
"""Attach request/response event hooks that inject ACT+ECT+sig headers."""
state_key = "_poc_ect_state"
async def on_request(request: httpx.Request) -> None:
if not request.url.path.endswith(mcp_path):
return
# httpx may have already serialized body into request.content.
body = request.content or b""
method, tool_name = _rpc_method_and_tool(body)
if method is None:
# Not JSON-RPC — still attach mandate so middleware can 403
# rather than 401, but skip ECT/record minting. The PoC never
# triggers this path; keep it permissive to ease debugging.
request.headers["authorization"] = f"Bearer {ledger.mandate.compact}"
return
try:
exec_act = exec_act_for_rpc_method(method, tool_name)
except ValueError:
LOG.warning("unknown tool in tools/call: %r", tool_name)
return
# Session-setup calls (initialize, tools/list, ping, …) don't grow
# the tool-call DAG — they point only at the mandate. Tool-call
# ECTs chain off the mandate plus every prior tool-call ECT.
is_tool_call = method == "tools/call"
if is_tool_call:
pred_jtis = ledger.tool_ect_pred()
else:
pred_jtis = [ledger.mandate.mandate.jti]
ect = mint_ect(
agent=agent,
audience=audience,
exec_act=exec_act,
pred_jtis=pred_jtis,
inp_body=body,
)
signed: SignedRequest = sign_request(
method=request.method,
target_uri=str(request.url),
body=body,
wimse_ect=ect.compact,
wimse_aud=audience,
keyid=agent.kid,
private_key=agent.private_key,
)
request.headers["authorization"] = f"Bearer {ledger.mandate.compact}"
request.headers["wimse-ect"] = ect.compact
request.headers["content-digest"] = signed.content_digest
request.headers["signature-input"] = signed.signature_input
request.headers["signature"] = signed.signature
# Stash so response hook can mint the exec record correlating the
# HTTP exchange with the ECT we just sent.
setattr(request, state_key, {
"ect": ect,
"exec_act": exec_act,
"method": method,
"tool_name": tool_name,
"inp_hash": b64url_sha256(body),
"pred_jtis": pred_jtis,
"request_body": body,
})
async def on_response(response: httpx.Response) -> None:
request = response.request
st = getattr(request, state_key, None)
if not st:
return
method: str = st["method"]
ect = st["ect"]
if method == "tools/call":
ledger.tool_ects.append(ect)
ledger.write_entry(
LedgerEntry(
kind="ect",
compact=ect.compact,
jti=ect.payload.jti,
metadata={
"method": method,
"tool_name": st["tool_name"],
"exec_act": st["exec_act"],
"pred": list(ect.payload.pred),
},
)
)
else:
ledger.write_entry(
LedgerEntry(
kind="ect",
compact=ect.compact,
jti=ect.payload.jti,
metadata={
"method": method,
"exec_act": st["exec_act"],
"session_only": True,
},
)
)
client.event_hooks["request"].append(on_request)
client.event_hooks["response"].append(on_response)
# ---- MCP client factory -----------------------------------------------------
def make_httpx_client_factory(agent: Identity, audience: str, ledger: SessionLedger):
"""Return an httpx_client_factory that installs our hooks on each client."""
from mcp.shared._httpx_utils import (
MCP_DEFAULT_SSE_READ_TIMEOUT,
MCP_DEFAULT_TIMEOUT,
)
def factory(
headers: dict[str, str] | None = None,
timeout: httpx.Timeout | None = None,
auth: httpx.Auth | None = None,
) -> httpx.AsyncClient:
kwargs: dict[str, Any] = {"follow_redirects": True}
if timeout is None:
kwargs["timeout"] = httpx.Timeout(
MCP_DEFAULT_TIMEOUT, read=MCP_DEFAULT_SSE_READ_TIMEOUT
)
else:
kwargs["timeout"] = timeout
if headers is not None:
kwargs["headers"] = headers
if auth is not None:
kwargs["auth"] = auth
client = httpx.AsyncClient(**kwargs)
_install_ect_hooks(client, agent=agent, audience=audience, ledger=ledger)
return client
return factory
# ---- Run an agent turn ------------------------------------------------------
@asynccontextmanager
async def open_mcp_client(
*, agent: Identity, audience: str, ledger: SessionLedger, url: str
) -> AsyncIterator[MultiServerMCPClient]:
factory = make_httpx_client_factory(agent, audience, ledger)
client = MultiServerMCPClient(
{
"poc": {
"transport": "streamable_http",
"url": url,
"httpx_client_factory": factory,
}
}
)
try:
yield client
finally:
# MultiServerMCPClient does not expose an explicit close in 0.2.x;
# sessions are closed per get_tools() call. Nothing to do here.
pass
async def run_once(
*,
purpose: str,
model: str,
mcp_url: str,
keys_dir: str,
ledger_path: str,
ollama_host: str | None,
) -> dict[str, Any]:
identities = load_identities(keys_dir)
user = identities["user"]
agent = identities["agent"]
mandate = mint_mandate(
user=user,
agent=agent,
audience=SERVER_IDENTITY_NAME,
purpose=purpose,
)
ledger = SessionLedger(path=Path(ledger_path), mandate=mandate)
ledger.write_entry(
LedgerEntry(
kind="mandate",
compact=mandate.compact,
jti=mandate.mandate.jti,
metadata={
"iss": mandate.mandate.iss,
"sub": mandate.mandate.sub,
"aud": mandate.mandate.aud,
"task": mandate.mandate.task.to_dict(),
"cap": [c.to_dict() for c in mandate.mandate.cap],
},
)
)
async with open_mcp_client(
agent=agent, audience=SERVER_IDENTITY_NAME, ledger=ledger, url=mcp_url
) as client:
tools = await client.get_tools()
LOG.info("loaded %d MCP tools: %s", len(tools), [t.name for t in tools])
llm_kwargs: dict[str, Any] = {"model": model, "temperature": 0.0}
if ollama_host:
llm_kwargs["base_url"] = ollama_host
llm = ChatOllama(**llm_kwargs)
graph = create_react_agent(llm, tools)
system = SystemMessage(
content=(
"You are a research assistant with access to two tools: "
"search(query) and summarize(text). "
"For the user's task, first call search to gather material, "
"then call summarize on the joined results. "
"After the summary, reply with the summary and stop."
)
)
human = HumanMessage(content=purpose)
result = await graph.ainvoke({"messages": [system, human]})
final_msg = result["messages"][-1]
final_text = getattr(final_msg, "content", str(final_msg))
if isinstance(final_text, list):
final_text = json.dumps(final_text, sort_keys=True)
# ACT §3.2: one mandate → one Phase 2 record (jti preserved). The
# record summarises the whole invocation; per-tool-call DAG structure
# lives in the ECTs we already logged.
final_record = mint_exec_record(
agent=agent,
mandate=mandate.mandate,
exec_act="mcp.summarize", # terminal exec_act; picked from cap
pred_jtis=[], # root task within this run's ACT view
inp_body=purpose.encode("utf-8"),
out_body=final_text.encode("utf-8"),
)
ledger.final_record = final_record
ledger.write_entry(
LedgerEntry(
kind="record",
compact=final_record.compact,
jti=final_record.record.jti,
metadata={
"exec_act": final_record.record.exec_act,
"status": final_record.record.status,
"pred": list(final_record.record.pred),
"inp_hash": final_record.record.inp_hash,
"out_hash": final_record.record.out_hash,
"n_tool_ects": len(ledger.tool_ects),
},
)
)
return {
"mandate_jti": mandate.mandate.jti,
"record_jti": final_record.record.jti,
"tool_ects": [e.payload.jti for e in ledger.tool_ects],
"final_message": final_text,
}
def main() -> None:
logging.basicConfig(
level=os.environ.get("POC_LOG_LEVEL", "INFO"),
format="%(asctime)s %(levelname)s %(name)s: %(message)s",
)
parser = argparse.ArgumentParser(description="ACT/ECT MCP PoC agent")
parser.add_argument(
"--purpose",
default="Summarise recent research on agent authorization tokens.",
help="High-level task the mandate authorises.",
)
parser.add_argument("--model", default=os.environ.get("POC_MODEL", "qwen3:8b"))
parser.add_argument(
"--mcp-url", default=os.environ.get("POC_MCP_URL", "http://127.0.0.1:8765/mcp")
)
parser.add_argument("--keys-dir", default=os.environ.get("POC_KEYS_DIR", "keys"))
parser.add_argument(
"--ledger", default=os.environ.get("POC_LEDGER", "keys/ledger.jsonl")
)
parser.add_argument("--ollama-host", default=os.environ.get("OLLAMA_HOST"))
args = parser.parse_args()
summary = asyncio.run(
run_once(
purpose=args.purpose,
model=args.model,
mcp_url=args.mcp_url,
keys_dir=args.keys_dir,
ledger_path=args.ledger,
ollama_host=args.ollama_host,
)
)
print(json.dumps(summary, indent=2))
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,238 @@
"""Minimal RFC 9421 HTTP Message Signatures for the PoC.
Covers just enough of RFC 9421 + draft-ietf-wimse-http-signature-03 to
bind an ECT-bearing request to its method, target URI, body digest, and
the Wimse-ECT header itself. Not a general-purpose implementation — the
signed-component serialization follows RFC 9421 §2.3 for this fixed set
of components only.
Signature metadata parameters per draft-ietf-wimse-http-signature-03:
keyid — kid of the signing workload
alg — "ecdsa-p256-sha256"
created — NumericDate seconds
wimse-aud — target audience workload identity (new in -03, replaces
the removed Wimse-Audience HTTP header)
Format:
Signature-Input: sig1=(\"@method\" \"@target-uri\" \"content-digest\" \
\"wimse-ect\");created=...;keyid=\"...\";\
alg=\"ecdsa-p256-sha256\";wimse-aud=\"...\"
Signature: sig1=:<base64>:
"""
from __future__ import annotations
import base64
import hashlib
import time
from dataclasses import dataclass
from typing import Iterable
from urllib.parse import urlsplit
from cryptography.hazmat.primitives.asymmetric.ec import (
EllipticCurvePrivateKey,
EllipticCurvePublicKey,
)
from act.crypto import sign as act_sign, verify as act_verify
from act.errors import ACTSignatureError
COVERED_COMPONENTS: tuple[str, ...] = (
"@method",
"@target-uri",
"content-digest",
"wimse-ect",
)
SIG_ALG = "ecdsa-p256-sha256"
def content_digest(body: bytes) -> str:
"""RFC 9530 Content-Digest header value using sha-256."""
digest = hashlib.sha256(body).digest()
return "sha-256=:" + base64.b64encode(digest).decode("ascii") + ":"
def _serialize_components(
*,
method: str,
target_uri: str,
content_digest_hdr: str,
wimse_ect_hdr: str,
params: str,
) -> bytes:
"""RFC 9421 §2.3 signature base for the fixed PoC component set."""
lines = [
f'"@method": {method.upper()}',
f'"@target-uri": {target_uri}',
f'"content-digest": {content_digest_hdr}',
f'"wimse-ect": {wimse_ect_hdr}',
f'"@signature-params": {params}',
]
return "\n".join(lines).encode("ascii")
def _signature_params(
*,
created: int,
keyid: str,
wimse_aud: str,
) -> str:
"""Render the signature-params string (quoted inner-list + params)."""
components = " ".join(f'"{c}"' for c in COVERED_COMPONENTS)
return (
f"({components})"
f";created={created}"
f';keyid="{keyid}"'
f';alg="{SIG_ALG}"'
f';wimse-aud="{wimse_aud}"'
)
@dataclass
class SignedRequest:
content_digest: str
signature_input: str
signature: str
def sign_request(
*,
method: str,
target_uri: str,
body: bytes,
wimse_ect: str,
wimse_aud: str,
keyid: str,
private_key: EllipticCurvePrivateKey,
created: int | None = None,
) -> SignedRequest:
"""Produce the three headers needed to send a signed PoC request."""
created = int(created if created is not None else time.time())
cd = content_digest(body)
params = _signature_params(created=created, keyid=keyid, wimse_aud=wimse_aud)
base = _serialize_components(
method=method,
target_uri=target_uri,
content_digest_hdr=cd,
wimse_ect_hdr=wimse_ect,
params=params,
)
sig = act_sign(private_key, base)
sig_b64 = base64.b64encode(sig).decode("ascii")
return SignedRequest(
content_digest=cd,
signature_input=f"sig1={params}",
signature=f"sig1=:{sig_b64}:",
)
@dataclass
class ParsedSignature:
covered: tuple[str, ...]
created: int
keyid: str
alg: str
wimse_aud: str
raw_params: str
signature_b64: str
def _parse_signature_input(value: str) -> ParsedSignature:
"""Parse 'sig1=(...);created=...;keyid="...";alg="...";wimse-aud="..."'."""
if "=" not in value:
raise ValueError("signature-input: missing label")
_label, _, inner = value.partition("=")
# inner: '("a" "b" ...);created=...;keyid="...";...'
if not inner.startswith("("):
raise ValueError("signature-input: missing covered components list")
close = inner.index(")")
components_raw = inner[1:close]
covered = tuple(
part.strip().strip('"') for part in components_raw.split() if part.strip()
)
rest = inner[close + 1 :]
params: dict[str, str] = {}
for part in rest.split(";"):
part = part.strip()
if not part or "=" not in part:
continue
k, _, v = part.partition("=")
params[k.strip()] = v.strip().strip('"')
return ParsedSignature(
covered=covered,
created=int(params["created"]),
keyid=params["keyid"],
alg=params["alg"],
wimse_aud=params["wimse-aud"],
raw_params=inner, # full params string (for sig base reconstruction)
signature_b64="",
)
def _parse_signature(value: str) -> str:
"""Parse 'sig1=:<base64>:' → base64 string."""
if "=" not in value:
raise ValueError("signature: missing label")
_label, _, inner = value.partition("=")
inner = inner.strip()
if not (inner.startswith(":") and inner.endswith(":")):
raise ValueError("signature: expected byte-sequence form :...:" )
return inner[1:-1]
def verify_request(
*,
method: str,
target_uri: str,
body: bytes,
wimse_ect_header: str,
content_digest_header: str,
signature_input_header: str,
signature_header: str,
expected_audience: str,
public_key: EllipticCurvePublicKey,
max_age_seconds: int = 300,
now: int | None = None,
) -> ParsedSignature:
"""Verify the signature covers the expected components and matches.
Returns the parsed signature metadata (keyid, alg, wimse-aud, created)
so the caller can cross-check against ECT/ACT claims.
"""
parsed = _parse_signature_input(signature_input_header)
if parsed.covered != COVERED_COMPONENTS:
raise ACTSignatureError(
f"signed components {parsed.covered!r} differ from expected "
f"{COVERED_COMPONENTS!r}"
)
if parsed.alg != SIG_ALG:
raise ACTSignatureError(f"unexpected alg {parsed.alg!r}")
if parsed.wimse_aud != expected_audience:
raise ACTSignatureError(
f"wimse-aud {parsed.wimse_aud!r} != expected {expected_audience!r}"
)
current = int(now if now is not None else time.time())
if current - parsed.created > max_age_seconds:
raise ACTSignatureError(
f"signature too old: created={parsed.created}, now={current}"
)
expected_digest = content_digest(body)
if content_digest_header != expected_digest:
raise ACTSignatureError("content-digest does not match body")
sig_b64 = _parse_signature(signature_header)
sig = base64.b64decode(sig_b64)
base = _serialize_components(
method=method,
target_uri=target_uri,
content_digest_hdr=content_digest_header,
wimse_ect_hdr=wimse_ect_header,
params=parsed.raw_params,
)
act_verify(public_key, sig, base) # raises ACTSignatureError on bad sig
parsed.signature_b64 = sig_b64
return parsed

View File

@@ -0,0 +1,97 @@
"""Key material for the three PoC identities.
The PoC uses three ES256 (P-256) keys — the common algorithm for both
ACT and ECT per draft-nennemann-act-01 §5 and draft-nennemann-wimse-ect-01 §5.
Identities:
user — issues the ACT mandate (iss in Phase 1)
agent — subject of the mandate, signs Phase 2 record, signs ECT on
every MCP tool call (sub in ACT, iss in ECT)
mcp-server — audience / verifier (aud in both ACT and ECT)
Keys are written to ``keys/`` as PEM files on first run; subsequent runs
load them. This mimics pre-shared-key deployment per ACT §5.2 Tier 1.
"""
from __future__ import annotations
from dataclasses import dataclass
from pathlib import Path
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives.asymmetric.ec import (
EllipticCurvePrivateKey,
EllipticCurvePublicKey,
)
from act.crypto import KeyRegistry, generate_p256_keypair
IDENTITIES = ("user", "agent", "mcp-server")
@dataclass
class Identity:
name: str
kid: str
private_key: EllipticCurvePrivateKey
public_key: EllipticCurvePublicKey
def _pem_paths(keys_dir: Path, name: str) -> tuple[Path, Path]:
return keys_dir / f"{name}.priv.pem", keys_dir / f"{name}.pub.pem"
def _load_or_generate(keys_dir: Path, name: str) -> Identity:
priv_path, pub_path = _pem_paths(keys_dir, name)
if priv_path.exists() and pub_path.exists():
priv_bytes = priv_path.read_bytes()
priv = serialization.load_pem_private_key(priv_bytes, password=None)
assert isinstance(priv, EllipticCurvePrivateKey), (
f"{name}.priv.pem is not a P-256 private key"
)
pub = priv.public_key()
else:
priv, pub = generate_p256_keypair()
keys_dir.mkdir(parents=True, exist_ok=True)
priv_path.write_bytes(
priv.private_bytes(
encoding=serialization.Encoding.PEM,
format=serialization.PrivateFormat.PKCS8,
encryption_algorithm=serialization.NoEncryption(),
)
)
pub_path.write_bytes(
pub.public_bytes(
encoding=serialization.Encoding.PEM,
format=serialization.PublicFormat.SubjectPublicKeyInfo,
)
)
kid = f"kid:{name}:v1"
return Identity(name=name, kid=kid, private_key=priv, public_key=pub)
def load_identities(keys_dir: str | Path = "keys") -> dict[str, Identity]:
"""Load all three PoC identities, generating key material if missing."""
keys_dir = Path(keys_dir)
return {name: _load_or_generate(keys_dir, name) for name in IDENTITIES}
def build_key_registry(identities: dict[str, Identity]) -> KeyRegistry:
"""Assemble an ACT KeyRegistry with every identity's public key."""
reg = KeyRegistry()
for ident in identities.values():
reg.register(ident.kid, ident.public_key)
return reg
def build_ect_key_resolver(identities: dict[str, Identity]):
"""Return an ECT KeyResolver callable that maps kid → public key."""
kid_to_pub: dict[str, EllipticCurvePublicKey] = {
ident.kid: ident.public_key for ident in identities.values()
}
def _resolve(kid: str) -> EllipticCurvePublicKey | None:
return kid_to_pub.get(kid)
return _resolve

View File

@@ -0,0 +1,300 @@
"""MCP server with ACT + ECT + HTTP-signature enforcement middleware.
The server exposes two tools via FastMCP streamable-HTTP:
search(query: str) — returns a list of fake hits (str[])
summarize(text: str) — returns a short synthetic summary (str)
Every POST to /mcp goes through ``ActEctAuthMiddleware`` which:
1. parses ``Authorization: Bearer <act-mandate>`` and verifies the
Phase 1 mandate (ACT §8.1);
2. parses ``Wimse-ECT: <ect-compact>`` and verifies the ECT
(draft-nennemann-wimse-ect-01 §7);
3. verifies the RFC 9421 HTTP-signature over
@method/@target-uri/content-digest/wimse-ect with
wimse-aud=mcp-server (per draft-ietf-wimse-http-signature-03);
4. cross-checks exec_act ∈ mandate.cap[].action, mandate.sub == ECT.iss,
ECT.inp_hash == sha256(body).
On failure any check returns HTTP 401/403 before the request reaches
FastMCP. Successful audits are appended to ``AUDIT_LOG_PATH`` so the
verifier CLI can reconstruct the server's view of the DAG.
"""
from __future__ import annotations
import argparse
import base64
import hashlib
import json
import logging
import os
import time
from pathlib import Path
from typing import Any
import uvicorn
from mcp.server.fastmcp import FastMCP
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import JSONResponse, Response
from act.crypto import ACTKeyResolver, b64url_sha256
from act.errors import ACTError, ACTSignatureError
from act.verify import ACTVerifier
import ect as ect_pkg # noqa: F401 (ensures package import)
from ect.verify import verify as ect_verify, VerifyOptions
from .http_sig import verify_request
from .keys import Identity, build_ect_key_resolver, build_key_registry, load_identities
LOG = logging.getLogger("poc.server")
AUDIT_LOG_PATH = Path(os.environ.get("POC_AUDIT_LOG", "keys/server-audit.jsonl"))
KEYS_DIR = os.environ.get("POC_KEYS_DIR", "keys")
SERVER_IDENTITY_NAME = "mcp-server"
AGENT_IDENTITY_NAME = "agent"
# ---- FastMCP tools ----------------------------------------------------------
def _build_mcp() -> FastMCP:
"""Fresh FastMCP instance with the two PoC tools registered.
A new instance per ``build_app`` call keeps ``StreamableHTTPSessionManager``
usable in tests that start the Starlette lifespan multiple times (the
session manager is a single-use object).
"""
from mcp.server.transport_security import TransportSecuritySettings
mcp = FastMCP(
"act-ect-poc",
transport_security=TransportSecuritySettings(
# Allow the test hostname ``testserver`` in addition to loopback.
allowed_hosts=[
"127.0.0.1:*", "localhost:*", "[::1]:*", "testserver",
],
allowed_origins=[
"http://127.0.0.1:*", "http://localhost:*",
"http://[::1]:*", "http://testserver",
],
),
)
@mcp.tool()
def search(query: str) -> list[str]:
"""Return three fake search hits for the query."""
q = query.strip() or "empty"
return [f"[{q}] result {i + 1}: lorem ipsum about {q}" for i in range(3)]
@mcp.tool()
def summarize(text: str) -> str:
"""Return a deterministic one-line summary of the text."""
snippet = text.strip().replace("\n", " ")
if len(snippet) > 120:
snippet = snippet[:117] + "..."
return f"Summary: {snippet}"
return mcp
# ---- Auth middleware --------------------------------------------------------
class ActEctAuthMiddleware(BaseHTTPMiddleware):
"""Enforce ACT mandate + ECT + HTTP signature on every tool invocation."""
def __init__(self, app, identities: dict[str, Identity]) -> None:
super().__init__(app)
self._identities = identities
server = identities[SERVER_IDENTITY_NAME]
agent = identities[AGENT_IDENTITY_NAME]
registry = build_key_registry(identities)
resolver = ACTKeyResolver(registry=registry)
self._act_verifier = ACTVerifier(
resolver,
verifier_id=SERVER_IDENTITY_NAME,
trusted_issuers={ident.name for ident in identities.values()},
)
self._ect_resolver = build_ect_key_resolver(identities)
self._agent_public_key = agent.public_key
self._server_name = server.name
self._agent_name = agent.name
async def dispatch(self, request: Request, call_next):
# Only enforce on the MCP endpoint; let everything else through.
if request.url.path != "/mcp" or request.method.upper() != "POST":
return await call_next(request)
try:
auth_ctx = await self._authorize(request)
except _AuthFailure as e:
LOG.warning("auth rejected: %s", e.detail)
return JSONResponse({"error": e.detail}, status_code=e.status)
# Pass the verified context downstream so a handler could read it.
request.state.act_ect = auth_ctx
response: Response = await call_next(request)
_append_audit(AUDIT_LOG_PATH, auth_ctx)
return response
async def _authorize(self, request: Request) -> dict[str, Any]:
body = await request.body()
# 1. Authorization: Bearer <act-mandate-compact>
auth_hdr = request.headers.get("authorization", "")
if not auth_hdr.lower().startswith("bearer "):
raise _AuthFailure(401, "missing Authorization: Bearer")
act_compact = auth_hdr[len("bearer ") :].strip()
# 2. Wimse-ECT: <ect-compact>
ect_compact = request.headers.get("wimse-ect", "").strip()
if not ect_compact:
raise _AuthFailure(401, "missing Wimse-ECT header")
# 3. RFC 9421 HTTP signature (over method/target/content-digest/wimse-ect)
sig_input = request.headers.get("signature-input", "")
sig = request.headers.get("signature", "")
content_digest_hdr = request.headers.get("content-digest", "")
if not (sig_input and sig and content_digest_hdr):
raise _AuthFailure(
401, "missing Signature / Signature-Input / Content-Digest"
)
# Starlette's request.url gives us a URL; canonicalize target URI.
# Use the path+query as the target since scheme/host differs behind
# reverse proxies. For the PoC we match client and server on the
# exact full URL the client signed.
target_uri = str(request.url)
try:
parsed_sig = verify_request(
method=request.method,
target_uri=target_uri,
body=body,
wimse_ect_header=ect_compact,
content_digest_header=content_digest_hdr,
signature_input_header=sig_input,
signature_header=sig,
expected_audience=self._server_name,
public_key=self._agent_public_key,
)
except ACTSignatureError as e:
raise _AuthFailure(401, f"http-signature failed: {e}") from e
# 4. ACT mandate
try:
mandate = self._act_verifier.verify_mandate(
act_compact, check_sub=False
)
except ACTError as e:
raise _AuthFailure(401, f"ACT mandate rejected: {e}") from e
if mandate.sub != self._agent_name:
raise _AuthFailure(
403, f"mandate.sub {mandate.sub!r} != agent {self._agent_name!r}"
)
# 5. ECT
try:
ect_opts = VerifyOptions(
verifier_id=self._server_name,
resolve_key=self._ect_resolver,
)
parsed_ect = ect_verify(ect_compact, ect_opts)
except Exception as e: # ECT refimpl raises ValueError subclasses
raise _AuthFailure(401, f"ECT rejected: {e}") from e
if parsed_sig.keyid != parsed_ect.header.get("kid"):
raise _AuthFailure(
401,
f"http-sig keyid {parsed_sig.keyid!r} != ect kid "
f"{parsed_ect.header.get('kid')!r}",
)
# 6. Cross-check inp_hash binds to this body
body_hash = b64url_sha256(body)
if parsed_ect.payload.inp_hash and parsed_ect.payload.inp_hash != body_hash:
raise _AuthFailure(401, "ECT.inp_hash does not match request body")
# 7. exec_act must be authorised by mandate.cap
cap_actions = {c.action for c in mandate.cap}
if parsed_ect.payload.exec_act not in cap_actions:
raise _AuthFailure(
403,
f"exec_act {parsed_ect.payload.exec_act!r} not in "
f"mandate.cap {sorted(cap_actions)!r}",
)
# 8. ECT issuer should equal mandate subject (the executing agent)
if parsed_ect.payload.iss != mandate.sub:
raise _AuthFailure(
403,
f"ECT.iss {parsed_ect.payload.iss!r} != mandate.sub "
f"{mandate.sub!r}",
)
return {
"ts": int(time.time()),
"mandate_jti": mandate.jti,
"ect_jti": parsed_ect.payload.jti,
"exec_act": parsed_ect.payload.exec_act,
"pred": list(parsed_ect.payload.pred),
"inp_hash": body_hash,
"wimse_aud": parsed_sig.wimse_aud,
"keyid": parsed_sig.keyid,
}
class _AuthFailure(Exception):
def __init__(self, status: int, detail: str) -> None:
super().__init__(detail)
self.status = status
self.detail = detail
def _append_audit(path: Path, entry: dict[str, Any]) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
with path.open("a", encoding="utf-8") as fh:
fh.write(json.dumps(entry, separators=(",", ":")) + "\n")
# ---- ASGI assembly ----------------------------------------------------------
def build_app(identities: dict[str, Identity]):
mcp = _build_mcp()
app = mcp.streamable_http_app()
app.add_middleware(ActEctAuthMiddleware, identities=identities)
return app
def main() -> None:
logging.basicConfig(
level=os.environ.get("POC_LOG_LEVEL", "INFO"),
format="%(asctime)s %(levelname)s %(name)s: %(message)s",
)
parser = argparse.ArgumentParser(description="ACT/ECT MCP PoC server")
parser.add_argument("--host", default="127.0.0.1")
parser.add_argument("--port", type=int, default=8765)
parser.add_argument("--keys-dir", default=KEYS_DIR)
args = parser.parse_args()
identities = load_identities(args.keys_dir)
app = build_app(identities)
LOG.info(
"serving MCP at http://%s:%d/mcp; audit=%s",
args.host,
args.port,
AUDIT_LOG_PATH,
)
uvicorn.run(app, host=args.host, port=args.port, log_level="info")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,197 @@
"""Mint and verify ACT mandates/records and ECT payloads for the PoC.
Thin convenience wrappers around ietf-act and ietf-ect that fix the
PoC's shape (one user issues to one agent, one MCP-server audience)
while leaving all cryptography and claim validation to the refimpls.
"""
from __future__ import annotations
import time
import uuid
from dataclasses import dataclass
from typing import Optional
from act.crypto import b64url_sha256, sign as act_sign
from act.token import (
ACTMandate,
ACTRecord,
Capability,
TaskClaim,
encode_jws,
)
import ect
from ect.create import CreateOptions, create as ect_create
from ect.types import Payload as ECTPayload
from .keys import Identity
# Capability action names the agent may exercise on the MCP server.
# mcp.session.* covers JSON-RPC plumbing (initialize, tools/list, ping, …)
# that precedes real tool invocations.
MCP_CAPS = (
"mcp.session.initialize",
"mcp.session.list_tools",
"mcp.session.other",
"mcp.search",
"mcp.summarize",
)
# Map tool-name → exec_act for real tool calls.
TOOL_ACTION = {
"search": "mcp.search",
"summarize": "mcp.summarize",
}
def exec_act_for_rpc_method(method: str, tool_name: str | None = None) -> str:
"""Return the ACT/ECT exec_act string for a JSON-RPC call.
Rules:
* ``tools/call`` dispatches to ``TOOL_ACTION[tool_name]``
* ``initialize`` and ``tools/list`` have dedicated ``mcp.session.*`` actions
* anything else collapses to ``mcp.session.other`` so the mandate can
still authorise session-level JSON-RPC without enumerating every
method on both sides.
"""
if method == "tools/call":
if tool_name is None or tool_name not in TOOL_ACTION:
raise ValueError(f"unknown tool: {tool_name!r}")
return TOOL_ACTION[tool_name]
if method == "initialize":
return "mcp.session.initialize"
if method == "tools/list":
return "mcp.session.list_tools"
return "mcp.session.other"
@dataclass
class MintedMandate:
compact: str
mandate: ACTMandate
def mint_mandate(
*,
user: Identity,
agent: Identity,
audience: str,
purpose: str,
ttl_seconds: int = 900,
now: Optional[int] = None,
) -> MintedMandate:
"""User issues a Phase 1 ACT mandate to the agent.
Reference: ACT §3.1, §4.2.
"""
iat = int(now if now is not None else time.time())
mandate = ACTMandate(
alg="ES256",
kid=user.kid,
iss=user.name,
sub=agent.name,
aud=audience,
iat=iat,
exp=iat + ttl_seconds,
jti=str(uuid.uuid4()),
wid=agent.name,
task=TaskClaim(purpose=purpose, created_by=user.name),
cap=[Capability(action=a) for a in MCP_CAPS],
)
mandate.validate()
signature = act_sign(user.private_key, mandate.signing_input())
compact = encode_jws(mandate, signature)
return MintedMandate(compact=compact, mandate=mandate)
@dataclass
class MintedRecord:
compact: str
record: ACTRecord
def mint_exec_record(
*,
agent: Identity,
mandate: ACTMandate,
exec_act: str,
pred_jtis: list[str],
inp_body: bytes,
out_body: bytes,
now: Optional[int] = None,
status: str = "completed",
) -> MintedRecord:
"""Agent mints a Phase 2 ACT execution record after a tool call.
``pred_jtis`` should include the mandate jti and any prior exec record
jtis for this run (DAG semantics, ACT §4.3 ``pred``).
Reference: ACT §3.2, §4.3.
"""
exec_ts = int(now if now is not None else time.time())
record = ACTRecord.from_mandate(
mandate,
kid=agent.kid,
exec_act=exec_act,
pred=pred_jtis,
exec_ts=exec_ts,
status=status,
inp_hash=b64url_sha256(inp_body),
out_hash=b64url_sha256(out_body),
)
# The record's jti MUST equal the mandate's jti per ACT §3.2 / §4 (the
# Phase 2 token records the same task). ``from_mandate`` already copies
# mandate.jti, so no override here.
record.validate()
signature = act_sign(agent.private_key, record.signing_input())
compact = encode_jws(record, signature)
return MintedRecord(compact=compact, record=record)
@dataclass
class MintedECT:
compact: str
payload: ECTPayload
def mint_ect(
*,
agent: Identity,
audience: str,
exec_act: str,
pred_jtis: list[str],
inp_body: bytes,
now: Optional[int] = None,
ttl_seconds: int = 300,
) -> MintedECT:
"""Agent mints an ECT binding the MCP request body to the execution.
The ECT's ``jti`` identifies this specific invocation; ``pred`` links
back to the mandate (and earlier ECT invocations, if any); the request
body is hashed into ``inp_hash``.
Reference: draft-nennemann-wimse-ect-01 §4.
"""
iat = int(now if now is not None else time.time())
payload = ECTPayload(
iss=agent.name,
aud=[audience],
iat=iat,
exp=iat + ttl_seconds,
jti=str(uuid.uuid4()),
exec_act=exec_act,
pred=pred_jtis,
wid=agent.name,
inp_hash=b64url_sha256(inp_body),
)
# ECT refimpl wants a real ES256 key and the same kid shape as ACT.
compact = ect_create(
payload,
agent.private_key,
CreateOptions(key_id=agent.kid),
)
# Reflect the defaulted iat/exp back onto our payload copy for logging.
return MintedECT(compact=compact, payload=payload)

View File

@@ -0,0 +1,188 @@
"""Replay an agent run from ledger.jsonl, verify every token, print the DAG.
Model (spec-consistent)
-----------------------
Per run:
* 1 ACT Phase 1 mandate (user → agent)
* N ECT tokens — one per outgoing MCP HTTP request. Tool-call ECTs form
a DAG via their ``pred`` field; session ECTs (initialize/tools-list)
only point at the mandate.
* 1 ACT Phase 2 record summarising the run (jti = mandate.jti per
ACT §3.2).
The verifier re-runs the ietf-act and ietf-ect refimpls on each compact
form and prints both the coarse ACT summary and the fine-grained ECT DAG.
Run ``poc-verify [--ledger keys/ledger.jsonl] [--keys-dir keys]``.
"""
from __future__ import annotations
import argparse
import json
import os
from dataclasses import dataclass
from pathlib import Path
from typing import Any
from act.crypto import ACTKeyResolver
from act.errors import ACTError
from act.verify import ACTVerifier
from ect.verify import verify as ect_verify, VerifyOptions
from .keys import build_ect_key_resolver, build_key_registry, load_identities
SERVER_IDENTITY_NAME = "mcp-server"
@dataclass
class Row:
kind: str
jti: str
compact: str
metadata: dict[str, Any]
def _read_ledger(path: Path) -> list[Row]:
rows: list[Row] = []
for line in path.read_text().splitlines():
if not line.strip():
continue
obj = json.loads(line)
rows.append(
Row(
kind=obj["kind"],
jti=obj["jti"],
compact=obj["compact"],
metadata=obj.get("metadata", {}),
)
)
return rows
def _fmt_jti(jti: str) -> str:
return jti.split("-")[0]
def run(ledger_path: Path, keys_dir: Path) -> int:
identities = load_identities(keys_dir)
registry = build_key_registry(identities)
resolver = ACTKeyResolver(registry=registry)
ect_resolver = build_ect_key_resolver(identities)
trusted_issuers = {ident.name for ident in identities.values()}
verifier = ACTVerifier(
resolver,
verifier_id=SERVER_IDENTITY_NAME,
trusted_issuers=trusted_issuers,
)
rows = _read_ledger(ledger_path)
mandates = [r for r in rows if r.kind == "mandate"]
records = [r for r in rows if r.kind == "record"]
ect_rows = [r for r in rows if r.kind == "ect"]
if len(mandates) != 1:
raise SystemExit(
f"expected exactly one mandate, got {len(mandates)} in {ledger_path}"
)
if len(records) != 1:
raise SystemExit(
f"expected exactly one record, got {len(records)} in {ledger_path}"
)
try:
mandate = verifier.verify_mandate(mandates[0].compact, check_sub=False)
except ACTError as e:
raise SystemExit(f"mandate verification failed: {e}")
print(f"mandate verified jti={_fmt_jti(mandate.jti)}")
# ECT verification — includes refimpl DAG walk when we supply a store.
# We don't supply one here because ECTStore would need cross-run scoping.
# Each ECT still passes its own Section-7 verification individually.
ect_parsed: list[Any] = []
ect_sessions = 0
ect_tool_calls = 0
for row in ect_rows:
parsed = ect_verify(
row.compact,
VerifyOptions(verifier_id=SERVER_IDENTITY_NAME, resolve_key=ect_resolver),
)
ect_parsed.append(parsed)
if row.metadata.get("session_only"):
ect_sessions += 1
else:
ect_tool_calls += 1
print(
f"ects verified n={len(ect_parsed)} "
f"(tool-calls={ect_tool_calls}, session={ect_sessions})"
)
# Final ACT record — verify without the DAG store (pred=[] for our model).
try:
record = verifier.verify_record(records[0].compact, store=None)
except ACTError as e:
raise SystemExit(f"record verification failed: {e}")
if record.jti != mandate.jti:
raise SystemExit(
f"record.jti {record.jti!r} != mandate.jti {mandate.jti!r} "
"— ACT §3.2 violation"
)
print(f"record verified jti={_fmt_jti(record.jti)} status={record.status}")
# Cross-check: ECT DAG well-formedness within this run.
known_jtis = {mandate.jti} | {p.payload.jti for p in ect_parsed}
dangling = 0
for p in ect_parsed:
for pred in p.payload.pred:
if pred not in known_jtis:
dangling += 1
if dangling:
raise SystemExit(f"ECT DAG has {dangling} dangling predecessor ref(s)")
print("ect-dag wellformed every pred is the mandate or a prior ECT")
# ---- Render ------------------------------------------------------------
print()
print("Run")
print("===")
print(f" mandate {_fmt_jti(mandate.jti)} task={mandate.task.purpose!r}")
print(f" iss={mandate.iss} sub={mandate.sub} aud={mandate.aud}")
print(f" cap={[c.action for c in mandate.cap]}")
print()
print("Tool-call ECT DAG:")
tool_only = [
p
for p, row in zip(ect_parsed, ect_rows)
if not row.metadata.get("session_only")
]
if not tool_only:
print(" (none — model called no tools)")
for p in tool_only:
preds = [_fmt_jti(x) for x in p.payload.pred]
print(
f" ect {_fmt_jti(p.payload.jti)} exec_act={p.payload.exec_act} "
f"pred={preds}"
)
print()
print("ACT Phase 2 record:")
print(f" jti={_fmt_jti(record.jti)} exec_act={record.exec_act}")
print(f" status={record.status} pred={list(record.pred)}")
print(f" inp_hash={record.inp_hash}")
print(f" out_hash={record.out_hash}")
return 0
def main() -> None:
parser = argparse.ArgumentParser(description="Verify ACT+ECT PoC ledger")
parser.add_argument(
"--ledger", default=os.environ.get("POC_LEDGER", "keys/ledger.jsonl")
)
parser.add_argument("--keys-dir", default=os.environ.get("POC_KEYS_DIR", "keys"))
args = parser.parse_args()
raise SystemExit(run(Path(args.ledger), Path(args.keys_dir)))
if __name__ == "__main__":
main()

View File

View File

@@ -0,0 +1,17 @@
import shutil
from pathlib import Path
import pytest
@pytest.fixture
def tmp_keys_dir(tmp_path) -> Path:
d = tmp_path / "keys"
d.mkdir()
return d
@pytest.fixture
def identities(tmp_keys_dir):
from poc.keys import load_identities
return load_identities(tmp_keys_dir)

View File

@@ -0,0 +1,84 @@
"""RFC-9421-shaped HTTP signature round-trip and tamper-detection."""
from __future__ import annotations
import pytest
from act.errors import ACTSignatureError
from poc.http_sig import sign_request, verify_request
def _sign_verify_ok(identities, body: bytes):
agent = identities["agent"]
target = "http://127.0.0.1:8765/mcp"
signed = sign_request(
method="POST",
target_uri=target,
body=body,
wimse_ect="ect.placeholder.compact",
wimse_aud=identities["mcp-server"].name,
keyid=agent.kid,
private_key=agent.private_key,
)
parsed = verify_request(
method="POST",
target_uri=target,
body=body,
wimse_ect_header="ect.placeholder.compact",
content_digest_header=signed.content_digest,
signature_input_header=signed.signature_input,
signature_header=signed.signature,
expected_audience=identities["mcp-server"].name,
public_key=agent.public_key,
)
return signed, parsed
def test_signature_round_trips(identities):
signed, parsed = _sign_verify_ok(identities, body=b'{"method":"tools/call"}')
assert parsed.keyid == identities["agent"].kid
assert parsed.wimse_aud == "mcp-server"
assert parsed.alg == "ecdsa-p256-sha256"
def test_signature_fails_on_tampered_body(identities):
agent = identities["agent"]
signed, _ = _sign_verify_ok(identities, body=b"original")
with pytest.raises(ACTSignatureError):
verify_request(
method="POST",
target_uri="http://127.0.0.1:8765/mcp",
body=b"tampered", # different body → different digest → no match
wimse_ect_header="ect.placeholder.compact",
content_digest_header=signed.content_digest,
signature_input_header=signed.signature_input,
signature_header=signed.signature,
expected_audience="mcp-server",
public_key=agent.public_key,
)
def test_signature_fails_on_wrong_audience(identities):
agent = identities["agent"]
signed = sign_request(
method="POST",
target_uri="http://example/mcp",
body=b"{}",
wimse_ect="ect.placeholder",
wimse_aud="the-wrong-workload", # signed for the wrong audience
keyid=agent.kid,
private_key=agent.private_key,
)
with pytest.raises(ACTSignatureError):
verify_request(
method="POST",
target_uri="http://example/mcp",
body=b"{}",
wimse_ect_header="ect.placeholder",
content_digest_header=signed.content_digest,
signature_input_header=signed.signature_input,
signature_header=signed.signature,
expected_audience="mcp-server",
public_key=agent.public_key,
)

View File

@@ -0,0 +1,191 @@
"""In-process tests that exercise the server's auth middleware via ASGI.
Uses ``httpx.AsyncClient`` with ``ASGITransport`` so no uvicorn / network is
required. Validates that a request forged with the real token-minting
pipeline reaches the FastMCP layer, and that tampering with any of the
pieces is rejected with 4xx.
"""
from __future__ import annotations
import json
import httpx
import pytest
from poc.http_sig import sign_request
from poc.server import build_app
from poc.tokens import mint_ect, mint_mandate
pytestmark = pytest.mark.asyncio
def _headers_for(
identities,
*,
body: bytes,
audience: str,
exec_act: str,
tamper_body: bool = False,
tamper_aud: bool = False,
) -> tuple[dict[str, str], bytes]:
"""Build a full set of ACT+ECT+signature headers for one request."""
agent = identities["agent"]
user = identities["user"]
mandate = mint_mandate(
user=user, agent=agent, audience=audience, purpose="test"
)
ect = mint_ect(
agent=agent,
audience=audience,
exec_act=exec_act,
pred_jtis=[mandate.mandate.jti],
inp_body=body,
)
sign_body = b"tampered" if tamper_body else body
sign_aud = "wrong-audience" if tamper_aud else audience
signed = sign_request(
method="POST",
target_uri="http://testserver/mcp",
body=sign_body,
wimse_ect=ect.compact,
wimse_aud=sign_aud,
keyid=agent.kid,
private_key=agent.private_key,
)
headers = {
"content-type": "application/json",
"accept": "application/json, text/event-stream",
"authorization": f"Bearer {mandate.compact}",
"wimse-ect": ect.compact,
"content-digest": signed.content_digest,
"signature-input": signed.signature_input,
"signature": signed.signature,
}
return headers, body
from contextlib import asynccontextmanager
@asynccontextmanager
async def _client_for(identities):
"""Return an httpx client wired to the ASGI app with lifespan started.
FastMCP's streamable-HTTP transport allocates a task group during
``lifespan.startup``; ``ASGITransport`` does not run lifespan by
default, so we manage it explicitly here.
"""
app = build_app(identities)
async with app.router.lifespan_context(app):
transport = httpx.ASGITransport(app=app)
async with httpx.AsyncClient(
transport=transport, base_url="http://testserver"
) as client:
yield client
async def test_no_auth_headers_returns_401(identities):
async with _client_for(identities) as c:
r = await c.post("/mcp", content=b'{"jsonrpc":"2.0","method":"initialize"}')
assert r.status_code == 401
assert "Authorization" in r.text
async def test_valid_initialize_request_is_accepted(identities, tmp_path, monkeypatch):
monkeypatch.setenv("POC_AUDIT_LOG", str(tmp_path / "audit.jsonl"))
body = json.dumps({
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2025-03-26",
"capabilities": {},
"clientInfo": {"name": "poc-test", "version": "0"},
},
}).encode("utf-8")
headers, _ = _headers_for(
identities, body=body, audience="mcp-server",
exec_act="mcp.session.initialize",
)
async with _client_for(identities) as c:
r = await c.post("/mcp", content=body, headers=headers)
# FastMCP may respond 200 with a session id, or 202, or stream SSE.
assert r.status_code < 400, f"unexpected status {r.status_code}: {r.text}"
async def test_tampered_body_is_rejected(identities):
body = json.dumps({"jsonrpc": "2.0", "id": 1, "method": "initialize"}).encode()
headers, _ = _headers_for(
identities, body=body, audience="mcp-server",
exec_act="mcp.session.initialize", tamper_body=True,
)
async with _client_for(identities) as c:
r = await c.post("/mcp", content=body, headers=headers)
assert r.status_code == 401
assert "content-digest" in r.text.lower() or "http-signature" in r.text.lower()
async def test_wrong_wimse_aud_is_rejected(identities):
body = json.dumps({"jsonrpc": "2.0", "id": 1, "method": "initialize"}).encode()
headers, _ = _headers_for(
identities, body=body, audience="mcp-server",
exec_act="mcp.session.initialize", tamper_aud=True,
)
async with _client_for(identities) as c:
r = await c.post("/mcp", content=body, headers=headers)
assert r.status_code == 401
async def test_unauthorised_tool_is_rejected(identities):
"""A tools/call whose exec_act is not in mandate.cap → 403."""
body = json.dumps({
"jsonrpc": "2.0", "id": 1, "method": "tools/call",
"params": {"name": "search", "arguments": {"query": "x"}},
}).encode()
# Craft a mandate where we strip search out of cap by going through
# the token API: we fabricate headers with the right exec_act but a
# mandate whose cap doesn't contain it.
from poc.keys import Identity # noqa
from poc.tokens import MCP_CAPS, mint_mandate
agent = identities["agent"]
user = identities["user"]
# Replace MCP_CAPS monkey-patch-free: build mandate directly.
from act.token import ACTMandate, Capability, TaskClaim
from act.crypto import sign as act_sign
from act.token import encode_jws
import time, uuid
iat = int(time.time())
mandate = ACTMandate(
alg="ES256", kid=user.kid, iss="user", sub="agent",
aud="mcp-server", iat=iat, exp=iat + 600, jti=str(uuid.uuid4()),
wid="agent",
task=TaskClaim(purpose="p", created_by="user"),
cap=[Capability(action="mcp.summarize")], # search MISSING
)
mandate.validate()
mandate_compact = encode_jws(mandate, act_sign(user.private_key, mandate.signing_input()))
ect = mint_ect(
agent=agent, audience="mcp-server",
exec_act="mcp.search", pred_jtis=[mandate.jti], inp_body=body,
)
signed = sign_request(
method="POST", target_uri="http://testserver/mcp",
body=body, wimse_ect=ect.compact, wimse_aud="mcp-server",
keyid=agent.kid, private_key=agent.private_key,
)
headers = {
"content-type": "application/json",
"authorization": f"Bearer {mandate_compact}",
"wimse-ect": ect.compact,
"content-digest": signed.content_digest,
"signature-input": signed.signature_input,
"signature": signed.signature,
}
async with _client_for(identities) as c:
r = await c.post("/mcp", content=body, headers=headers)
assert r.status_code == 403
assert "exec_act" in r.text or "cap" in r.text

View File

@@ -0,0 +1,136 @@
"""Token minting + round-trip verification for all three PoC token types."""
from __future__ import annotations
import pytest
from poc.keys import build_ect_key_resolver, build_key_registry
from poc.tokens import mint_ect, mint_exec_record, mint_mandate
from act.crypto import ACTKeyResolver
from act.errors import ACTError
from act.verify import ACTVerifier
from ect.verify import verify as ect_verify, VerifyOptions
SERVER = "mcp-server"
def _act_verifier(identities) -> ACTVerifier:
reg = build_key_registry(identities)
return ACTVerifier(
ACTKeyResolver(registry=reg),
verifier_id=SERVER,
trusted_issuers={i.name for i in identities.values()},
)
def test_mandate_round_trips(identities):
m = mint_mandate(
user=identities["user"],
agent=identities["agent"],
audience=SERVER,
purpose="research task",
)
v = _act_verifier(identities).verify_mandate(m.compact, check_sub=False)
assert v.jti == m.mandate.jti
assert v.iss == "user"
assert v.sub == "agent"
assert {c.action for c in v.cap} >= {"mcp.search", "mcp.summarize"}
def test_record_preserves_mandate_jti(identities):
"""ACT §3.2: Phase 2 record carries the mandate's jti."""
m = mint_mandate(
user=identities["user"],
agent=identities["agent"],
audience=SERVER,
purpose="research task",
)
rec = mint_exec_record(
agent=identities["agent"],
mandate=m.mandate,
exec_act="mcp.search",
pred_jtis=[],
inp_body=b"input",
out_body=b"output",
)
assert rec.record.jti == m.mandate.jti
vr = _act_verifier(identities).verify_record(rec.compact)
assert vr.jti == m.mandate.jti
assert vr.exec_act == "mcp.search"
assert vr.status == "completed"
def test_record_rejects_unauthorised_exec_act(identities):
"""Verifier must raise ACTCapabilityError when exec_act ∉ cap."""
from act.errors import ACTCapabilityError
from act.token import Capability
m = mint_mandate(
user=identities["user"],
agent=identities["agent"],
audience=SERVER,
purpose="p",
)
# Narrow the mandate to only mcp.search so mcp.summarize is unauthorised.
m.mandate.cap = [Capability(action="mcp.search")]
# Build the record locally so we can bypass the local validate() guard
# and produce a compact that only the verifier can spot as malformed.
rec = mint_exec_record(
agent=identities["agent"],
mandate=m.mandate,
exec_act="mcp.search",
pred_jtis=[],
inp_body=b"i",
out_body=b"o",
)
# Swap exec_act *after* signing to simulate a forged record. The
# verifier should reject it on capability-consistency grounds (ACT §7.1).
import act.crypto as _crypto
from act.token import encode_jws
rec.record.exec_act = "mcp.summarize"
rec.record.cap = [Capability(action="mcp.search")]
tampered = encode_jws(
rec.record,
_crypto.sign(identities["agent"].private_key, rec.record.signing_input()),
)
with pytest.raises(ACTCapabilityError):
_act_verifier(identities).verify_record(tampered)
def test_ect_round_trips(identities):
et = mint_ect(
agent=identities["agent"],
audience=SERVER,
exec_act="mcp.search",
pred_jtis=["some-prior-jti"],
inp_body=b'{"query":"x"}',
)
parsed = ect_verify(
et.compact,
VerifyOptions(
verifier_id=SERVER,
resolve_key=build_ect_key_resolver(identities),
),
)
assert parsed.payload.iss == "agent"
assert parsed.payload.exec_act == "mcp.search"
assert parsed.payload.pred == ["some-prior-jti"]
assert parsed.payload.inp_hash # present
def test_wrong_audience_rejected_by_act_verifier(identities):
m = mint_mandate(
user=identities["user"],
agent=identities["agent"],
audience="some-other-workload",
purpose="p",
)
# mcp-server is not the mandate's aud → verifier MUST refuse.
verifier = _act_verifier(identities)
with pytest.raises(ACTError):
verifier.verify_mandate(m.compact, check_sub=False)

1749
demo/act-ect-mcp/uv.lock generated Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1,16 +1,26 @@
# Paper build targets
TEX = pdflatex
BIB = bibtex
MAIN = ietf-landscape
SOURCES = $(MAIN).tex ietf-refs.bib
.PHONY: all figures pdf clean
.PHONY: all clean watch
all: figures pdf
all: $(MAIN).pdf
figures:
python3 export_figures.py
pdf: figures
pdflatex -interaction=nonstopmode main.tex
pdflatex -interaction=nonstopmode main.tex # second pass for references
$(MAIN).pdf: $(SOURCES)
$(TEX) $(MAIN)
$(BIB) $(MAIN)
$(TEX) $(MAIN)
$(TEX) $(MAIN)
clean:
rm -f main.aux main.log main.out main.bbl main.blg main.pdf
rm -rf figures/
rm -f $(MAIN).aux $(MAIN).bbl $(MAIN).blg $(MAIN).log \
$(MAIN).out $(MAIN).pdf $(MAIN).toc $(MAIN).fls \
$(MAIN).fdb_latexmk $(MAIN).synctex.gz
watch:
@echo "Rebuilding on change..."
@while true; do \
inotifywait -q -e modify $(SOURCES) 2>/dev/null || sleep 2; \
$(MAKE) all; \
done

899
paper/ietf-landscape.tex Normal file
View File

@@ -0,0 +1,899 @@
\documentclass[11pt,a4paper]{article}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{lmodern}
\usepackage[margin=2.5cm]{geometry}
\usepackage{amsmath,amssymb}
\usepackage{graphicx}
\usepackage{booktabs}
\usepackage{tabularx}
\usepackage{hyperref}
\usepackage{xcolor}
\usepackage{natbib}
\usepackage{enumitem}
\usepackage{float}
\usepackage{caption}
\hypersetup{
colorlinks=true,
linkcolor=blue!60!black,
citecolor=green!50!black,
urlcolor=blue!70!black,
}
\setlength{\parskip}{0.4em}
\setlength{\parindent}{0em}
\title{%
Mapping the AI-Agent Standardization Landscape:\\
An LLM-Assisted Analysis of IETF Internet-Drafts%
}
\author{
Christian Nennemann\\
Independent Researcher\\
\texttt{write@nennemann.de}
}
\date{April 2026}
\begin{document}
\maketitle
\begin{abstract}
The Internet Engineering Task Force (IETF) is experiencing an unprecedented
surge in standardization activity around AI agents. Between January~2024 and
March~2026, AI- and agent-related Internet-Drafts grew from 0.5\% to 9.3\%
of all IETF submissions. We present a systematic, LLM-assisted analysis of
this landscape, covering 475 drafts from 713 authors across more than 230
organizations. Our pipeline combines keyword-based corpus construction from
the IETF Datatracker API, multi-dimensional quality rating via Claude
(Anthropic) as an LLM-as-judge, semantic embedding and clustering via a
local embedding model (nomic-embed-text), LLM-based extraction of 501
discrete technical ideas, and gap analysis against the assembled corpus.
Key findings include: (1)~a persistent capability-to-safety deficit, with
roughly four capability-building drafts for every safety-oriented one;
(2)~extreme protocol fragmentation, including 14~competing OAuth-for-agents
proposals and 155~agent-to-agent protocol drafts with no interoperability
layer; (3)~high organizational concentration, with a single vendor
contributing approximately 16\% of all drafts; (4)~132 cross-organization
convergent ideas independently proposed by multiple organizations, signaling
latent consensus beneath the fragmentation; and (5)~11 identified
standardization gaps, three rated critical, centered on behavioral
verification, capability degradation detection, and emergency override
protocols. The total analysis cost approximately \$9--15\,USD in API fees.
We discuss implications for AI-agent standardization strategy, the
limitations of LLM-as-judge methodologies applied to technical document
corpora, and organizational dynamics shaping the standards landscape.
\end{abstract}
\textbf{Keywords:} IETF, Internet-Drafts, AI agents, standardization,
LLM-as-judge, landscape analysis, multi-agent systems, protocol
fragmentation
% =========================================================================
\section{Introduction}
\label{sec:intro}
% =========================================================================
The deployment of autonomous AI agents---software systems that perceive
their environment, make decisions, and take actions with limited human
supervision---has accelerated dramatically since 2023. Commercial
offerings from Anthropic, Google, OpenAI, and others have moved AI agents
from research prototypes to production systems that browse the web,
execute code, manage cloud infrastructure, and interact with external
services on behalf of users. This proliferation raises fundamental
questions about identity, authentication, delegation, safety, and
interoperability that fall squarely within the purview of Internet
standards bodies.
The IETF, responsible for the core protocols of the Internet, has
responded with an extraordinary burst of activity. In 2024, just 9
AI- or agent-related Internet-Drafts were submitted---0.5\% of all
submissions. By the first quarter of 2026, that figure reached 9.3\%:
nearly one in ten new drafts addressed AI agents in some capacity.
Monthly submissions surged from 5 in June~2025 to 85 in February~2026,
a growth rate without precedent in the IETF's recent history.
This rapid expansion creates an analytical challenge. The volume of
drafts, the diversity of working groups involved, the overlapping scope
of competing proposals, and the speed of new submissions make manual
tracking infeasible. A standards participant seeking to understand the
landscape---which problems are being addressed, which are being
neglected, where proposals converge and where they conflict---faces a
corpus of hundreds of technical documents evolving on a weekly basis.
We address this challenge with an LLM-assisted analysis pipeline that
automates the collection, rating, clustering, idea extraction, and gap
identification for the full corpus of AI-agent-related IETF
Internet-Drafts. The pipeline combines three complementary analytical
approaches: (1)~LLM-as-judge rating of drafts on five quality
dimensions, using Claude (Anthropic) with structured prompts;
(2)~embedding-based semantic similarity and clustering, using a locally
hosted nomic-embed-text model via Ollama; and (3)~LLM-based extraction
of discrete technical ideas and identification of landscape gaps.
Our contributions are:
\begin{itemize}[nosep]
\item A comprehensive, quantitative map of the IETF's AI-agent
standardization landscape as of March~2026, covering 475 drafts,
713 authors, 501 extracted technical ideas, and 11 identified gaps.
\item A replicable, cost-effective methodology for LLM-assisted
standards corpus analysis (\$9--15 total), with explicit
documentation of limitations and methodological caveats.
\item Empirical findings on organizational concentration,
protocol fragmentation, cross-organization convergence, and
the capability-to-safety imbalance in the current landscape.
\item An open-source tool (the IETF Draft Analyzer) that makes the
pipeline, database, and all derived reports available for
independent verification and extension.
\end{itemize}
The remainder of this paper is organized as follows.
Section~\ref{sec:related} reviews related work on standards landscape
analysis, NLP for technical documents, and technology mapping.
Section~\ref{sec:method} describes the data collection and analysis
pipeline in detail. Section~\ref{sec:results} presents our findings
across five analytical dimensions. Section~\ref{sec:discussion}
discusses implications, limitations, and organizational dynamics.
Section~\ref{sec:conclusion} concludes.
% =========================================================================
\section{Related Work}
\label{sec:related}
% =========================================================================
Our work sits at the intersection of three research areas: standards
ecosystem analysis, NLP applied to technical document corpora, and
technology landscape mapping.
\subsection{Standards Analysis}
The economics and dynamics of technical standardization have been
studied extensively. \citet{simcoe2012} analyzes consensus governance
in standard-setting committees, showing how committee structure
influences the trajectory of shared technology platforms.
\citet{blind2017} examine the impact of standards and regulation on
innovation in uncertain markets, a framing directly applicable to the
nascent AI-agent ecosystem where both the technology and the regulatory
environment are in flux. \citet{lerner2014} study standard-essential
patents, a concern that is beginning to surface in the AI-agent space
as organizations file IPR declarations on agent-related protocols.
Prior quantitative analyses of IETF activity have typically focused on
participation patterns, working group dynamics, or the trajectory of
individual RFCs through the standards process. Our work differs in
scope: rather than analyzing the IETF as an institution, we analyze a
specific cross-cutting topic (AI agents) that spans multiple working
groups and is evolving too rapidly for traditional manual survey methods.
\subsection{NLP for Technical Documents}
The application of natural language processing to technical and legal
document corpora has expanded significantly with the advent of large
language models. \citet{devlin2019} introduced BERT-based approaches
that enabled transfer learning for domain-specific text
classification. More recently, \citet{brown2020} demonstrated that
large language models exhibit strong few-shot and zero-shot performance
on diverse text understanding tasks, opening the possibility of using
LLMs as automated annotators for technical documents.
The ``LLM-as-judge'' paradigm---using language models to evaluate or
rate text artifacts---has been systematically studied by
\citet{zheng2023}, who introduced MT-Bench and Chatbot Arena to
evaluate LLM judges against human preferences. Their work establishes
both the promise (high correlation with human judgment on structured
evaluation tasks) and the limitations (position bias, verbosity bias,
self-enhancement bias) of LLM-based evaluation. Our use of Claude as a
rater for IETF drafts follows this paradigm, with the specific
limitation that no human calibration study has been performed on our
rating outputs (see Section~\ref{sec:limitations}).
Embedding-based document similarity using models such as
Sentence-BERT~\citep{nussbaumer2024} and its successors has become
standard practice for document clustering and retrieval. We use
nomic-embed-text~\citep{nomic2024}, a general-purpose text embedding
model, for computing pairwise cosine similarity across the draft corpus.
The resulting similarity matrix enables both cluster detection and
visualization via t-SNE~\citep{vandermaaten2008}.
\subsection{Technology Landscape Surveys}
Technology landscape mapping---the systematic identification and
organization of technical activities within a domain---has a long
history in foresight and innovation studies.
\citet{porter2005} introduced ``tech mining'' as a methodology for
extracting competitive intelligence from patent and publication
databases. \citet{roper2011} extended these methods to broader
technology management contexts. Our work adapts these approaches to
the standards domain, replacing patent databases with the IETF
Datatracker and augmenting keyword-based search with LLM-driven
semantic analysis.
The AI agent research community has produced several recent surveys.
\citet{wang2024} and \citet{xi2023} survey the rapidly growing
literature on LLM-based autonomous agents, covering architectures,
capabilities, and evaluation. These academic surveys focus on
research contributions; our work complements them by mapping the
parallel standardization effort, where research ideas meet the
engineering constraints of Internet protocol design.
The multi-agent systems (MAS) research tradition, surveyed
comprehensively by \citet{wooldridge2009} and \citet{dorri2018},
provides historical context. The FIPA Agent Communication
Language~\citep{fipa-acl} and Agent Management
Specification~\citep{fipa-ams}, developed between 1996 and 2005,
addressed many of the same problems---agent discovery, communication
protocols, platform interoperability---that the current IETF drafts
tackle. The near-complete absence of FIPA references in the
contemporary IETF corpus suggests limited awareness of this prior art,
a finding we quantify in Section~\ref{sec:results}.
% =========================================================================
\section{Methodology}
\label{sec:method}
% =========================================================================
The analysis pipeline consists of six sequential stages, each building
on the output of the previous. All intermediate results are stored in
a SQLite database (28\,MB) with FTS5 full-text search, enabling both
pipeline idempotency and ad-hoc querying. The complete pipeline is
implemented as a Python CLI tool (approximately 6,100 lines across 12
modules) using Click, httpx, the Anthropic SDK, and Ollama.
\subsection{Data Collection}
\label{sec:datacollection}
\subsubsection{Corpus Construction}
Drafts were retrieved from the IETF Datatracker
API\footnote{\url{https://datatracker.ietf.org/api/v1/doc/document/}}
using keyword search across both draft names
(\texttt{name\_\_contains}) and abstracts
(\texttt{abstract\_\_contains}). Twelve search terms were used:
\textit{agent}, \textit{ai-agent}, \textit{agentic},
\textit{autonomous}, \textit{mcp}, \textit{inference},
\textit{generative}, \textit{intelligent}, \textit{large language
model}, \textit{multi-agent}, and \textit{trustworth}.
Only drafts with \texttt{type\_\_slug=draft} and submission date
$\geq$~2024-01-01 were included. Full text was downloaded from the
IETF archive.\footnote{\url{https://www.ietf.org/archive/id/}}
The keyword set was expanded iteratively. An initial set of 6 keywords
yielded 260 drafts; adding 6 further terms captured 174 additional
drafts in categories initially underrepresented, including MCP-related
work, generative AI infrastructure, and the nascent \texttt{aipref}
working group. A polite delay of 0.5\,seconds was applied between API
requests.
The resulting corpus contains 475 drafts. After false-positive
filtering (removing drafts about ``user agents,'' ``autonomous
systems'' in routing, and other non-AI uses of matched keywords), 361
drafts were retained as AI/agent-relevant based on a relevance
rating threshold.
\subsubsection{Supplementary Standards Bodies}
To contextualize the IETF landscape, we ingested a supplementary
corpus of standards and specifications from five additional bodies:
ISO/IEC (including ISO~22989~\citep{iso22989} and
ISO~42001~\citep{iso42001}), ITU-T (including
Y.3172~\citep{itu-y3172}), ETSI (ENI, ZSM), W3C (Web of Things,
Verifiable Credentials, WebNN), and NIST (AI RMF~\citep{nist-ai-rmf}).
These documents were included in the gap analysis (Section~\ref{sec:gaps})
to identify areas where non-IETF bodies provide coverage that the IETF
corpus lacks, and vice versa.
\subsubsection{Author and Affiliation Data}
Author records were fetched from the Datatracker's
\texttt{documentauthor} and \texttt{person} endpoints. Organizational
affiliations were normalized using a hand-curated alias table of 40+
mappings (e.g., ``Huawei Technologies Co., Ltd.''
$\rightarrow$~``Huawei'') supplemented by automatic suffix stripping
for common corporate suffixes.
\subsection{LLM-Based Analysis}
\label{sec:llm-analysis}
\subsubsection{Multi-Dimensional Rating}
Each draft was rated by Claude (Anthropic; Sonnet model) on five
dimensions using a structured prompt containing the draft's name,
title, submission date, page count, and abstract (truncated to 2,000
characters). The five rating dimensions are:
\begin{itemize}[nosep]
\item \textbf{Novelty} (1--5): Originality relative to existing
standards and proposals.
\item \textbf{Maturity} (1--5): Completeness of the technical
specification.
\item \textbf{Overlap} (1--5): Redundancy with other known drafts
(5 indicates near-duplication).
\item \textbf{Momentum} (1--5): Community engagement, revisions,
and working group adoption signals.
\item \textbf{Relevance} (1--5): Importance to the AI/agent
ecosystem specifically.
\end{itemize}
The prompt instructs Claude to return structured JSON with integer
scores and brief justification notes for each dimension, plus a 2--3
sentence summary and one or more category labels drawn from a
predefined taxonomy of 11 categories (Table~\ref{tab:categories}).
A composite quality score is computed as the arithmetic mean of
novelty, maturity, momentum, and relevance (excluding overlap, which
measures redundancy rather than quality).
To reduce API costs, drafts were rated in batches of five using a
batch prompt variant. Each draft's abstract was truncated to 1,500
characters in batch mode. All API responses were cached in an
\texttt{llm\_cache} table keyed by SHA-256 hash of the full prompt,
making the pipeline idempotent on re-runs.
\subsubsection{Idea Extraction}
Discrete technical ideas---mechanisms, protocols, architectural
patterns, extensions, and requirements---were extracted from each
draft using Claude. For individual extraction, the prompt included
the abstract and the first 3,000 characters of full text (Sonnet
model). For batch extraction, groups of five drafts were processed
per API call using the cheaper Haiku model with abstracts truncated
to 800 characters. The prompt requested 1--4 top-level novel
contributions per draft, with explicit instructions to merge
sub-features into parent ideas and to return an empty array for
drafts lacking substantive technical content.
Extracted ideas were deduplicated within each draft using
embedding-based cosine similarity (threshold~0.85), removing ideas
that were restatements of the same concept. Cross-draft idea overlap
was analyzed using Python's \texttt{SequenceMatcher} with a fuzzy
matching threshold of~0.75 on idea titles, enabling detection of
convergent ideas across organizational boundaries.
\subsubsection{Gap Analysis}
A single Claude Sonnet call received a compressed landscape summary
containing category distribution counts, the 20 most frequently
occurring idea titles, overlap cluster statistics, and summaries of
relevant non-IETF standards. The prompt instructed the model to
identify 8--15 standardization gaps---areas, problems, or technical
challenges not adequately addressed by the existing corpus---with
structured output including topic, description, severity rating
(critical/high/medium/low), evidence, and partial coverage from
existing standards.
\subsection{Embedding and Clustering}
\label{sec:embedding}
Vector embeddings were generated locally using Ollama with the
nomic-embed-text model~\citep{nomic2024}. For each draft, the input
combined the title, abstract, and first 4,000 characters of full text
(when available), producing a 768-dimensional vector stored as a
binary blob in SQLite.
Pairwise cosine similarity was computed across all embedded drafts,
producing an $n \times n$ similarity matrix (cached to disk as a
NumPy array). Clustering used a greedy single-linkage algorithm: for
each unvisited draft, all unvisited drafts with cosine similarity
$\geq \tau$ to the seed were added to its cluster. Three empirically
determined thresholds were applied:
\begin{itemize}[nosep]
\item $\tau = 0.85$: Topically overlapping drafts (42 clusters).
\item $\tau = 0.90$: Near-duplicates or same-author variants (34
clusters).
\item $\tau = 0.98$: Functionally identical drafts (25+ pairs).
\end{itemize}
These thresholds were selected by manual inspection of draft pairs at
each level; no systematic sensitivity analysis was performed (see
Section~\ref{sec:limitations}).
\subsection{Supplementary Analyses}
Three additional analysis passes operate on the stored data with zero
API cost:
\begin{enumerate}[nosep]
\item \textbf{RFC cross-references}: Regex-based extraction of
RFC, BCP, and draft citations from full text, yielding 4,231
cross-references across 360 drafts.
\item \textbf{Category trends}: SQL-based monthly breakdown of new
drafts per category with growth rates.
\item \textbf{Co-authorship network}: Team bloc detection via
pairwise author overlap ($\geq$70\% shared drafts, $\geq$2 shared
drafts), with connected components forming blocs.
\end{enumerate}
\subsection{Cost}
Table~\ref{tab:cost} summarizes the total pipeline cost for 475 drafts.
\begin{table}[H]
\centering
\caption{Pipeline cost breakdown.}
\label{tab:cost}
\begin{tabular}{llrr}
\toprule
\textbf{Stage} & \textbf{Model} & \textbf{Items} & \textbf{Cost (USD)} \\
\midrule
Rating & Claude Sonnet & 475 drafts & \$5.50--8.00 \\
Idea extract. & Claude Haiku & 475 drafts & \$0.80 \\
Gap analysis & Claude Sonnet & 1 call & \$0.20 \\
Embeddings & Ollama (local) & 475 drafts & \$0.00 \\
RFC refs & Regex (local) & 475 drafts & \$0.00 \\
Trends & SQL (local) & 475 drafts & \$0.00 \\
Idea overlap & SequenceMatcher & 501 ideas & \$0.00 \\
\midrule
\textbf{Total} & & & \textbf{\$6.50--9.00} \\
\bottomrule
\end{tabular}
\end{table}
% =========================================================================
\section{Results}
\label{sec:results}
% =========================================================================
\subsection{Corpus Overview and Growth Trajectory}
The final corpus comprises 475 Internet-Drafts submitted between
January~2024 and March~2026. After false-positive filtering (drafts
with relevance score $\leq$~2 or manually flagged), 361 drafts were
retained as substantively related to AI agents.
The growth trajectory is striking. In 2024, 9 AI/agent drafts were
submitted (0.5\% of 1,651 total IETF drafts). In 2025, 190 were
submitted (7.0\% of 2,696). In Q1~2026 alone, 162 were submitted
(9.3\% of 1,748). Monthly submissions followed a step function:
5~drafts in June~2025, 61 in October~2025, 85 in February~2026.
The acceleration has not plateaued as of March~2026.
\begin{table}[H]
\centering
\caption{Growth of AI/agent-related IETF Internet-Drafts.}
\label{tab:growth}
\begin{tabular}{rrrr}
\toprule
\textbf{Year} & \textbf{Total IETF} & \textbf{AI/Agent} & \textbf{Share (\%)} \\
\midrule
2024 & 1,651 & 9 & 0.5 \\
2025 & 2,696 & 190 & 7.0 \\
2026 (Q1) & 1,748 & 162 & 9.3 \\
\bottomrule
\end{tabular}
\end{table}
\subsection{Thematic Distribution}
\label{sec:categories}
Drafts were classified into 11 non-exclusive categories
(Table~\ref{tab:categories}). A single draft may belong to multiple
categories; percentages therefore exceed 100\%.
\begin{table}[H]
\centering
\caption{Category distribution across 475 drafts. Drafts may appear in
multiple categories.}
\label{tab:categories}
\begin{tabular}{lrr}
\toprule
\textbf{Category} & \textbf{Drafts} & \textbf{Share (\%)} \\
\midrule
Data formats / interoperability & 214 & 45 \\
Policy / governance & 214 & 45 \\
Agent identity / authentication & 160 & 34 \\
A2A protocols & 157 & 33 \\
Autonomous network operations & 124 & 26 \\
Agent discovery / registration & 89 & 19 \\
ML traffic management & 79 & 17 \\
Human--agent interaction & 57 & 12 \\
AI safety / alignment & 112 & 24 \\
Model serving / inference & 42 & 9 \\
Other AI/agent & -- & -- \\
\bottomrule
\end{tabular}
\end{table}
The dominance of infrastructure categories---data formats, identity,
communication protocols---is expected for an early-stage standards
effort. The comparatively low representation of safety/alignment and
human--agent interaction categories is a structural finding we examine
in Section~\ref{sec:safety-deficit}.
\subsection{The Capability-to-Safety Deficit}
\label{sec:safety-deficit}
The ratio of capability-building drafts (A2A protocols, autonomous
network operations, agent discovery, model serving) to safety-oriented
drafts (AI safety/alignment, human--agent interaction) is
approximately 4:1 on aggregate. This ratio varies significantly by
month, ranging from 1.5:1 in months with concentrated safety
submissions to over 20:1 in months dominated by protocol proposals.
The drafts that do address safety are among the highest-rated in the
corpus. The Verifiable Observation Logging for Transparency
(VOLT)~\citep{draft-cowles-volt} protocol scored 4.75/5.0 on the
four-dimension composite (excluding overlap), as did the Distributed
AI Accountability Protocol (DAAP)~\citep{draft-aylward-daap}. The
STAMP protocol~\citep{draft-guy-bary-stamp} for cryptographic
delegation and proof scored 4.5. The quality of safety-focused work
is high; the quantity is not.
An analysis of RFC cross-references reinforces this finding. Across
4,231 parsed citations, the most-referenced standards after the
boilerplate RFC~2119/8174 conventions are TLS~1.3~\citep{rfc8446}
(42 citations), OAuth~2.0~\citep{rfc6749} (36), HTTP
Semantics~\citep{rfc9110} (34), and JWT~\citep{rfc7519} (22). The
agent standards ecosystem is being constructed on the web's existing
security infrastructure---OAuth, TLS, HTTP, JWT---yet the safety
layer that should accompany this security foundation remains
underdeveloped.
\subsection{Protocol Fragmentation}
\label{sec:fragmentation}
Embedding-based similarity analysis reveals extensive duplication and
fragmentation across the corpus.
\subsubsection{Near-Duplicates}
At the 0.98 cosine similarity threshold, 25+ draft pairs are
functionally identical---the same proposal submitted under different
names, to different working groups, or as renamed revisions. A
taxonomy of near-duplicates includes: same draft submitted to
different working groups (14 pairs), renamed drafts (5), evolutionary
versions (3), and genuinely competing proposals from different
organizations (2+).
\subsubsection{Competing Clusters}
At the 0.85 threshold, 42 topical clusters emerge. The most crowded
is OAuth for AI agents, with 14 distinct proposals all addressing
how AI agents authenticate and receive authorization via the OAuth
framework. These range from broad profile proposals to narrow scope
extensions to comprehensive accountability systems. None are
interoperable.
The A2A protocol space encompasses 157 drafts with no
interoperability layer. The most common technical idea in the entire
extracted corpus---``Multi-Agent Communication Protocol''---appears
independently in 8 drafts from different teams. A 10-draft cluster
addresses agent gateway and multi-agent collaboration, with
approaches ranging from semantic routing gateways to cross-domain
interoperability frameworks.
\subsubsection{Causes of Fragmentation}
The data distinguishes three causes: (1)~working group shopping, where
authors submit the same draft to multiple working groups seeking
adoption; (2)~parallel invention, where isolated teams independently
solve the same problem; and (3)~strategic surface-area expansion,
where organizations submit multiple related drafts to maximize
presence in the standards landscape.
\subsection{Organizational Dynamics}
\label{sec:orgs}
\subsubsection{Concentration}
Authorship is heavily concentrated. Huawei leads with 53 authors
contributing to 69 drafts---approximately 16\% of the entire corpus
across all Huawei entities. China Mobile (24~authors, 35~drafts),
Cisco (24~authors, 26~drafts), and China Telecom (24~authors,
24~drafts) follow. Chinese-linked institutions (Huawei, China
Mobile, China Telecom, China Unicom, Tsinghua University, ZTE, BUPT,
and associated laboratories) collectively account for over 160
authors.
Western technology companies are dramatically underrepresented
relative to their market positions. Google is present with 5 authors
on 9 drafts. Microsoft, Apple, and Meta have minimal direct
participation. Amazon's 6 authors focus on post-quantum cryptography
rather than agent-specific work.
\subsubsection{Team Blocs}
Co-authorship analysis identifies 18 team blocs among the 713 authors,
covering approximately 25\% of all authors. The largest bloc is a
13-person Huawei team sharing 22 drafts with 94\% average cohesion
(measured as pairwise overlap of draft portfolios). The team's core
of 7 members each appear on 13--23 drafts.
Cross-organizational collaboration is sparse. The most productive
cross-team pair shares only 3 drafts. Chinese organizations form a
tightly linked ecosystem: Huawei--China Unicom shares 6 drafts,
Tsinghua--Zhongguancun Lab shares 5, China Mobile--ZTE shares 4.
European telecoms (Deutsche Telekom, Telef\'onica, Orange) act as
bridges between Chinese and Western institutions.
\subsection{Cross-Organization Convergence}
\label{sec:convergence}
Despite the fragmentation, significant latent consensus exists. Using
fuzzy title matching (\texttt{SequenceMatcher} at 0.75 threshold) on
the 501 extracted ideas, 132 ideas (approximately 33\% of unique idea
clusters) have been independently proposed by two or more organizations.
The strongest convergence signals include ``A2A Communication
Paradigm'' (proposed by 8 organizations from 5 countries),
``AI Agent Network Architecture'' (8 organizations), and
``Multi-Agent Communication Protocol'' (7 organizations). An
examination of organizational pairs reveals that 180 convergent ideas
cross the boundary between Chinese-linked and Western organizations,
indicating genuine cross-cultural consensus on technical directions
despite the sparse direct collaboration noted in
Section~\ref{sec:orgs}.
The coexistence of convergence and fragmentation has a specific
structure: organizations agree on \textit{what} needs building (the
convergent ideas) but disagree on \textit{how} to build it (the
competing protocol proposals). This gap between problem consensus and
solution divergence is where architectural coordination is most needed.
\subsection{Gap Analysis}
\label{sec:gaps}
The gap analysis identified 11 standardization gaps, distributed across
severity levels as shown in Table~\ref{tab:gaps}.
\begin{table}[H]
\centering
\caption{Identified standardization gaps by severity.}
\label{tab:gaps}
\begin{tabularx}{\textwidth}{llX}
\toprule
\textbf{Severity} & \textbf{Topic} & \textbf{Description} \\
\midrule
Critical & Agent legal liability &
No standard addresses liability assignment when autonomous agents
cause harm or make binding commitments across creators, operators,
and users. \\
Critical & Capability degradation detection &
No standard defines detection mechanisms for gradual capability
degradation due to concept drift, adversarial inputs, or model
corruption. \\
Critical & Emergency override protocols &
No standard defines distributed emergency-stop mechanisms for
autonomous agents exhibiting dangerous behavior across
multi-system deployments. \\
\midrule
High & Cross-domain identity portability &
Agents cannot maintain consistent identity across organizational
domains with different identity systems. \\
High & Real-time behavior explanation &
No standard for interactive, real-time explanations of agent
decision-making during operation. \\
High & Multi-agent conflict resolution &
No protocol for resolving conflicts when multiple agents have
competing objectives or contend for shared resources. \\
High & Inter-standards-body bridging &
Protocols from IETF, ITU-T, and ISO cannot interoperate, creating
silos across network, internet, and industrial domains. \\
High & Behavioral audit trails &
Missing standards for immutable, decision-level audit logs
supporting forensic analysis and regulatory compliance. \\
\midrule
Medium & Resource consumption limits &
No self-regulation standards for agent computational, network, and
energy resource usage. \\
Medium & Training data provenance &
Missing standards for tracking data lineage as it flows between
agents in federated learning scenarios. \\
Medium & Content attribution &
No cryptographic attribution standards for agent-generated content.\\
\bottomrule
\end{tabularx}
\end{table}
The three critical gaps share a common theme: they address what happens
when autonomous agents fail or misbehave. The capability-building
majority of the corpus assumes cooperative, well-functioning agent
systems; the critical gaps expose the absence of standards for the
adversarial, degraded, and emergency cases that inevitably arise in
production deployment.
Cross-referencing gaps with extracted ideas quantifies the coverage
deficit. The ``emergency override'' gap has only 15 partially
addressing ideas across the corpus. The ``multi-agent conflict
resolution'' and ``inter-standards-body bridging'' gaps have zero
directly related extracted ideas---they are entirely unaddressed.
% =========================================================================
\section{Discussion}
\label{sec:discussion}
% =========================================================================
\subsection{Implications for Standardization Strategy}
The landscape reveals a standards ecosystem in a characteristic
early-stage pattern: rapid expansion, parallel invention, and
insufficient coordination. The IETF has navigated such patterns
before---the early web, IoT, DNS security---and the historical
resolution involves convergence of competing proposals, working group
consolidation, and the emergence of a small number of lasting
standards from a large initial field.
Three strategic priorities emerge from the data:
\textbf{Safety-first coordination.} The 4:1 capability-to-safety
ratio is a structural risk. The critical gaps---behavioral verification,
capability degradation detection, emergency override---are precisely
the areas where standardization failure has the highest real-world
consequence. Unlike protocol fragmentation, which causes confusion and
implementation cost, safety gaps create liability and harm. The
EU AI Act~\citep{eu-ai-act}, which mandates real-time explainability
and human oversight for high-risk AI systems, will make several of
these gaps regulatory obligations rather than optional best practices.
\textbf{Architectural connective tissue.} The landscape needs not more
protocols but a shared execution model. The convergence data shows that
organizations agree on the components; they disagree on the
integration. Proposals like VOLT~\citep{draft-cowles-volt} (execution
traces), DAAP~\citep{draft-aylward-daap} (accountability),
STAMP~\citep{draft-guy-bary-stamp} (cryptographic delegation), and
Verifiable Agent Conversations~\citep{draft-birkholz-vac} (signed
conversation records) address complementary parts of the same
architectural problem. An overarching agent execution architecture
that composes these components would accelerate convergence more
effectively than continued parallel invention.
\textbf{Cross-organization coordination.} The team bloc structure
produces drafts that are internally consistent but externally
incompatible. The 18 detected blocs function as islands; the bridges
between them are thin. Mechanisms that encourage cross-bloc
collaboration---joint design teams, interop testing events,
shared reference implementations---are more likely to produce lasting
standards than the current pattern of parallel submission.
\subsection{Relationship to Prior Agent Standards}
A notable finding is the near-complete absence of references to FIPA
(Foundation for Intelligent Physical Agents) in the contemporary IETF
corpus. FIPA's Agent Communication Language~\citep{fipa-acl} and Agent
Management Specification~\citep{fipa-ams}, developed between 1996 and
2005, addressed agent discovery, communication, platform
interoperability, and interaction protocols---the same problem space
that the current wave of IETF drafts tackles.
The absence of FIPA references does not necessarily indicate ignorance;
the web-native technical context of 2025 differs substantially from the
Java/CORBA context of 2002. However, the recurrence of problems
FIPA addressed (agent naming, message semantics, directory services,
interaction protocols) suggests that explicit engagement with the
FIPA legacy could help the IETF community avoid re-learning lessons
from two decades ago.
\subsection{Limitations}
\label{sec:limitations}
The methodology has several limitations that affect the confidence and
generalizability of the findings.
\textbf{LLM-as-judge validity.} All quality ratings are generated by a
single LLM (Claude Sonnet) from draft abstracts truncated to 2,000
characters. No human calibration study has been performed; no
inter-rater reliability is established. The ratings should be treated
as relative rankings within this corpus, not absolute quality measures.
Maturity scores are particularly affected by abstract-only input, as
abstracts may not convey the full technical depth of a specification.
The overlap dimension is limited because Claude rates each draft
independently without access to the full corpus, meaning it reflects
the model's general knowledge rather than corpus-specific similarity.
A validation study using domain expert ratings on a sample of 25--30
drafts would substantially strengthen confidence.
\textbf{Corpus selection bias.} Keyword-based selection introduces both
false positives (``agent'' matching ``user agent,'' ``autonomous''
matching ``autonomous systems'' in routing) and false negatives
(relevant drafts using terminology outside the keyword set). We
estimate 30--50 false positives remain despite relevance filtering.
The temporal cutoff of January~2024 excludes earlier foundational work.
\textbf{Clustering thresholds.} The similarity thresholds (0.85, 0.90,
0.98) are empirically chosen by manual inspection, not derived from
principled analysis. The embedding model (nomic-embed-text) is a
general-purpose model not fine-tuned for standards document similarity.
Sensitivity analysis across thresholds and comparison with alternative
clustering methods (DBSCAN, hierarchical agglomerative) would
strengthen the clustering results.
\textbf{Gap analysis methodology.} Gap identification relies on a
single-shot LLM analysis of compressed landscape statistics, not
systematic comparison against a reference taxonomy. A rigorous
approach would compare the corpus against an explicit reference
architecture such as NIST AI RMF~\citep{nist-ai-rmf}, the FIPA agent
platform model, or a purpose-built agent ecosystem reference model.
Gap severity is assigned by Claude without defined quantitative
thresholds.
\textbf{Idea extraction consistency.} Batch extraction using Haiku
with abstract-only input produces different results from individual
extraction using Sonnet with full text. No precision/recall measurement
has been performed. The extraction prompt limits output to 1--4 ideas
per draft, potentially under-counting contributions from comprehensive
specifications.
\textbf{Organizational normalization.} Cross-organization analysis
depends on the accuracy of a hand-curated alias table. Boundary cases
(e.g., joint ventures, university--industry affiliations, subsidiary
relationships) introduce judgment calls that affect concentration
statistics.
Despite these limitations, the findings are robust in their broad
contours: the growth trajectory, the safety deficit, the protocol
fragmentation, and the organizational concentration are visible
across multiple analytical methods and are not sensitive to the
specific threshold or model choices within reasonable ranges.
\subsection{Reproducibility and Openness}
The complete pipeline, database, and derived reports are released as
open-source software (the IETF Draft Analyzer). The SQLite database
contains all raw data, ratings, embeddings, ideas, gaps, author
records, and cached LLM responses, enabling independent verification
of every finding reported in this paper. The caching mechanism ensures
that re-running the pipeline produces identical results without
additional API cost.
% =========================================================================
\section{Conclusion}
\label{sec:conclusion}
% =========================================================================
We have presented a systematic, LLM-assisted analysis of the IETF's
AI-agent standardization landscape, covering 475 Internet-Drafts from
713 authors across more than 230 organizations. The analysis reveals a
standards ecosystem experiencing unprecedented growth---from 0.5\% to
9.3\% of all IETF submissions in fifteen months---accompanied by
significant structural challenges.
The capability-to-safety ratio of approximately 4:1, the extreme
protocol fragmentation (14 competing OAuth proposals, 155 A2A drafts
with no interoperability layer), and the concentration of authorship
(one vendor contributing $\sim$16\% of all drafts) are findings that
have direct implications for the trajectory of AI-agent
standardization. The 11 identified gaps, with three critical gaps
centered on what happens when agents fail, highlight the areas where
standardization effort is most urgently needed.
At the same time, the 132 cross-organization convergent ideas
demonstrate that latent consensus exists beneath the fragmentation.
Organizations agree on the problems; they disagree on the solutions.
This gap between problem consensus and solution divergence defines the
current phase of the standards race and points toward the needed
intervention: not more protocol proposals, but architectural
connective tissue that composes the existing high-quality components
into a coherent ecosystem.
The methodology itself contributes a replicable, cost-effective
approach to standards landscape analysis. At \$9--15 total, the
pipeline demonstrates that LLM-assisted document analysis at scale is
practical for research and policy applications. The explicit
documentation of limitations---no human calibration, empirical
thresholds, single-judge ratings---provides a template for the
responsible use of LLM-as-judge methodologies in technical document
analysis.
The IETF has navigated standardization sprints before, and the lasting
standards have consistently emerged from efforts that prioritized
interoperability and safety alongside capability. Whether the current
AI-agent wave follows this historical pattern depends on whether the
community can shift from parallel invention to coordinated
architecture before the capability work ships without the safety work
that should accompany it.
% =========================================================================
% References
% =========================================================================
\bibliographystyle{plainnat}
\bibliography{ietf-refs}
\end{document}

334
paper/ietf-refs.bib Normal file
View File

@@ -0,0 +1,334 @@
% =========================================================================
% Bibliography — IETF AI-Agent Landscape Paper
% =========================================================================
% --- IETF RFCs and Internet-Drafts ---
@techreport{rfc6749,
author = {Dick Hardt},
title = {{The OAuth 2.0 Authorization Framework}},
institution = {IETF},
type = {RFC},
number = {6749},
year = {2012},
doi = {10.17487/RFC6749},
}
@techreport{rfc7519,
author = {Michael Jones and John Bradley and Nat Sakimura},
title = {{JSON Web Token (JWT)}},
institution = {IETF},
type = {RFC},
number = {7519},
year = {2015},
doi = {10.17487/RFC7519},
}
@techreport{rfc8446,
author = {Eric Rescorla},
title = {{The Transport Layer Security (TLS) Protocol Version 1.3}},
institution = {IETF},
type = {RFC},
number = {8446},
year = {2018},
doi = {10.17487/RFC8446},
}
@techreport{rfc9110,
author = {Roy T. Fielding and Mark Nottingham and Julian Reschke},
title = {{HTTP Semantics}},
institution = {IETF},
type = {RFC},
number = {9110},
year = {2022},
doi = {10.17487/RFC9110},
}
@misc{draft-cowles-volt,
author = {Colin Cowles},
title = {{Verifiable Observation Logging for Transparency (VOLT)}},
howpublished = {Internet-Draft},
year = {2026},
note = {Work in progress},
}
@misc{draft-aylward-daap,
author = {Ryan Aylward},
title = {{Distributed AI Accountability Protocol (DAAP) Version 2.0}},
howpublished = {Internet-Draft},
year = {2026},
note = {Work in progress},
}
@misc{draft-guy-bary-stamp,
author = {Guy Bary},
title = {{Secure Task Authentication and Monitoring Protocol (STAMP)}},
howpublished = {Internet-Draft},
year = {2026},
note = {Work in progress},
}
@misc{draft-birkholz-vac,
author = {Henk Birkholz},
title = {{Verifiable Agent Conversations}},
howpublished = {Internet-Draft},
year = {2026},
note = {Work in progress},
}
@misc{draft-rosenberg-cheq,
author = {Jonathan Rosenberg},
title = {{CHEQ: Constrained Human-Engaged Queries for AI Agents}},
howpublished = {Internet-Draft},
year = {2025},
note = {Work in progress},
}
@misc{draft-williams-lm-hierarchy,
author = {Brandon Williams},
title = {{YANG Data Model for Hierarchical Language Model Coordination}},
howpublished = {Internet-Draft},
year = {2026},
note = {Work in progress},
}
@misc{draft-ietf-lake-edhoc,
title = {{Ephemeral Diffie-Hellman Over COSE (EDHOC)}},
howpublished = {Internet-Draft (IETF LAKE WG)},
year = {2025},
note = {Work in progress},
}
% --- Standards bodies ---
@techreport{iso22989,
author = {{ISO/IEC}},
title = {{Information technology --- Artificial intelligence --- Artificial intelligence concepts and terminology}},
institution = {ISO/IEC},
number = {22989:2022},
year = {2022},
}
@techreport{iso42001,
author = {{ISO/IEC}},
title = {{Information technology --- Artificial intelligence --- Management system}},
institution = {ISO/IEC},
number = {42001:2023},
year = {2023},
}
@techreport{itu-y3172,
author = {{ITU-T}},
title = {{Architectural framework for machine learning in future networks including IMT-2020}},
institution = {ITU-T},
number = {Y.3172},
year = {2019},
}
@techreport{nist-ai-rmf,
author = {{National Institute of Standards and Technology}},
title = {{Artificial Intelligence Risk Management Framework (AI RMF 1.0)}},
institution = {NIST},
number = {AI 100-1},
year = {2023},
doi = {10.6028/NIST.AI.100-1},
}
@misc{eu-ai-act,
author = {{European Parliament and Council of the European Union}},
title = {{Regulation (EU) 2024/1689 --- Artificial Intelligence Act}},
howpublished = {Official Journal of the European Union},
year = {2024},
}
% --- FIPA ---
@techreport{fipa-acl,
author = {{Foundation for Intelligent Physical Agents}},
title = {{FIPA ACL Message Structure Specification}},
institution = {FIPA},
number = {SC00061G},
year = {2002},
}
@techreport{fipa-ams,
author = {{Foundation for Intelligent Physical Agents}},
title = {{FIPA Agent Management Specification}},
institution = {FIPA},
number = {SC00023K},
year = {2004},
}
% --- Multi-agent systems ---
@book{wooldridge2009,
author = {Michael Wooldridge},
title = {{An Introduction to MultiAgent Systems}},
publisher = {John Wiley \& Sons},
edition = {2nd},
year = {2009},
}
@article{dorri2018,
author = {Ali Dorri and Salil S. Kanhere and Raja Jurdak},
title = {{Multi-Agent Systems: A Survey}},
journal = {IEEE Access},
volume = {6},
pages = {28573--28593},
year = {2018},
doi = {10.1109/ACCESS.2018.2831228},
}
@inproceedings{shoham2008,
author = {Yoav Shoham and Kevin Leyton-Brown},
title = {{Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations}},
booktitle = {Cambridge University Press},
year = {2008},
}
% --- NLP and text analysis ---
@inproceedings{devlin2019,
author = {Jacob Devlin and Ming-Wei Chang and Kenton Lee and Kristina Toutanova},
title = {{BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding}},
booktitle = {Proceedings of NAACL-HLT},
pages = {4171--4186},
year = {2019},
}
@article{zheng2023,
author = {Lianmin Zheng and Wei-Lin Chiang and Ying Sheng and Siyuan Zhuang and Zhanghao Wu and Yonghao Zhuang and Zi Lin and Zhuohan Li and Dacheng Li and Eric P. Xing and Hao Zhang and Joseph E. Gonzalez and Ion Stoica},
title = {{Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena}},
journal = {Advances in Neural Information Processing Systems},
volume = {36},
year = {2023},
}
@article{brown2020,
author = {Tom Brown and Benjamin Mann and Nick Ryder and Melanie Subbiah and Jared Kaplan and Prafulla Dhariwal and Arvind Neelakantan and Pranav Shyam and Girish Sastry and Amanda Askell and others},
title = {{Language Models are Few-Shot Learners}},
journal = {Advances in Neural Information Processing Systems},
volume = {33},
pages = {1877--1901},
year = {2020},
}
% --- Embeddings and clustering ---
@article{nussbaumer2024,
author = {Nils Reimers and Iryna Gurevych},
title = {{Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks}},
journal = {Proceedings of EMNLP-IJCNLP},
pages = {3982--3992},
year = {2019},
}
@article{nomic2024,
author = {Zach Nussbaumer and John X. Morris and Brandon Duderstadt},
title = {{Nomic Embed: Training a Reproducible Long Context Text Embedder}},
journal = {arXiv preprint arXiv:2402.01613},
year = {2024},
}
@article{vandermaaten2008,
author = {Laurens van der Maaten and Geoffrey Hinton},
title = {{Visualizing Data using t-SNE}},
journal = {Journal of Machine Learning Research},
volume = {9},
pages = {2579--2605},
year = {2008},
}
% --- Technology landscape analysis ---
@article{martin2016,
author = {Ben R. Martin},
title = {{Technology Foresight in a Rapidly Globalizing Economy}},
journal = {International Journal of Foresight and Innovation Policy},
volume = {4},
number = {1/2},
year = {2016},
}
@book{porter2005,
author = {Alan L. Porter and Scott W. Cunningham},
title = {{Tech Mining: Exploiting New Technologies for Competitive Advantage}},
publisher = {John Wiley \& Sons},
year = {2005},
}
@book{roper2011,
author = {A. Thomas Roper and Scott W. Cunningham and Alan L. Porter and Thomas W. Mason and Frederick A. Rossini and Jerry Banks},
title = {{Forecasting and Management of Technology}},
publisher = {John Wiley \& Sons},
edition = {2nd},
year = {2011},
}
% --- Standards analysis ---
@article{blind2017,
author = {Knut Blind and Sören S. Petersen and Cesare A.F. Riillo},
title = {{The Impact of Standards and Regulation on Innovation in Uncertain Markets}},
journal = {Research Policy},
volume = {46},
number = {1},
pages = {249--264},
year = {2017},
doi = {10.1016/j.respol.2016.11.003},
}
@article{simcoe2012,
author = {Timothy Simcoe},
title = {{Standard Setting Committees: Consensus Governance for Shared Technology Platforms}},
journal = {American Economic Review},
volume = {102},
number = {1},
pages = {305--336},
year = {2012},
doi = {10.1257/aer.102.1.305},
}
@article{lerner2014,
author = {Josh Lerner and Jean Tirole},
title = {{Standard-Essential Patents}},
journal = {Journal of Political Economy},
volume = {123},
number = {3},
pages = {547--586},
year = {2015},
doi = {10.1086/680995},
}
% --- Agent protocols ---
@misc{anthropic-mcp,
author = {{Anthropic}},
title = {{Model Context Protocol (MCP) Specification}},
year = {2024},
howpublished = {\url{https://modelcontextprotocol.io}},
}
@misc{google-a2a,
author = {{Google}},
title = {{Agent-to-Agent (A2A) Protocol}},
year = {2025},
howpublished = {\url{https://github.com/google/A2A}},
}
@article{wang2024,
author = {Lei Wang and Chen Ma and Xueyang Feng and Zeyu Zhang and Hao Yang and Jingsen Zhang and Zhiyuan Chen and Jiakai Tang and Xu Chen and Yankai Lin and Wayne Xin Zhao and Zhewei Wei and Ji-Rong Wen},
title = {{A Survey on Large Language Model Based Autonomous Agents}},
journal = {Frontiers of Computer Science},
volume = {18},
number = {6},
year = {2024},
doi = {10.1007/s11704-024-40231-1},
}
@article{xi2023,
author = {Zhiheng Xi and Wenxiang Chen and Xin Guo and Wei He and Yiwen Ding and Boyang Hong and Ming Zhang and Junzhe Wang and Senjie Jin and Enyu Zhou and others},
title = {{The Rise and Potential of Large Language Model Based Agents: A Survey}},
journal = {arXiv preprint arXiv:2309.07864},
year = {2023},
}

View File

@@ -0,0 +1,171 @@
# Session Handoff — 2026-04-12 (ACT / ECT IETF Strategy)
**Purpose**: Cold-start snapshot for the next session. Read this plus
`workspace/STRATEGY.md` to pick up without re-discovery.
---
## 1. Session Date & Context
- **Date**: 2026-04-12
- **Trigger**: Between 2026-04-07 and 2026-04-11, 14+ competing IETF
individual drafts and 7+ high-relevance arXiv papers appeared in the
agent-authorization / execution-accountability space. The window to
plant ACT + ECT as the standards-track home for this family is now
narrow. IETF 123 (July 2026) is the landing target.
- **Work mode**: Strategic consolidation — renames, restructure, diff
documents, outreach drafts, interop plan. No new normative content
added beyond what the existing drafts already carried.
---
## 2. Key Decisions Made This Session
1. **`par``pred` claim rename** (ACT). Aligns the ACT predecessor
claim with ECT's identical `pred` claim so both specs use the same
wire token for DAG parent references. Applied across draft text,
refimpl, tests.
2. **"Agent Compact Token" → "Agent Context Token" rename** (ACT).
Preserves the "ACT" acronym. "Context" better describes what the
token carries (invocation context — DAG refs, task metadata,
capabilities, delegation chain, oversight) and creates a clean
semantic pair with ECT (Execution Context Token).
3. **Package split into sibling packages `ietf-act` + `ietf-ect`** (no
shared core). Moved from monorepo with shared module to
`workspace/packages/act/` and `workspace/packages/ect/` as
independent packages. Decision: do not build a shared-core library
— the two specs have intentionally different scope, and forcing
shared abstractions obscures that. Cross-spec guarantees are
documented and tested via a separate `packages/interop/` plan.
4. **Position Option B chosen**: ECT is a WIMSE profile that
**normatively references** ACT. ACT is the general-purpose
primitive, ECT is the WIMSE-identity-bound execution profile. This
lets ECT be submitted for WIMSE WG adoption while ACT stays on the
independent-submission path (no WG dependency), and they cite each
other cleanly.
5. **`inp_hash` / `out_hash` format divergence acknowledged, not
unified this session**. ACT emits plain base64url; ECT validator
requires `sha-256:<b64url>` prefix. Documented as expected-xfail in
the interop test plan rather than forced into alignment — spec-level
decision deferred.
---
## 3. Artifacts Produced / Updated This Session
| Path (relative to `workspace/`) | Purpose |
|---|---|
| `STRATEGY.md` | Master strategy doc — landscape, positioning, phased action plan, risk register, success criteria |
| `packages/act/draft-nennemann-act-01.md` | ACT -01 draft; new §1.4.1 Related Work (concurrent proposals), new §1.5 Applicability (MCP/OpenAI/LangGraph/A2A/CrewAI/ECT), new §7.3 DAG vs Linear Delegation Chains |
| `packages/act/` (`ietf-act` refimpl) | 103 tests passing after `pred` and Context rename |
| `packages/ect/` (`ietf-ect` refimpl) | 56 tests passing; `inp_hash` bug fixed (removed stale `sha-256:` transformation in emitter path) |
| `packages/INTEROP-TEST-PLAN.md` | Planned `packages/interop/tests/test_interop.py` structure — shared-claim consistency, algorithm matrix, DAG cross-reference, claim divergence, anti-goals; user-facing compatibility matrix |
| `drafts/ietf-wimse-ect/draft-nennemann-wimse-ect.md` | ECT -02 (docname `draft-nennemann-wimse-ect-02`); normative ref to `I-D.nennemann-act` added |
| `drafts/ietf-wimse-ect/DIFF-vs-txn-tokens-for-agents.md` | ~1235-word factual diff doc vs `draft-oauth-transaction-tokens-for-agents-06` — claim-level matrix, lifecycle comparison, composition scenarios |
| `drafts/ietf-wimse-ect/wimse-intro-email.md` | ~390-word introduction email for wimse@ietf.org |
| `drafts/ietf-wimse-ect/ietf123-slides-outline.md` | 10-minute WIMSE slot outline: 10 slides, pacing plan, Mermaid diagrams for WIT/WPT/ECT layering and DAG-vs-linear, speaker notes, timing-discipline cuts |
| `drafts/outreach/emirdag-liaison-email.md` | Liaison email to Dr. Emirdag on SCITT-AI-agent-execution overlap; proposes cross-citation, claim alignment, possible joint IETF 123 slot |
| `drafts/outreach/oauth-ml-response.md` | Short oauth@ietf.org response to Txn-Tokens-for-Agents-06; frames ACT `pred` DAG as generalization of Raut et al.'s linear `actchain` |
---
## 4. Open Action Items (what the user does next)
Phase A items from `STRATEGY.md` still require execution:
- [ ] **A1**: Update ECT HTTP header section — replace `Wimse-Audience`
header with `wimse-aud` signature metadata parameter per
`draft-ietf-wimse-http-signature-03` (breaking change upstream,
published 2026-04-07).
- [ ] **A2**: Update SCITT refs in ACT to `draft-ietf-scitt-architecture-22`
(AUTH48); note "to be RFC-XXXX upon publication".
- [ ] **A3**: Lock Txn-Tokens refs in ACT/ECT to
`draft-ietf-oauth-transaction-tokens-08`.
- [ ] **A7**: Commit workspace + `research.ietf` subrepo changes.
Phase B outreach items (drafted but not sent):
- [ ] **B1**: Send Emirdag liaison email (`drafts/outreach/emirdag-liaison-email.md`).
- [ ] **B2**: Submit ACT -01 to datatracker.
- [ ] **B3**: Submit ECT -02 to datatracker.
- [ ] **B4**: Post ECT intro email to wimse@ietf.org, link DIFF doc.
- [ ] **B5**: Post OAuth ML response to oauth@ietf.org.
- [ ] **B6**: Request 10-min WIMSE slot at IETF 123.
- [ ] **B7**: Watch DAWN WG charter formation.
---
## 5. Pending Decisions (need user input)
- **Emirdag engagement depth**: liaison citation only, co-authorship
offer on a joint anchoring section, or just passive cross-citation?
The drafted email leaves all three doors open — pick one before
sending.
- **Refimpl publication to PyPI**: package names `ietf-act` and
`ietf-ect` are reserved but not published. User approval required
before any `twine upload`.
- **Repo strategy**: single monorepo for both drafts, or split into
separate Git repos so each draft has its own "home" for
kramdown-rfc / datatracker watchers? Current state is monorepo.
- **IETF 123 travel**: in person (Madrid) or remote? Affects
slide-prep cadence and whether to plan side meetings with Emirdag
/ Bertocci.
- **Hash encoding alignment** (ACT plain b64url vs ECT `sha-256:`
prefix): decide which spec moves, or keep divergence documented.
Interop plan currently pins it as xfail.
---
## 6. Known Landscape Threats (top 3)
1. **`draft-oauth-transaction-tokens-for-agents-06`** (Raut / Amazon,
2026-04-11). Linear `actchain` at OAuth AS layer — directly in the
same conceptual neighborhood as ACT. If this gets OAuth WG adoption
before ACT is visible, ACT has to position as "DAG generalization /
no-AS variant" instead of the default.
2. **`draft-emirdag-scitt-ai-agent-execution-00`** (VERIDIC, 2026-04-07).
AIR (AgentInteractionRecord) as SCITT payload. Not a direct
competitor but overlapping on input/output hashing, reasoning
capture, causality. Risk: if adopted first, ACT looks redundant
unless positioned as the *lifecycle* that AIR *anchors*.
3. **arXiv 2603.24775 (AIP / IBCTs)**. Closest *technical* competitor —
JWT + Biscuit/Datalog, exposes auth gap on ~2000 MCP servers, same
peer-to-peer-without-AS story. Not an IETF draft so no WG adoption
risk, but could become the citation of record in academic /
industry press if ACT is not visible fast.
(Full landscape table: `STRATEGY.md` §3.)
---
## 7. Next Session Starting Points (first 3 things)
1. **Read** `docs/control-center.md` (workspace root) and
`workspace/STRATEGY.md`; confirm Phase A is still the active phase.
2. **Execute A1** — patch ECT's HTTP header section to the
`wimse-aud` signature parameter form. This is the most urgent
technical fix; it's a breaking upstream change from
`draft-ietf-wimse-http-signature-03` and blocks ECT -02 submission.
3. **Execute A2 + A3** — refresh SCITT and Txn-Tokens reference
versions in both drafts so the submission snapshot is current.
After A1A3, move to Phase B (submissions + outreach sends).
---
## 8. Reference State Snapshot
- ACT refimpl: `packages/act/` — 103 tests pass, `pred` + Context
rename done, EdDSA + ES256 both supported.
- ECT refimpl: `packages/ect/` — 56 tests pass, `inp_hash` fix
applied, ES256 only, `exec+jwt` typ (legacy `wimse-exec+jwt` still
accepted).
- Interop package: not yet created; plan is in
`packages/INTEROP-TEST-PLAN.md`.
- Draft versions: ACT at `-01`, ECT at `-02` (`docname:
draft-nennemann-wimse-ect-02`).
- No submissions on datatracker yet this session (pending Phase A
completion).

224
workspace/STRATEGY.md Normal file
View File

@@ -0,0 +1,224 @@
# ACT + ECT IETF Strategy
**Author**: Christian Nennemann
**Date**: 2026-04-12
**Status**: Active
---
## 1. Executive Summary
Two Internet-Drafts, one strategy: **ACT** (general) + **ECT** (WIMSE profile) as a complementary spec family for AI agent authorization and execution accountability.
**The window**: In the last 8 weeks, 14+ competing IETF individual drafts and 7+ high-relevance arXiv papers appeared. The space is crowding fast. **Ship -01/-02 within 2 weeks**; establish IETF 123 (July 2026) as the landing point.
**The position**: ACT is the only spec combining (a) two-phase JWT lifecycle, (b) DAG-based DAG predecessor structure, and (c) standards-track independence from proprietary agent frameworks. ECT is the only WIMSE-aligned execution-context spec.
---
## 2. Current State (What We Have)
### Artifacts in place
| Artifact | Location | Status |
|---|---|---|
| ACT draft | `packages/act/draft-nennemann-act-01.md` | -01, ready to review |
| ECT draft | `drafts/ietf-wimse-ect/draft-nennemann-wimse-ect.md` | -02, needs HTTP header update |
| ACT refimpl | `packages/act/` (ietf-act) | 103 tests pass, `pred` + Context rename done |
| ECT refimpl | `packages/ect/` (ietf-ect) | 56 tests pass, `inp_hash` bug fixed |
| ACT applicability section | In draft §1.5 | MCP, OpenAI, LangGraph, A2A, CrewAI, WIMSE-ECT |
| Diff doc vs Txn-Agents | `drafts/ietf-wimse-ect/DIFF-vs-txn-tokens-for-agents.md` | Done, ~1235 words |
| WIMSE mailing list email | `drafts/ietf-wimse-ect/wimse-intro-email.md` | Done, ~390 words |
### Recent completed work
- `par``pred` rename across ACT (spec alignment with ECT)
- "Agent Compact Token" → "Agent Context Token" rename (semantic alignment with ECT)
- Package restructure to `workspace/packages/{act,ect}/`
- ECT `inp_hash` format bug fix (removed `sha-256:` prefix)
---
## 3. Landscape (What Just Happened)
### Critical drafts published April 711, 2026
| Draft | Impact | Response |
|---|---|---|
| `draft-emirdag-scitt-ai-agent-execution-00` | SCITT profile for AgentInteractionRecord (AIR) | **Propose liaison**: ACT = lifecycle, AIR = anchor payload |
| `draft-oauth-transaction-tokens-for-agents-06` | Amazon's `actchain` competes with ACT's DAG | **Differentiate**: linear chain vs DAG (fork/join) |
| `draft-ietf-wimse-http-signature-03` | `Wimse-Audience` header **removed**`wimse-aud` param | **Breaking change — fix ECT immediately** |
| `draft-ietf-oauth-transaction-tokens-08` | In WG Last Call → RFC imminent | Lock references before publication |
| `draft-ietf-scitt-architecture-22` | In AUTH48 → RFC imminent | Update SCITT refs to RFC number |
### Competitive arXiv papers (MarApr 2026)
- **2603.24775 (AIP/IBCTs)** — closest technical competitor, JWT + Biscuit/Datalog, zero auth on ~2000 MCP servers
- **2604.02767 (SentinelAgent)** — formal Delegation Chain Calculus
- **2509.13597 (Agentic JWT)** — prior linear chain JWT
- **2603.23801 (AgentRFC — Composition Safety)** — theoretical grounding for DAG-level tracking
### Strategic openings
- `draft-ietf-wimse-arch-07 §3.3.9` — WG arch doc **already names AI/ML intermediaries as workloads**; ECT fills this gap
- **DAWN potential new WG** (`draft-king-dawn-requirements-00`, 2026-04-11) — agent discovery; ACT identity claims are natural payload
- **NIST/NCCoE Concept Paper** — US government validation of standards-first agent identity approach
---
## 4. Positioning Strategy
### The three-sentence pitch
> ACT is a two-phase JWT lifecycle — the authorization mandate transitions to a tamper-evident execution record, producing a cryptographically verifiable DAG of agent invocations. ECT is the WIMSE profile that binds ACT-style execution records to workload identity with assurance levels. Together they close the agent accountability gap that OAuth/WIMSE/SCITT leave partially open.
### Differentiation matrix
| Against | How ACT/ECT differ |
|---|---|
| `draft-oauth-transaction-tokens-for-agents` | Two-phase lifecycle (authorization → proof-of-execution), DAG (not linear `actchain`), works without AuthZ server |
| `draft-emirdag-scitt-ai-agent-execution` | Lifecycle layer complement, not competitor; ACT produces what AIR anchors |
| AIP/IBCTs (arXiv 2603.24775) | Standards-track IETF home; JWT-only (no Biscuit/Datalog complexity) |
| `draft-helixar-hdp-agentic-delegation` | JWT/JOSE-standard (vs raw JSON), DAG (vs linear), IETF path |
| SentinelAgent (arXiv 2604.02767) | Standards deployability (vs formal calculus) |
| Agentic JWT (arXiv 2509.13597) | Two-phase lifecycle; DAG vs linear chain |
### Non-goals (say this explicitly)
- ACT does not replace WIMSE WIT/WPT — it sits above
- ACT does not replace OAuth/Txn-Tokens — it profiles for agent semantics
- ACT does not require SCITT — but integrates cleanly with it
- ECT does not carry identity — it carries execution context
---
## 5. Action Plan
### Phase A — Urgent technical updates (this week)
- [ ] **A1**: Update ECT HTTP header section — replace `Wimse-Audience` with `wimse-aud` signature metadata parameter per `draft-ietf-wimse-http-signature-03`
- [ ] **A2**: Update SCITT references in ACT — point to `draft-ietf-scitt-architecture-22` (AUTH48); note RFC-to-be
- [ ] **A3**: Update Txn-Tokens references in ACT/ECT — lock to `draft-ietf-oauth-transaction-tokens-08`
- [ ] **A4**: Add "DAG vs linear chain" section to ACT — key technical differentiator
- [ ] **A5**: Add Related Work additions to ACT:
- AIP/IBCTs (arXiv 2603.24775)
- SentinelAgent (arXiv 2604.02767)
- Agentic JWT (arXiv 2509.13597)
- Txn-Tokens-for-Agents-06
- HDP (`draft-helixar-hdp-agentic-delegation`)
- [ ] **A6**: Add Related Work additions to ECT:
- WIMSE arch §3.3.9 (explicit)
- Composition Safety (arXiv 2603.23801)
- MIGT taxonomy (arXiv 2604.06148)
- NIST/NCCoE Concept Paper
- [ ] **A7**: Commit all current work to git (workspace + research.ietf subrepo)
### Phase B — External engagement (next 12 weeks)
- [ ] **B1**: Email Emirdag (VERIDIC) — propose SCITT-AI + ACT liaison; coordinate AIR payload format with ACT execution-phase claims
- [ ] **B2**: Submit ACT -01 to datatracker
- [ ] **B3**: Submit ECT -02 to datatracker
- [ ] **B4**: Post ECT intro email to wimse@ietf.org with diff doc link
- [ ] **B5**: Post short response to OAuth WG on Txn-Tokens-for-Agents-06 — compare `actchain` (linear) vs ACT `pred` (DAG), offer as complementary not competitive
- [ ] **B6**: Request 10-min slot at IETF 123 WIMSE session (July 2026)
- [ ] **B7**: Track DAWN WG charter formation — if charters, submit positioning comment on how ACT identity claims serve discovery
### Phase C — IETF 123 preparation (MayJune 2026)
- [ ] **C1**: Iterate ACT/ECT based on mailing list feedback
- [ ] **C2**: Prepare 10-min WIMSE slides (focus on: gap filled, relationship to adopted drafts, ECT's role in execution context propagation)
- [ ] **C3**: Prepare 5-min OAuth slot request if Txn-Tokens-for-Agents discussion opens
- [ ] **C4**: Reference implementation hardening: test vectors, interop with at least one other implementation
### Phase D — Post-IETF 123 (August 2026+)
- [ ] **D1**: Based on WIMSE reception: either iterate toward WG adoption or pivot to BoF-style workshop
- [ ] **D2**: If SCITT-AI liaison forms: draft joint implementation report
- [ ] **D3**: If DAWN charters: submit ACT positioning statement
---
## 6. Timeline
```
2026-04-12 Strategy finalized (today)
2026-04-12 Phase A starts
2026-04-19 Phase A complete, ACT-01 + ECT-02 submitted
2026-04-20 Phase B starts (WIMSE ML post + Emirdag outreach)
2026-05-01 All external engagement initiated
2026-07-xx IETF 123 (target: WIMSE 10-min slot)
2026-08-xx Post-IETF 123 review, decide WG adoption strategy
```
---
## 7. Risk Register
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| WIMSE WG rejects ECT as out-of-charter | Medium | High | Cite arch §3.3.9 explicitly; frame as charter-aligned |
| Amazon Txn-Tokens-for-Agents gets OAuth WG adoption first | High | Medium | Differentiate at DAG/lifecycle level; position as complementary layer |
| SCITT-AI (Emirdag) adopted, ACT seen as redundant | Medium | High | Proactive liaison; position as lifecycle vs anchoring |
| DAWN charters without ACT positioning | Medium | Medium | Submit positioning statement during charter review |
| 14+ competing drafts fragment the space | High | Medium | Focus on ACT's unique two-phase lifecycle; cite competitors as related work |
| Independent-submission path stalls for ACT | Medium | Medium | Keep ECT on WG-adoption path; ACT can stay independent longer if needed |
---
## 8. Success Criteria
### 30-day criteria
- ACT-01 + ECT-02 on datatracker
- WIMSE mailing list engagement (≥3 replies from chairs/contributors)
- Emirdag liaison conversation started
### 90-day criteria (IETF 123 timing)
- 10-minute WIMSE agenda slot secured
- ≥1 independent implementation of ACT or ECT outside our refimpl
- Referenced by at least 2 other drafts
### 180-day criteria
- WIMSE WG adoption call for ECT (or clear path to it)
- SCITT-AI joint profile or explicit coordination
- ACT independent submission moving toward RFC Editor queue
---
## 9. Dependencies and Open Decisions
### External dependencies
- `draft-ietf-scitt-architecture` → RFC (timing unknown, AUTH48 now)
- `draft-ietf-oauth-transaction-tokens-08` → RFC (WG Last Call now)
- `draft-ietf-wimse-http-signature` → needs breaking change propagated
- WIMSE WG charter interpretation (chairs' call)
### Open decisions (need user input)
- Approach to Emirdag: liaison email, co-authorship offer, or just citation?
- Publish refimpls to PyPI? (currently package names `ietf-act`/`ietf-ect` reserved but not published — **no publishing without explicit user approval**)
- Repo strategy: single monorepo, or split ACT/ECT into separate Git repos for separate draft homes?
- IETF 123 travel: attend in person or remote?
---
## 10. References
### Our work
- `packages/act/draft-nennemann-act-01.md`
- `drafts/ietf-wimse-ect/draft-nennemann-wimse-ect.md` (docname -02)
- `drafts/ietf-wimse-ect/DIFF-vs-txn-tokens-for-agents.md`
- `drafts/ietf-wimse-ect/wimse-intro-email.md`
### Key competing/complementary drafts
- draft-oauth-transaction-tokens-for-agents-06 (Raut/Amazon)
- draft-emirdag-scitt-ai-agent-execution-00 (VERIDIC)
- draft-helixar-hdp-agentic-delegation-00
- draft-king-dawn-requirements-00 (potential new WG)
- draft-ietf-wimse-arch-07 (cite §3.3.9)
- draft-ietf-wimse-http-signature-03 (breaking change)
### Key arXiv references
- 2603.24775 — AIP / IBCTs
- 2604.02767 — SentinelAgent
- 2603.23801 — AgentRFC (Composition Safety)
- 2509.13597 — Agentic JWT
- 2604.06148 — MIGT taxonomy

1
workspace/act/MOVED.md Normal file
View File

@@ -0,0 +1 @@
Canonical location moved to workspace/packages/act/

View File

@@ -1,4 +1,4 @@
"""Agent Compact Token (ACT) — Reference Implementation.
"""Agent Context Token (ACT) — Reference Implementation.
A JWT-based format for autonomous AI agents that unifies authorization
and execution accountability in a single token lifecycle.

View File

@@ -0,0 +1,23 @@
[build-system]
requires = ["setuptools>=68.0"]
build-backend = "setuptools.build_meta"
[project]
name = "act"
version = "0.1.0"
description = "Agent Context Token (ACT) — JWT-based authorization and execution accountability for AI agents"
requires-python = ">=3.11"
dependencies = [
"cryptography>=42.0",
]
[project.optional-dependencies]
dev = [
"pytest>=8.0",
]
[tool.setuptools.packages.find]
where = ["."]
[tool.pytest.ini_options]
testpaths = ["tests"]

View File

@@ -0,0 +1,79 @@
# PoC Plan — ACT + ECT over MCP with LangGraph
**Target**: end-to-end working demo for IETF 123 preparation and draft credibility.
**Location**: `demo/act-ect-mcp/` (moved out of `workspace/poc/` to be a first-class peer of `paper/` and `workspace/`)
**Date**: 2026-04-12
## Scenario
A user issues a **mandate** (ACT authorization token) to a LangGraph agent: "research topic X, produce summary." The agent, running in a LangGraph `StateGraph`, calls two tools exposed by an MCP server:
1. `search(query)` — returns fake hits
2. `summarize(text)` — returns fake summary
Every MCP tool call is authenticated by:
- **ACT mandate** in `Authorization: Bearer <act-jwt>` header (capability check: `cap=["mcp.search","mcp.summarize"]`)
- **ECT execution context** in a signed HTTP header per `draft-ietf-wimse-http-signature-03` using the `wimse-aud` signature parameter
- Tool-call body hashed into ECT `inp_hash`
On tool success, the agent mints an **ACT execution record** with:
- `status="completed"`
- `pred=[mandate.jti]` for the first call, `pred=[mandate.jti, prev_exec.jti]` for subsequent calls (DAG, not linear)
- `inp_hash` / `out_hash` bound to request/response bodies
A standalone `verify` CLI walks the resulting ACT ledger + ECT store and prints the DAG.
## Components
```
poc/mcp-langgraph/
├── README.md # how to run, what it proves
├── pyproject.toml # depends on ietf-act, ietf-ect, mcp, langgraph, fastapi
├── keys/ # generated ES256 keys (gitignored)
├── src/
│ ├── keys.py # key gen + JWKS loader
│ ├── tokens.py # ACT mandate/exec minters, ECT header builder
│ ├── http_sig.py # minimal http-signature-03 signer/verifier
│ ├── server.py # MCP server (FastMCP streamable-http) + ECT/ACT middleware
│ ├── agent.py # LangGraph agent + MCP client, injects ACT+ECT
│ └── verify_cli.py # walks ledger, prints DAG
├── demo.sh # end-to-end: start server, run agent, run verifier
└── tests/
├── test_token_flow.py # one tool call → one exec record linked via pred
└── test_dag_shape.py # two tool calls → DAG with expected pred edges
```
## Build sequence (verifiable increments)
1. **Skeleton + deps**: `pyproject.toml`, package layout, install works.
2. **Keys + tokens**: `keys.py`, `tokens.py`. Unit test: mint mandate, mint exec, verify both.
3. **HTTP signature**: `http_sig.py`. Unit test: sign request, verify (round-trip).
4. **MCP server**: FastMCP with two fake tools + ASGI middleware that verifies ACT+ECT on every call. Manual curl test.
5. **LangGraph agent**: StateGraph, `langchain-mcp-adapters` with custom headers callback. No LLM — scripted node sequence that calls both tools.
6. **Verifier CLI**: prints the DAG (mandate → exec1 → exec2).
7. **End-to-end `demo.sh`**: spawns server, runs agent, runs verifier. Green = PoC done.
## Explicit non-goals
- Real LLM via Ollama (local, zero API cost). `create_react_agent` decides which MCP tools to call. Token flow is deterministic regardless of LLM output.
- No registration / discovery of agents — assume pre-shared JWKS.
- No distributed SCITT anchoring — in-memory ledger only.
- No Go interop in this PoC (Python + Python). Go path tracked separately.
## Tradeoffs
**Real MCP vs. MCP-shaped**: use real MCP SDK (FastMCP server + `streamablehttp_client`). More credible for IETF reviewers; adds dep weight. If the MCP SDK blocks us from injecting ECT on the HTTP layer, fall back to FastAPI endpoints that mirror the MCP JSON-RPC shape.
**LLM**: real. Ollama `qwen3:8b` as default (local, free, reproducible per CLAUDE.md cost policy). Swap to Anthropic/OpenAI via env var. LangGraph `create_react_agent` wires LLM + MCP tools.
## Success criteria
- `demo.sh` exits 0, verifier prints a DAG with 1 mandate + 2 execs + correct `pred` edges.
- All unit tests green.
- README shows a reviewer can reproduce in under 5 minutes (`uv sync && ./demo.sh`).
## Questions before coding
- OK to put PoC under `workspace/poc/mcp-langgraph/` (new dir), not under `packages/`?
- OK to pin `mcp>=1.0`, `langgraph>=0.2`, `langchain-mcp-adapters` as deps?
- LangGraph without an LLM is fine for v1 of the PoC, right?

View File

@@ -0,0 +1,67 @@
# PDF Generation for IETF Drafts
## Status
`xml2rfc --pdf` is wired up on this machine. PDFs are generated automatically
as part of `build.sh` in each draft directory.
## How it works
`xml2rfc` is installed via `pipx` in its own venv
(`/home/c/.local/share/pipx/venvs/xml2rfc/`). The `--pdf` switch requires
several extra dependencies that had to be injected into that venv:
```bash
pipx inject xml2rfc "weasyprint<60" pycairo pangocffi
/home/c/.local/share/pipx/venvs/xml2rfc/bin/python -m pip install "pydyf<0.10"
```
Version pins matter:
- `weasyprint 59.x` — xml2rfc's `--pdf` code path calls weasyprint's
`write_pdf(target, stylesheets=[...], presentational_hints=True)`.
Newer weasyprint (60+) changes the signature.
- `pydyf <0.10` — weasyprint 59 calls `pydyf.PDF(version, identifier)` with
two positional args. pydyf 0.10+ removed those.
System libs used via ctypes: `pango`, `pangocairo`, `cairo`, `harfbuzz`
(all already present via Fedora packages).
Fonts: xml2rfc uses Noto + Roboto Mono embedded in the weasyprint output.
Not installed system-wide but weasyprint fonttools handles them.
## Build step (pattern for build.sh)
After the HTML step, add:
```bash
# Step 4: XML -> PDF
echo "Generating PDF output..."
if "$XML2RFC" "$DIR/$DRAFT.xml" --pdf --quiet 2>/dev/null; then
PDF_OK=1
else
echo " xml2rfc --pdf failed; falling back to weasyprint on HTML"
if command -v weasyprint >/dev/null 2>&1; then
weasyprint "$DIR/$DRAFT.html" "$DIR/$DRAFT.pdf" >/dev/null 2>&1 \
&& PDF_OK=1 || PDF_OK=0
else
PDF_OK=0
fi
fi
```
## Verification
`ietf-wimse-ect/draft-nennemann-wimse-ect-02.pdf` — 178 KB, generated via
`xml2rfc --pdf` (IETF-idiomatic layout with Noto fonts, title page,
bookmarks, and proper TOC). The fallback `weasyprint html->pdf` produces a
172 KB PDF that works but renders the html2rfc template instead of the
official IETF print layout; use it only if xml2rfc --pdf is unavailable.
## Reinstallation checklist (on a new machine)
1. `pipx install xml2rfc`
2. `pipx inject xml2rfc "weasyprint<60" pycairo pangocffi`
3. `/home/c/.local/share/pipx/venvs/xml2rfc/bin/python -m pip install "pydyf<0.10"`
4. Install system libs: `dnf install pango cairo harfbuzz` (Fedora) or
`apt install libpango-1.0-0 libpangoft2-1.0-0 libharfbuzz0b` (Debian)
5. Test: `xml2rfc some-draft.xml --pdf`

View File

@@ -0,0 +1,35 @@
<?xml version="1.0" encoding="UTF-8"?>
<reference anchor="I-D.ietf-scitt-architecture" target="https://datatracker.ietf.org/doc/html/draft-ietf-scitt-architecture-22">
<front>
<title>An Architecture for Trustworthy and Transparent Digital Supply Chains</title>
<author initials="H." surname="Birkholz" fullname="Henk Birkholz">
<organization>Fraunhofer SIT</organization>
</author>
<author initials="A." surname="Delignat-Lavaud" fullname="Antoine Delignat-Lavaud">
<organization>Microsoft Research</organization>
</author>
<author initials="C." surname="Fournet" fullname="Cedric Fournet">
<organization>Microsoft Research</organization>
</author>
<author initials="Y." surname="Deshpande" fullname="Yogesh Deshpande">
<organization>ARM</organization>
</author>
<author initials="S." surname="Lasker" fullname="Steve Lasker">
</author>
<date month="October" day="10" year="2025" />
<abstract>
<t> Traceability in supply chains is a growing security concern. While
verifiable data structures have addressed specific issues, such as
equivocation over digital certificates, they lack a universal
architecture for all supply chains. This document defines such an
architecture for single-issuer signed statement transparency. It
ensures extensibility, interoperability between different
transparency services, and compliance with various auditing
procedures and regulatory requirements.
</t>
</abstract>
</front>
<seriesInfo name="Internet-Draft" value="draft-ietf-scitt-architecture-22" />
</reference>

View File

@@ -0,0 +1,13 @@
<reference anchor="RFC2119" target="https://www.rfc-editor.org/info/rfc2119">
<front>
<title>Key words for use in RFCs to Indicate Requirement Levels</title>
<author fullname="S. Bradner" initials="S." surname="Bradner"/>
<date month="March" year="1997"/>
<abstract>
<t>In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
</abstract>
</front>
<seriesInfo name="BCP" value="14"/>
<seriesInfo name="RFC" value="2119"/>
<seriesInfo name="DOI" value="10.17487/RFC2119"/>
</reference>

View File

@@ -0,0 +1,14 @@
<reference anchor="RFC7009" target="https://www.rfc-editor.org/info/rfc7009">
<front>
<title>OAuth 2.0 Token Revocation</title>
<author fullname="T. Lodderstedt" initials="T." role="editor" surname="Lodderstedt"/>
<author fullname="S. Dronia" initials="S." surname="Dronia"/>
<author fullname="M. Scurtescu" initials="M." surname="Scurtescu"/>
<date month="August" year="2013"/>
<abstract>
<t>This document proposes an additional endpoint for OAuth authorization servers, which allows clients to notify the authorization server that a previously obtained refresh or access token is no longer needed. This allows the authorization server to clean up security credentials. A revocation request will invalidate the actual token and, if applicable, other tokens based on the same authorization grant.</t>
</abstract>
</front>
<seriesInfo name="RFC" value="7009"/>
<seriesInfo name="DOI" value="10.17487/RFC7009"/>
</reference>

View File

@@ -0,0 +1,14 @@
<reference anchor="RFC7515" target="https://www.rfc-editor.org/info/rfc7515">
<front>
<title>JSON Web Signature (JWS)</title>
<author fullname="M. Jones" initials="M." surname="Jones"/>
<author fullname="J. Bradley" initials="J." surname="Bradley"/>
<author fullname="N. Sakimura" initials="N." surname="Sakimura"/>
<date month="May" year="2015"/>
<abstract>
<t>JSON Web Signature (JWS) represents content secured with digital signatures or Message Authentication Codes (MACs) using JSON-based data structures. Cryptographic algorithms and identifiers for use with this specification are described in the separate JSON Web Algorithms (JWA) specification and an IANA registry defined by that specification. Related encryption capabilities are described in the separate JSON Web Encryption (JWE) specification.</t>
</abstract>
</front>
<seriesInfo name="RFC" value="7515"/>
<seriesInfo name="DOI" value="10.17487/RFC7515"/>
</reference>

View File

@@ -0,0 +1,12 @@
<reference anchor="RFC7517" target="https://www.rfc-editor.org/info/rfc7517">
<front>
<title>JSON Web Key (JWK)</title>
<author fullname="M. Jones" initials="M." surname="Jones"/>
<date month="May" year="2015"/>
<abstract>
<t>A JSON Web Key (JWK) is a JavaScript Object Notation (JSON) data structure that represents a cryptographic key. This specification also defines a JWK Set JSON data structure that represents a set of JWKs. Cryptographic algorithms and identifiers for use with this specification are described in the separate JSON Web Algorithms (JWA) specification and IANA registries established by that specification.</t>
</abstract>
</front>
<seriesInfo name="RFC" value="7517"/>
<seriesInfo name="DOI" value="10.17487/RFC7517"/>
</reference>

View File

@@ -0,0 +1,12 @@
<reference anchor="RFC7518" target="https://www.rfc-editor.org/info/rfc7518">
<front>
<title>JSON Web Algorithms (JWA)</title>
<author fullname="M. Jones" initials="M." surname="Jones"/>
<date month="May" year="2015"/>
<abstract>
<t>This specification registers cryptographic algorithms and identifiers to be used with the JSON Web Signature (JWS), JSON Web Encryption (JWE), and JSON Web Key (JWK) specifications. It defines several IANA registries for these identifiers.</t>
</abstract>
</front>
<seriesInfo name="RFC" value="7518"/>
<seriesInfo name="DOI" value="10.17487/RFC7518"/>
</reference>

View File

@@ -0,0 +1,14 @@
<reference anchor="RFC7519" target="https://www.rfc-editor.org/info/rfc7519">
<front>
<title>JSON Web Token (JWT)</title>
<author fullname="M. Jones" initials="M." surname="Jones"/>
<author fullname="J. Bradley" initials="J." surname="Bradley"/>
<author fullname="N. Sakimura" initials="N." surname="Sakimura"/>
<date month="May" year="2015"/>
<abstract>
<t>JSON Web Token (JWT) is a compact, URL-safe means of representing claims to be transferred between two parties. The claims in a JWT are encoded as a JSON object that is used as the payload of a JSON Web Signature (JWS) structure or as the plaintext of a JSON Web Encryption (JWE) structure, enabling the claims to be digitally signed or integrity protected with a Message Authentication Code (MAC) and/or encrypted.</t>
</abstract>
</front>
<seriesInfo name="RFC" value="7519"/>
<seriesInfo name="DOI" value="10.17487/RFC7519"/>
</reference>

View File

@@ -0,0 +1,12 @@
<reference anchor="RFC8037" target="https://www.rfc-editor.org/info/rfc8037">
<front>
<title>CFRG Elliptic Curve Diffie-Hellman (ECDH) and Signatures in JSON Object Signing and Encryption (JOSE)</title>
<author fullname="I. Liusvaara" initials="I." surname="Liusvaara"/>
<date month="January" year="2017"/>
<abstract>
<t>This document defines how to use the Diffie-Hellman algorithms "X25519" and "X448" as well as the signature algorithms "Ed25519" and "Ed448" from the IRTF CFRG elliptic curves work in JSON Object Signing and Encryption (JOSE).</t>
</abstract>
</front>
<seriesInfo name="RFC" value="8037"/>
<seriesInfo name="DOI" value="10.17487/RFC8037"/>
</reference>

View File

@@ -0,0 +1,13 @@
<reference anchor="RFC8174" target="https://www.rfc-editor.org/info/rfc8174">
<front>
<title>Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words</title>
<author fullname="B. Leiba" initials="B." surname="Leiba"/>
<date month="May" year="2017"/>
<abstract>
<t>RFC 2119 specifies common key words that may be used in protocol specifications. This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the defined special meanings.</t>
</abstract>
</front>
<seriesInfo name="BCP" value="14"/>
<seriesInfo name="RFC" value="8174"/>
<seriesInfo name="DOI" value="10.17487/RFC8174"/>
</reference>

View File

@@ -0,0 +1,16 @@
<reference anchor="RFC8693" target="https://www.rfc-editor.org/info/rfc8693">
<front>
<title>OAuth 2.0 Token Exchange</title>
<author fullname="M. Jones" initials="M." surname="Jones"/>
<author fullname="A. Nadalin" initials="A." surname="Nadalin"/>
<author fullname="B. Campbell" initials="B." role="editor" surname="Campbell"/>
<author fullname="J. Bradley" initials="J." surname="Bradley"/>
<author fullname="C. Mortimore" initials="C." surname="Mortimore"/>
<date month="January" year="2020"/>
<abstract>
<t>This specification defines a protocol for an HTTP- and JSON-based Security Token Service (STS) by defining how to request and obtain security tokens from OAuth 2.0 authorization servers, including security tokens employing impersonation and delegation.</t>
</abstract>
</front>
<seriesInfo name="RFC" value="8693"/>
<seriesInfo name="DOI" value="10.17487/RFC8693"/>
</reference>

View File

@@ -0,0 +1,16 @@
<reference anchor="RFC9110" target="https://www.rfc-editor.org/info/rfc9110">
<front>
<title>HTTP Semantics</title>
<author fullname="R. Fielding" initials="R." role="editor" surname="Fielding"/>
<author fullname="M. Nottingham" initials="M." role="editor" surname="Nottingham"/>
<author fullname="J. Reschke" initials="J." role="editor" surname="Reschke"/>
<date month="June" year="2022"/>
<abstract>
<t>The Hypertext Transfer Protocol (HTTP) is a stateless application-level protocol for distributed, collaborative, hypertext information systems. This document describes the overall architecture of HTTP, establishes common terminology, and defines aspects of the protocol that are shared by all versions. In this definition are core protocol elements, extensibility mechanisms, and the "http" and "https" Uniform Resource Identifier (URI) schemes.</t>
<t>This document updates RFC 3864 and obsoletes RFCs 2818, 7231, 7232, 7233, 7235, 7538, 7615, 7694, and portions of 7230.</t>
</abstract>
</front>
<seriesInfo name="STD" value="97"/>
<seriesInfo name="RFC" value="9110"/>
<seriesInfo name="DOI" value="10.17487/RFC9110"/>
</reference>

View File

@@ -0,0 +1,24 @@
<reference anchor="RFC9562" target="https://www.rfc-editor.org/info/rfc9562">
<front>
<title>Universally Unique IDentifiers (UUIDs)</title>
<author fullname="K. Davis" initials="K." surname="Davis"/>
<author fullname="B. Peabody" initials="B." surname="Peabody"/>
<author fullname="P. Leach" initials="P." surname="Leach"/>
<date month="May" year="2024"/>
<abstract>
<t>This specification defines UUIDs (Universally Unique IDentifiers) --
also known as GUIDs (Globally Unique IDentifiers) -- and a Uniform
Resource Name namespace for UUIDs. A UUID is 128 bits long and is
intended to guarantee uniqueness across space and time. UUIDs were
originally used in the Apollo Network Computing System (NCS), later
in the Open Software Foundation's (OSF's) Distributed Computing
Environment (DCE), and then in Microsoft Windows platforms.</t>
<t>This specification is derived from the OSF DCE specification with the
kind permission of the OSF (now known as "The Open Group"). Information from earlier versions of the OSF DCE specification have
been incorporated into this document. This document obsoletes RFC
4122.</t>
</abstract>
</front>
<seriesInfo name="RFC" value="9562"/>
<seriesInfo name="DOI" value="10.17487/RFC9562"/>
</reference>

69
workspace/drafts/act/build.sh Executable file
View File

@@ -0,0 +1,69 @@
#!/bin/bash
set -e
DIR="$(cd "$(dirname "$0")" && pwd)"
SRC="$DIR/draft-nennemann-act-01.md"
# Extract docname from YAML front matter
DRAFT=$(grep '^docname:' "$SRC" | head -1 | awk '{print $2}')
if [ -z "$DRAFT" ]; then
echo "Error: could not extract docname from $SRC"
exit 1
fi
# Tool paths
KRAMDOWN="$(which kramdown-rfc2629 2>/dev/null)"
XML2RFC="$(which xml2rfc 2>/dev/null)"
if [ -z "$KRAMDOWN" ]; then
echo "Error: kramdown-rfc2629 not found. Install with: gem install kramdown-rfc2629"
exit 1
fi
if [ -z "$XML2RFC" ]; then
echo "Error: xml2rfc not found. Install with: pip install xml2rfc"
exit 1
fi
export PYTHONWARNINGS="ignore::UserWarning"
echo "Building: $DRAFT"
echo "Using kramdown-rfc2629: $KRAMDOWN"
echo "Using xml2rfc: $XML2RFC"
echo ""
# Step 1: Markdown -> XML
echo "Converting markdown to XML..."
"$KRAMDOWN" "$SRC" > "$DIR/$DRAFT.xml"
# Step 2: XML -> TXT
echo "Generating text output..."
"$XML2RFC" "$DIR/$DRAFT.xml" --text --quiet 2>/dev/null
# Step 3: XML -> HTML
echo "Generating HTML output..."
"$XML2RFC" "$DIR/$DRAFT.xml" --html --quiet 2>/dev/null
# Step 4: XML -> PDF (requires weasyprint + pangocffi + pycairo injected into xml2rfc venv
# and pydyf<0.10 pinned; see /home/c/projects/research.ietf/workspace/drafts/README-pdf.md)
echo "Generating PDF output..."
if "$XML2RFC" "$DIR/$DRAFT.xml" --pdf --quiet 2>/dev/null; then
PDF_OK=1
else
echo " xml2rfc --pdf failed; falling back to weasyprint on HTML"
if command -v weasyprint >/dev/null 2>&1; then
weasyprint "$DIR/$DRAFT.html" "$DIR/$DRAFT.pdf" >/dev/null 2>&1 && PDF_OK=1 || PDF_OK=0
else
PDF_OK=0
fi
fi
echo ""
echo "Build complete:"
echo " $DRAFT.xml (submit this to datatracker)"
echo " $DRAFT.txt"
echo " $DRAFT.html"
if [ "${PDF_OK:-0}" = "1" ]; then
echo " $DRAFT.pdf"
else
echo " (PDF generation skipped — missing deps)"
fi

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

Binary file not shown.

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

Submodule workspace/drafts/ietf-wimse-ect updated: ba38569319...3d01cb32b6

View File

@@ -0,0 +1,30 @@
Dear Dr. Emirdag,
Congratulations on the publication of draft-emirdag-scitt-ai-agent-execution-00 earlier today. I came across it while tracking SCITT-adjacent work on AI agent accountability, and I wanted to reach out because the positioning looks genuinely complementary to a pair of drafts I have been developing.
Brief introduction: I am Christian Nennemann, an independent researcher working on execution-context and lifecycle tokens for agentic systems. My current IETF work consists of:
- draft-nennemann-act-01 (Agent Context Token): a JWT-based two-phase lifecycle — a pre-execution Mandate token carrying authorization, scope, and input commitments, followed by a post-execution Record token committing to outputs and linking back via `pred`. Multiple Records form a DAG, signed with Ed25519 or ES256.
- draft-nennemann-wimse-ect-02 (Execution Context Token): a WIMSE profile with three assurance levels and identity binding for the workload that produced a given execution.
Reading your AIR specification, the layering seems fairly clean: ACT defines *what* is being anchored — the lifecycle token with its authorization proof, input/output commitments, and causal predecessor links — while AIR defines *how* it is anchored on a SCITT transparency service as a COSE_Sign1 payload with its hash-chain, four-step verification, and EU AI Act / NIST AI RMF mappings. There is real conceptual overlap on input/output hashing, reasoning capture, identity, timing, and causality, which suggests that coordinating now would save both of us retrofitting later.
A few concrete options, in rough order of effort:
(a) Cross-citations in both drafts, establishing the "ACT record → AIR payload → SCITT receipt" flow as the intended pipeline.
(b) A short shared section on "Anchoring ACT Records in SCITT" — either folded into ACT-02 or as a small companion draft if you prefer neutral ground.
(c) Aligning claim semantics where they overlap — in particular input/output hash representation (I currently use `inp_hash` / `out_hash`, JWT-side) so that translation to AIR is lossless.
(d) If we both attend IETF 123, a joint slot in SCITT or a side meeting could make the layering concrete for the WG.
I would be happy to send you the current ACT and ECT drafts and to review yours in detail before either of us adds formal cross-references. Low-pressure — mainly wanted to flag the alignment while the drafts are still malleable.
Looking forward to your thoughts.
Best regards,
Christian Nennemann
Independent Researcher
[contact details]
---
**Suggested subject line:** Liaison proposal: ACT/ECT lifecycle tokens and SCITT-AI AIR — complementary layering

View File

@@ -0,0 +1,71 @@
**To:** oauth@ietf.org
**From:** Christian Nennemann <ietf@nennemann.de>
**Subject:** draft-oauth-transaction-tokens-for-agents-06: complementary work on DAG-based delegation (draft-nennemann-act)
Hi all,
I noticed the publication of draft-oauth-transaction-tokens-for-agents-06
(Raut et al., 2026-04-11) and wanted to share some complementary work that
addresses an adjacent slice of the agent-delegation problem space. The
Amazon draft fills a real gap at the OAuth authorization-server layer, and
I think there is useful coordination potential rather than overlap.
# Technical difference in one paragraph
draft-oauth-transaction-tokens-for-agents introduces `actchain` as an
ordered array documenting delegation history, plus `agentic_ctx` carrying
type/version/intent/operational constraints, with a split between
principal-initiated and autonomous flow types. Our work
(draft-nennemann-act-01) models delegation history as a DAG through a
`pred` (predecessor) claim that is itself an array of parent token
references. A linear `actchain` is a special case of the DAG form where
every node has exactly one predecessor.
# Why a DAG, concretely
Consider an agent that fans out to N parallel sub-agents (e.g. one per
data source) and then synthesizes a single response from their results.
The synthesis step has N predecessors, not one. A linear `actchain`
cannot express this fan-in; you would have to either linearize artificially
(losing causality) or emit N parallel chains (losing the join). With a
DAG-valued `pred`, the synthesis token references all N predecessor tokens
directly, and a verifier can walk the graph to check that each parallel
branch was authorized and unexpired. Fork, join, and diamond topologies
fall out of the same structure.
# Layering, not competition
These two drafts sit at different layers:
- Txn-Tokens-for-Agents is anchored at an OAuth authorization server:
the AS mints and validates tokens, and `actchain` is read in the
context of an AS-issued transaction token.
- ACT is designed for peer-to-peer agent orchestration without
requiring an AS in the hot path — useful for multi-vendor agent
meshes where no single AS is authoritative. It is transport-agnostic
and leans on JWS for provenance.
An AS-issued txn-token could carry an ACT-shaped `pred` graph
internally, or an ACT chain could terminate at an AS that upgrades it
into a txn-token for a specific resource. The two seem stackable.
# Offer
Happy to compare test vectors, especially around:
- claim naming: `agentic_ctx` (Raut) vs ACT's `task` claim — is there
an opportunity to align on a shared intent/constraint shape so
downstream verifiers don't have to parse both?
- linear-subset interop: confirming that a degenerate DAG (each node
one parent) round-trips cleanly to/from `actchain`.
- autonomous-flow semantics: how ACT's unattended-delegation marker
maps onto Raut's autonomous flow type.
ACT draft: https://datatracker.ietf.org/doc/draft-nennemann-act/
Feedback welcome, on- or off-list.
Best,
Christian Nennemann
Independent Researcher
ietf@nennemann.de

View File

@@ -0,0 +1,113 @@
# ACT / ECT Cross-Spec Interop Test Plan
**Status**: Draft (Task C4 preparation — planning only, not yet implemented)
**Scope**: Python refimpls `ietf-act` (Phase 1/2, 103 tests) and `ietf-ect` (single-phase, 56 tests)
**Deliverable**: `packages/interop/tests/test_interop.py` + compatibility matrix docs
## 1. Goals and Non-Goals
### Goals
- Empirically document which shared claims round-trip cleanly between refimpls.
- Surface real format-level incompatibilities (hash encoding, typ header, algorithm support) rather than assume the spec-level claim overlap implies wire interop.
- Produce a user-facing **compatibility matrix** that implementers can rely on when building bridges between Phase 2 ACT Records and ECT payloads.
- Provide executable regression tests so future changes to either refimpl cannot silently break the documented interop level without CI noticing.
### Non-Goals
- Propose spec unification or new shared claim registries.
- Build a lossy translator/bridge between ACT Records and ECT payloads.
- Test `typ` cross-acceptance — `act+jwt` vs `exec+jwt` MUST remain distinct token types.
- Forge one token type as the other.
- Add new crypto backends (e.g., Ed25519 support) to ECT as part of this work.
## 2. Known Shape of the Problem
Shared claims (by name): `jti`, `wid`, `iat`, `exp`, `aud`, `exec_act`, `pred`, `inp_hash`, `out_hash`.
Confirmed divergences discovered while reading the code:
- **Hash encoding mismatch**: ACT `b64url_sha256()` emits plain base64url (e.g. `n4bQgYhMfWWaL-qgxVrQFaO_TxsrC4Is0V1sFbDwCgg`). ECT `validate_hash_format()` requires `alg:base64url` form (e.g. `sha-256:...`) and raises on plain b64url. The briefing says this was "recently fixed to match ACT's plain base64url format" but the ECT validator still requires the prefix — plan must include a reproducer.
- **Algorithm**: ACT supports `EdDSA` + `ES256`; ECT hard-codes `ES256` (see `ect/verify.py`, line 59, `"ect: expected ES256"`).
- **Typ header**: ACT requires `act+jwt`; ECT requires `exec+jwt` (with legacy `wimse-exec+jwt`). Neither accepts the other — and per anti-goals, neither should.
- **aud shape**: ACT stores `aud` as `str | list[str]`; ECT normalises to `list[str]` via `_audience_deserialize`.
- **Claims unique to ACT**: `sub`, `iss` (required string), `task`, `cap`, `del`, `oversight`, `exec_ts`, `status`, `err`.
- **Claims unique to ECT**: `ect_ext`, `inp_classification`, and policy claims inside `ect_ext` (`pol`, `pol_decision`, `compensation_required`).
## 3. Test Categories
### 3.1 Shared claim consistency (`TestSharedClaims`)
- `test_jti_format_roundtrips`: UUID-v4 jti accepted by both refimpls; non-UUID jti accepted by ACT (no UUID check) but only by ECT when `validate_uuids=False` (document the asymmetry).
- `test_wid_shared_semantics`: same wid value on an ACT Record and an ECT payload — both accept.
- `test_iat_exp_numericdate`: identical integer NumericDate accepted by both (ACT uses strict `> 0`, ECT uses `int(claims["iat"])`).
- `test_aud_string_vs_list`: string `aud` preserved by ACT, coerced to list by ECT; list form is lossless on both.
- `test_exec_act_string_both_sides`: same `exec_act` value (e.g. `read.data`) serialises identically; ACT additionally validates ABNF grammar — test that ECT accepts an ACT-grammar-legal value unchanged.
- `test_pred_array_shape`: `pred=[]`, `pred=[jti1]`, `pred=[jti1, jti2]` — both refimpls serialise/deserialise identically.
- `test_inp_hash_format_divergence` (**expected xfail/documented**): feed ACT's plain b64url output into ECT validator — expect `ValueError("ect: inp_hash/out_hash must be algorithm:base64url...")`. This pins the incompatibility so a future fix flips the test green.
- `test_inp_hash_prefixed_form`: `sha-256:<b64url>` value accepted by ECT; ACT treats it as opaque string (no validation), roundtrips without error.
- `test_out_hash_same_as_inp`: mirror the above for `out_hash`.
### 3.2 Algorithm compatibility (`TestAlgorithmMatrix`)
- `test_es256_act_record_signature_verifies_with_ect_key_resolver`: build a Phase 2 ACTRecord, sign with ES256 P-256 key. Feed the compact JWS bytes *and an ECT-shaped resolver* through `ect.verify`. Expect `ValueError("ect: invalid typ parameter")` because typ is `act+jwt`. Document: JWS/ES256 signature layer is compatible, but typ gate prevents verifier reuse as-is.
- `test_eddsa_act_record_rejected_by_ect`: Phase 2 ACTRecord signed EdDSA. ECT must reject at alg gate (`"ect: expected ES256"`). Documents the ES256-only limitation.
- `test_ect_payload_signature_verifies_with_act_crypto`: sign an ECT payload (ES256), strip to raw JWS, feed signature bytes through `act.crypto.verify` with the ECT public key. Expect success — proves the ES256 primitive is wire-compatible at the raw-sig level.
### 3.3 DAG cross-reference (`TestDagInterop`)
- `test_pred_array_referenceable_both_ways`: construct ACT Record with `pred=[ect_jti]` and an ECT payload with `pred=[act_jti]`. Both refimpls accept the arrays structurally (they're opaque strings).
- `test_mixed_dag_is_out_of_scope`: document and assert that `ACTStore` only stores ACT records and `ECTStore` only stores ECT payloads; neither is designed to resolve a `pred` jti from the other type. A bridging verifier would have to walk both stores — out of scope for refimpls.
- `test_jti_collision_across_types`: the same UUID used as `jti` in an ACT Record and an unrelated ECT payload — both refimpls accept independently; document that jti uniqueness is scoped per-token-type in the refimpls.
### 3.4 Semantic divergence (`TestClaimDivergence`)
- `test_ect_ignores_act_only_claims`: ECT `Payload.from_claims` is called on a dict that includes `sub`, `task`, `cap`, `oversight`, `exec_ts`, `status`. Expect: silently ignored (no error, no retention). Document as "ECT is lenient on unknown top-level claims".
- `test_act_ignores_ect_only_claims`: feed `ACTRecord.from_claims` a claim dict with `ect_ext`, `inp_classification`. Expect: silently ignored and not retained.
- `test_exec_act_not_validated_against_cap_in_ect`: ACT Record with `exec_act="read.data"` and `cap=[{"action":"write.result"}]` → ACT verifier raises `ACTCapabilityError`. Same `exec_act` in an ECT payload with no `cap` → ECT accepts. Documents the cap-validation asymmetry; guards against anyone accidentally copy-pasting cap logic into ECT.
- `test_act_requires_status_ect_does_not`: ACTRecord without `status``ACTValidationError`. ECT without `status` → accepted.
### 3.5 Anti-goals (encoded as negative tests)
- `test_act_jwt_typ_rejected_by_ect`: ACT compact with `typ=act+jwt` fed to `ect.verify` → MUST raise "invalid typ parameter".
- `test_exec_jwt_typ_rejected_by_act`: ECT compact with `typ=exec+jwt` fed to `act.decode_jws` → MUST raise `ACTValidationError` on typ check.
- `test_no_forgery_as_other_type`: explicit comment-only placeholder asserting we do not re-encode one type as the other; kept as a doc anchor.
## 4. Expected Compatibility Matrix (user-facing)
| Layer | Direction | Status | Notes |
|---|---|---|---|
| ES256 raw signature | ACT ↔ ECT | Compatible | Same JWS/ES256 primitive |
| EdDSA signature | ACT → ECT | Incompatible | ECT is ES256-only |
| `typ` header | ACT ↔ ECT | Strictly separated | By design |
| `jti`, `wid`, `iat`, `exp`, `aud`, `exec_act`, `pred` | Shared | Compatible | Identical wire shapes |
| `inp_hash`/`out_hash` | ACT → ECT | **Incompatible today** | ACT emits plain b64url, ECT requires `sha-256:<b64url>` |
| `inp_hash`/`out_hash` | ECT → ACT | Compatible | ACT treats as opaque string |
| `cap` / `exec_act` coupling | ACT-only | N/A | ECT does not enforce |
| DAG `pred` traversal | Separate stores | Manual bridging required | Refimpls do not cross-resolve |
## 5. Dependencies and Structure
Both packages must be importable in a single venv:
```
pip install -e packages/act packages/ect packages/interop[dev]
```
Proposed layout:
```
packages/
act/ …
ect/ …
interop/
pyproject.toml # declares ietf-act, ietf-ect as deps
tests/
__init__.py
conftest.py # shared ES256 keypair + resolver fixtures
test_interop.py # classes Test{SharedClaims,AlgorithmMatrix,DagInterop,ClaimDivergence,AntiGoals}
README.md # published compatibility matrix
```
`conftest.py` exposes fixtures: `es256_keypair`, `act_record_builder`, `ect_payload_builder`, `dual_resolver` (one kid → same ES256 pubkey for both refimpls).
## 6. What the Compatibility Matrix Docs Should Tell Users
- **Do** reuse ES256 key material across ACT and ECT deployments — the signing primitive is identical.
- **Do not** feed ACT compact tokens to an ECT verifier or vice versa; `typ` gates are deliberate.
- **Do** treat `jti`, `wid`, `pred`, `exec_act` as semantically aligned when building cross-type audit logs.
- **Do not** rely on `inp_hash`/`out_hash` being portable today — raise a spec issue if portability matters for your deployment.
- **Do not** expect ECT to enforce ACT's `cap`/`exec_act` coupling — authorization remains an ACT concern.
- **Open question for spec editors**: align hash encoding (plain b64url vs prefixed), and decide whether Ed25519 should be optional-to-support for ECT.

View File

@@ -0,0 +1,119 @@
"""Agent Context Token (ACT) — Reference Implementation.
A JWT-based format for autonomous AI agents that unifies authorization
and execution accountability in a single token lifecycle.
Reference: draft-nennemann-act-01.
"""
from .errors import (
ACTAudienceMismatchError,
ACTCapabilityError,
ACTDAGError,
ACTDelegationError,
ACTError,
ACTExpiredError,
ACTKeyResolutionError,
ACTLedgerImmutabilityError,
ACTPhaseError,
ACTPrivilegeEscalationError,
ACTSignatureError,
ACTValidationError,
)
from .token import (
ACTMandate,
ACTRecord,
Capability,
Delegation,
DelegationEntry,
ErrorClaim,
Oversight,
TaskClaim,
decode_jws,
encode_jws,
parse_token,
)
from .crypto import (
ACTKeyResolver,
KeyRegistry,
PublicKey,
PrivateKey,
X509TrustStore,
b64url_sha256,
compute_sha256,
did_key_from_ed25519,
generate_ed25519_keypair,
generate_p256_keypair,
resolve_did_key,
sign,
verify,
)
from .lifecycle import transition_to_record
from .delegation import (
create_delegated_mandate,
verify_capability_subset,
verify_delegation_chain,
)
from .dag import validate_dag, ACTStore
from .ledger import ACTLedger
from .verify import ACTVerifier
from .vectors import generate_vectors, validate_vectors
__all__ = [
# Errors
"ACTError",
"ACTValidationError",
"ACTSignatureError",
"ACTExpiredError",
"ACTAudienceMismatchError",
"ACTCapabilityError",
"ACTDelegationError",
"ACTDAGError",
"ACTPhaseError",
"ACTKeyResolutionError",
"ACTLedgerImmutabilityError",
"ACTPrivilegeEscalationError",
# Token structures
"ACTMandate",
"ACTRecord",
"TaskClaim",
"Capability",
"Delegation",
"DelegationEntry",
"Oversight",
"ErrorClaim",
# Token serialization
"encode_jws",
"decode_jws",
"parse_token",
# Crypto
"generate_ed25519_keypair",
"generate_p256_keypair",
"sign",
"verify",
"compute_sha256",
"b64url_sha256",
"resolve_did_key",
"did_key_from_ed25519",
"KeyRegistry",
"X509TrustStore",
"ACTKeyResolver",
"PublicKey",
"PrivateKey",
# Lifecycle
"transition_to_record",
# Delegation
"create_delegated_mandate",
"verify_capability_subset",
"verify_delegation_chain",
# DAG
"validate_dag",
"ACTStore",
# Ledger
"ACTLedger",
# Verify
"ACTVerifier",
# Vectors
"generate_vectors",
"validate_vectors",
]

View File

@@ -0,0 +1,467 @@
"""ACT cryptographic primitives and key management.
Provides sign/verify operations and key resolution across all three
ACT trust tiers:
- Tier 1: Pre-shared Ed25519 and P-256 keys
- Tier 2: PKI / X.509 certificate chains
- Tier 3: DID (did:key self-contained, did:web via resolver callback)
Reference: ACT §5 (Trust Model), §8 (Verification Procedure).
"""
from __future__ import annotations
import base64
import hashlib
import re
from typing import Any, Callable, Protocol
from cryptography.exceptions import InvalidSignature
from cryptography.hazmat.primitives.asymmetric.ec import (
ECDSA,
SECP256R1,
EllipticCurvePrivateKey,
EllipticCurvePublicKey,
generate_private_key as ec_generate_private_key,
)
from cryptography.hazmat.primitives.asymmetric.ed25519 import (
Ed25519PrivateKey,
Ed25519PublicKey,
)
from cryptography.hazmat.primitives.hashes import SHA256
from cryptography.x509 import (
Certificate,
load_der_x509_certificate,
)
from .errors import (
ACTKeyResolutionError,
ACTSignatureError,
ACTValidationError,
)
# Type aliases for public/private keys supported by ACT.
PublicKey = Ed25519PublicKey | EllipticCurvePublicKey
PrivateKey = Ed25519PrivateKey | EllipticCurvePrivateKey
# Callback type for DID:web resolution.
DIDResolver = Callable[[str], PublicKey | None]
def generate_ed25519_keypair() -> tuple[Ed25519PrivateKey, Ed25519PublicKey]:
"""Generate an Ed25519 key pair for ACT signing.
Returns a (private_key, public_key) tuple. The private key object
carries its associated public key per ACT security requirements.
Reference: ACT §5.2 (Tier 1 pre-shared keys).
"""
private_key = Ed25519PrivateKey.generate()
return private_key, private_key.public_key()
def generate_p256_keypair() -> tuple[EllipticCurvePrivateKey, EllipticCurvePublicKey]:
"""Generate a P-256 (ES256) key pair for ACT signing.
Returns a (private_key, public_key) tuple.
Reference: ACT §5.2 (Tier 1 pre-shared keys).
"""
private_key = ec_generate_private_key(SECP256R1())
return private_key, private_key.public_key()
def sign(private_key: PrivateKey, data: bytes) -> bytes:
"""Sign data using the appropriate algorithm for the key type.
Uses Ed25519 for Ed25519PrivateKey, ECDSA with SHA-256 for P-256.
Returns raw signature bytes (for Ed25519: 64 bytes; for ES256:
raw r||s format per RFC 7518 §3.4).
Reference: ACT §5, RFC 7515 §5.1.
Raises:
ACTValidationError: If the key type is not supported.
"""
if isinstance(private_key, Ed25519PrivateKey):
return private_key.sign(data)
elif isinstance(private_key, EllipticCurvePrivateKey):
from cryptography.hazmat.primitives.asymmetric.utils import (
decode_dss_signature,
)
# Sign with DER-encoded signature, then convert to raw r||s
der_sig = private_key.sign(data, ECDSA(SHA256()))
r, s = decode_dss_signature(der_sig)
# P-256 uses 32-byte integers
return r.to_bytes(32, "big") + s.to_bytes(32, "big")
else:
raise ACTValidationError(f"Unsupported key type: {type(private_key)}")
def verify(public_key: PublicKey, signature: bytes, data: bytes) -> None:
"""Verify a signature against the given public key and data.
Reference: ACT §8.1 step 5.
Raises:
ACTSignatureError: If the signature is invalid.
ACTValidationError: If the key type is not supported.
"""
try:
if isinstance(public_key, Ed25519PublicKey):
public_key.verify(signature, data)
elif isinstance(public_key, EllipticCurvePublicKey):
from cryptography.hazmat.primitives.asymmetric.utils import (
encode_dss_signature,
)
# Convert raw r||s back to DER
r = int.from_bytes(signature[:32], "big")
s = int.from_bytes(signature[32:], "big")
der_sig = encode_dss_signature(r, s)
public_key.verify(der_sig, data, ECDSA(SHA256()))
else:
raise ACTValidationError(
f"Unsupported key type: {type(public_key)}"
)
except InvalidSignature as e:
raise ACTSignatureError("Signature verification failed") from e
def compute_sha256(data: bytes) -> bytes:
"""Compute SHA-256 hash of data.
Used for delegation chain signatures and inp_hash/out_hash claims.
Reference: ACT §6.1 (delegation sig), §4.3 (inp_hash, out_hash).
"""
return hashlib.sha256(data).digest()
def b64url_sha256(data: bytes) -> str:
"""Compute base64url(SHA-256(data)) without padding.
Used for inp_hash and out_hash claims.
Reference: ACT §4.3.
"""
digest = compute_sha256(data)
return base64.urlsafe_b64encode(digest).rstrip(b"=").decode("ascii")
def x509_kid(cert_der: bytes) -> str:
"""Compute the Tier 2 kid: SHA-256 thumbprint of DER certificate.
Reference: ACT §5.3 (Tier 2 kid format).
"""
return hashlib.sha256(cert_der).hexdigest()
class KeyRegistry:
"""Tier 1 pre-shared key registry.
Maps kid strings to public keys. Configured at initialization time
with no external resolution needed.
Reference: ACT §5.2 (Tier 1 Pre-Shared Keys).
"""
def __init__(self) -> None:
self._keys: dict[str, PublicKey] = {}
def register(self, kid: str, public_key: PublicKey) -> None:
"""Register a public key under the given kid.
Reference: ACT §5.2.
"""
self._keys[kid] = public_key
def get(self, kid: str) -> PublicKey | None:
"""Retrieve the public key for a kid, or None if not found."""
return self._keys.get(kid)
def __contains__(self, kid: str) -> bool:
return kid in self._keys
def __len__(self) -> int:
return len(self._keys)
class X509TrustStore:
"""Tier 2 PKI/X.509 trust store.
Holds trusted CA certificates and resolves kid (certificate
thumbprint) to public keys. Supports x5c header chain validation.
Reference: ACT §5.3 (Tier 2 PKI).
"""
def __init__(self) -> None:
self._trusted_certs: dict[str, Certificate] = {}
def add_trusted_cert(self, cert: Certificate) -> str:
"""Add a trusted certificate to the store.
Returns the kid (SHA-256 thumbprint of DER encoding).
Reference: ACT §5.3.
"""
from cryptography.hazmat.primitives.serialization import Encoding
der_bytes = cert.public_bytes(Encoding.DER)
kid = x509_kid(der_bytes)
self._trusted_certs[kid] = cert
return kid
def resolve(self, kid: str) -> PublicKey | None:
"""Resolve kid to a public key from a trusted certificate.
Reference: ACT §5.3, §8.1 step 4.
"""
cert = self._trusted_certs.get(kid)
if cert is None:
return None
pub = cert.public_key()
if isinstance(pub, (Ed25519PublicKey, EllipticCurvePublicKey)):
return pub
return None
def resolve_x5c(self, x5c: list[str]) -> PublicKey | None:
"""Resolve public key from x5c certificate chain.
The first entry in x5c is the end-entity certificate.
Validates that the chain terminates in a trusted CA.
Reference: ACT §4.1 (x5c header), §5.3.
"""
if not x5c:
return None
try:
# Decode certificates from base64 DER
certs = [
load_der_x509_certificate(base64.b64decode(c)) for c in x5c
]
except Exception:
return None
# Check if any cert in the chain is in our trust store
from cryptography.hazmat.primitives.serialization import Encoding
for cert in certs:
der_bytes = cert.public_bytes(Encoding.DER)
kid = x509_kid(der_bytes)
if kid in self._trusted_certs:
# End-entity cert is the first one
ee_pub = certs[0].public_key()
if isinstance(ee_pub, (Ed25519PublicKey, EllipticCurvePublicKey)):
return ee_pub
return None
return None
# --- Tier 3: DID Support ---
# Multicodec prefixes for did:key
_ED25519_MULTICODEC = b"\xed\x01"
_P256_MULTICODEC = b"\x80\x24"
def _multibase_decode(encoded: str) -> bytes:
"""Decode a multibase-encoded string (base58btc 'z' prefix).
Reference: ACT §5.4 (Tier 3 DID:key).
"""
if not encoded.startswith("z"):
raise ACTKeyResolutionError(
f"Unsupported multibase encoding prefix: {encoded[0]!r}"
)
return _base58btc_decode(encoded[1:])
def _base58btc_decode(s: str) -> bytes:
"""Decode a base58btc string."""
alphabet = "123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz"
n = 0
for ch in s:
idx = alphabet.index(ch)
n = n * 58 + idx
# Compute byte length
byte_length = (n.bit_length() + 7) // 8
result = n.to_bytes(byte_length, "big") if byte_length > 0 else b""
# Preserve leading zeros
leading_zeros = len(s) - len(s.lstrip("1"))
return b"\x00" * leading_zeros + result
def resolve_did_key(did: str) -> PublicKey:
"""Resolve a did:key identifier to a public key.
Supports Ed25519 and P-256 key types. The did:key method is
self-contained — no external resolution is needed.
Reference: ACT §5.4 (Tier 3 DID:key).
Raises:
ACTKeyResolutionError: If the DID cannot be resolved.
"""
# Strip fragment if present (e.g., did:key:z6Mk...#z6Mk...)
did_base = did.split("#")[0]
if not did_base.startswith("did:key:"):
raise ACTKeyResolutionError(
f"Not a did:key identifier: {did!r}"
)
multibase_value = did_base[len("did:key:"):]
try:
decoded = _multibase_decode(multibase_value)
except Exception as e:
raise ACTKeyResolutionError(
f"Failed to decode did:key multibase value: {e}"
) from e
if decoded[:2] == _ED25519_MULTICODEC:
raw_key = decoded[2:]
if len(raw_key) != 32:
raise ACTKeyResolutionError(
f"Ed25519 public key must be 32 bytes, got {len(raw_key)}"
)
return Ed25519PublicKey.from_public_bytes(raw_key)
elif decoded[:2] == _P256_MULTICODEC:
raw_key = decoded[2:]
from cryptography.hazmat.primitives.asymmetric.ec import (
EllipticCurvePublicKey as ECPub,
)
from cryptography.hazmat.primitives.serialization import (
load_der_public_key,
)
# P-256 compressed point (33 bytes) or uncompressed (65 bytes)
# Wrap in SubjectPublicKeyInfo for loading
try:
return EllipticCurvePublicKey.from_encoded_point(
SECP256R1(), raw_key
)
except Exception as e:
raise ACTKeyResolutionError(
f"Failed to load P-256 key from did:key: {e}"
) from e
else:
raise ACTKeyResolutionError(
f"Unsupported multicodec prefix in did:key: {decoded[:2]!r}"
)
def did_key_from_ed25519(public_key: Ed25519PublicKey) -> str:
"""Create a did:key identifier from an Ed25519 public key.
Reference: ACT §5.4 (Tier 3 DID:key).
"""
from cryptography.hazmat.primitives.serialization import (
Encoding,
PublicFormat,
)
raw = public_key.public_bytes(Encoding.Raw, PublicFormat.Raw)
multicodec = _ED25519_MULTICODEC + raw
encoded = "z" + _base58btc_encode(multicodec)
return f"did:key:{encoded}"
def _base58btc_encode(data: bytes) -> str:
"""Encode bytes as base58btc."""
alphabet = "123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz"
# Count leading zeros
leading_zeros = 0
for b in data:
if b == 0:
leading_zeros += 1
else:
break
n = int.from_bytes(data, "big")
if n == 0:
return "1" * leading_zeros
chars: list[str] = []
while n > 0:
n, remainder = divmod(n, 58)
chars.append(alphabet[remainder])
return "1" * leading_zeros + "".join(reversed(chars))
class ACTKeyResolver:
"""Unified key resolver across all trust tiers.
Tries Tier 1 (pre-shared), then Tier 2 (X.509), then Tier 3 (DID)
to resolve a kid to a public key.
Reference: ACT §5 (Trust Model), §8.1 step 4.
"""
def __init__(
self,
registry: KeyRegistry | None = None,
x509_store: X509TrustStore | None = None,
did_web_resolver: DIDResolver | None = None,
) -> None:
self._registry = registry or KeyRegistry()
self._x509_store = x509_store or X509TrustStore()
self._did_web_resolver = did_web_resolver
@property
def registry(self) -> KeyRegistry:
"""Access the Tier 1 key registry."""
return self._registry
@property
def x509_store(self) -> X509TrustStore:
"""Access the Tier 2 X.509 trust store."""
return self._x509_store
def resolve(
self,
kid: str,
header: dict[str, Any] | None = None,
) -> PublicKey:
"""Resolve a kid to a public key, trying all configured tiers.
Resolution order:
1. Tier 1: Pre-shared key registry lookup by kid
2. Tier 2: X.509 certificate lookup by kid (thumbprint)
or x5c header chain validation
3. Tier 3: DID resolution (did:key or did:web)
Reference: ACT §5 (Trust Model), §8.1 step 4.
Raises:
ACTKeyResolutionError: If no key can be resolved for the kid.
"""
header = header or {}
# Tier 1: Pre-shared keys
key = self._registry.get(kid)
if key is not None:
return key
# Tier 2: X.509
key = self._x509_store.resolve(kid)
if key is not None:
return key
# Tier 2: x5c chain in header
x5c = header.get("x5c")
if x5c:
key = self._x509_store.resolve_x5c(x5c)
if key is not None:
return key
# Tier 3: DID
did_value = header.get("did") or kid
if did_value.startswith("did:key:"):
try:
return resolve_did_key(did_value)
except ACTKeyResolutionError:
pass
if did_value.startswith("did:web:") and self._did_web_resolver:
resolved = self._did_web_resolver(did_value)
if resolved is not None:
return resolved
raise ACTKeyResolutionError(
f"Cannot resolve kid {kid!r} to a public key via any trust tier"
)

View File

@@ -0,0 +1,136 @@
"""ACT DAG validation for Phase 2 execution records.
Validates the directed acyclic graph formed by pred (predecessor) references
in Phase 2 ACTs, ensuring uniqueness, predecessor existence, temporal ordering,
acyclicity, and capability consistency.
Reference: ACT §7 (DAG Structure and Causal Ordering).
"""
from __future__ import annotations
from typing import Protocol
from .errors import ACTCapabilityError, ACTDAGError
from .token import ACTRecord
# Maximum ancestor traversal limit for cycle detection — ACT §7.1 step 4.
MAX_TRAVERSAL_LIMIT: int = 10_000
# Clock skew tolerance for temporal ordering — ACT §7.1 step 3.
DAG_CLOCK_SKEW_TOLERANCE: int = 30
class ACTStore(Protocol):
"""Protocol for an ACT store used in DAG validation.
Any object implementing get() and has() can serve as the store.
The ACTLedger in ledger.py implements this protocol.
"""
def get(self, jti: str) -> ACTRecord | None:
"""Retrieve a Phase 2 ACT record by jti."""
...
def validate_dag(
record: ACTRecord,
store: ACTStore,
*,
clock_skew_tolerance: int = DAG_CLOCK_SKEW_TOLERANCE,
) -> None:
"""Validate the DAG constraints for a Phase 2 execution record.
Performs all five DAG validation checks defined in ACT §7.1:
1. jti uniqueness within wid scope (or globally)
2. Predecessor existence in store
3. Temporal ordering with clock skew tolerance
4. Acyclicity (max traversal limit)
5. Capability consistency (exec_act matches cap[].action)
Reference: ACT §7.1 (DAG Validation).
Args:
record: The Phase 2 ACTRecord to validate.
store: An ACT store providing get() for predecessor lookup.
clock_skew_tolerance: Seconds of allowed clock skew (default 30).
Raises:
ACTDAGError: If any DAG constraint is violated.
ACTCapabilityError: If exec_act does not match cap actions.
"""
# Step 1: jti uniqueness — ACT §7.1 step 1
existing = store.get(record.jti)
if existing is not None:
raise ACTDAGError(
f"Duplicate jti {record.jti!r} already exists in store"
)
# Step 5: Capability consistency — ACT §7.1 step 5
cap_actions = {c.action for c in record.cap}
if record.exec_act not in cap_actions:
raise ACTCapabilityError(
f"exec_act {record.exec_act!r} does not match any "
f"cap[].action: {sorted(cap_actions)}"
)
# Step 2 & 3: Predecessor existence and temporal ordering
for pred_jti in record.pred:
parent = store.get(pred_jti)
if parent is None:
raise ACTDAGError(
f"Predecessor jti {pred_jti!r} not found in store"
)
# Temporal ordering: predecessor.exec_ts < child.exec_ts + tolerance
if parent.exec_ts >= record.exec_ts + clock_skew_tolerance:
raise ACTDAGError(
f"Temporal ordering violation: predecessor {pred_jti!r} "
f"exec_ts={parent.exec_ts} >= child exec_ts="
f"{record.exec_ts} + tolerance={clock_skew_tolerance}"
)
# Step 4: Acyclicity — ACT §7.1 step 4
_check_acyclicity(record.jti, record.pred, store)
def _check_acyclicity(
current_jti: str,
pred_jtis: list[str],
store: ACTStore,
) -> None:
"""Check that following pred references does not lead back to current_jti.
Uses breadth-first traversal with a maximum node limit.
Reference: ACT §7.1 step 4.
Raises:
ACTDAGError: If a cycle is detected or traversal limit exceeded.
"""
visited: set[str] = set()
queue: list[str] = list(pred_jtis)
nodes_visited = 0
while queue:
if nodes_visited >= MAX_TRAVERSAL_LIMIT:
raise ACTDAGError(
f"DAG traversal limit ({MAX_TRAVERSAL_LIMIT}) exceeded; "
f"possible cycle or excessively deep DAG"
)
jti = queue.pop(0)
if jti == current_jti:
raise ACTDAGError(
f"DAG cycle detected: jti {current_jti!r} appears in "
f"its own ancestor chain"
)
if jti in visited:
continue
visited.add(jti)
nodes_visited += 1
parent = store.get(jti)
if parent is not None:
queue.extend(parent.pred)

View File

@@ -0,0 +1,333 @@
"""ACT delegation chain construction and verification.
Handles peer-to-peer delegation where Agent A authorizes Agent B
with reduced privileges, building a cryptographic chain of authority.
Reference: ACT §6 (Delegation Chain).
"""
from __future__ import annotations
import base64
from typing import Any
from .crypto import (
PrivateKey,
PublicKey,
compute_sha256,
sign as crypto_sign,
verify as crypto_verify,
)
from .errors import (
ACTDelegationError,
ACTPrivilegeEscalationError,
ACTValidationError,
)
from .token import (
ACTMandate,
Capability,
Delegation,
DelegationEntry,
_b64url_encode,
_b64url_decode,
encode_jws,
)
def create_delegated_mandate(
parent_mandate: ACTMandate,
parent_compact: str,
delegator_private_key: PrivateKey,
*,
sub: str,
kid: str,
iss: str,
aud: str | list[str],
iat: int,
exp: int,
jti: str,
cap: list[Capability],
task: Any,
alg: str = "EdDSA",
wid: str | None = None,
max_depth: int | None = None,
oversight: Any | None = None,
) -> tuple[ACTMandate, str]:
"""Create a delegated ACT mandate from a parent mandate.
Agent A (delegator) creates a new mandate for Agent B (sub) with
reduced privileges. The delegation chain is extended with a new
entry linking back to the parent ACT.
Reference: ACT §6.1 (Peer-to-Peer Delegation).
Args:
parent_mandate: The parent ACT that authorizes delegation.
parent_compact: JWS compact serialization of the parent ACT.
delegator_private_key: The delegator's private key for chain sig.
sub: Target agent identifier.
kid: Key identifier for the new mandate's signing key.
iss: Issuer identifier (the delegator).
aud: Audience for the new mandate.
iat: Issuance time.
exp: Expiration time.
jti: Unique identifier for the new mandate.
cap: Capabilities (must be subset of parent).
task: TaskClaim for the new mandate.
alg: Algorithm (default EdDSA).
wid: Workflow identifier (optional).
max_depth: Max delegation depth (must be <= parent's).
oversight: Oversight claim (optional).
Returns:
Tuple of (ACTMandate, needs to be signed by delegator).
Raises:
ACTDelegationError: If delegation depth would exceed max_depth.
ACTPrivilegeEscalationError: If cap exceeds parent capabilities.
"""
# Determine parent delegation state
if parent_mandate.delegation is not None:
parent_depth = parent_mandate.delegation.depth
parent_max_depth = parent_mandate.delegation.max_depth
parent_chain = list(parent_mandate.delegation.chain)
else:
# Root mandate without del claim — delegation not permitted
raise ACTDelegationError(
"Parent mandate has no 'del' claim; delegation is not permitted"
)
new_depth = parent_depth + 1
# Validate depth constraints — ACT §6.3 step 3
if new_depth > parent_max_depth:
raise ACTDelegationError(
f"Delegation depth {new_depth} exceeds max_depth {parent_max_depth}"
)
# Validate max_depth — ACT §6.1 step 4
if max_depth is None:
effective_max_depth = parent_max_depth
else:
if max_depth > parent_max_depth:
raise ACTDelegationError(
f"Requested max_depth {max_depth} exceeds parent max_depth "
f"{parent_max_depth}"
)
effective_max_depth = max_depth
# Validate capability subset — ACT §6.2
verify_capability_subset(parent_mandate.cap, cap)
# Compute chain entry signature — ACT §6.1 step 5
parent_hash = compute_sha256(parent_compact.encode("utf-8"))
chain_sig = crypto_sign(delegator_private_key, parent_hash)
chain_sig_b64 = _b64url_encode(chain_sig)
# Build new chain entry
new_entry = DelegationEntry(
delegator=iss,
jti=parent_mandate.jti,
sig=chain_sig_b64,
)
# Extend chain — ordered root → immediate parent
new_chain = parent_chain + [new_entry]
delegation = Delegation(
depth=new_depth,
max_depth=effective_max_depth,
chain=new_chain,
)
mandate = ACTMandate(
alg=alg,
kid=kid,
iss=iss,
sub=sub,
aud=aud,
iat=iat,
exp=exp,
jti=jti,
wid=wid if wid is not None else parent_mandate.wid,
task=task,
cap=cap,
delegation=delegation,
oversight=oversight,
)
return mandate, ""
def verify_capability_subset(
parent_caps: list[Capability],
child_caps: list[Capability],
) -> None:
"""Verify that child capabilities are a subset of parent capabilities.
Each child capability action must exist in the parent. Constraints
must be at least as restrictive.
Reference: ACT §6.2 (Privilege Reduction Requirements).
Raises:
ACTPrivilegeEscalationError: If child cap exceeds parent cap.
"""
parent_actions = {c.action: c for c in parent_caps}
for child_cap in child_caps:
if child_cap.action not in parent_actions:
raise ACTPrivilegeEscalationError(
f"Capability action {child_cap.action!r} not present in "
f"parent capabilities: {sorted(parent_actions.keys())}"
)
parent_cap = parent_actions[child_cap.action]
_verify_constraints_subset(
parent_cap.constraints, child_cap.constraints, child_cap.action
)
def _verify_constraints_subset(
parent_constraints: dict[str, Any] | None,
child_constraints: dict[str, Any] | None,
action: str,
) -> None:
"""Verify child constraints are at least as restrictive as parent.
Reference: ACT §6.2 (Privilege Reduction Requirements).
Rules:
- Numeric values: child must be <= parent (lower = more restrictive)
- data_sensitivity enum: child must be >= parent in ordering
- Unknown/domain-specific: must be byte-for-byte identical
Raises:
ACTPrivilegeEscalationError: If child constraint is less restrictive.
"""
if parent_constraints is None:
# Parent has no constraints — child may add constraints (more restrictive)
return
if child_constraints is None:
# Parent has constraints but child does not — escalation
raise ACTPrivilegeEscalationError(
f"Capability {action!r}: parent has constraints but child does not"
)
# Sensitivity ordering per ACT §6.2
_SENSITIVITY_ORDER = {
"public": 0,
"internal": 1,
"confidential": 2,
"restricted": 3,
}
for key, parent_val in parent_constraints.items():
if key not in child_constraints:
# Missing constraint in child = less restrictive
raise ACTPrivilegeEscalationError(
f"Capability {action!r}: constraint {key!r} present in "
f"parent but missing in child"
)
child_val = child_constraints[key]
if key == "data_sensitivity" or key == "data_classification_max":
# Enum comparison — higher = more restrictive
p_ord = _SENSITIVITY_ORDER.get(parent_val)
c_ord = _SENSITIVITY_ORDER.get(child_val)
if p_ord is not None and c_ord is not None:
if c_ord < p_ord:
raise ACTPrivilegeEscalationError(
f"Capability {action!r}: constraint {key!r} "
f"value {child_val!r} is less restrictive than "
f"parent value {parent_val!r}"
)
continue
if isinstance(parent_val, (int, float)) and isinstance(child_val, (int, float)):
# Numeric: lower/equal = more restrictive
if child_val > parent_val:
raise ACTPrivilegeEscalationError(
f"Capability {action!r}: numeric constraint {key!r} "
f"value {child_val} exceeds parent value {parent_val}"
)
continue
# Unknown/domain-specific: must be identical — ACT §6.2
if child_val != parent_val:
raise ACTPrivilegeEscalationError(
f"Capability {action!r}: constraint {key!r} value "
f"{child_val!r} differs from parent value {parent_val!r} "
f"(non-comparable constraints must be identical)"
)
def verify_delegation_chain(
mandate: ACTMandate,
resolve_key: Any,
resolve_parent_compact: Any | None = None,
) -> None:
"""Verify the delegation chain of a mandate.
Reference: ACT §6.3 (Delegation Verification).
Args:
mandate: The ACT mandate to verify.
resolve_key: Callable(delegator_id: str) -> PublicKey to resolve
the public key of a delegator.
resolve_parent_compact: Optional callable(jti: str) -> str|None
to retrieve the parent ACT compact form.
Required for full chain sig verification.
Raises:
ACTDelegationError: If the chain is structurally invalid.
ACTPrivilegeEscalationError: If capabilities were escalated.
"""
if mandate.delegation is None:
# No delegation — root mandate, nothing to verify
return
delegation = mandate.delegation
# Step 3: depth <= max_depth
if delegation.depth > delegation.max_depth:
raise ACTDelegationError(
f"Delegation depth {delegation.depth} exceeds "
f"max_depth {delegation.max_depth}"
)
# Step 4: chain length == depth
if len(delegation.chain) != delegation.depth:
raise ACTDelegationError(
f"Delegation chain length {len(delegation.chain)} does not "
f"match depth {delegation.depth}"
)
# Step 2: verify each chain entry
for i, entry in enumerate(delegation.chain):
# Step 2a: resolve delegator's public key
try:
pub_key = resolve_key(entry.delegator)
except Exception as e:
raise ACTDelegationError(
f"Cannot resolve key for delegator {entry.delegator!r} "
f"at chain index {i}: {e}"
) from e
# Step 2b: verify signature if parent compact is available
if resolve_parent_compact is not None:
parent_compact = resolve_parent_compact(entry.jti)
if parent_compact is not None:
parent_hash = compute_sha256(
parent_compact.encode("utf-8")
)
sig_bytes = _b64url_decode(entry.sig)
try:
crypto_verify(pub_key, sig_bytes, parent_hash)
except Exception as e:
raise ACTDelegationError(
f"Chain entry signature verification failed at "
f"index {i} (delegator={entry.delegator!r}): {e}"
) from e

View File

@@ -0,0 +1,131 @@
"""ACT-specific exception types.
All exceptions defined in this module correspond to specific failure
modes in the Agent Context Token lifecycle as defined in
draft-nennemann-act-01.
Reference: ACT §8 (Verification Procedure), §6 (Delegation Chain),
§7 (DAG Structure), §10 (Audit Ledger Interface).
"""
from __future__ import annotations
class ACTError(Exception):
"""Base exception for all ACT operations.
All ACT-specific exceptions inherit from this class, allowing
callers to catch any ACT error with a single except clause.
Reference: draft-nennemann-act-01.
"""
class ACTValidationError(ACTError):
"""Malformed token structure or invalid field values.
Raised when an ACT fails structural validation: missing required
claims, invalid claim types, unsupported algorithm ("none", HS*),
or invalid typ header.
Reference: ACT §4 (Token Format), §8.1 steps 2-3, 11.
"""
class ACTSignatureError(ACTError):
"""Signature verification failed.
Raised when a JWS signature cannot be verified against the
resolved public key, or when a Phase 2 token is signed by the
wrong key (e.g., iss key instead of sub key).
Reference: ACT §8.1 step 5, §8.2 step 17.
"""
class ACTExpiredError(ACTError):
"""Token has expired.
Raised when the current time exceeds exp + clock_skew_tolerance.
The default clock skew tolerance is 300 seconds.
Reference: ACT §8.1 step 6.
"""
class ACTAudienceMismatchError(ACTError):
"""The aud claim does not contain the verifier's identity.
Reference: ACT §8.1 step 8.
"""
class ACTCapabilityError(ACTError):
"""No matching capability or exec_act not in cap actions.
Raised when exec_act does not match any cap[].action value,
or when a requested action is not authorized by any capability.
Reference: ACT §8.2 step 13, §4.2.2 (cap).
"""
class ACTDelegationError(ACTError):
"""Delegation chain is invalid.
Raised when delegation chain verification fails: depth > max_depth,
chain length != depth, or any chain entry signature fails.
Reference: ACT §6 (Delegation Chain), §8.1 step 12.
"""
class ACTDAGError(ACTError):
"""DAG validation failed.
Raised on cycle detection, missing parent jti, temporal ordering
violations, or traversal limit exceeded.
Reference: ACT §7 (DAG Structure and Causal Ordering).
"""
class ACTPhaseError(ACTError):
"""Wrong phase for the requested operation.
Raised when a mandate is used where a record is expected, or
vice versa. Phase is determined by the presence of exec_act.
Reference: ACT §3 (Lifecycle), §8.
"""
class ACTKeyResolutionError(ACTError):
"""Cannot resolve kid to a public key.
Raised when the kid in the JOSE header cannot be resolved to a
public key via any configured trust tier (pre-shared, PKI, DID).
Reference: ACT §5 (Trust Model), §8.1 step 4.
"""
class ACTLedgerImmutabilityError(ACTError):
"""Attempt to modify or delete a ledger record.
The audit ledger enforces append-only semantics. Once appended,
a record cannot be modified or deleted.
Reference: ACT §10 (Audit Ledger Interface).
"""
class ACTPrivilegeEscalationError(ACTError):
"""Delegated capability exceeds parent capability.
Raised when a delegated ACT contains actions not present in the
parent ACT's cap array, or when constraints are less restrictive
than the parent's constraints.
Reference: ACT §6.2 (Privilege Reduction Requirements).
"""

View File

@@ -0,0 +1,152 @@
"""ACT in-memory append-only audit ledger.
Provides an in-memory reference implementation of the audit ledger
interface. Enforces append-only semantics and hash-chain integrity.
Reference: ACT §10 (Audit Ledger Interface).
"""
from __future__ import annotations
import hashlib
import json
from typing import Any
from .errors import ACTLedgerImmutabilityError
from .token import ACTRecord
class ACTLedger:
"""In-memory append-only audit ledger for ACT execution records.
Records are stored in insertion order with monotonically increasing
sequence numbers. A hash chain provides integrity verification.
Reference: ACT §10.
This is a reference implementation suitable for testing. Production
deployments should use a persistent backend implementing the same
interface.
"""
def __init__(self) -> None:
self._records: list[tuple[int, ACTRecord, str]] = []
# jti → index mapping for efficient lookup
self._jti_index: dict[str, int] = {}
# wid → list of indices for workflow queries
self._wid_index: dict[str | None, list[int]] = {}
self._seq_counter: int = 0
# Hash chain: each entry's hash includes the previous hash
self._chain_hashes: list[bytes] = []
def append(self, act_record: ACTRecord) -> int:
"""Append an execution record to the ledger.
Returns the sequence number assigned to the record.
Reference: ACT §10, requirement 1 (append-only), requirement 2 (ordering).
Raises:
ACTLedgerImmutabilityError: If a record with the same jti
already exists.
"""
if act_record.jti in self._jti_index:
raise ACTLedgerImmutabilityError(
f"Record with jti {act_record.jti!r} already exists in ledger"
)
seq = self._seq_counter
self._seq_counter += 1
# Compute hash chain entry
record_hash = self._hash_record(act_record, seq)
if self._chain_hashes:
chained = hashlib.sha256(
self._chain_hashes[-1] + record_hash
).digest()
else:
chained = record_hash
self._chain_hashes.append(chained)
idx = len(self._records)
self._records.append((seq, act_record, act_record.jti))
self._jti_index[act_record.jti] = idx
wid = act_record.wid
if wid not in self._wid_index:
self._wid_index[wid] = []
self._wid_index[wid].append(idx)
return seq
def get(self, jti: str) -> ACTRecord | None:
"""Retrieve a record by jti.
Reference: ACT §10, requirement 3 (lookup).
"""
idx = self._jti_index.get(jti)
if idx is None:
return None
return self._records[idx][1]
def list(self, wid: str | None = None) -> list[ACTRecord]:
"""List records, optionally filtered by workflow id.
If wid is None, returns all records. If wid is a string,
returns only records with that wid value.
Reference: ACT §10.
"""
if wid is None:
return [r[1] for r in self._records]
indices = self._wid_index.get(wid, [])
return [self._records[i][1] for i in indices]
def verify_integrity(self) -> bool:
"""Verify the hash chain integrity of the ledger.
Recomputes the hash chain from scratch and compares against
stored chain hashes. Returns True if all hashes match.
Reference: ACT §10, requirement 4 (integrity).
"""
if not self._records:
return True
prev_hash: bytes | None = None
for i, (seq, record, _jti) in enumerate(self._records):
record_hash = self._hash_record(record, seq)
if prev_hash is not None:
expected = hashlib.sha256(prev_hash + record_hash).digest()
else:
expected = record_hash
if i >= len(self._chain_hashes):
return False
if self._chain_hashes[i] != expected:
return False
prev_hash = expected
return True
def __len__(self) -> int:
return len(self._records)
def _hash_record(self, record: ACTRecord, seq: int) -> bytes:
"""Compute a deterministic hash of a record for chain integrity."""
claims = record.to_claims()
# Include sequence number in hash for ordering integrity
claims["_seq"] = seq
canonical = json.dumps(claims, sort_keys=True, separators=(",", ":"))
return hashlib.sha256(canonical.encode("utf-8")).digest()
def _immutable_guard(self) -> None:
"""Internal method — not callable externally.
The ledger has no update/delete methods by design.
This exists to make the intent explicit.
"""
raise ACTLedgerImmutabilityError(
"Ledger records cannot be modified or deleted"
)

View File

@@ -0,0 +1,96 @@
"""ACT Phase 1 to Phase 2 transition logic.
Handles the transition from Authorization Mandate to Execution Record,
including re-signing by the executing agent (sub).
Reference: ACT §3.2, §3.3 (Lifecycle State Machine).
"""
from __future__ import annotations
import time
from typing import Any
from .crypto import PrivateKey, sign as crypto_sign
from .errors import ACTCapabilityError, ACTPhaseError, ACTValidationError
from .token import (
ACTMandate,
ACTRecord,
ErrorClaim,
encode_jws,
)
def transition_to_record(
mandate: ACTMandate,
*,
sub_kid: str,
sub_private_key: PrivateKey,
exec_act: str,
pred: list[str] | None = None,
exec_ts: int | None = None,
status: str = "completed",
inp_hash: str | None = None,
out_hash: str | None = None,
err: ErrorClaim | None = None,
) -> tuple[ACTRecord, str]:
"""Transition a Phase 1 mandate to a Phase 2 execution record.
The executing agent (sub) adds execution claims and re-signs the
complete token with its own private key. The kid in the Phase 2
JOSE header MUST reference sub's key, not iss's key.
All Phase 1 claims are preserved unchanged in the Phase 2 token.
Reference: ACT §3.2, §8.2 step 17.
Args:
mandate: The Phase 1 ACTMandate to transition.
sub_kid: The kid for the sub agent's signing key.
sub_private_key: The sub agent's private key for re-signing.
exec_act: The action actually performed (must match a cap[].action).
pred: Predecessor task jti values (DAG dependencies). Empty list for root tasks.
exec_ts: Execution timestamp (defaults to current time).
status: Execution status: "completed", "failed", or "partial".
inp_hash: Base64url SHA-256 hash of input data (optional).
out_hash: Base64url SHA-256 hash of output data (optional).
err: Error details when status is "failed" or "partial".
Returns:
Tuple of (ACTRecord, JWS compact serialization string).
Raises:
ACTPhaseError: If the mandate is already a Phase 2 token.
ACTCapabilityError: If exec_act does not match any cap[].action.
ACTValidationError: If the resulting record fails validation.
"""
if mandate.is_phase2():
raise ACTPhaseError("Cannot transition: token is already Phase 2")
# Verify exec_act matches a capability
cap_actions = {c.action for c in mandate.cap}
if exec_act not in cap_actions:
raise ACTCapabilityError(
f"exec_act {exec_act!r} does not match any cap[].action: "
f"{sorted(cap_actions)}"
)
record = ACTRecord.from_mandate(
mandate,
kid=sub_kid,
exec_act=exec_act,
pred=pred if pred is not None else [],
exec_ts=exec_ts if exec_ts is not None else int(time.time()),
status=status,
inp_hash=inp_hash,
out_hash=out_hash,
err=err,
)
record.validate()
# Re-sign with sub's private key
signature = crypto_sign(sub_private_key, record.signing_input())
compact = encode_jws(record, signature)
return record, compact

View File

@@ -0,0 +1,734 @@
"""ACT token structures and JWS Compact Serialization.
Defines ACTMandate (Phase 1) and ACTRecord (Phase 2) dataclasses,
plus JWS encoding/decoding primitives for ACT tokens.
Reference: ACT §3 (Lifecycle), §4 (Token Format).
"""
from __future__ import annotations
import base64
import json
import re
import time
import uuid
from dataclasses import dataclass, field
from typing import Any
from .errors import ACTPhaseError, ACTValidationError
# Allowed algorithms per ACT §4.1 — symmetric and "none" are forbidden.
ALLOWED_ALGORITHMS: frozenset[str] = frozenset({"EdDSA", "ES256"})
# Forbidden algorithm prefixes/values per ACT §4.1.
_FORBIDDEN_ALGORITHMS: frozenset[str] = frozenset({
"none", "HS256", "HS384", "HS512",
})
# Required typ value per ACT §4.1.
ACT_TYP: str = "act+jwt"
# ABNF for action names: component *("." component)
# component = ALPHA *(ALPHA / DIGIT / "-" / "_")
_ACTION_RE = re.compile(
r"^[A-Za-z][A-Za-z0-9\-_]*(?:\.[A-Za-z][A-Za-z0-9\-_]*)*$"
)
def _b64url_encode(data: bytes) -> str:
"""Base64url encode without padding (RFC 7515 §2)."""
return base64.urlsafe_b64encode(data).rstrip(b"=").decode("ascii")
def _b64url_decode(s: str) -> bytes:
"""Base64url decode with padding restoration."""
s = s + "=" * (-len(s) % 4)
return base64.urlsafe_b64decode(s)
def validate_action_name(action: str) -> None:
"""Validate an action name against ACT ABNF grammar.
Reference: ACT §4.2.2 (cap action names).
Raises:
ACTValidationError: If action does not match the required grammar.
"""
if not _ACTION_RE.match(action):
raise ACTValidationError(
f"Action name {action!r} does not conform to ACT ABNF grammar"
)
@dataclass(frozen=True)
class TaskClaim:
"""The 'task' claim object.
Reference: ACT §4.2.2.
"""
purpose: str
data_sensitivity: str | None = None
created_by: str | None = None
expires_at: int | None = None
def to_dict(self) -> dict[str, Any]:
d: dict[str, Any] = {"purpose": self.purpose}
if self.data_sensitivity is not None:
d["data_sensitivity"] = self.data_sensitivity
if self.created_by is not None:
d["created_by"] = self.created_by
if self.expires_at is not None:
d["expires_at"] = self.expires_at
return d
@classmethod
def from_dict(cls, d: dict[str, Any]) -> TaskClaim:
if "purpose" not in d:
raise ACTValidationError("task.purpose is required")
return cls(
purpose=d["purpose"],
data_sensitivity=d.get("data_sensitivity"),
created_by=d.get("created_by"),
expires_at=d.get("expires_at"),
)
@dataclass(frozen=True)
class Capability:
"""A single capability entry in the 'cap' array.
Reference: ACT §4.2.2.
"""
action: str
constraints: dict[str, Any] | None = None
def __post_init__(self) -> None:
validate_action_name(self.action)
def to_dict(self) -> dict[str, Any]:
d: dict[str, Any] = {"action": self.action}
if self.constraints is not None:
d["constraints"] = self.constraints
return d
@classmethod
def from_dict(cls, d: dict[str, Any]) -> Capability:
if "action" not in d:
raise ACTValidationError("cap[].action is required")
return cls(
action=d["action"],
constraints=d.get("constraints"),
)
@dataclass(frozen=True)
class DelegationEntry:
"""A single entry in del.chain.
Reference: ACT §4.2.2 (del), §6 (Delegation Chain).
"""
delegator: str
jti: str
sig: str
def to_dict(self) -> dict[str, str]:
return {"delegator": self.delegator, "jti": self.jti, "sig": self.sig}
@classmethod
def from_dict(cls, d: dict[str, Any]) -> DelegationEntry:
for key in ("delegator", "jti", "sig"):
if key not in d:
raise ACTValidationError(f"del.chain[].{key} is required")
return cls(
delegator=d["delegator"], jti=d["jti"], sig=d["sig"]
)
@dataclass(frozen=True)
class Delegation:
"""The 'del' claim object.
Reference: ACT §4.2.2 (del), §6 (Delegation Chain).
"""
depth: int
max_depth: int
chain: list[DelegationEntry] = field(default_factory=list)
def to_dict(self) -> dict[str, Any]:
return {
"depth": self.depth,
"max_depth": self.max_depth,
"chain": [e.to_dict() for e in self.chain],
}
@classmethod
def from_dict(cls, d: dict[str, Any]) -> Delegation:
for key in ("depth", "max_depth"):
if key not in d:
raise ACTValidationError(f"del.{key} is required")
chain_raw = d.get("chain", [])
chain = [DelegationEntry.from_dict(e) for e in chain_raw]
return cls(depth=d["depth"], max_depth=d["max_depth"], chain=chain)
@dataclass(frozen=True)
class Oversight:
"""The 'oversight' claim object.
Reference: ACT §4.2.2 (oversight).
"""
requires_approval_for: list[str] = field(default_factory=list)
approval_ref: str | None = None
def to_dict(self) -> dict[str, Any]:
d: dict[str, Any] = {
"requires_approval_for": self.requires_approval_for
}
if self.approval_ref is not None:
d["approval_ref"] = self.approval_ref
return d
@classmethod
def from_dict(cls, d: dict[str, Any]) -> Oversight:
return cls(
requires_approval_for=d.get("requires_approval_for", []),
approval_ref=d.get("approval_ref"),
)
@dataclass(frozen=True)
class ErrorClaim:
"""The 'err' claim object for failed/partial execution.
Reference: ACT §4.3.
"""
code: str
detail: str
def to_dict(self) -> dict[str, str]:
return {"code": self.code, "detail": self.detail}
@classmethod
def from_dict(cls, d: dict[str, Any]) -> ErrorClaim:
for key in ("code", "detail"):
if key not in d:
raise ACTValidationError(f"err.{key} is required")
return cls(code=d["code"], detail=d["detail"])
@dataclass
class ACTMandate:
"""Phase 1 Authorization Mandate.
Represents a signed authorization from an issuing agent to a target
agent, encoding capabilities, constraints, and delegation provenance.
Reference: ACT §3.1, §4.1, §4.2.
"""
# JOSE header fields
alg: str
kid: str
x5c: list[str] | None = None
did: str | None = None
# Required JWT claims
iss: str = ""
sub: str = ""
aud: str | list[str] = ""
iat: int = 0
exp: int = 0
jti: str = field(default_factory=lambda: str(uuid.uuid4()))
# Optional standard claims
wid: str | None = None
# Required ACT claims
task: TaskClaim = field(default_factory=lambda: TaskClaim(purpose=""))
cap: list[Capability] = field(default_factory=list)
# Optional ACT claims
delegation: Delegation | None = None
oversight: Oversight | None = None
def validate(self) -> None:
"""Validate structural correctness of this mandate.
Reference: ACT §4.1, §4.2, §8.1 step 11.
Raises:
ACTValidationError: If any required field is missing or invalid.
"""
_validate_algorithm(self.alg)
if not self.kid:
raise ACTValidationError("kid is required in JOSE header")
for claim_name in ("iss", "sub", "aud", "jti"):
val = getattr(self, claim_name)
if not val:
raise ACTValidationError(f"{claim_name} claim is required")
if self.iat <= 0:
raise ACTValidationError("iat must be a positive NumericDate")
if self.exp <= 0:
raise ACTValidationError("exp must be a positive NumericDate")
if not self.task.purpose:
raise ACTValidationError("task.purpose is required")
if not self.cap:
raise ACTValidationError("cap must contain at least one capability")
def to_header(self) -> dict[str, Any]:
"""Build JOSE header dict.
Reference: ACT §4.1.
"""
h: dict[str, Any] = {
"alg": self.alg,
"typ": ACT_TYP,
"kid": self.kid,
}
if self.x5c is not None:
h["x5c"] = self.x5c
if self.did is not None:
h["did"] = self.did
return h
def to_claims(self) -> dict[str, Any]:
"""Build JWT claims dict (Phase 1 claims only).
Reference: ACT §4.2.
"""
c: dict[str, Any] = {
"iss": self.iss,
"sub": self.sub,
"aud": self.aud,
"iat": self.iat,
"exp": self.exp,
"jti": self.jti,
"task": self.task.to_dict(),
"cap": [cap.to_dict() for cap in self.cap],
}
if self.wid is not None:
c["wid"] = self.wid
if self.delegation is not None:
c["del"] = self.delegation.to_dict()
if self.oversight is not None:
c["oversight"] = self.oversight.to_dict()
return c
def signing_input(self) -> bytes:
"""Compute the JWS signing input (header.payload) as bytes.
Reference: RFC 7515 §5.1.
"""
header_b64 = _b64url_encode(
json.dumps(self.to_header(), separators=(",", ":")).encode()
)
payload_b64 = _b64url_encode(
json.dumps(self.to_claims(), separators=(",", ":")).encode()
)
return f"{header_b64}.{payload_b64}".encode("ascii")
def is_phase2(self) -> bool:
"""Return False; mandates are always Phase 1."""
return False
@classmethod
def from_claims(
cls,
header: dict[str, Any],
claims: dict[str, Any],
) -> ACTMandate:
"""Construct an ACTMandate from parsed header and claims dicts.
Reference: ACT §4.1, §4.2.
Raises:
ACTValidationError: If required fields are missing.
ACTPhaseError: If exec_act is present (this is a Phase 2 token).
"""
if "exec_act" in claims:
raise ACTPhaseError(
"Token contains exec_act; use ACTRecord.from_claims instead"
)
del_raw = claims.get("del")
delegation = Delegation.from_dict(del_raw) if del_raw else None
oversight_raw = claims.get("oversight")
oversight_obj = Oversight.from_dict(oversight_raw) if oversight_raw else None
task_raw = claims.get("task")
if task_raw is None:
raise ACTValidationError("task claim is required")
cap_raw = claims.get("cap")
if cap_raw is None:
raise ACTValidationError("cap claim is required")
return cls(
alg=header.get("alg", ""),
kid=header.get("kid", ""),
x5c=header.get("x5c"),
did=header.get("did"),
iss=claims.get("iss", ""),
sub=claims.get("sub", ""),
aud=claims.get("aud", ""),
iat=claims.get("iat", 0),
exp=claims.get("exp", 0),
jti=claims.get("jti", ""),
wid=claims.get("wid"),
task=TaskClaim.from_dict(task_raw),
cap=[Capability.from_dict(c) for c in cap_raw],
delegation=delegation,
oversight=oversight_obj,
)
@dataclass
class ACTRecord:
"""Phase 2 Execution Record.
Contains all Phase 1 claims preserved unchanged, plus execution
claims added by the executing agent. Re-signed by sub's key.
Reference: ACT §3.2, §4.3.
"""
# JOSE header fields (Phase 2 header uses sub's kid)
alg: str
kid: str
x5c: list[str] | None = None
did: str | None = None
# Phase 1 claims (preserved)
iss: str = ""
sub: str = ""
aud: str | list[str] = ""
iat: int = 0
exp: int = 0
jti: str = ""
wid: str | None = None
task: TaskClaim = field(default_factory=lambda: TaskClaim(purpose=""))
cap: list[Capability] = field(default_factory=list)
delegation: Delegation | None = None
oversight: Oversight | None = None
# Phase 2 claims (execution)
exec_act: str = ""
pred: list[str] = field(default_factory=list)
exec_ts: int = 0
status: str = ""
inp_hash: str | None = None
out_hash: str | None = None
err: ErrorClaim | None = None
def validate(self) -> None:
"""Validate structural correctness of this record.
Reference: ACT §4.3, §8.2 steps 13-16.
Raises:
ACTValidationError: If any required field is missing or invalid.
"""
_validate_algorithm(self.alg)
if not self.kid:
raise ACTValidationError("kid is required in JOSE header")
for claim_name in ("iss", "sub", "aud", "jti"):
val = getattr(self, claim_name)
if not val:
raise ACTValidationError(f"{claim_name} claim is required")
if self.iat <= 0:
raise ACTValidationError("iat must be a positive NumericDate")
if self.exp <= 0:
raise ACTValidationError("exp must be a positive NumericDate")
if not self.task.purpose:
raise ACTValidationError("task.purpose is required")
if not self.cap:
raise ACTValidationError("cap must contain at least one capability")
if not self.exec_act:
raise ACTValidationError("exec_act is required in Phase 2")
validate_action_name(self.exec_act)
if self.exec_ts <= 0:
raise ACTValidationError("exec_ts must be a positive NumericDate")
if self.status not in ("completed", "failed", "partial"):
raise ACTValidationError(
f"status must be one of completed/failed/partial, got {self.status!r}"
)
def to_header(self) -> dict[str, Any]:
"""Build JOSE header dict for Phase 2.
In Phase 2, kid MUST reference the sub agent's key.
Reference: ACT §4.1, §8.2 step 17.
"""
h: dict[str, Any] = {
"alg": self.alg,
"typ": ACT_TYP,
"kid": self.kid,
}
if self.x5c is not None:
h["x5c"] = self.x5c
if self.did is not None:
h["did"] = self.did
return h
def to_claims(self) -> dict[str, Any]:
"""Build JWT claims dict (Phase 1 + Phase 2 claims).
Reference: ACT §4.2, §4.3.
"""
c: dict[str, Any] = {
"iss": self.iss,
"sub": self.sub,
"aud": self.aud,
"iat": self.iat,
"exp": self.exp,
"jti": self.jti,
"task": self.task.to_dict(),
"cap": [cap.to_dict() for cap in self.cap],
"exec_act": self.exec_act,
"pred": self.pred,
"exec_ts": self.exec_ts,
"status": self.status,
}
if self.wid is not None:
c["wid"] = self.wid
if self.delegation is not None:
c["del"] = self.delegation.to_dict()
if self.oversight is not None:
c["oversight"] = self.oversight.to_dict()
if self.inp_hash is not None:
c["inp_hash"] = self.inp_hash
if self.out_hash is not None:
c["out_hash"] = self.out_hash
if self.err is not None:
c["err"] = self.err.to_dict()
return c
def signing_input(self) -> bytes:
"""Compute the JWS signing input (header.payload) as bytes.
Reference: RFC 7515 §5.1.
"""
header_b64 = _b64url_encode(
json.dumps(self.to_header(), separators=(",", ":")).encode()
)
payload_b64 = _b64url_encode(
json.dumps(self.to_claims(), separators=(",", ":")).encode()
)
return f"{header_b64}.{payload_b64}".encode("ascii")
def is_phase2(self) -> bool:
"""Return True; records are always Phase 2."""
return True
@classmethod
def from_mandate(
cls,
mandate: ACTMandate,
*,
kid: str,
exec_act: str,
pred: list[str] | None = None,
exec_ts: int | None = None,
status: str = "completed",
inp_hash: str | None = None,
out_hash: str | None = None,
err: ErrorClaim | None = None,
) -> ACTRecord:
"""Create an ACTRecord by transitioning a mandate to Phase 2.
The kid MUST be the sub agent's key identifier.
Reference: ACT §3.2, §4.3.
"""
return cls(
alg=mandate.alg,
kid=kid,
x5c=mandate.x5c,
did=mandate.did,
iss=mandate.iss,
sub=mandate.sub,
aud=mandate.aud,
iat=mandate.iat,
exp=mandate.exp,
jti=mandate.jti,
wid=mandate.wid,
task=mandate.task,
cap=mandate.cap,
delegation=mandate.delegation,
oversight=mandate.oversight,
exec_act=exec_act,
pred=pred if pred is not None else [],
exec_ts=exec_ts if exec_ts is not None else int(time.time()),
status=status,
inp_hash=inp_hash,
out_hash=out_hash,
err=err,
)
@classmethod
def from_claims(
cls,
header: dict[str, Any],
claims: dict[str, Any],
) -> ACTRecord:
"""Construct an ACTRecord from parsed header and claims dicts.
Reference: ACT §4.1, §4.2, §4.3.
Raises:
ACTValidationError: If required fields are missing.
ACTPhaseError: If exec_act is absent (this is a Phase 1 token).
"""
if "exec_act" not in claims:
raise ACTPhaseError(
"Token does not contain exec_act; use ACTMandate.from_claims instead"
)
del_raw = claims.get("del")
delegation = Delegation.from_dict(del_raw) if del_raw else None
oversight_raw = claims.get("oversight")
oversight_obj = Oversight.from_dict(oversight_raw) if oversight_raw else None
task_raw = claims.get("task")
if task_raw is None:
raise ACTValidationError("task claim is required")
cap_raw = claims.get("cap")
if cap_raw is None:
raise ACTValidationError("cap claim is required")
err_raw = claims.get("err")
err_obj = ErrorClaim.from_dict(err_raw) if err_raw else None
return cls(
alg=header.get("alg", ""),
kid=header.get("kid", ""),
x5c=header.get("x5c"),
did=header.get("did"),
iss=claims.get("iss", ""),
sub=claims.get("sub", ""),
aud=claims.get("aud", ""),
iat=claims.get("iat", 0),
exp=claims.get("exp", 0),
jti=claims.get("jti", ""),
wid=claims.get("wid"),
task=TaskClaim.from_dict(task_raw),
cap=[Capability.from_dict(c) for c in cap_raw],
delegation=delegation,
oversight=oversight_obj,
exec_act=claims["exec_act"],
pred=claims.get("pred", []),
exec_ts=claims.get("exec_ts", 0),
status=claims.get("status", ""),
inp_hash=claims.get("inp_hash"),
out_hash=claims.get("out_hash"),
err=err_obj,
)
# --- JWS Compact Serialization ---
def encode_jws(
token: ACTMandate | ACTRecord,
signature: bytes,
) -> str:
"""Encode a token and signature as JWS Compact Serialization.
Returns header.payload.signature (three base64url segments).
Reference: RFC 7515 §3.1, ACT §4.
"""
signing_input = token.signing_input().decode("ascii")
sig_b64 = _b64url_encode(signature)
return f"{signing_input}.{sig_b64}"
def decode_jws(compact: str) -> tuple[dict[str, Any], dict[str, Any], bytes, bytes]:
"""Decode a JWS Compact Serialization string.
Returns (header_dict, claims_dict, signature_bytes, signing_input_bytes).
Reference: RFC 7515 §5.2, ACT §4.
Raises:
ACTValidationError: If the token is malformed.
"""
parts = compact.split(".")
if len(parts) != 3:
raise ACTValidationError(
f"JWS Compact Serialization requires 3 parts, got {len(parts)}"
)
try:
header = json.loads(_b64url_decode(parts[0]))
except (json.JSONDecodeError, Exception) as e:
raise ACTValidationError(f"Invalid JOSE header: {e}") from e
try:
claims = json.loads(_b64url_decode(parts[1]))
except (json.JSONDecodeError, Exception) as e:
raise ACTValidationError(f"Invalid JWT claims: {e}") from e
try:
signature = _b64url_decode(parts[2])
except Exception as e:
raise ACTValidationError(f"Invalid signature encoding: {e}") from e
signing_input = f"{parts[0]}.{parts[1]}".encode("ascii")
# Validate header requirements per ACT §4.1
typ = header.get("typ")
if typ != ACT_TYP:
raise ACTValidationError(
f"typ must be {ACT_TYP!r}, got {typ!r}"
)
alg = header.get("alg", "")
_validate_algorithm(alg)
if "kid" not in header:
raise ACTValidationError("kid is required in JOSE header")
return header, claims, signature, signing_input
def parse_token(compact: str) -> ACTMandate | ACTRecord:
"""Parse a JWS compact string into an ACTMandate or ACTRecord.
Determines phase by presence of exec_act claim.
Reference: ACT §3 (phase determination).
Returns:
ACTMandate for Phase 1, ACTRecord for Phase 2.
"""
header, claims, _, _ = decode_jws(compact)
if "exec_act" in claims:
return ACTRecord.from_claims(header, claims)
return ACTMandate.from_claims(header, claims)
def _validate_algorithm(alg: str) -> None:
"""Check algorithm is allowed per ACT §4.1.
Raises:
ACTValidationError: If algorithm is forbidden or unsupported.
"""
if alg in _FORBIDDEN_ALGORITHMS or alg.upper() in _FORBIDDEN_ALGORITHMS:
raise ACTValidationError(
f"Algorithm {alg!r} is forbidden by ACT specification"
)
if alg not in ALLOWED_ALGORITHMS:
raise ACTValidationError(
f"Unsupported algorithm {alg!r}; allowed: {sorted(ALLOWED_ALGORITHMS)}"
)

View File

@@ -0,0 +1,639 @@
"""ACT Appendix B test vectors.
Generates and validates all 15 test vectors from Appendix B of
draft-nennemann-act-01. Each vector includes description, input
parameters, and expected output or exception.
Reference: ACT Appendix B (Test Vectors).
"""
from __future__ import annotations
import time
import uuid
from dataclasses import dataclass, field
from typing import Any
from .crypto import (
ACTKeyResolver,
KeyRegistry,
PrivateKey,
PublicKey,
b64url_sha256,
compute_sha256,
generate_ed25519_keypair,
sign as crypto_sign,
verify as crypto_verify,
)
from .dag import validate_dag
from .delegation import create_delegated_mandate, verify_capability_subset
from .errors import (
ACTAudienceMismatchError,
ACTCapabilityError,
ACTDAGError,
ACTDelegationError,
ACTExpiredError,
ACTPrivilegeEscalationError,
ACTSignatureError,
ACTValidationError,
)
from .ledger import ACTLedger
from .lifecycle import transition_to_record
from .token import (
ACTMandate,
ACTRecord,
Capability,
Delegation,
DelegationEntry,
ErrorClaim,
Oversight,
TaskClaim,
_b64url_encode,
decode_jws,
encode_jws,
)
from .verify import ACTVerifier
@dataclass
class TestVector:
"""A single test vector."""
id: str
description: str
valid: bool
expected_exception: type[Exception] | None = None
compact: str = ""
record: ACTMandate | ACTRecord | None = None
def generate_vectors() -> tuple[list[TestVector], dict[str, Any]]:
"""Generate all Appendix B test vectors.
Returns a list of TestVector objects and a context dict containing
keys and other state needed for validation.
Reference: ACT Appendix B.
"""
# Fixed timestamp for deterministic vectors
base_time = 1772064000
# Generate key pairs for test agents
iss_priv, iss_pub = generate_ed25519_keypair()
sub_priv, sub_pub = generate_ed25519_keypair()
agent_c_priv, agent_c_pub = generate_ed25519_keypair()
# Fixed JTIs for cross-referencing
jti_b1 = "550e8400-e29b-41d4-a716-446655440001"
jti_b2 = "550e8400-e29b-41d4-a716-446655440002"
jti_b3_parent1 = "550e8400-e29b-41d4-a716-446655440003"
jti_b3_parent2 = "550e8400-e29b-41d4-a716-446655440004"
jti_b3 = "550e8400-e29b-41d4-a716-446655440005"
jti_b4 = "550e8400-e29b-41d4-a716-446655440006"
jti_b5 = "550e8400-e29b-41d4-a716-446655440007"
wid = "a0b1c2d3-e4f5-6789-abcd-ef0123456789"
# Key registry
registry = KeyRegistry()
registry.register("iss-key", iss_pub)
registry.register("sub-key", sub_pub)
registry.register("agent-c-key", agent_c_pub)
resolver = ACTKeyResolver(registry=registry)
vectors: list[TestVector] = []
compacts: dict[str, str] = {} # jti → compact for delegation refs
# --- B.1: Phase 1 — Root mandate, Tier 1, Ed25519, no delegation ---
mandate_b1 = ACTMandate(
alg="EdDSA",
kid="iss-key",
iss="agent-issuer",
sub="agent-subject",
aud=["agent-subject", "https://ledger.example.com"],
iat=base_time,
exp=base_time + 900,
jti=jti_b1,
wid=wid,
task=TaskClaim(
purpose="validate_data",
data_sensitivity="restricted",
),
cap=[
Capability(action="read.data", constraints={"max_records": 10}),
Capability(action="write.result"),
],
delegation=Delegation(depth=0, max_depth=2, chain=[]),
)
mandate_b1.validate()
sig_b1 = crypto_sign(iss_priv, mandate_b1.signing_input())
compact_b1 = encode_jws(mandate_b1, sig_b1)
compacts[jti_b1] = compact_b1
vectors.append(TestVector(
id="B.1",
description="Phase 1 ACT — root mandate, Tier 1 (Ed25519), no delegation",
valid=True,
compact=compact_b1,
record=mandate_b1,
))
# --- B.2: Phase 2 — Completed execution from B.1 ---
record_b2, compact_b2 = transition_to_record(
mandate_b1,
sub_kid="sub-key",
sub_private_key=sub_priv,
exec_act="read.data",
pred=[],
exec_ts=base_time + 300,
status="completed",
inp_hash=b64url_sha256(b"test input data"),
out_hash=b64url_sha256(b"test output data"),
)
compacts[jti_b2] = compact_b2
vectors.append(TestVector(
id="B.2",
description="Phase 2 ACT — completed execution, transition from B.1 mandate",
valid=True,
compact=compact_b2,
record=record_b2,
))
# --- B.3: Phase 2 — Fan-in, two parent jti values ---
# Create two parent records first
parent1_mandate = ACTMandate(
alg="EdDSA", kid="iss-key",
iss="agent-issuer", sub="agent-subject",
aud="agent-subject",
iat=base_time, exp=base_time + 900,
jti=jti_b3_parent1, wid=wid,
task=TaskClaim(purpose="branch_a"),
cap=[Capability(action="compute.result")],
delegation=Delegation(depth=0, max_depth=1, chain=[]),
)
sig_p1 = crypto_sign(iss_priv, parent1_mandate.signing_input())
compact_p1 = encode_jws(parent1_mandate, sig_p1)
parent1_record, parent1_compact = transition_to_record(
parent1_mandate, sub_kid="sub-key", sub_private_key=sub_priv,
exec_act="compute.result", pred=[], exec_ts=base_time + 100,
status="completed",
)
parent2_mandate = ACTMandate(
alg="EdDSA", kid="iss-key",
iss="agent-issuer", sub="agent-subject",
aud="agent-subject",
iat=base_time, exp=base_time + 900,
jti=jti_b3_parent2, wid=wid,
task=TaskClaim(purpose="branch_b"),
cap=[Capability(action="compute.result")],
delegation=Delegation(depth=0, max_depth=1, chain=[]),
)
sig_p2 = crypto_sign(iss_priv, parent2_mandate.signing_input())
compact_p2 = encode_jws(parent2_mandate, sig_p2)
parent2_record, parent2_compact = transition_to_record(
parent2_mandate, sub_kid="sub-key", sub_private_key=sub_priv,
exec_act="compute.result", pred=[], exec_ts=base_time + 150,
status="completed",
)
# Fan-in record depends on both parents
fanin_mandate = ACTMandate(
alg="EdDSA", kid="iss-key",
iss="agent-issuer", sub="agent-subject",
aud="agent-subject",
iat=base_time, exp=base_time + 900,
jti=jti_b3, wid=wid,
task=TaskClaim(purpose="merge_results"),
cap=[Capability(action="compute.result")],
delegation=Delegation(depth=0, max_depth=1, chain=[]),
)
sig_fi = crypto_sign(iss_priv, fanin_mandate.signing_input())
fanin_record, fanin_compact = transition_to_record(
fanin_mandate, sub_kid="sub-key", sub_private_key=sub_priv,
exec_act="compute.result",
pred=[jti_b3_parent1, jti_b3_parent2],
exec_ts=base_time + 200,
status="completed",
)
vectors.append(TestVector(
id="B.3",
description="Phase 2 ACT — fan-in, two predecessor jti values from parallel branches",
valid=True,
compact=fanin_compact,
record=fanin_record,
))
# --- B.4: Phase 1 — Delegated mandate (depth=1) ---
delegated_b4, _ = create_delegated_mandate(
parent_mandate=mandate_b1,
parent_compact=compact_b1,
delegator_private_key=iss_priv,
sub="agent-c",
kid="iss-key",
iss="agent-issuer",
aud="agent-c",
iat=base_time + 10,
exp=base_time + 600,
jti=jti_b4,
cap=[Capability(action="read.data", constraints={"max_records": 5})],
task=TaskClaim(purpose="delegated_read"),
)
sig_b4 = crypto_sign(iss_priv, delegated_b4.signing_input())
compact_b4 = encode_jws(delegated_b4, sig_b4)
compacts[jti_b4] = compact_b4
vectors.append(TestVector(
id="B.4",
description="Phase 1 ACT — delegated mandate (depth=1), chain entry with sig",
valid=True,
compact=compact_b4,
record=delegated_b4,
))
# --- B.5: Phase 2 — Delegated execution record ---
record_b5, compact_b5 = transition_to_record(
delegated_b4,
sub_kid="agent-c-key",
sub_private_key=agent_c_priv,
exec_act="read.data",
pred=[],
exec_ts=base_time + 350,
status="completed",
)
vectors.append(TestVector(
id="B.5",
description="Phase 2 ACT — delegated execution record",
valid=True,
compact=compact_b5,
record=record_b5,
))
# --- B.6: del.depth > del.max_depth → ACTDelegationError ---
bad_depth_mandate = ACTMandate(
alg="EdDSA", kid="iss-key",
iss="agent-issuer", sub="agent-subject",
aud="agent-subject",
iat=base_time, exp=base_time + 900,
jti=str(uuid.uuid4()),
task=TaskClaim(purpose="bad_depth"),
cap=[Capability(action="read.data")],
delegation=Delegation(depth=3, max_depth=2, chain=[
DelegationEntry(delegator="a", jti="j1", sig="sig1"),
DelegationEntry(delegator="b", jti="j2", sig="sig2"),
DelegationEntry(delegator="c", jti="j3", sig="sig3"),
]),
)
sig_b6 = crypto_sign(iss_priv, bad_depth_mandate.signing_input())
compact_b6 = encode_jws(bad_depth_mandate, sig_b6)
vectors.append(TestVector(
id="B.6",
description="del.depth > del.max_depth → ACTDelegationError",
valid=False,
expected_exception=ACTDelegationError,
compact=compact_b6,
))
# --- B.7: cap escalation in delegated ACT → ACTPrivilegeEscalationError ---
vectors.append(TestVector(
id="B.7",
description="cap escalation in delegated ACT → ACTPrivilegeEscalationError",
valid=False,
expected_exception=ACTPrivilegeEscalationError,
))
# --- B.8: exec_act not in cap → ACTCapabilityError ---
bad_exec_mandate = ACTMandate(
alg="EdDSA", kid="iss-key",
iss="agent-issuer", sub="agent-subject",
aud="agent-subject",
iat=base_time, exp=base_time + 900,
jti=str(uuid.uuid4()),
task=TaskClaim(purpose="bad_exec"),
cap=[Capability(action="read.data")],
delegation=Delegation(depth=0, max_depth=1, chain=[]),
)
sig_b8m = crypto_sign(iss_priv, bad_exec_mandate.signing_input())
# Manually construct Phase 2 with wrong exec_act
bad_exec_record = ACTRecord(
alg="EdDSA", kid="sub-key",
iss="agent-issuer", sub="agent-subject",
aud="agent-subject",
iat=base_time, exp=base_time + 900,
jti=bad_exec_mandate.jti,
task=TaskClaim(purpose="bad_exec"),
cap=[Capability(action="read.data")],
exec_act="delete.everything",
pred=[], exec_ts=base_time + 100, status="completed",
)
sig_b8 = crypto_sign(sub_priv, bad_exec_record.signing_input())
compact_b8 = encode_jws(bad_exec_record, sig_b8)
vectors.append(TestVector(
id="B.8",
description="exec_act not in cap → ACTCapabilityError",
valid=False,
expected_exception=ACTCapabilityError,
compact=compact_b8,
))
# --- B.9: DAG cycle (pred references own jti) → ACTDAGError ---
cycle_jti = str(uuid.uuid4())
cycle_record = ACTRecord(
alg="EdDSA", kid="sub-key",
iss="agent-issuer", sub="agent-subject",
aud="agent-subject",
iat=base_time, exp=base_time + 900,
jti=cycle_jti,
task=TaskClaim(purpose="cycle_test"),
cap=[Capability(action="read.data")],
exec_act="read.data",
pred=[cycle_jti],
exec_ts=base_time + 100, status="completed",
)
sig_b9 = crypto_sign(sub_priv, cycle_record.signing_input())
compact_b9 = encode_jws(cycle_record, sig_b9)
vectors.append(TestVector(
id="B.9",
description="DAG cycle (pred references own jti) → ACTDAGError",
valid=False,
expected_exception=ACTDAGError,
compact=compact_b9,
))
# --- B.10: Missing parent jti in DAG → ACTDAGError ---
missing_parent_record = ACTRecord(
alg="EdDSA", kid="sub-key",
iss="agent-issuer", sub="agent-subject",
aud="agent-subject",
iat=base_time, exp=base_time + 900,
jti=str(uuid.uuid4()),
task=TaskClaim(purpose="missing_parent"),
cap=[Capability(action="read.data")],
exec_act="read.data",
pred=["nonexistent-parent-jti"],
exec_ts=base_time + 100, status="completed",
)
sig_b10 = crypto_sign(sub_priv, missing_parent_record.signing_input())
compact_b10 = encode_jws(missing_parent_record, sig_b10)
vectors.append(TestVector(
id="B.10",
description="Missing parent jti in DAG → ACTDAGError",
valid=False,
expected_exception=ACTDAGError,
compact=compact_b10,
))
# --- B.11: Tampered payload (bit flip) → ACTSignatureError ---
# Take a valid compact and flip a byte in the payload
parts = compact_b1.split(".")
payload_bytes = bytearray(parts[1].encode("ascii"))
# Flip a character in the payload
flip_idx = len(payload_bytes) // 2
payload_bytes[flip_idx] = (payload_bytes[flip_idx] + 1) % 128
if payload_bytes[flip_idx] == 0:
payload_bytes[flip_idx] = 65 # 'A'
tampered_compact = f"{parts[0]}.{payload_bytes.decode('ascii')}.{parts[2]}"
vectors.append(TestVector(
id="B.11",
description="Tampered payload (bit flip in claims) → ACTSignatureError",
valid=False,
expected_exception=ACTSignatureError,
compact=tampered_compact,
))
# --- B.12: Expired token → ACTExpiredError ---
expired_mandate = ACTMandate(
alg="EdDSA", kid="iss-key",
iss="agent-issuer", sub="agent-subject",
aud="agent-subject",
iat=base_time - 3600,
exp=base_time - 2700, # expired 45 minutes ago
jti=str(uuid.uuid4()),
task=TaskClaim(purpose="expired_test"),
cap=[Capability(action="read.data")],
)
sig_b12 = crypto_sign(iss_priv, expired_mandate.signing_input())
compact_b12 = encode_jws(expired_mandate, sig_b12)
vectors.append(TestVector(
id="B.12",
description="Expired token → ACTExpiredError",
valid=False,
expected_exception=ACTExpiredError,
compact=compact_b12,
))
# --- B.13: Wrong audience → ACTAudienceMismatchError ---
wrong_aud_mandate = ACTMandate(
alg="EdDSA", kid="iss-key",
iss="agent-issuer", sub="wrong-agent",
aud="wrong-agent",
iat=base_time, exp=base_time + 900,
jti=str(uuid.uuid4()),
task=TaskClaim(purpose="wrong_aud_test"),
cap=[Capability(action="read.data")],
)
sig_b13 = crypto_sign(iss_priv, wrong_aud_mandate.signing_input())
compact_b13 = encode_jws(wrong_aud_mandate, sig_b13)
vectors.append(TestVector(
id="B.13",
description="Wrong audience → ACTAudienceMismatchError",
valid=False,
expected_exception=ACTAudienceMismatchError,
compact=compact_b13,
))
# --- B.14: Phase 2 re-signed by iss key instead of sub → ACTSignatureError ---
record_b14 = ACTRecord.from_mandate(
mandate_b1,
kid="sub-key", # claims to be sub's key
exec_act="read.data",
pred=[], exec_ts=base_time + 300, status="completed",
)
# But signed with ISS's private key (wrong signer)
sig_b14 = crypto_sign(iss_priv, record_b14.signing_input())
compact_b14 = encode_jws(record_b14, sig_b14)
vectors.append(TestVector(
id="B.14",
description="Phase 2 re-signed by iss key instead of sub → ACTSignatureError",
valid=False,
expected_exception=ACTSignatureError,
compact=compact_b14,
))
# --- B.15: Algorithm "none" → ACTValidationError ---
# Manually construct a JWS with alg: none
import json
import base64
none_header = base64.urlsafe_b64encode(
json.dumps({"alg": "none", "typ": "act+jwt", "kid": "k"}, separators=(",", ":")).encode()
).rstrip(b"=").decode()
none_payload = base64.urlsafe_b64encode(
json.dumps({"iss": "a", "sub": "b"}, separators=(",", ":")).encode()
).rstrip(b"=").decode()
compact_b15 = f"{none_header}.{none_payload}."
vectors.append(TestVector(
id="B.15",
description='Algorithm "none" → ACTValidationError',
valid=False,
expected_exception=ACTValidationError,
compact=compact_b15,
))
context = {
"iss_priv": iss_priv,
"iss_pub": iss_pub,
"sub_priv": sub_priv,
"sub_pub": sub_pub,
"agent_c_priv": agent_c_priv,
"agent_c_pub": agent_c_pub,
"registry": registry,
"resolver": resolver,
"base_time": base_time,
"compacts": compacts,
"parent1_record": parent1_record,
"parent2_record": parent2_record,
"mandate_b1": mandate_b1,
}
return vectors, context
def validate_vectors() -> bool:
"""Run all test vectors and validate results.
Returns True if all vectors pass.
Reference: ACT Appendix B.
"""
vectors, ctx = generate_vectors()
resolver = ctx["resolver"]
base_time = ctx["base_time"]
verifier = ACTVerifier(
key_resolver=resolver,
verifier_id="agent-subject",
trusted_issuers={"agent-issuer"},
)
passed = 0
failed = 0
for v in vectors:
try:
if v.id == "B.7":
# Special case: test cap escalation during delegation creation
try:
from .delegation import verify_capability_subset
verify_capability_subset(
[Capability(action="read.data", constraints={"max_records": 10})],
[Capability(action="read.data", constraints={"max_records": 100})],
)
print(f" FAIL {v.id}: Expected {v.expected_exception.__name__}")
failed += 1
except ACTPrivilegeEscalationError:
print(f" PASS {v.id}: {v.description}")
passed += 1
continue
if v.valid:
# Valid vectors: should parse and verify without error
header, claims, sig, si = decode_jws(v.compact)
kid = header["kid"]
pub = resolver.resolve(kid, header=header)
crypto_verify(pub, sig, si)
print(f" PASS {v.id}: {v.description}")
passed += 1
else:
# Invalid vectors: should raise the expected exception
try:
if v.expected_exception == ACTDelegationError:
header, claims, sig, si = decode_jws(v.compact)
kid = header["kid"]
pub = resolver.resolve(kid, header=header)
crypto_verify(pub, sig, si)
# Parse and check delegation
from .token import ACTMandate as _M
m = _M.from_claims(header, claims)
from .delegation import verify_delegation_chain
verify_delegation_chain(m, lambda d: resolver.resolve(d))
print(f" FAIL {v.id}: Expected {v.expected_exception.__name__}")
failed += 1
elif v.expected_exception == ACTCapabilityError:
header, claims, sig, si = decode_jws(v.compact)
kid = header["kid"]
pub = resolver.resolve(kid, header=header)
crypto_verify(pub, sig, si)
r = ACTRecord.from_claims(header, claims)
cap_actions = {c.action for c in r.cap}
if r.exec_act not in cap_actions:
raise ACTCapabilityError("exec_act mismatch")
print(f" FAIL {v.id}: Expected {v.expected_exception.__name__}")
failed += 1
elif v.expected_exception == ACTDAGError:
header, claims, sig, si = decode_jws(v.compact)
kid = header["kid"]
pub = resolver.resolve(kid, header=header)
crypto_verify(pub, sig, si)
r = ACTRecord.from_claims(header, claims)
ledger = ACTLedger()
validate_dag(r, ledger)
print(f" FAIL {v.id}: Expected {v.expected_exception.__name__}")
failed += 1
elif v.expected_exception == ACTExpiredError:
verifier.verify_mandate(v.compact, check_sub=False)
print(f" FAIL {v.id}: Expected {v.expected_exception.__name__}")
failed += 1
elif v.expected_exception == ACTAudienceMismatchError:
verifier.verify_mandate(
v.compact,
now=base_time + 100,
check_sub=False,
)
print(f" FAIL {v.id}: Expected {v.expected_exception.__name__}")
failed += 1
elif v.expected_exception == ACTSignatureError:
header, claims, sig, si = decode_jws(v.compact)
kid = header["kid"]
pub = resolver.resolve(kid, header=header)
crypto_verify(pub, sig, si)
print(f" FAIL {v.id}: Expected {v.expected_exception.__name__}")
failed += 1
elif v.expected_exception == ACTValidationError:
decode_jws(v.compact)
print(f" FAIL {v.id}: Expected {v.expected_exception.__name__}")
failed += 1
else:
print(f" SKIP {v.id}: Unknown expected exception type")
failed += 1
except Exception as e:
if isinstance(e, v.expected_exception):
print(f" PASS {v.id}: {v.description}")
passed += 1
else:
print(f" FAIL {v.id}: Expected {v.expected_exception.__name__}, "
f"got {type(e).__name__}: {e}")
failed += 1
except Exception as e:
print(f" FAIL {v.id}: Unexpected error: {type(e).__name__}: {e}")
failed += 1
print(f"\nResults: {passed} passed, {failed} failed out of {len(vectors)}")
return failed == 0

View File

@@ -0,0 +1,323 @@
"""ACT unified verification entry point.
Provides ACTVerifier with verify_mandate (Phase 1) and verify_record
(Phase 2) methods implementing the full verification procedures.
Reference: ACT §8 (Verification Procedure).
"""
from __future__ import annotations
import logging
import time
from typing import Any
from .crypto import ACTKeyResolver, PublicKey, verify as crypto_verify
from .dag import ACTStore, validate_dag
from .delegation import verify_delegation_chain
from .errors import (
ACTAudienceMismatchError,
ACTCapabilityError,
ACTExpiredError,
ACTPhaseError,
ACTSignatureError,
ACTValidationError,
)
from .token import (
ACTMandate,
ACTRecord,
decode_jws,
)
logger = logging.getLogger(__name__)
# Default clock skew tolerance for exp check — ACT §8.1 step 6.
DEFAULT_EXP_CLOCK_SKEW: int = 300 # 5 minutes
# Default clock skew tolerance for iat future check — ACT §8.1 step 7.
DEFAULT_IAT_FUTURE_TOLERANCE: int = 30 # 30 seconds
class ACTVerifier:
"""Unified ACT verification entry point.
Implements the full verification procedure for both Phase 1
(Authorization Mandate) and Phase 2 (Execution Record) tokens.
Reference: ACT §8.
"""
def __init__(
self,
key_resolver: ACTKeyResolver,
*,
verifier_id: str | None = None,
trusted_issuers: set[str] | None = None,
exp_clock_skew: int = DEFAULT_EXP_CLOCK_SKEW,
iat_future_tolerance: int = DEFAULT_IAT_FUTURE_TOLERANCE,
resolve_parent_compact: Any | None = None,
) -> None:
"""Initialize the verifier.
Args:
key_resolver: Key resolver for all trust tiers.
verifier_id: This verifier's own identifier (for aud/sub checks).
trusted_issuers: Set of trusted issuer identifiers.
If None, iss check is skipped.
exp_clock_skew: Maximum clock skew for expiration (seconds).
iat_future_tolerance: Maximum future iat tolerance (seconds).
resolve_parent_compact: Callback to resolve parent ACT compact
form by jti (for delegation chain).
"""
self._key_resolver = key_resolver
self._verifier_id = verifier_id
self._trusted_issuers = trusted_issuers
self._exp_clock_skew = exp_clock_skew
self._iat_future_tolerance = iat_future_tolerance
self._resolve_parent_compact = resolve_parent_compact
def verify_mandate(
self,
compact: str,
*,
now: int | None = None,
check_aud: bool = True,
check_sub: bool = True,
) -> ACTMandate:
"""Verify a Phase 1 Authorization Mandate.
Implements ACT §8.1 verification steps 1-13.
Args:
compact: JWS Compact Serialization of the Phase 1 ACT.
now: Current time override (for testing). Defaults to time.time().
check_aud: Whether to check aud contains verifier_id.
check_sub: Whether to check sub matches verifier_id.
Returns:
Verified ACTMandate.
Raises:
ACTValidationError: Malformed token (steps 2-3, 11).
ACTSignatureError: Signature failure (step 5).
ACTExpiredError: Token expired (step 6).
ACTAudienceMismatchError: Wrong audience (step 8).
ACTDelegationError: Invalid delegation chain (step 12).
"""
current_time = now if now is not None else int(time.time())
# Step 1: Parse JWS Compact Serialization
header, claims, signature, signing_input = decode_jws(compact)
# Steps 2-3: typ and alg checked by decode_jws
# Phase check: must NOT have exec_act
if "exec_act" in claims:
raise ACTPhaseError(
"Token contains exec_act — this is a Phase 2 token, "
"not a Phase 1 mandate"
)
# Step 4: Resolve public key for kid
kid = header["kid"]
public_key = self._key_resolver.resolve(kid, header=header)
# Step 5: Verify JWS signature
crypto_verify(public_key, signature, signing_input)
# Build mandate object for claim validation
mandate = ACTMandate.from_claims(header, claims)
# Step 6: Check exp not passed
if current_time > mandate.exp + self._exp_clock_skew:
raise ACTExpiredError(
f"Token expired: exp={mandate.exp}, "
f"now={current_time}, skew={self._exp_clock_skew}"
)
# Step 7: Check iat not unreasonably future
if mandate.iat > current_time + self._iat_future_tolerance:
raise ACTValidationError(
f"Token iat is too far in the future: iat={mandate.iat}, "
f"now={current_time}, tolerance={self._iat_future_tolerance}"
)
# Step 8: Check aud contains verifier's identity
if check_aud and self._verifier_id is not None:
aud = mandate.aud
if isinstance(aud, str):
aud_list = [aud]
else:
aud_list = aud
if self._verifier_id not in aud_list:
raise ACTAudienceMismatchError(
f"Verifier id {self._verifier_id!r} not in aud: {aud_list}"
)
# Step 9: Check iss is trusted
if self._trusted_issuers is not None:
if mandate.iss not in self._trusted_issuers:
raise ACTValidationError(
f"Issuer {mandate.iss!r} is not trusted"
)
# Step 10: Check sub matches verifier's identity
if check_sub and self._verifier_id is not None:
if mandate.sub != self._verifier_id:
raise ACTValidationError(
f"sub {mandate.sub!r} does not match verifier id "
f"{self._verifier_id!r}"
)
# Step 11: Check all required claims (done by from_claims + validate)
mandate.validate()
# Step 12: Verify delegation chain
if mandate.delegation is not None and mandate.delegation.chain:
def _resolve_key(delegator_id: str) -> PublicKey:
return self._key_resolver.resolve(delegator_id)
verify_delegation_chain(
mandate,
resolve_key=_resolve_key,
resolve_parent_compact=self._resolve_parent_compact,
)
return mandate
def verify_record(
self,
compact: str,
store: ACTStore | None = None,
*,
now: int | None = None,
check_aud: bool = True,
) -> ACTRecord:
"""Verify a Phase 2 Execution Record.
Implements all Phase 1 steps (§8.1) plus Phase 2 steps (§8.2).
Args:
compact: JWS Compact Serialization of the Phase 2 ACT.
store: ACT store for DAG validation. If None, DAG checks
are limited to capability consistency only.
now: Current time override (for testing).
check_aud: Whether to check aud contains verifier_id.
Returns:
Verified ACTRecord.
Raises:
ACTValidationError: Malformed token.
ACTSignatureError: Signature failure or wrong signer.
ACTExpiredError: Token expired.
ACTAudienceMismatchError: Wrong audience.
ACTCapabilityError: exec_act not in cap.
ACTDAGError: DAG validation failure.
"""
current_time = now if now is not None else int(time.time())
# Step 1: Parse JWS
header, claims, signature, signing_input = decode_jws(compact)
# Phase check
if "exec_act" not in claims:
raise ACTPhaseError(
"Token does not contain exec_act — this is a Phase 1 "
"mandate, not a Phase 2 record"
)
# Step 4: Resolve key — in Phase 2, kid MUST be sub's key
kid = header["kid"]
public_key = self._key_resolver.resolve(kid, header=header)
# Step 5: Verify JWS signature (Step 17: by sub's key)
crypto_verify(public_key, signature, signing_input)
# Build record
record = ACTRecord.from_claims(header, claims)
# Step 6: Check exp
if current_time > record.exp + self._exp_clock_skew:
raise ACTExpiredError(
f"Token expired: exp={record.exp}, "
f"now={current_time}, skew={self._exp_clock_skew}"
)
# Step 7: iat future check
if record.iat > current_time + self._iat_future_tolerance:
raise ACTValidationError(
f"Token iat is too far in the future: iat={record.iat}"
)
# Step 8: aud check
if check_aud and self._verifier_id is not None:
aud = record.aud
if isinstance(aud, str):
aud_list = [aud]
else:
aud_list = aud
if self._verifier_id not in aud_list:
raise ACTAudienceMismatchError(
f"Verifier id {self._verifier_id!r} not in aud: {aud_list}"
)
# Step 9: iss trust check
if self._trusted_issuers is not None:
if record.iss not in self._trusted_issuers:
raise ACTValidationError(
f"Issuer {record.iss!r} is not trusted"
)
# Step 11: required claims validation
record.validate()
# Step 12: delegation chain
if record.delegation is not None and record.delegation.chain:
def _resolve_key(delegator_id: str) -> PublicKey:
return self._key_resolver.resolve(delegator_id)
# Reuse verify_delegation_chain with ACTRecord fields
# (it accesses .delegation which exists on ACTRecord too)
from .delegation import verify_delegation_chain as _vdc
# Create a temporary mandate-like view — delegation chain
# verification only needs delegation and cap fields
mandate_view = ACTMandate(
alg=record.alg, kid=record.kid,
iss=record.iss, sub=record.sub, aud=record.aud,
iat=record.iat, exp=record.exp, jti=record.jti,
task=record.task, cap=record.cap,
delegation=record.delegation,
)
_vdc(
mandate_view,
resolve_key=_resolve_key,
resolve_parent_compact=self._resolve_parent_compact,
)
# Phase 2 step 13: exec_act matches cap[].action
cap_actions = {c.action for c in record.cap}
if record.exec_act not in cap_actions:
raise ACTCapabilityError(
f"exec_act {record.exec_act!r} does not match any "
f"cap[].action: {sorted(cap_actions)}"
)
# Phase 2 step 14: DAG validation
if store is not None:
validate_dag(record, store)
# Phase 2 step 15: exec_ts checks
if record.exec_ts < record.iat:
raise ACTValidationError(
f"exec_ts {record.exec_ts} is before iat {record.iat}"
)
if record.exec_ts > record.exp:
logger.warning(
"exec_ts %d is after exp %d — execution after mandate expiry",
record.exec_ts, record.exp,
)
# Phase 2 step 16: status validation (done by record.validate())
return record

View File

@@ -0,0 +1,174 @@
"""ACT performance benchmarks.
Measures Phase 1 creation time (construct + sign + encode) against
the 500µs target from the specification.
"""
import time
import uuid
import statistics
from act import (
ACTMandate,
ACTRecord,
Capability,
TaskClaim,
encode_jws,
decode_jws,
generate_ed25519_keypair,
generate_p256_keypair,
sign,
verify,
transition_to_record,
)
def bench_phase1_ed25519(n: int = 10000) -> None:
"""Benchmark Phase 1 creation with Ed25519."""
priv, pub = generate_ed25519_keypair()
# Warmup
for _ in range(100):
m = ACTMandate(
alg="EdDSA", kid="k", iss="a", sub="b", aud="b",
iat=1772064000, exp=1772064900, jti=str(uuid.uuid4()),
task=TaskClaim(purpose="t"), cap=[Capability(action="x.y")],
)
sig = sign(priv, m.signing_input())
encode_jws(m, sig)
times = []
for _ in range(n):
start = time.perf_counter()
m = ACTMandate(
alg="EdDSA", kid="k", iss="a", sub="b", aud="b",
iat=1772064000, exp=1772064900, jti=str(uuid.uuid4()),
task=TaskClaim(purpose="benchmark"),
cap=[Capability(action="read.data")],
)
sig = sign(priv, m.signing_input())
encode_jws(m, sig)
elapsed = time.perf_counter() - start
times.append(elapsed * 1_000_000) # µs
mean = statistics.mean(times)
median = statistics.median(times)
p99 = sorted(times)[int(n * 0.99)]
print(f"Phase 1 Ed25519 (n={n}):")
print(f" Mean: {mean:.1f} µs")
print(f" Median: {median:.1f} µs")
print(f" P99: {p99:.1f} µs")
print(f" Target: <= 500 µs {'PASS' if mean <= 500 else 'FAIL'}")
print()
def bench_phase1_p256(n: int = 5000) -> None:
"""Benchmark Phase 1 creation with P-256."""
priv, pub = generate_p256_keypair()
for _ in range(50):
m = ACTMandate(
alg="ES256", kid="k", iss="a", sub="b", aud="b",
iat=1772064000, exp=1772064900, jti=str(uuid.uuid4()),
task=TaskClaim(purpose="t"), cap=[Capability(action="x.y")],
)
sig = sign(priv, m.signing_input())
encode_jws(m, sig)
times = []
for _ in range(n):
start = time.perf_counter()
m = ACTMandate(
alg="ES256", kid="k", iss="a", sub="b", aud="b",
iat=1772064000, exp=1772064900, jti=str(uuid.uuid4()),
task=TaskClaim(purpose="benchmark"),
cap=[Capability(action="read.data")],
)
sig = sign(priv, m.signing_input())
encode_jws(m, sig)
elapsed = time.perf_counter() - start
times.append(elapsed * 1_000_000)
mean = statistics.mean(times)
median = statistics.median(times)
p99 = sorted(times)[int(n * 0.99)]
print(f"Phase 1 ES256 (n={n}):")
print(f" Mean: {mean:.1f} µs")
print(f" Median: {median:.1f} µs")
print(f" P99: {p99:.1f} µs")
print()
def bench_phase2_transition(n: int = 5000) -> None:
"""Benchmark Phase 1 -> Phase 2 transition."""
iss_priv, _ = generate_ed25519_keypair()
sub_priv, _ = generate_ed25519_keypair()
mandate = ACTMandate(
alg="EdDSA", kid="k", iss="a", sub="b", aud="b",
iat=1772064000, exp=1772064900, jti=str(uuid.uuid4()),
task=TaskClaim(purpose="t"), cap=[Capability(action="x.y")],
)
# Warmup
for _ in range(50):
transition_to_record(
mandate, sub_kid="sk", sub_private_key=sub_priv,
exec_act="x.y", pred=[], status="completed",
)
times = []
for _ in range(n):
start = time.perf_counter()
transition_to_record(
mandate, sub_kid="sk", sub_private_key=sub_priv,
exec_act="x.y", pred=[], status="completed",
)
elapsed = time.perf_counter() - start
times.append(elapsed * 1_000_000)
mean = statistics.mean(times)
median = statistics.median(times)
print(f"Phase 2 Transition (n={n}):")
print(f" Mean: {mean:.1f} µs")
print(f" Median: {median:.1f} µs")
print()
def bench_verify(n: int = 5000) -> None:
"""Benchmark JWS decode + verify."""
priv, pub = generate_ed25519_keypair()
m = ACTMandate(
alg="EdDSA", kid="k", iss="a", sub="b", aud="b",
iat=1772064000, exp=1772064900, jti=str(uuid.uuid4()),
task=TaskClaim(purpose="t"), cap=[Capability(action="x.y")],
)
sig = sign(priv, m.signing_input())
compact = encode_jws(m, sig)
# Warmup
for _ in range(50):
_, _, s, si = decode_jws(compact)
verify(pub, s, si)
times = []
for _ in range(n):
start = time.perf_counter()
_, _, s, si = decode_jws(compact)
verify(pub, s, si)
elapsed = time.perf_counter() - start
times.append(elapsed * 1_000_000)
mean = statistics.mean(times)
median = statistics.median(times)
print(f"Decode + Verify (n={n}):")
print(f" Mean: {mean:.1f} µs")
print(f" Median: {median:.1f} µs")
print()
if __name__ == "__main__":
bench_phase1_ed25519()
bench_phase1_p256()
bench_phase2_transition()
bench_verify()

View File

@@ -0,0 +1,194 @@
# Section 1.5: Applicability (for draft-nennemann-act-01)
Insert after Section 1.4 (Relationship to Related Work).
---
### 1.5. Applicability
ACT is designed as a general-purpose primitive for AI agent
authorization and execution accountability. While a sibling
specification [I-D.nennemann-wimse-ect] profiles execution context
tokens specifically for the WIMSE working group's workload identity
infrastructure, ACT operates without any shared identity plane. This
section identifies deployment contexts where ACT applies independently
of WIMSE, and clarifies how ACT complements — rather than competes
with — ecosystem-specific agent protocols.
#### 1.5.1. Model Context Protocol (MCP) Tool-Use Flows
The Model Context Protocol [MCP-SPEC] defines a client-server
interface by which LLM hosts invoke external tools via structured
JSON-RPC calls. MCP 2025-11-25 mandates OAuth 2.1 for transport-layer
authentication, but provides no mechanism for carrying per-invocation
authorization constraints or for producing a tamper-evident record
of what arguments were passed and what result was returned.
ACT addresses this gap as follows: when an MCP host is about to
dispatch a tool call on behalf of an agent, it SHOULD issue a Phase 1
ACT Mandate encoding the permitted tool name (e.g., as a capability
constraint), the declaring scope, and any parameter-level constraints
applicable to that invocation. The MCP server, upon receiving the
request, MAY validate the ACT Mandate and, upon completing the tool
execution, SHOULD transition the token to Phase 2 by appending
SHA-256 hashes of the serialized input arguments and the JSON
response, then re-sign. The resulting Phase 2 ACT constitutes an
unforgeable record that a specific tool was called with specific
arguments and returned a specific result, independently of MCP's
OAuth layer.
This integration requires no modification to MCP transport; the ACT
SHOULD be carried in the `ACT-Mandate` and `ACT-Record` HTTP headers
defined in Section 9.1 of this document.
#### 1.5.2. OpenAI Agents SDK and Function Calling
The OpenAI Agents SDK [OPENAI-AGENTS-SDK] enables composition of
agents via handoffs — structured transfers of control from one agent
to another, each potentially invoking registered function tools. The
SDK provides no built-in mechanism for a receiving agent to verify
that the handoff was authorized by a named principal, nor for the
invoking agent to produce a verifiable record of what functions it
called.
ACT is applicable at the handoff boundary: the orchestrating agent
SHOULD issue a Phase 1 ACT Mandate to the receiving agent at the
moment of handoff, encoding the permitted function set as
capability constraints and the maximum privilege the receiving agent
MAY exercise. The receiving agent SHOULD attach its Phase 2 ACT
Record to any callback or downstream response, providing the
orchestrator with cryptographic evidence of the actions taken. In
multi-turn chains involving multiple handoffs, the DAG linkage
(Section 7) allows each handoff to be expressed as a parent-child
edge, preserving the full causal ordering of the agent invocation
sequence.
Implementations that use the OpenAI function calling API directly,
without the Agents SDK, MAY apply ACT at the application layer: the
calling process issues a Phase 1 ACT before the function call
parameter block is finalized, and the receiving function handler
returns a Phase 2 ACT alongside its JSON result.
#### 1.5.3. LangGraph and LangChain Agent Graphs
LangGraph [LANGGRAPH] models agent workflows as typed StateGraphs in
which nodes represent agent invocations or tool calls and edges
represent conditional transitions. The DAG structure of ACT (Section
7) is a natural fit for this model: each LangGraph node that performs
an observable action corresponds to exactly one ACT task identifier
(`tid`), and directed edges in the LangGraph correspond to `pred`
(predecessor) references in successor ACTs.
ACT is applicable at the node boundary: when a LangGraph node
dispatches a sub-agent or invokes a tool with side effects, it SHOULD
issue a Phase 1 ACT Mandate encoding the node's permitted actions
before any external call is made. Upon transition out of the node,
a Phase 2 ACT Record SHOULD be produced and attached to the
LangGraph state object alongside the node's output. Downstream nodes
that fan-in from multiple predecessors MAY retrieve the set of parent
ACT identifiers from the shared state to populate their `pred` array,
thereby expressing LangGraph's fan-in semantics within the ACT DAG
without any additional infrastructure.
In contrast to LangGraph's built-in state audit trail, which is
mutable in-process memory, Phase 2 ACTs are cryptographically signed
and portable: they can be exported from a LangGraph run and
submitted to an external audit ledger, satisfying compliance
requirements that cannot be met by in-process logging alone.
#### 1.5.4. Google Agent2Agent (A2A) Protocol
The Agent2Agent protocol [A2A-SPEC] defines a task-oriented JSON-RPC
interface for inter-agent communication, with authentication
delegated to OAuth 2.0 or API key schemes declared in each agent's
Agent Card. A2A provides no mechanism for a receiving agent to
verify the authorization provenance of a task request beyond the
transport-layer credential, and produces no token that represents
the execution of the task in a verifiable, portable form.
ACT is applicable as a session-layer accountability complement to
A2A: a client agent SHOULD include a Phase 1 ACT Mandate in the
`metadata` field of the A2A Task object, encoding the task type as
a capability constraint and the delegating agent's identity as the
ACT issuer. The receiving agent SHOULD validate the Mandate before
beginning task execution and SHOULD return a Phase 2 ACT Record
as an artifact in the A2A TaskResult, enabling the client agent to
retain cryptographic proof of what was executed on its behalf.
This integration does not require modification to A2A's transport or
authentication scheme; ACT and A2A's OAuth credentials operate at
independent layers and are not redundant. A2A's credential answers
"is this client permitted to contact this server?"; the ACT Mandate
answers "is this agent permitted to request this specific task
under these constraints?".
#### 1.5.5. Enterprise Orchestration Without WIMSE (CrewAI, AutoGen)
Enterprise orchestration frameworks such as CrewAI [CREWAI] and
AutoGen [AUTOGEN] deploy multi-agent systems within a single
organizational boundary, typically without SPIFFE/SPIRE workload
identity infrastructure. In these environments, OAuth Authorization
Servers are often unavailable or impractical to deploy for intra-
process agent communication.
ACT is applicable in this context via its Tier 1 (pre-shared key)
trust model (Section 5.2): each agent role in a CrewAI Crew or
AutoGen ConversableAgent graph is assigned an Ed25519 keypair at
instantiation time. The orchestrating agent issues Phase 1 Mandates
to worker agents before delegating tasks, constraining each worker
to only the tools and actions relevant to its role. Worker agents
produce Phase 2 Records on task completion. The resulting ACT chain
is exportable as a structured audit trail that satisfies the
per-action logging requirements of DORA [DORA] and EU AI Act
Article 12 [EUAIA] without requiring shared infrastructure beyond
the ability to exchange public keys at deployment time.
Implementations SHOULD NOT use ACT's self-assertion mode (where an
agent issues and records its own mandate without external sign-off)
in regulated workflows; at minimum, the orchestrating agent MUST
sign the initial Mandate so that accountability is anchored to a
principal outside the executing agent.
#### 1.5.6. Relationship to WIMSE ECT
Where WIMSE infrastructure is deployed, ACT and the WIMSE Execution
Context Token [I-D.nennemann-wimse-ect] serve complementary and
non-overlapping functions. The ECT records workload-level execution
in WIMSE terms — which SPIFFE workload executed, in which trust
domain, against which service. ACT records the authorization
provenance — which agent was permitted to request which action,
under what capability constraints, by whose authority — and
transitions that authorization record into an execution record upon
task completion.
In mixed environments, both tokens SHOULD be carried simultaneously:
the `Workload-Identity` header carries the WIMSE ECT; the
`ACT-Record` header carries the ACT. Verifiers MAY correlate the
two by matching the ACT `tid` claim against application-layer
identifiers present in the ECT's task context. Neither token is a
profile or extension of the other; they operate at different
abstraction layers and their co-presence is additive.
---
## Informative References to Add
```
[MCP-SPEC] Model Context Protocol Specification, 2025-11-25,
<https://modelcontextprotocol.io/specification/2025-11-25>
[OPENAI-AGENTS-SDK] OpenAI, "Agents SDK",
<https://openai.github.io/openai-agents-python/>
[LANGGRAPH] LangChain, "LangGraph Documentation",
<https://langchain-ai.github.io/langgraph/>
[A2A-SPEC] Google, "Agent2Agent (A2A) Protocol",
<https://github.com/a2aproject/A2A>
[CREWAI] CrewAI, "CrewAI Documentation",
<https://docs.crewai.com/>
[AUTOGEN] Microsoft, "AutoGen Documentation",
<https://microsoft.github.io/autogen/>
```

View File

@@ -0,0 +1,23 @@
[build-system]
requires = ["setuptools>=68.0"]
build-backend = "setuptools.build_meta"
[project]
name = "ietf-act"
version = "0.1.0"
description = "Agent Context Token (ACT) — JWT-based authorization and execution accountability for AI agents"
requires-python = ">=3.11"
dependencies = [
"cryptography>=42.0",
]
[project.optional-dependencies]
dev = [
"pytest>=8.0",
]
[tool.setuptools.packages.find]
where = ["."]
[tool.pytest.ini_options]
testpaths = ["tests"]

View File

@@ -0,0 +1,145 @@
"""Tests for act.crypto module."""
import pytest
from act.crypto import (
ACTKeyResolver,
KeyRegistry,
X509TrustStore,
b64url_sha256,
compute_sha256,
did_key_from_ed25519,
generate_ed25519_keypair,
generate_p256_keypair,
resolve_did_key,
sign,
verify,
)
from act.errors import ACTKeyResolutionError, ACTSignatureError
class TestEd25519:
def test_generate_keypair(self):
priv, pub = generate_ed25519_keypair()
assert priv is not None
assert pub is not None
def test_sign_verify(self):
priv, pub = generate_ed25519_keypair()
data = b"test data"
sig = sign(priv, data)
verify(pub, sig, data)
def test_verify_wrong_data(self):
priv, pub = generate_ed25519_keypair()
sig = sign(priv, b"correct data")
with pytest.raises(ACTSignatureError):
verify(pub, sig, b"wrong data")
def test_verify_wrong_key(self):
priv1, pub1 = generate_ed25519_keypair()
_, pub2 = generate_ed25519_keypair()
sig = sign(priv1, b"data")
with pytest.raises(ACTSignatureError):
verify(pub2, sig, b"data")
class TestP256:
def test_generate_keypair(self):
priv, pub = generate_p256_keypair()
assert priv is not None
assert pub is not None
def test_sign_verify(self):
priv, pub = generate_p256_keypair()
data = b"test data for p256"
sig = sign(priv, data)
assert len(sig) == 64 # r||s, 32 bytes each
verify(pub, sig, data)
def test_verify_wrong_data(self):
priv, pub = generate_p256_keypair()
sig = sign(priv, b"correct")
with pytest.raises(ACTSignatureError):
verify(pub, sig, b"wrong")
class TestSHA256:
def test_compute(self):
h = compute_sha256(b"hello")
assert len(h) == 32
def test_b64url(self):
result = b64url_sha256(b"hello world")
assert "=" not in result
assert isinstance(result, str)
class TestKeyRegistry:
def test_register_and_get(self):
reg = KeyRegistry()
_, pub = generate_ed25519_keypair()
reg.register("key-1", pub)
assert reg.get("key-1") is pub
assert "key-1" in reg
assert len(reg) == 1
def test_missing_key(self):
reg = KeyRegistry()
assert reg.get("missing") is None
assert "missing" not in reg
class TestDIDKey:
def test_ed25519_roundtrip(self):
_, pub = generate_ed25519_keypair()
did = did_key_from_ed25519(pub)
assert did.startswith("did:key:z6Mk")
resolved = resolve_did_key(did)
# Verify same key by signing/verifying
from cryptography.hazmat.primitives.serialization import Encoding, PublicFormat
original_bytes = pub.public_bytes(Encoding.Raw, PublicFormat.Raw)
resolved_bytes = resolved.public_bytes(Encoding.Raw, PublicFormat.Raw)
assert original_bytes == resolved_bytes
def test_invalid_prefix(self):
with pytest.raises(ACTKeyResolutionError):
resolve_did_key("did:web:example.com")
def test_with_fragment(self):
_, pub = generate_ed25519_keypair()
did = did_key_from_ed25519(pub)
did_with_fragment = f"{did}#{did.split(':')[2]}"
resolved = resolve_did_key(did_with_fragment)
assert resolved is not None
class TestACTKeyResolver:
def test_tier1_resolution(self):
reg = KeyRegistry()
_, pub = generate_ed25519_keypair()
reg.register("my-key", pub)
resolver = ACTKeyResolver(registry=reg)
assert resolver.resolve("my-key") is pub
def test_tier3_did_key(self):
_, pub = generate_ed25519_keypair()
did = did_key_from_ed25519(pub)
resolver = ACTKeyResolver()
resolved = resolver.resolve(did)
assert resolved is not None
def test_unresolvable(self):
resolver = ACTKeyResolver()
with pytest.raises(ACTKeyResolutionError):
resolver.resolve("unknown-kid")
def test_did_web_resolver_callback(self):
_, pub = generate_ed25519_keypair()
def resolver_cb(did: str):
if did == "did:web:example.com":
return pub
return None
resolver = ACTKeyResolver(did_web_resolver=resolver_cb)
result = resolver.resolve("did:web:example.com")
assert result is pub

View File

@@ -0,0 +1,103 @@
"""Tests for act.dag module."""
import time
import pytest
from act.dag import validate_dag
from act.errors import ACTCapabilityError, ACTDAGError
from act.ledger import ACTLedger
from act.token import ACTRecord, Capability, TaskClaim
def make_record(jti, pred=None, exec_act="do.thing", exec_ts=None, cap=None):
"""Helper to create a minimal ACTRecord."""
return ACTRecord(
alg="EdDSA", kid="k", iss="a", sub="b", aud="b",
iat=1772064000, exp=1772064900,
jti=jti,
task=TaskClaim(purpose="t"),
cap=cap or [Capability(action="do.thing")],
exec_act=exec_act,
pred=pred or [],
exec_ts=exec_ts or 1772064100,
status="completed",
)
class TestDAGValidation:
def test_root_task(self):
ledger = ACTLedger()
r = make_record("root-1")
validate_dag(r, ledger)
def test_child_with_parent(self):
ledger = ACTLedger()
parent = make_record("parent-1", exec_ts=1772064050)
ledger.append(parent)
child = make_record("child-1", pred=["parent-1"], exec_ts=1772064100)
validate_dag(child, ledger)
def test_fan_in(self):
ledger = ACTLedger()
p1 = make_record("p1", exec_ts=1772064050)
p2 = make_record("p2", exec_ts=1772064060)
ledger.append(p1)
ledger.append(p2)
child = make_record("child", pred=["p1", "p2"], exec_ts=1772064100)
validate_dag(child, ledger)
def test_duplicate_jti(self):
ledger = ACTLedger()
r = make_record("dup-1")
ledger.append(r)
r2 = make_record("dup-1")
with pytest.raises(ACTDAGError, match="Duplicate"):
validate_dag(r2, ledger)
def test_missing_parent(self):
ledger = ACTLedger()
r = make_record("orphan", pred=["nonexistent"])
with pytest.raises(ACTDAGError, match="not found"):
validate_dag(r, ledger)
def test_self_cycle(self):
ledger = ACTLedger()
r = make_record("cycle", pred=["cycle"])
with pytest.raises(ACTDAGError, match="cycle"):
validate_dag(r, ledger)
def test_indirect_cycle(self):
ledger = ACTLedger()
# a -> b -> a would be a cycle
a = make_record("a", pred=["b"], exec_ts=1772064100)
b = make_record("b", pred=["a"], exec_ts=1772064100)
ledger.append(b)
# When validating a, following pred leads to b,
# which has pred=["a"] — cycle!
with pytest.raises(ACTDAGError, match="cycle"):
validate_dag(a, ledger)
def test_temporal_ordering_violation(self):
ledger = ACTLedger()
parent = make_record("parent", exec_ts=1772064200)
ledger.append(parent)
# Child's exec_ts is way before parent
child = make_record("child", pred=["parent"], exec_ts=1772064100)
with pytest.raises(ACTDAGError, match="Temporal"):
validate_dag(child, ledger)
def test_temporal_within_tolerance(self):
ledger = ACTLedger()
parent = make_record("parent", exec_ts=1772064120)
ledger.append(parent)
# Child exec_ts is slightly before parent but within 30s tolerance
child = make_record("child", pred=["parent"], exec_ts=1772064100)
validate_dag(child, ledger)
def test_bad_exec_act(self):
ledger = ACTLedger()
r = make_record("bad", exec_act="not.authorized",
cap=[Capability(action="do.thing")])
with pytest.raises(ACTCapabilityError):
validate_dag(r, ledger)

View File

@@ -0,0 +1,229 @@
"""Tests for act.delegation module."""
import time
import uuid
import pytest
from act.crypto import generate_ed25519_keypair, sign, verify, compute_sha256
from act.delegation import (
create_delegated_mandate,
verify_capability_subset,
verify_delegation_chain,
)
from act.errors import (
ACTDelegationError,
ACTPrivilegeEscalationError,
)
from act.token import (
ACTMandate,
Capability,
Delegation,
DelegationEntry,
TaskClaim,
_b64url_decode,
encode_jws,
)
@pytest.fixture
def parent_setup():
iss_priv, iss_pub = generate_ed25519_keypair()
mandate = ACTMandate(
alg="EdDSA", kid="iss-key",
iss="agent-a", sub="agent-b", aud="agent-b",
iat=1772064000, exp=1772064900,
jti="parent-jti-1",
task=TaskClaim(purpose="parent_task"),
cap=[
Capability(action="read.data", constraints={"max_records": 10}),
Capability(action="write.result"),
],
delegation=Delegation(depth=0, max_depth=3, chain=[]),
)
sig = sign(iss_priv, mandate.signing_input())
compact = encode_jws(mandate, sig)
return mandate, compact, iss_priv, iss_pub
class TestCreateDelegatedMandate:
def test_basic_delegation(self, parent_setup):
mandate, compact, priv, _ = parent_setup
delegated, _ = create_delegated_mandate(
parent_mandate=mandate, parent_compact=compact,
delegator_private_key=priv,
sub="agent-c", kid="key-b", iss="agent-a", aud="agent-c",
iat=1772064010, exp=1772064600,
jti="child-jti-1",
cap=[Capability(action="read.data", constraints={"max_records": 5})],
task=TaskClaim(purpose="child_task"),
)
assert delegated.delegation.depth == 1
assert len(delegated.delegation.chain) == 1
assert delegated.delegation.chain[0].delegator == "agent-a"
def test_depth_exceeded(self, parent_setup):
mandate, compact, priv, _ = parent_setup
# Set parent to max depth
mandate.delegation = Delegation(depth=3, max_depth=3, chain=[
DelegationEntry(delegator="x", jti="j", sig="s")
for _ in range(3)
])
with pytest.raises(ACTDelegationError, match="exceeds max_depth"):
create_delegated_mandate(
parent_mandate=mandate, parent_compact=compact,
delegator_private_key=priv,
sub="c", kid="k", iss="a", aud="c",
iat=1, exp=2, jti="j",
cap=[Capability(action="read.data", constraints={"max_records": 5})],
task=TaskClaim(purpose="t"),
)
def test_no_del_claim(self):
priv, _ = generate_ed25519_keypair()
mandate = ACTMandate(
alg="EdDSA", kid="k", iss="a", sub="b", aud="b",
iat=1, exp=2,
task=TaskClaim(purpose="t"),
cap=[Capability(action="x.y")],
delegation=None, # no del claim
)
with pytest.raises(ACTDelegationError, match="not permitted"):
create_delegated_mandate(
parent_mandate=mandate, parent_compact="compact",
delegator_private_key=priv,
sub="c", kid="k", iss="a", aud="c",
iat=1, exp=2, jti="j",
cap=[Capability(action="x.y")],
task=TaskClaim(purpose="t"),
)
def test_max_depth_reduction(self, parent_setup):
mandate, compact, priv, _ = parent_setup
delegated, _ = create_delegated_mandate(
parent_mandate=mandate, parent_compact=compact,
delegator_private_key=priv,
sub="c", kid="k", iss="a", aud="c",
iat=1, exp=2, jti="j",
cap=[Capability(action="read.data", constraints={"max_records": 5})],
task=TaskClaim(purpose="t"),
max_depth=2,
)
assert delegated.delegation.max_depth == 2
def test_max_depth_escalation(self, parent_setup):
mandate, compact, priv, _ = parent_setup
with pytest.raises(ACTDelegationError, match="exceeds parent max_depth"):
create_delegated_mandate(
parent_mandate=mandate, parent_compact=compact,
delegator_private_key=priv,
sub="c", kid="k", iss="a", aud="c",
iat=1, exp=2, jti="j",
cap=[Capability(action="read.data", constraints={"max_records": 5})],
task=TaskClaim(purpose="t"),
max_depth=10,
)
class TestCapabilitySubset:
def test_valid_subset(self):
parent = [Capability(action="read.data", constraints={"max_records": 10})]
child = [Capability(action="read.data", constraints={"max_records": 5})]
verify_capability_subset(parent, child)
def test_extra_action(self):
parent = [Capability(action="read.data")]
child = [Capability(action="delete.data")]
with pytest.raises(ACTPrivilegeEscalationError):
verify_capability_subset(parent, child)
def test_numeric_escalation(self):
parent = [Capability(action="read.data", constraints={"max_records": 10})]
child = [Capability(action="read.data", constraints={"max_records": 100})]
with pytest.raises(ACTPrivilegeEscalationError):
verify_capability_subset(parent, child)
def test_sensitivity_escalation(self):
parent = [Capability(action="read.data",
constraints={"data_sensitivity": "confidential"})]
child = [Capability(action="read.data",
constraints={"data_sensitivity": "internal"})]
with pytest.raises(ACTPrivilegeEscalationError):
verify_capability_subset(parent, child)
def test_sensitivity_more_restrictive(self):
parent = [Capability(action="read.data",
constraints={"data_sensitivity": "internal"})]
child = [Capability(action="read.data",
constraints={"data_sensitivity": "restricted"})]
verify_capability_subset(parent, child) # should pass
def test_missing_constraint(self):
parent = [Capability(action="read.data",
constraints={"max_records": 10, "scope": "local"})]
child = [Capability(action="read.data",
constraints={"max_records": 5})]
with pytest.raises(ACTPrivilegeEscalationError, match="missing"):
verify_capability_subset(parent, child)
def test_domain_specific_identical(self):
parent = [Capability(action="read.data",
constraints={"custom": "value_a"})]
child = [Capability(action="read.data",
constraints={"custom": "value_a"})]
verify_capability_subset(parent, child)
def test_domain_specific_different(self):
parent = [Capability(action="read.data",
constraints={"custom": "value_a"})]
child = [Capability(action="read.data",
constraints={"custom": "value_b"})]
with pytest.raises(ACTPrivilegeEscalationError, match="identical"):
verify_capability_subset(parent, child)
class TestVerifyDelegationChain:
def test_chain_sig_verification(self, parent_setup):
mandate, compact, priv, pub = parent_setup
delegated, _ = create_delegated_mandate(
parent_mandate=mandate, parent_compact=compact,
delegator_private_key=priv,
sub="c", kid="k", iss="agent-a", aud="c",
iat=1, exp=2, jti="j",
cap=[Capability(action="read.data", constraints={"max_records": 5})],
task=TaskClaim(purpose="t"),
)
# Verify the chain
def resolve_key(delegator_id):
return pub
def resolve_compact(jti):
if jti == "parent-jti-1":
return compact
return None
verify_delegation_chain(delegated, resolve_key, resolve_compact)
def test_no_delegation(self):
mandate = ACTMandate(
alg="EdDSA", kid="k", iss="a", sub="b", aud="b",
iat=1, exp=2,
task=TaskClaim(purpose="t"),
cap=[Capability(action="x.y")],
)
verify_delegation_chain(mandate, lambda x: None) # no-op
def test_depth_exceeds_max(self):
mandate = ACTMandate(
alg="EdDSA", kid="k", iss="a", sub="b", aud="b",
iat=1, exp=2,
task=TaskClaim(purpose="t"),
cap=[Capability(action="x.y")],
delegation=Delegation(depth=5, max_depth=3, chain=[
DelegationEntry(delegator="x", jti="j", sig="s")
for _ in range(5)
]),
)
with pytest.raises(ACTDelegationError, match="exceeds"):
verify_delegation_chain(mandate, lambda x: None)

View File

@@ -0,0 +1,84 @@
"""Tests for act.ledger module."""
import pytest
from act.errors import ACTLedgerImmutabilityError
from act.ledger import ACTLedger
from act.token import ACTRecord, Capability, TaskClaim
def make_record(jti, wid=None):
return ACTRecord(
alg="EdDSA", kid="k", iss="a", sub="b", aud="b",
iat=1772064000, exp=1772064900,
jti=jti, wid=wid,
task=TaskClaim(purpose="t"),
cap=[Capability(action="do.thing")],
exec_act="do.thing", pred=[], exec_ts=1772064100,
status="completed",
)
class TestACTLedger:
def test_append_and_get(self):
ledger = ACTLedger()
r = make_record("jti-1")
seq = ledger.append(r)
assert seq == 0
assert ledger.get("jti-1") is r
def test_sequential_ordering(self):
ledger = ACTLedger()
for i in range(5):
seq = ledger.append(make_record(f"jti-{i}"))
assert seq == i
def test_duplicate_rejected(self):
ledger = ACTLedger()
ledger.append(make_record("jti-1"))
with pytest.raises(ACTLedgerImmutabilityError):
ledger.append(make_record("jti-1"))
def test_get_missing(self):
ledger = ACTLedger()
assert ledger.get("missing") is None
def test_list_all(self):
ledger = ACTLedger()
ledger.append(make_record("a"))
ledger.append(make_record("b"))
records = ledger.list()
assert len(records) == 2
def test_list_by_wid(self):
ledger = ACTLedger()
ledger.append(make_record("a", wid="w1"))
ledger.append(make_record("b", wid="w2"))
ledger.append(make_record("c", wid="w1"))
assert len(ledger.list("w1")) == 2
assert len(ledger.list("w2")) == 1
assert len(ledger.list("w3")) == 0
def test_verify_integrity_empty(self):
ledger = ACTLedger()
assert ledger.verify_integrity() is True
def test_verify_integrity_with_records(self):
ledger = ACTLedger()
for i in range(10):
ledger.append(make_record(f"jti-{i}"))
assert ledger.verify_integrity() is True
def test_verify_integrity_tampered(self):
ledger = ACTLedger()
ledger.append(make_record("jti-1"))
ledger.append(make_record("jti-2"))
# Tamper with chain hash
ledger._chain_hashes[0] = b"\x00" * 32
assert ledger.verify_integrity() is False
def test_len(self):
ledger = ACTLedger()
assert len(ledger) == 0
ledger.append(make_record("a"))
assert len(ledger) == 1

View File

@@ -0,0 +1,103 @@
"""Tests for act.lifecycle module."""
import time
import uuid
import pytest
from act.crypto import generate_ed25519_keypair, sign
from act.errors import ACTCapabilityError, ACTPhaseError
from act.lifecycle import transition_to_record
from act.token import (
ACTMandate,
ACTRecord,
Capability,
Delegation,
ErrorClaim,
TaskClaim,
decode_jws,
encode_jws,
)
from act.crypto import verify
@pytest.fixture
def keys():
iss_priv, iss_pub = generate_ed25519_keypair()
sub_priv, sub_pub = generate_ed25519_keypair()
return iss_priv, iss_pub, sub_priv, sub_pub
@pytest.fixture
def mandate(keys):
iss_priv, _, _, _ = keys
m = ACTMandate(
alg="EdDSA", kid="iss-key",
iss="agent-a", sub="agent-b", aud="agent-b",
iat=1772064000, exp=1772064900,
jti=str(uuid.uuid4()),
task=TaskClaim(purpose="test"),
cap=[Capability(action="read.data"), Capability(action="write.result")],
delegation=Delegation(depth=0, max_depth=2, chain=[]),
)
return m
class TestTransitionToRecord:
def test_basic_transition(self, mandate, keys):
_, _, sub_priv, sub_pub = keys
record, compact = transition_to_record(
mandate, sub_kid="sub-key", sub_private_key=sub_priv,
exec_act="read.data", pred=[], status="completed",
)
assert isinstance(record, ACTRecord)
assert record.exec_act == "read.data"
assert record.kid == "sub-key"
assert record.iss == mandate.iss # preserved
# Verify signature
_, _, sig, si = decode_jws(compact)
verify(sub_pub, sig, si)
def test_with_hashes(self, mandate, keys):
_, _, sub_priv, _ = keys
record, _ = transition_to_record(
mandate, sub_kid="k", sub_private_key=sub_priv,
exec_act="write.result", pred=[], status="completed",
inp_hash="abc", out_hash="def",
)
assert record.inp_hash == "abc"
assert record.out_hash == "def"
def test_with_error(self, mandate, keys):
_, _, sub_priv, _ = keys
record, _ = transition_to_record(
mandate, sub_kid="k", sub_private_key=sub_priv,
exec_act="read.data", pred=[], status="failed",
err=ErrorClaim(code="timeout", detail="request timed out"),
)
assert record.status == "failed"
assert record.err is not None
assert record.err.code == "timeout"
def test_rejects_bad_exec_act(self, mandate, keys):
_, _, sub_priv, _ = keys
with pytest.raises(ACTCapabilityError):
transition_to_record(
mandate, sub_kid="k", sub_private_key=sub_priv,
exec_act="delete.everything", pred=[],
)
def test_preserves_phase1_claims(self, mandate, keys):
_, _, sub_priv, _ = keys
record, _ = transition_to_record(
mandate, sub_kid="k", sub_private_key=sub_priv,
exec_act="read.data", pred=[], status="completed",
)
assert record.iss == mandate.iss
assert record.sub == mandate.sub
assert record.aud == mandate.aud
assert record.iat == mandate.iat
assert record.exp == mandate.exp
assert record.jti == mandate.jti
assert record.task == mandate.task
assert record.cap == mandate.cap

View File

@@ -0,0 +1,244 @@
"""Tests for act.token module."""
import json
import time
import uuid
import pytest
from act.token import (
ACTMandate,
ACTRecord,
Capability,
Delegation,
DelegationEntry,
ErrorClaim,
Oversight,
TaskClaim,
_b64url_decode,
_b64url_encode,
decode_jws,
encode_jws,
parse_token,
validate_action_name,
)
from act.errors import ACTPhaseError, ACTValidationError
@pytest.fixture
def base_time():
return 1772064000
@pytest.fixture
def mandate(base_time):
return ACTMandate(
alg="EdDSA",
kid="test-key",
iss="agent-a",
sub="agent-b",
aud="agent-b",
iat=base_time,
exp=base_time + 900,
jti=str(uuid.uuid4()),
task=TaskClaim(purpose="test_task"),
cap=[Capability(action="read.data")],
)
class TestBase64url:
def test_roundtrip(self):
data = b"hello world"
assert _b64url_decode(_b64url_encode(data)) == data
def test_no_padding(self):
encoded = _b64url_encode(b"test")
assert "=" not in encoded
class TestActionNameValidation:
def test_valid_simple(self):
validate_action_name("read")
def test_valid_dotted(self):
validate_action_name("read.data")
def test_valid_with_hyphens(self):
validate_action_name("read-write.data_item")
def test_invalid_starts_with_digit(self):
with pytest.raises(ACTValidationError):
validate_action_name("1read")
def test_invalid_empty(self):
with pytest.raises(ACTValidationError):
validate_action_name("")
def test_invalid_double_dot(self):
with pytest.raises(ACTValidationError):
validate_action_name("read..data")
class TestTaskClaim:
def test_roundtrip(self):
t = TaskClaim(purpose="test", data_sensitivity="restricted")
d = t.to_dict()
t2 = TaskClaim.from_dict(d)
assert t == t2
def test_missing_purpose(self):
with pytest.raises(ACTValidationError):
TaskClaim.from_dict({})
class TestCapability:
def test_roundtrip(self):
c = Capability(action="read.data", constraints={"max": 10})
d = c.to_dict()
c2 = Capability.from_dict(d)
assert c == c2
def test_validates_action(self):
with pytest.raises(ACTValidationError):
Capability(action="")
class TestDelegation:
def test_roundtrip(self):
d = Delegation(
depth=1,
max_depth=3,
chain=[DelegationEntry(delegator="a", jti="j1", sig="sig1")],
)
as_dict = d.to_dict()
d2 = Delegation.from_dict(as_dict)
assert d.depth == d2.depth
assert len(d2.chain) == 1
class TestACTMandate:
def test_validate_success(self, mandate):
mandate.validate()
def test_validate_missing_iss(self, base_time):
m = ACTMandate(
alg="EdDSA", kid="k", iss="", sub="b", aud="b",
iat=base_time, exp=base_time + 900,
task=TaskClaim(purpose="t"), cap=[Capability(action="x.y")],
)
with pytest.raises(ACTValidationError, match="iss"):
m.validate()
def test_validate_forbidden_alg(self, base_time):
m = ACTMandate(
alg="HS256", kid="k", iss="a", sub="b", aud="b",
iat=base_time, exp=base_time + 900,
task=TaskClaim(purpose="t"), cap=[Capability(action="x.y")],
)
with pytest.raises(ACTValidationError):
m.validate()
def test_validate_alg_none(self, base_time):
m = ACTMandate(
alg="none", kid="k", iss="a", sub="b", aud="b",
iat=base_time, exp=base_time + 900,
task=TaskClaim(purpose="t"), cap=[Capability(action="x.y")],
)
with pytest.raises(ACTValidationError):
m.validate()
def test_to_claims_includes_optional(self, base_time):
m = ACTMandate(
alg="EdDSA", kid="k", iss="a", sub="b", aud="b",
iat=base_time, exp=base_time + 900,
task=TaskClaim(purpose="t"), cap=[Capability(action="x.y")],
wid="w-1",
oversight=Oversight(requires_approval_for=["x.y"]),
)
claims = m.to_claims()
assert claims["wid"] == "w-1"
assert "oversight" in claims
def test_is_phase2(self, mandate):
assert mandate.is_phase2() is False
def test_from_claims_rejects_phase2(self):
with pytest.raises(ACTPhaseError):
ACTMandate.from_claims(
{"alg": "EdDSA", "typ": "act+jwt", "kid": "k"},
{"exec_act": "x", "iss": "a", "sub": "b", "aud": "b",
"iat": 1, "exp": 2, "jti": "j",
"task": {"purpose": "t"}, "cap": [{"action": "x"}]},
)
class TestACTRecord:
def test_from_mandate(self, mandate):
r = ACTRecord.from_mandate(
mandate, kid="sub-key", exec_act="read.data",
pred=[], status="completed",
)
assert r.iss == mandate.iss
assert r.exec_act == "read.data"
assert r.kid == "sub-key"
def test_validate_bad_status(self, mandate):
r = ACTRecord.from_mandate(
mandate, kid="k", exec_act="read.data",
pred=[], exec_ts=mandate.iat + 100, status="invalid",
)
with pytest.raises(ACTValidationError, match="status"):
r.validate()
def test_is_phase2(self, mandate):
r = ACTRecord.from_mandate(
mandate, kid="k", exec_act="read.data",
pred=[], status="completed",
)
assert r.is_phase2() is True
def test_from_claims_rejects_phase1(self):
with pytest.raises(ACTPhaseError):
ACTRecord.from_claims(
{"alg": "EdDSA", "typ": "act+jwt", "kid": "k"},
{"iss": "a", "sub": "b", "aud": "b",
"iat": 1, "exp": 2, "jti": "j",
"task": {"purpose": "t"}, "cap": [{"action": "x"}]},
)
class TestJWSSerialization:
def test_decode_invalid_parts(self):
with pytest.raises(ACTValidationError):
decode_jws("only.two")
def test_decode_invalid_header(self):
with pytest.raises(ACTValidationError):
decode_jws("!!!.cGF5bG9hZA.c2ln")
def test_decode_wrong_typ(self):
header = _b64url_encode(json.dumps({"alg": "EdDSA", "typ": "jwt", "kid": "k"}).encode())
payload = _b64url_encode(json.dumps({"iss": "a"}).encode())
sig = _b64url_encode(b"sig")
with pytest.raises(ACTValidationError, match="typ"):
decode_jws(f"{header}.{payload}.{sig}")
def test_parse_token_phase1(self, mandate):
from act.crypto import generate_ed25519_keypair, sign
priv, pub = generate_ed25519_keypair()
sig = sign(priv, mandate.signing_input())
compact = encode_jws(mandate, sig)
parsed = parse_token(compact)
assert isinstance(parsed, ACTMandate)
def test_parse_token_phase2(self, mandate):
from act.crypto import generate_ed25519_keypair, sign
priv, pub = generate_ed25519_keypair()
record = ACTRecord.from_mandate(
mandate, kid="k", exec_act="read.data",
pred=[], status="completed",
)
sig = sign(priv, record.signing_input())
compact = encode_jws(record, sig)
parsed = parse_token(compact)
assert isinstance(parsed, ACTRecord)

View File

@@ -0,0 +1,35 @@
"""Tests for act.vectors module — Appendix B test vectors."""
import pytest
from act.vectors import generate_vectors, validate_vectors
class TestVectorGeneration:
def test_generates_15_vectors(self):
vectors, ctx = generate_vectors()
assert len(vectors) == 15
def test_vector_ids(self):
vectors, _ = generate_vectors()
ids = [v.id for v in vectors]
expected = [f"B.{i}" for i in range(1, 16)]
assert ids == expected
def test_valid_vectors_have_compact(self):
vectors, _ = generate_vectors()
for v in vectors:
if v.valid and v.id != "B.7":
assert v.compact, f"{v.id} should have compact"
def test_invalid_vectors_have_exception(self):
vectors, _ = generate_vectors()
for v in vectors:
if not v.valid:
assert v.expected_exception is not None, \
f"{v.id} should have expected_exception"
class TestVectorValidation:
def test_all_vectors_pass(self):
assert validate_vectors() is True

View File

@@ -0,0 +1,191 @@
"""Tests for act.verify module."""
import time
import uuid
import pytest
from act.crypto import (
ACTKeyResolver,
KeyRegistry,
generate_ed25519_keypair,
sign,
)
from act.errors import (
ACTAudienceMismatchError,
ACTCapabilityError,
ACTExpiredError,
ACTPhaseError,
ACTSignatureError,
ACTValidationError,
)
from act.ledger import ACTLedger
from act.lifecycle import transition_to_record
from act.token import (
ACTMandate,
ACTRecord,
Capability,
Delegation,
TaskClaim,
encode_jws,
)
from act.verify import ACTVerifier
@pytest.fixture
def setup():
iss_priv, iss_pub = generate_ed25519_keypair()
sub_priv, sub_pub = generate_ed25519_keypair()
registry = KeyRegistry()
registry.register("iss-key", iss_pub)
registry.register("sub-key", sub_pub)
resolver = ACTKeyResolver(registry=registry)
base_time = 1772064000
return {
"iss_priv": iss_priv, "iss_pub": iss_pub,
"sub_priv": sub_priv, "sub_pub": sub_pub,
"registry": registry, "resolver": resolver,
"base_time": base_time,
}
def make_mandate(setup, **overrides):
bt = setup["base_time"]
defaults = dict(
alg="EdDSA", kid="iss-key",
iss="agent-issuer", sub="agent-subject",
aud="agent-subject",
iat=bt, exp=bt + 900,
jti=str(uuid.uuid4()),
task=TaskClaim(purpose="test"),
cap=[Capability(action="read.data")],
)
defaults.update(overrides)
return ACTMandate(**defaults)
def sign_mandate(mandate, priv_key):
sig = sign(priv_key, mandate.signing_input())
return encode_jws(mandate, sig)
class TestVerifyMandate:
def test_valid_mandate(self, setup):
verifier = ACTVerifier(
setup["resolver"],
verifier_id="agent-subject",
trusted_issuers={"agent-issuer"},
)
mandate = make_mandate(setup)
compact = sign_mandate(mandate, setup["iss_priv"])
result = verifier.verify_mandate(compact, now=setup["base_time"] + 100)
assert result.iss == "agent-issuer"
def test_expired(self, setup):
verifier = ACTVerifier(setup["resolver"], verifier_id="agent-subject")
mandate = make_mandate(setup)
compact = sign_mandate(mandate, setup["iss_priv"])
with pytest.raises(ACTExpiredError):
verifier.verify_mandate(compact, now=setup["base_time"] + 2000)
def test_wrong_audience(self, setup):
verifier = ACTVerifier(
setup["resolver"], verifier_id="other-agent",
trusted_issuers={"agent-issuer"},
)
mandate = make_mandate(setup)
compact = sign_mandate(mandate, setup["iss_priv"])
with pytest.raises(ACTAudienceMismatchError):
verifier.verify_mandate(
compact, now=setup["base_time"] + 100, check_sub=False,
)
def test_untrusted_issuer(self, setup):
verifier = ACTVerifier(
setup["resolver"], verifier_id="agent-subject",
trusted_issuers={"trusted-only"},
)
mandate = make_mandate(setup)
compact = sign_mandate(mandate, setup["iss_priv"])
with pytest.raises(ACTValidationError, match="not trusted"):
verifier.verify_mandate(compact, now=setup["base_time"] + 100)
def test_signature_failure(self, setup):
verifier = ACTVerifier(setup["resolver"], verifier_id="agent-subject")
mandate = make_mandate(setup)
compact = sign_mandate(mandate, setup["iss_priv"])
# Tamper with signature
parts = compact.split(".")
parts[2] = parts[2][:-4] + "XXXX"
tampered = ".".join(parts)
with pytest.raises(ACTSignatureError):
verifier.verify_mandate(tampered, now=setup["base_time"] + 100)
def test_phase2_as_mandate(self, setup):
verifier = ACTVerifier(setup["resolver"])
mandate = make_mandate(setup)
record, compact = transition_to_record(
mandate, sub_kid="sub-key", sub_private_key=setup["sub_priv"],
exec_act="read.data", pred=[], status="completed",
exec_ts=setup["base_time"] + 100,
)
with pytest.raises(ACTPhaseError):
verifier.verify_mandate(compact, now=setup["base_time"] + 100)
def test_future_iat(self, setup):
verifier = ACTVerifier(setup["resolver"], verifier_id="agent-subject")
bt = setup["base_time"]
mandate = make_mandate(setup, iat=bt + 1000, exp=bt + 2000)
compact = sign_mandate(mandate, setup["iss_priv"])
with pytest.raises(ACTValidationError, match="future"):
verifier.verify_mandate(compact, now=bt)
class TestVerifyRecord:
def test_valid_record(self, setup):
verifier = ACTVerifier(
setup["resolver"],
verifier_id="agent-subject",
trusted_issuers={"agent-issuer"},
)
mandate = make_mandate(setup)
record, compact = transition_to_record(
mandate, sub_kid="sub-key", sub_private_key=setup["sub_priv"],
exec_act="read.data", pred=[],
exec_ts=setup["base_time"] + 100, status="completed",
)
result = verifier.verify_record(
compact, now=setup["base_time"] + 200, check_aud=False,
)
assert result.exec_act == "read.data"
def test_wrong_signer(self, setup):
verifier = ACTVerifier(setup["resolver"])
mandate = make_mandate(setup)
record = ACTRecord.from_mandate(
mandate, kid="sub-key", exec_act="read.data",
pred=[], exec_ts=setup["base_time"] + 100, status="completed",
)
# Sign with iss key instead of sub key
sig = sign(setup["iss_priv"], record.signing_input())
compact = encode_jws(record, sig)
with pytest.raises(ACTSignatureError):
verifier.verify_record(compact, now=setup["base_time"] + 200)
def test_with_dag_validation(self, setup):
verifier = ACTVerifier(
setup["resolver"], verifier_id="agent-subject",
trusted_issuers={"agent-issuer"},
)
ledger = ACTLedger()
mandate = make_mandate(setup)
record, compact = transition_to_record(
mandate, sub_kid="sub-key", sub_private_key=setup["sub_priv"],
exec_act="read.data", pred=[],
exec_ts=setup["base_time"] + 100, status="completed",
)
result = verifier.verify_record(
compact, store=ledger,
now=setup["base_time"] + 200, check_aud=False,
)
assert result.status == "completed"

Binary file not shown.

View File

@@ -0,0 +1,111 @@
# WIMSE ECT — Python Reference Implementation
Python reference implementation of [Execution Context Tokens (ECTs)](../../draft-nennemann-wimse-execution-context-01.txt) for WIMSE. Implements ECT creation (ES256), verification (Section 7), DAG validation (Section 6), and an in-memory audit ledger (Section 9).
## Layout
```
python/
├── pyproject.toml
├── ect/ # library
│ ├── __init__.py
│ ├── types.py # Payload, constants
│ ├── create.py # create(), generate_key()
│ ├── verify.py # parse(), verify(), VerifyOptions
│ ├── dag.py # validate_dag(), ECTStore, DAGConfig
│ ├── ledger.py # Ledger, MemoryLedger
│ ├── config.py # Config, load_config_from_env()
│ ├── jti_cache.py # JTICache for replay protection
│ └── validate.py # validate_ext, valid_uuid, validate_hash_format
├── tests/
│ ├── test_create.py
│ └── test_dag.py
├── testdata/
│ └── valid_root_ect_payload.json
└── demo.py # two-agent workflow demo
```
## Install
```bash
cd refimpl/python && pip install -e .
```
## Usage
```python
from ect import (
Payload,
create,
generate_key,
CreateOptions,
verify,
VerifyOptions,
MemoryLedger,
)
cfg = load_config_from_env()
key = generate_key()
payload = Payload(
iss="spiffe://example.com/agent/a",
aud=["spiffe://example.com/agent/b"],
iat=int(time.time()),
exp=int(time.time()) + 600,
jti="550e8400-e29b-41d4-a716-446655440000",
exec_act="review_spec",
pred=[],
ext={
"pol": "policy_v1",
"pol_decision": "approved",
},
)
compact = create(payload, key, cfg.create_options("agent-a-key"))
store = MemoryLedger()
opts = cfg.verify_options()
opts.verifier_id = "spiffe://example.com/agent/b"
opts.resolve_key = lambda kid: key.public_key() if kid == "agent-a-key" else None
opts.store = store
parsed = verify(compact, opts)
store.append(compact, parsed.payload)
```
## Demo
```bash
cd refimpl/python && python3 demo.py
```
## Tests
```bash
cd refimpl/python && python3 -m pytest tests/ -v
```
Unit tests require **90% coverage** minimum (`pytest` is configured with `--cov-fail-under=90` in `pyproject.toml`). Install dev deps: `pip install -e ".[dev]"`. Uncovered lines are mainly abstract base methods and a few verify branches that need manually built tokens.
## draft-01 claim changes
| -00 (previous) | -01 (current) | Notes |
|----------------|---------------|-------|
| `par` | `pred` | Predecessor task IDs |
| `pol`, `pol_decision` | removed (use `ect_ext`) | Policy claims moved to extension object |
| `sub` | not defined | Standard JWT claim, not part of ECT spec |
| `typ: wimse-exec+jwt` | `typ: exec+jwt` (preferred) | Both accepted for backward compat |
| `max_par_length` | `max_pred_length` | Renamed to match `pred` claim |
## Production configuration (environment)
Same env vars as the Go refimpl: `ECT_IAT_MAX_AGE_MINUTES`, `ECT_IAT_MAX_FUTURE_SEC`, `ECT_DEFAULT_EXPIRY_MIN`, `ECT_JTI_REPLAY_CACHE_SIZE`, `ECT_JTI_REPLAY_TTL_MIN`.
### Replay cache (multi-instance)
The provided JTI cache is in-memory only. For multiple verifier instances, use a shared store (Redis, DB) and pass a `jti_seen` callable that checks/records JTIs there. See refimpl/README for an overview.
## Dependencies
- PyJWT, cryptography (ES256).
## License
Same as the Internet-Draft (IETF Trust). Code under Revised BSD per BCP 78/79.

View File

@@ -0,0 +1,102 @@
#!/usr/bin/env python3
"""Two-agent ECT workflow demo: Agent A creates root ECT, Agent B verifies and creates child."""
import time
from ect import (
Payload,
create,
generate_key,
CreateOptions,
verify,
VerifyOptions,
MemoryLedger,
)
def main():
ledger = MemoryLedger()
now = int(time.time())
key_a = generate_key()
agent_a = "spiffe://example.com/agent/spec-reviewer"
agent_b = "spiffe://example.com/agent/implementer"
kid_a = "agent-a-key"
# 1) Agent A creates root ECT (task id = jti per spec)
root_jti = "550e8400-e29b-41d4-a716-446655440001"
payload_a = Payload(
iss=agent_a,
aud=[agent_b],
iat=now,
exp=now + 600,
jti=root_jti,
wid="wf-demo-001",
exec_act="review_requirements_spec",
pred=[],
ext={
"pol": "spec_review_policy_v2",
"pol_decision": "approved",
},
)
ect_a = create(payload_a, key_a, CreateOptions(key_id=kid_a))
print("Agent A created root ECT (jti=550e8400-..., review_requirements_spec)")
# 2) Agent B verifies
def resolve_key(kid):
if kid == kid_a:
return key_a.public_key()
return None
opts = VerifyOptions(
verifier_id=agent_b,
resolve_key=resolve_key,
store=ledger,
now=now,
)
parsed = verify(ect_a, opts)
ledger.append(ect_a, parsed.payload)
print("Agent B verified root ECT and appended to ledger")
# 3) Agent B creates child ECT (pred contains predecessor jti values per spec)
key_b = generate_key()
kid_b = "agent-b-key"
child_jti = "550e8400-e29b-41d4-a716-446655440002"
payload_b = Payload(
iss=agent_b,
aud=["spiffe://example.com/system/ledger"],
iat=now + 1,
exp=now + 600,
jti=child_jti,
wid="wf-demo-001",
exec_act="implement_module",
pred=[root_jti],
ext={
"pol": "coding_standards_v3",
"pol_decision": "approved",
},
)
ect_b = create(payload_b, key_b, CreateOptions(key_id=kid_b))
print("Agent B created child ECT (jti=550e8400-...002, implement_module, pred=[predecessor jti])")
# 4) Verify child ECT with DAG
def resolver_b(kid):
if kid == kid_b:
return key_b.public_key()
if kid == kid_a:
return key_a.public_key()
return None
opts_b = VerifyOptions(
verifier_id="spiffe://example.com/system/ledger",
resolve_key=resolver_b,
store=ledger,
now=now + 2,
)
parsed_b = verify(ect_b, opts_b)
ledger.append(ect_b, parsed_b.payload)
print("Verified child ECT with DAG validation and appended to ledger")
print(f"Ledger entries: {parsed.payload.jti} ({parsed.payload.exec_act}), {parsed_b.payload.jti} ({parsed_b.payload.exec_act})")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,55 @@
# WIMSE Execution Context Tokens (ECT) — Python reference implementation
# draft-nennemann-wimse-execution-context-01
from ect.types import (
ECT_TYPE,
ECT_TYPE_LEGACY,
Payload,
)
from ect.create import create, generate_key, CreateOptions, default_create_options
from ect.verify import (
ParsedECT,
parse,
verify,
VerifyOptions,
default_verify_options,
KeyResolver,
)
from ect.dag import (
ECTStore,
DAGConfig,
default_dag_config,
validate_dag,
)
from ect.ledger import Ledger, MemoryLedger, LedgerEntry, ErrTaskIDExists
from ect.config import Config, default_config, load_config_from_env
from ect.jti_cache import JTICache, new_jti_cache
__all__ = [
"ECT_TYPE",
"ECT_TYPE_LEGACY",
"Payload",
"create",
"generate_key",
"CreateOptions",
"default_create_options",
"ParsedECT",
"parse",
"verify",
"VerifyOptions",
"default_verify_options",
"KeyResolver",
"ECTStore",
"DAGConfig",
"default_dag_config",
"validate_dag",
"Ledger",
"MemoryLedger",
"LedgerEntry",
"ErrTaskIDExists",
"Config",
"default_config",
"load_config_from_env",
"JTICache",
"new_jti_cache",
]

View File

@@ -0,0 +1,61 @@
"""Production config from environment."""
from __future__ import annotations
import os
from dataclasses import dataclass
ENV_IAT_MAX_AGE_MINUTES = "ECT_IAT_MAX_AGE_MINUTES"
ENV_IAT_MAX_FUTURE_SEC = "ECT_IAT_MAX_FUTURE_SEC"
ENV_DEFAULT_EXPIRY_MIN = "ECT_DEFAULT_EXPIRY_MIN"
ENV_JTI_REPLAY_CACHE_SIZE = "ECT_JTI_REPLAY_CACHE_SIZE"
ENV_JTI_REPLAY_TTL_MIN = "ECT_JTI_REPLAY_TTL_MIN"
@dataclass
class Config:
iat_max_age_sec: int = 900
iat_max_future_sec: int = 30
default_expiry_sec: int = 600
jti_replay_size: int = 0
jti_replay_ttl_sec: int = 3600
def create_options(self, key_id: str) -> "CreateOptions":
from ect.create import CreateOptions
return CreateOptions(
key_id=key_id,
default_expiry_sec=self.default_expiry_sec,
)
def verify_options(self) -> "VerifyOptions":
from ect.verify import VerifyOptions
from ect.dag import default_dag_config
return VerifyOptions(
iat_max_age_sec=self.iat_max_age_sec,
iat_max_future_sec=self.iat_max_future_sec,
dag=default_dag_config(),
)
def default_config() -> Config:
return Config()
def _int_env(name: str, default: int) -> int:
v = os.environ.get(name)
if v is None or v == "":
return default
try:
return int(v)
except ValueError:
return default
def load_config_from_env() -> Config:
c = default_config()
c.iat_max_age_sec = _int_env(ENV_IAT_MAX_AGE_MINUTES, 15) * 60
c.iat_max_future_sec = _int_env(ENV_IAT_MAX_FUTURE_SEC, 30)
c.default_expiry_sec = _int_env(ENV_DEFAULT_EXPIRY_MIN, 10) * 60
c.jti_replay_size = _int_env(ENV_JTI_REPLAY_CACHE_SIZE, 0)
c.jti_replay_ttl_sec = _int_env(ENV_JTI_REPLAY_TTL_MIN, 60) * 60
return c

View File

@@ -0,0 +1,104 @@
"""ECT creation: build and sign JWT with ES256."""
from __future__ import annotations
import copy
import time
from dataclasses import dataclass
from typing import Optional
import jwt
from cryptography.hazmat.primitives.asymmetric.ec import EllipticCurvePrivateKey
from ect.types import ECT_TYPE, Payload
from ect.validate import (
DEFAULT_MAX_PRED_LENGTH,
validate_ext,
validate_hash_format,
valid_uuid,
)
@dataclass
class CreateOptions:
key_id: str
iat_max_age_sec: int = 900 # 15 min
default_expiry_sec: int = 600 # 10 min
validate_uuids: bool = False
max_pred_length: int = 0 # 0 = no limit; use DEFAULT_MAX_PRED_LENGTH for 100
def default_create_options() -> CreateOptions:
return CreateOptions(key_id="")
def _validate_payload(p: Payload, opts: CreateOptions) -> None:
if not p.iss:
raise ValueError("ect: iss required")
if not p.aud:
raise ValueError("ect: aud required")
if not p.jti:
raise ValueError("ect: jti required")
if not p.exec_act:
raise ValueError("ect: exec_act required")
if opts.validate_uuids:
if not valid_uuid(p.jti):
raise ValueError("ect: jti must be UUID format")
if p.wid and not valid_uuid(p.wid):
raise ValueError("ect: wid must be UUID format when set")
max_pred = opts.max_pred_length or 0
if max_pred > 0 and len(p.pred) > max_pred:
raise ValueError("ect: pred exceeds max length")
if p.inp_hash:
validate_hash_format(p.inp_hash)
if p.out_hash:
validate_hash_format(p.out_hash)
validate_ext(p.ext)
# compensation in ext per spec
if p.ext and p.ext.get("compensation_reason") and not p.ext.get("compensation_required"):
raise ValueError("ect: ext.compensation_reason requires ext.compensation_required true")
def create(
payload: Payload,
private_key: EllipticCurvePrivateKey,
opts: CreateOptions,
) -> str:
"""Build and sign an ECT. Payload must have required claims; iat/exp can be 0 for defaults.
create() works on a deep copy so the caller's payload is not modified.
"""
if not opts.key_id:
raise ValueError("ect: KeyID required")
# Work on a copy so we do not mutate the caller's payload.
payload = copy.deepcopy(payload)
now = int(time.time())
if payload.iat == 0:
payload.iat = now
if payload.exp == 0:
payload.exp = now + (opts.default_expiry_sec or 600)
if payload.pred is None:
payload.pred = []
_validate_payload(payload, opts)
claims = payload.to_claims()
headers = {
"typ": ECT_TYPE,
"alg": "ES256",
"kid": opts.key_id,
}
return jwt.encode(
claims,
private_key,
algorithm="ES256",
headers=headers,
)
def generate_key() -> EllipticCurvePrivateKey:
"""Create an ECDSA P-256 key for ES256 (testing/demo)."""
from cryptography.hazmat.primitives.asymmetric import ec
return ec.generate_private_key(ec.SECP256R1())

View File

@@ -0,0 +1,96 @@
"""DAG validation per Section 6."""
from __future__ import annotations
from abc import ABC, abstractmethod
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from ect.types import Payload
from ect.validate import DEFAULT_MAX_PRED_LENGTH
DEFAULT_CLOCK_SKEW_TOLERANCE = 30
DEFAULT_MAX_ANCESTOR_LIMIT = 10000
class ECTStore(ABC):
"""Lookup of ECTs by task ID for DAG validation."""
@abstractmethod
def get_by_tid(self, tid: str) -> "Payload | None":
pass
@abstractmethod
def contains(self, tid: str, wid: str) -> bool:
pass
class DAGConfig:
def __init__(
self,
clock_skew_tolerance: int = DEFAULT_CLOCK_SKEW_TOLERANCE,
max_ancestor_limit: int = DEFAULT_MAX_ANCESTOR_LIMIT,
max_pred_length: int = 0,
):
self.clock_skew_tolerance = clock_skew_tolerance or DEFAULT_CLOCK_SKEW_TOLERANCE
self.max_ancestor_limit = max_ancestor_limit or DEFAULT_MAX_ANCESTOR_LIMIT
self.max_pred_length = max_pred_length or 0
def default_dag_config() -> DAGConfig:
return DAGConfig()
def _has_cycle(
target_tid: str,
pred_ids: list[str],
store: ECTStore,
visited: set[str],
max_depth: int,
) -> bool:
if len(visited) >= max_depth:
return True
for pred_id in pred_ids:
if pred_id == target_tid:
return True
if pred_id in visited:
continue
visited.add(pred_id)
pred = store.get_by_tid(pred_id)
if pred is not None:
if _has_cycle(target_tid, pred.pred, store, visited, max_depth):
return True
return False
def validate_dag(
payload: "Payload",
store: ECTStore,
cfg: DAGConfig,
) -> None:
"""Section 6.2: uniqueness (by jti), predecessor existence, temporal ordering, acyclicity, predecessor policy."""
if cfg.max_pred_length > 0 and len(payload.pred) > cfg.max_pred_length:
raise ValueError("ect: pred exceeds max length")
if store.contains(payload.jti, payload.wid or ""):
raise ValueError(f"ect: task ID (jti) already exists: {payload.jti}")
for pred_id in payload.pred:
pred = store.get_by_tid(pred_id)
if pred is None:
raise ValueError(f"ect: predecessor task not found: {pred_id}")
if pred.iat >= payload.iat + cfg.clock_skew_tolerance:
raise ValueError(f"ect: predecessor task not earlier than current: {pred_id}")
visited: set[str] = set()
if _has_cycle(payload.jti, payload.pred, store, visited, cfg.max_ancestor_limit):
raise ValueError("ect: circular dependency or depth limit exceeded")
# Predecessor policy decision: only when predecessor has policy claims in ext per -01
for pred_id in payload.pred:
pred = store.get_by_tid(pred_id)
if pred and pred.has_policy_claims() and pred.pol_decision() in ("rejected", "pending_human_review"):
if not payload.compensation_required():
raise ValueError(
"ect: predecessor has non-approved pol_decision; current ECT must be compensation/remediation or have ext.compensation_required true"
)

View File

@@ -0,0 +1,52 @@
"""JTI replay cache for production verification."""
from __future__ import annotations
import threading
import time
from abc import ABC, abstractmethod
class JTICache(ABC):
@abstractmethod
def seen(self, jti: str) -> bool:
pass
@abstractmethod
def add(self, jti: str) -> None:
pass
class _MemoryJTICache(JTICache):
def __init__(self, max_size: int, ttl_sec: int) -> None:
self._max_size = max_size
self._ttl_sec = ttl_sec
self._by_jti: dict[str, float] = {}
self._lock = threading.RLock()
def seen(self, jti: str) -> bool:
with self._lock:
exp = self._by_jti.get(jti)
if exp is None:
return False
if time.time() > exp:
del self._by_jti[jti]
return False
return True
def add(self, jti: str) -> None:
with self._lock:
now = time.time()
for k, exp in list(self._by_jti.items()):
if now > exp:
del self._by_jti[k]
if self._max_size > 0 and len(self._by_jti) >= self._max_size and jti not in self._by_jti:
# evict one
for k in self._by_jti:
del self._by_jti[k]
break
self._by_jti[jti] = now + self._ttl_sec
def new_jti_cache(max_size: int, ttl_sec: int) -> JTICache:
return _MemoryJTICache(max_size, ttl_sec)

View File

@@ -0,0 +1,97 @@
"""Audit ledger per Section 9."""
from __future__ import annotations
import time
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import TYPE_CHECKING
from ect.types import Payload
if TYPE_CHECKING:
pass
class ErrTaskIDExists(Exception):
"""Raised when appending an ECT whose tid already exists."""
@dataclass
class LedgerEntry:
ledger_sequence: int
task_id: str
agent_id: str
action: str
predecessors: list[str]
ect_jws: str
signature_verified: bool
verification_timestamp: float
stored_timestamp: float
class Ledger(ABC):
"""Append-only audit ledger; lookup by task id (jti)."""
@abstractmethod
def append(self, ect_jws: str, payload: Payload) -> int:
"""Returns new ledger sequence number."""
pass
@abstractmethod
def get_by_tid(self, tid: str) -> Payload | None:
pass
@abstractmethod
def contains(self, tid: str, wid: str) -> bool:
pass
class MemoryLedger(Ledger):
"""In-memory append-only ECT store implementing Ledger and ECTStore."""
def __init__(self) -> None:
self._seq = 0
self._by_tid: dict[str, "Payload"] = {}
self._entries: list[LedgerEntry] = []
self._lock = __import__("threading").Lock()
def append(self, ect_jws: str, payload: Payload) -> int:
if payload is None:
return 0
with self._lock:
wid = payload.wid or ""
if self._contains_locked(payload.jti, wid):
raise ErrTaskIDExists("ect: task ID (jti) already exists in ledger")
self._seq += 1
now = time.time()
entry = LedgerEntry(
ledger_sequence=self._seq,
task_id=payload.jti,
agent_id=payload.iss,
action=payload.exec_act,
predecessors=list(payload.pred) if payload.pred else [],
ect_jws=ect_jws,
signature_verified=True,
verification_timestamp=now,
stored_timestamp=now,
)
self._by_tid[payload.jti] = payload
self._entries.append(entry)
return self._seq
def get_by_tid(self, tid: str) -> Payload | None:
with self._lock:
return self._by_tid.get(tid)
def contains(self, tid: str, wid: str) -> bool:
with self._lock:
return self._contains_locked(tid, wid)
def _contains_locked(self, tid: str, wid: str) -> bool:
p = self._by_tid.get(tid)
if p is None:
return False
if not wid:
return True
return (p.wid or "") == wid

View File

@@ -0,0 +1,106 @@
"""ECT payload and claim types per draft-nennemann-wimse-ect-01 Section 4."""
from __future__ import annotations
import json
from dataclasses import dataclass, field
from typing import Any
# Preferred typ per -01; legacy accepted for backward compatibility.
ECT_TYPE = "exec+jwt"
ECT_TYPE_LEGACY = "wimse-exec+jwt"
def _audience_serialize(aud: list[str]) -> str | list[str]:
if len(aud) == 1:
return aud[0]
return aud
def _audience_deserialize(raw: Any) -> list[str]:
if isinstance(raw, list):
return [str(x) for x in raw]
if isinstance(raw, str):
return [raw]
raise ValueError("aud must be string or array of strings")
@dataclass
class Payload:
"""ECT JWT claims per Section 4. Task identity is jti only; no separate tid per spec."""
iss: str
aud: list[str]
iat: int
exp: int
jti: str
exec_act: str
pred: list[str] # predecessor jti values (renamed from par in -01)
wid: str = ""
inp_hash: str = ""
out_hash: str = ""
inp_classification: str = ""
ext: dict[str, Any] = field(default_factory=dict)
def to_claims(self) -> dict[str, Any]:
"""Export as JWT claims. Policy and compensation in ext per -01 spec."""
out: dict[str, Any] = {
"iss": self.iss,
"aud": _audience_serialize(self.aud),
"iat": self.iat,
"exp": self.exp,
"jti": self.jti,
"exec_act": self.exec_act,
"pred": self.pred,
}
if self.wid:
out["wid"] = self.wid
if self.inp_hash:
out["inp_hash"] = self.inp_hash
if self.out_hash:
out["out_hash"] = self.out_hash
if self.inp_classification:
out["inp_classification"] = self.inp_classification
if self.ext:
out["ect_ext"] = dict(self.ext)
return out
@classmethod
def from_claims(cls, claims: dict[str, Any]) -> Payload:
"""Build Payload from JWT claims. Policy claims read from ext per -01 spec."""
ext = claims.get("ect_ext") or {}
return cls(
iss=claims["iss"],
aud=_audience_deserialize(claims["aud"]),
iat=int(claims["iat"]),
exp=int(claims["exp"]),
jti=claims["jti"],
exec_act=claims["exec_act"],
pred=claims.get("pred") or [],
wid=claims.get("wid", ""),
inp_hash=claims.get("inp_hash", ""),
out_hash=claims.get("out_hash", ""),
inp_classification=claims.get("inp_classification", ""),
ext=ext,
)
def contains_audience(self, verifier_id: str) -> bool:
return verifier_id in self.aud
def compensation_required(self) -> bool:
"""Per spec: compensation_required is in ext."""
if not self.ext:
return False
return bool(self.ext.get("compensation_required"))
def has_policy_claims(self) -> bool:
"""True if both pol and pol_decision are present in ext (per -01, moved to extension)."""
if not self.ext:
return False
return bool(self.ext.get("pol")) and bool(self.ext.get("pol_decision"))
def pol_decision(self) -> str:
"""Return pol_decision from ext, or empty string."""
if not self.ext:
return ""
return str(self.ext.get("pol_decision", ""))

View File

@@ -0,0 +1,62 @@
"""Validation helpers: ext size/depth, UUID, inp_hash/out_hash format."""
from __future__ import annotations
import base64
import json
import re
from typing import Any
EXT_MAX_SIZE = 4096
EXT_MAX_DEPTH = 5
DEFAULT_MAX_PRED_LENGTH = 100
_UUID_RE = re.compile(
r"^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$"
)
def _json_depth(obj: Any, depth: int = 0) -> int:
if depth > EXT_MAX_DEPTH:
return depth
if isinstance(obj, dict):
return max((_json_depth(v, depth + 1) for v in obj.values()), default=depth + 1)
if isinstance(obj, list):
return max((_json_depth(x, depth + 1) for x in obj), default=depth + 1)
return depth
def validate_ext(ext: dict[str, Any] | None) -> None:
"""Raise ValueError if ext exceeds EXT_MAX_SIZE or nesting depth EXT_MAX_DEPTH."""
if not ext:
return
raw = json.dumps(ext)
if len(raw.encode("utf-8")) > EXT_MAX_SIZE:
raise ValueError("ect: ext exceeds max size (4096 bytes)")
if _json_depth(ext) > EXT_MAX_DEPTH:
raise ValueError("ect: ext exceeds max nesting depth (5)")
def valid_uuid(s: str) -> bool:
"""Return True if s is a UUID string (RFC 9562)."""
return bool(_UUID_RE.match(s))
def validate_hash_format(s: str) -> None:
"""Raise ValueError if s is non-empty and not plain base64url per RFC 9449 / ECT spec.
The ECT spec (draft-nennemann-wimse-ect-01) and RFC 9449 specify
``base64url(SHA-256(data))`` — a plain base64url string without any
algorithm prefix. This matches how ACT handles hashes.
"""
if not s:
return
# Reject strings containing non-base64url characters.
# base64url alphabet: A-Z a-z 0-9 - _ (no padding '=' expected)
if not re.fullmatch(r"[A-Za-z0-9_-]+", s):
raise ValueError("ect: inp_hash/out_hash must be plain base64url (no prefix)")
# Verify it actually decodes.
pad = 4 - len(s) % 4
padded = s + "=" * pad if pad != 4 else s
try:
base64.urlsafe_b64decode(padded)
except Exception:
raise ValueError("ect: inp_hash/out_hash must be plain base64url (no prefix)") from None

View File

@@ -0,0 +1,154 @@
"""ECT verification per Section 7."""
from __future__ import annotations
import hmac
import time
from dataclasses import dataclass
from typing import Callable, Optional
import jwt
from cryptography.hazmat.primitives.asymmetric.ec import EllipticCurvePublicKey
from ect.types import ECT_TYPE, ECT_TYPE_LEGACY, Payload
from ect.dag import ECTStore, DAGConfig, validate_dag
from ect.validate import validate_ext, validate_hash_format, valid_uuid
@dataclass
class ParsedECT:
header: dict
payload: Payload
raw: str
KeyResolver = Callable[[str], Optional[EllipticCurvePublicKey]]
@dataclass
class VerifyOptions:
verifier_id: str = ""
resolve_key: Optional[KeyResolver] = None
store: Optional[ECTStore] = None
dag: Optional[DAGConfig] = None
now: Optional[int] = None # unix seconds; None = time.time()
iat_max_age_sec: int = 900
iat_max_future_sec: int = 30
jti_seen: Optional[Callable[[str], bool]] = None
wit_subject: str = ""
validate_uuids: bool = False
max_pred_length: int = 0 # 0 = no limit
on_verify_attempt: Optional[Callable[[str, Optional[Exception]], None]] = None # (jti, err) for observability
def default_verify_options() -> VerifyOptions:
from ect.dag import default_dag_config
return VerifyOptions(dag=default_dag_config())
def parse(compact: str) -> ParsedECT:
"""Parse compact JWS and return header + payload without verification."""
try:
unverified = jwt.decode(
compact,
options={"verify_signature": False, "verify_exp": False},
)
except Exception as e:
raise ValueError(f"ect: parse failed: {e}") from e
header = jwt.get_unverified_header(compact)
if header.get("alg") != "ES256":
raise ValueError("ect: expected ES256")
payload = Payload.from_claims(unverified)
return ParsedECT(header=header, payload=payload, raw=compact)
def verify(compact: str, opts: VerifyOptions) -> ParsedECT:
"""Full Section 7 verification and optional DAG validation."""
log_jti: list[str] = [""] # use list so callback sees updated jti
def set_log_jti(jti: str) -> None:
log_jti[0] = jti
err: Optional[Exception] = None
try:
return _verify_impl(compact, opts, set_log_jti)
except Exception as e:
err = e
raise
finally:
if opts.on_verify_attempt is not None:
opts.on_verify_attempt(log_jti[0], err)
def _verify_impl(compact: str, opts: VerifyOptions, set_log_jti: Callable[[str], None]) -> ParsedECT:
header = jwt.get_unverified_header(compact)
typ = header.get("typ") or ""
# Constant-time comparison for typ; accept both preferred and legacy values
if not hmac.compare_digest(typ, ECT_TYPE) and not hmac.compare_digest(typ, ECT_TYPE_LEGACY):
raise ValueError("ect: invalid typ parameter")
alg = header.get("alg")
if alg in ("none", "HS256", "HS384", "HS512"):
raise ValueError("ect: prohibited algorithm")
kid = header.get("kid")
if not kid:
raise ValueError("ect: missing kid")
if not opts.resolve_key:
raise ValueError("ect: ResolveKey required")
pub = opts.resolve_key(kid)
if pub is None:
raise ValueError("ect: unknown key identifier")
try:
claims = jwt.decode(
compact,
pub,
algorithms=["ES256"],
options={"verify_exp": False, "verify_aud": False, "verify_iat": False},
)
except jwt.InvalidSignatureError as e:
raise ValueError(f"ect: invalid signature: {e}") from e
except Exception as e:
raise ValueError(f"ect: verify failed: {e}") from e
payload = Payload.from_claims(claims)
set_log_jti(payload.jti)
validate_ext(payload.ext)
if opts.max_pred_length > 0 and len(payload.pred) > opts.max_pred_length:
raise ValueError("ect: pred exceeds max length")
if opts.validate_uuids:
if not valid_uuid(payload.jti):
raise ValueError("ect: jti must be UUID format")
if payload.wid and not valid_uuid(payload.wid):
raise ValueError("ect: wid must be UUID format when set")
if payload.inp_hash:
validate_hash_format(payload.inp_hash)
if payload.out_hash:
validate_hash_format(payload.out_hash)
if opts.wit_subject and payload.iss != opts.wit_subject:
raise ValueError("ect: issuer does not match WIT subject")
if opts.verifier_id and not payload.contains_audience(opts.verifier_id):
raise ValueError("ect: audience does not include verifier")
now = opts.now if opts.now is not None else int(time.time())
if now > payload.exp:
raise ValueError("ect: token expired")
if now - payload.iat > opts.iat_max_age_sec:
raise ValueError("ect: iat too far in the past")
if payload.iat > now + opts.iat_max_future_sec:
raise ValueError("ect: iat in the future")
# Required claims per spec: jti, exec_act, pred. pred may be set to [] when missing (from_claims already uses []).
if not payload.jti or not payload.exec_act:
raise ValueError("ect: missing required claims (jti, exec_act, pred)")
if payload.pred is None:
payload.pred = []
if opts.store is not None and opts.dag is not None:
validate_dag(payload, opts.store, opts.dag)
if opts.jti_seen is not None and opts.jti_seen(payload.jti):
raise ValueError("ect: jti already seen (replay)")
return ParsedECT(header=header, payload=payload, raw=compact)

View File

@@ -0,0 +1,25 @@
[build-system]
requires = ["setuptools>=61", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "ietf-ect"
version = "0.1.0"
description = "WIMSE Execution Context Tokens (ECT) reference implementation"
requires-python = ">=3.9"
license = "BSD-3-Clause"
dependencies = [
"PyJWT>=2.8.0",
"cryptography>=42.0.0",
]
[project.optional-dependencies]
dev = ["pytest>=7.0", "pytest-cov>=4.0"]
[tool.pytest.ini_options]
testpaths = ["tests"]
pythonpath = ["."]
addopts = "--cov=ect --cov-report=term-missing --cov-fail-under=90 -v"
[tool.setuptools.packages.find]
include = ["ect*"]

View File

@@ -0,0 +1 @@
{"iss":"spiffe://example.com/agent/clinical","aud":"spiffe://example.com/agent/safety","iat":1772064150,"exp":1772064750,"jti":"7f3a8b2c-d1e4-4f56-9a0b-c3d4e5f6a7b8","wid":"a0b1c2d3-e4f5-6789-abcd-ef0123456789","exec_act":"recommend_treatment","pred":[],"ect_ext":{"pol":"clinical_reasoning_policy_v2","pol_decision":"approved"}}

View File

@@ -0,0 +1 @@
# Tests package

View File

@@ -0,0 +1,49 @@
"""Tests for config module."""
import os
import pytest
from ect import default_config, load_config_from_env
from ect.config import ENV_IAT_MAX_AGE_MINUTES, ENV_JTI_REPLAY_CACHE_SIZE
def test_default_config():
c = default_config()
assert c.iat_max_age_sec == 900
assert c.jti_replay_size == 0
def test_load_config_from_env():
os.environ[ENV_IAT_MAX_AGE_MINUTES] = "20"
os.environ[ENV_JTI_REPLAY_CACHE_SIZE] = "500"
try:
c = load_config_from_env()
assert c.iat_max_age_sec == 20 * 60
assert c.jti_replay_size == 500
finally:
os.environ.pop(ENV_IAT_MAX_AGE_MINUTES, None)
os.environ.pop(ENV_JTI_REPLAY_CACHE_SIZE, None)
def test_config_create_options():
c = default_config()
opts = c.create_options("my-kid")
assert opts.key_id == "my-kid"
assert opts.default_expiry_sec == c.default_expiry_sec
def test_config_verify_options():
c = default_config()
opts = c.verify_options()
assert opts.iat_max_age_sec == c.iat_max_age_sec
assert opts.dag is not None
def test_load_config_invalid_int():
os.environ[ENV_IAT_MAX_AGE_MINUTES] = "bad"
try:
c = load_config_from_env()
assert c.iat_max_age_sec == 900
finally:
os.environ.pop(ENV_IAT_MAX_AGE_MINUTES, None)

View File

@@ -0,0 +1,74 @@
"""Tests for ECT creation and roundtrip."""
import json
import os
import time
import pytest
from ect import (
Payload,
create,
generate_key,
CreateOptions,
verify,
VerifyOptions,
)
def test_create_roundtrip():
key = generate_key()
now = int(time.time())
payload = Payload(
iss="spiffe://example.com/agent/a",
aud=["spiffe://example.com/agent/b"],
iat=now,
exp=now + 600,
jti="e4f5a6b7-c8d9-0123-ef01-234567890abc",
exec_act="review_spec",
pred=[],
)
compact = create(payload, key, CreateOptions(key_id="agent-a-key-1"))
assert compact
def resolver(kid):
if kid == "agent-a-key-1":
return key.public_key()
return None
opts = VerifyOptions(
verifier_id="spiffe://example.com/agent/b",
resolve_key=resolver,
now=now,
)
parsed = verify(compact, opts)
assert parsed.payload.jti == payload.jti
assert parsed.payload.exec_act == payload.exec_act
def test_create_with_test_vector():
path = os.path.join(os.path.dirname(__file__), "..", "testdata", "valid_root_ect_payload.json")
if not os.path.exists(path):
pytest.skip(f"test vector not found: {path}")
with open(path) as f:
data = json.load(f)
payload = Payload.from_claims(data)
key = generate_key()
now = int(time.time())
payload.iat = now
payload.exp = now + 600
compact = create(payload, key, CreateOptions(key_id="test-kid"))
assert compact
def resolver(kid):
if kid == "test-kid":
return key.public_key()
return None
opts = VerifyOptions(
verifier_id=payload.aud[0],
resolve_key=resolver,
now=now,
)
verify(compact, opts)

View File

@@ -0,0 +1,94 @@
"""Additional tests for create module."""
import time
import pytest
from ect import Payload, create, generate_key, CreateOptions, default_create_options
def test_default_create_options():
opts = default_create_options()
assert opts.key_id == ""
def test_create_errors():
key = generate_key()
p = Payload(iss="i", aud=["a"], iat=1, exp=2, jti="j", exec_act="e", pred=[])
with pytest.raises(ValueError, match="KeyID|required"):
create(p, key, CreateOptions(key_id=""))
with pytest.raises((ValueError, TypeError, AttributeError)):
create(None, key, CreateOptions(key_id="k"))
def test_create_optional_pol():
key = generate_key()
now = int(time.time())
p = Payload(
iss="iss", aud=["a"], iat=now, exp=now + 3600,
jti="jti-nopol", exec_act="act", pred=[],
)
compact = create(p, key, CreateOptions(key_id="kid"))
assert compact
def test_create_validation_errors():
key = generate_key()
base = dict(iss="i", aud=["a"], iat=1, exp=2, jti="j", exec_act="e", pred=[])
with pytest.raises(ValueError, match="iss"):
create(Payload(**{**base, "iss": ""}), key, CreateOptions(key_id="k"))
with pytest.raises(ValueError, match="aud"):
create(Payload(**{**base, "aud": []}), key, CreateOptions(key_id="k"))
with pytest.raises(ValueError, match="jti"):
create(Payload(**{**base, "jti": ""}), key, CreateOptions(key_id="k"))
with pytest.raises(ValueError, match="exec_act"):
create(Payload(**{**base, "exec_act": ""}), key, CreateOptions(key_id="k"))
def test_create_ext_compensation_reason_requires_required():
key = generate_key()
p = Payload(
iss="i", aud=["a"], iat=1, exp=2, jti="j", exec_act="e", pred=[],
ext={"compensation_reason": "rollback", "compensation_required": False},
)
with pytest.raises(ValueError, match="compensation_required"):
create(p, key, CreateOptions(key_id="k"))
def test_create_zero_expiry_uses_default():
key = generate_key()
p = Payload(iss="i", aud=["a"], iat=0, exp=0, jti="j", exec_act="e", pred=[])
compact = create(p, key, CreateOptions(key_id="k", default_expiry_sec=300))
assert compact
# create() works on a copy; decode the token to verify defaults were applied
import jwt
claims = jwt.decode(compact, options={"verify_signature": False})
assert claims["exp"] > claims["iat"]
def test_create_validate_uuids_rejects_non_uuid_jti():
key = generate_key()
now = int(time.time())
p = Payload(iss="i", aud=["a"], iat=now, exp=now + 3600, jti="not-a-uuid", exec_act="e", pred=[])
with pytest.raises(ValueError, match="jti must be UUID"):
create(p, key, CreateOptions(key_id="k", validate_uuids=True))
def test_create_max_pred_length():
key = generate_key()
now = int(time.time())
p = Payload(iss="i", aud=["a"], iat=now, exp=now + 3600, jti="550e8400-e29b-41d4-a716-446655440000", exec_act="e", pred=["p1", "p2"])
with pytest.raises(ValueError, match="pred exceeds max length"):
create(p, key, CreateOptions(key_id="k", max_pred_length=1))
def test_create_ext_size_rejected():
from ect.validate import EXT_MAX_SIZE
key = generate_key()
now = int(time.time())
p = Payload(
iss="i", aud=["a"], iat=now, exp=now + 3600, jti="550e8400-e29b-41d4-a716-446655440000", exec_act="e", pred=[],
ext={"x": "y" * (EXT_MAX_SIZE - 5)},
)
with pytest.raises(ValueError, match="ext exceeds max size"):
create(p, key, CreateOptions(key_id="k"))

View File

@@ -0,0 +1,111 @@
"""Tests for DAG validation."""
import time
import pytest
from ect import Payload, MemoryLedger, validate_dag, default_dag_config
def test_validate_dag_root():
store = MemoryLedger()
payload = Payload(
iss="",
aud=[],
iat=0,
exp=0,
jti="jti-001",
exec_act="",
pred=[],
wid="wf-1",
)
validate_dag(payload, store, default_dag_config())
def test_validate_dag_duplicate_jti():
store = MemoryLedger()
p = Payload(
iss="x",
aud=["y"],
iat=0,
exp=0,
jti="jti-001",
exec_act="a",
pred=[],
wid="wf-1",
)
store.append("dummy-jws", p)
payload = Payload(
iss="",
aud=[],
iat=0,
exp=0,
jti="jti-001",
exec_act="",
pred=[],
wid="wf-1",
)
with pytest.raises(ValueError, match="task ID.*already exists"):
validate_dag(payload, store, default_dag_config())
def test_validate_dag_pred_exists():
store = MemoryLedger()
now = int(time.time())
p = Payload(
iss="x",
aud=["y"],
iat=now - 60,
exp=now + 600,
jti="jti-001",
exec_act="a",
pred=[],
wid="wf-1",
)
store.append("jws1", p)
payload = Payload(
iss="",
aud=[],
iat=now,
exp=now + 600,
jti="jti-002",
exec_act="b",
pred=["jti-001"],
wid="wf-1",
)
validate_dag(payload, store, default_dag_config())
def test_validate_dag_pred_not_found():
store = MemoryLedger()
now = int(time.time())
payload = Payload(
iss="",
aud=[],
iat=now,
exp=now + 600,
jti="jti-002",
exec_act="",
pred=["jti-missing"],
)
with pytest.raises(ValueError, match="predecessor task not found"):
validate_dag(payload, store, default_dag_config())
def test_validate_dag_pred_policy_rejected_requires_compensation():
store = MemoryLedger()
now = int(time.time())
p = Payload(
iss="x", aud=["y"], iat=now - 60, exp=now + 600,
jti="jti-rej", exec_act="a", pred=[], wid="wf-1",
ext={"pol": "p", "pol_decision": "rejected"},
)
store.append("jws1", p)
payload = Payload(
iss="", aud=[], iat=now, exp=now + 600,
jti="jti-child", exec_act="b", pred=["jti-rej"], wid="wf-1",
)
with pytest.raises(ValueError, match="compensation"):
validate_dag(payload, store, default_dag_config())
payload.ext = {"compensation_required": True}
validate_dag(payload, store, default_dag_config())

View File

@@ -0,0 +1,40 @@
"""Tests for JTI replay cache."""
import time
import pytest
from ect import new_jti_cache
def test_jti_cache_seen_and_add():
cache = new_jti_cache(10, 60)
assert cache.seen("jti-1") is False
cache.add("jti-1")
assert cache.seen("jti-1") is True
assert cache.seen("jti-2") is False
cache.add("jti-2")
assert cache.seen("jti-2") is True
def test_jti_cache_expiry():
cache = new_jti_cache(10, 1) # 1 second TTL
cache.add("jti-1")
assert cache.seen("jti-1") is True
time.sleep(1.1)
assert cache.seen("jti-1") is False
def test_jti_cache_max_size_eviction():
cache = new_jti_cache(2, 60)
cache.add("jti-1")
cache.add("jti-2")
cache.add("jti-3")
assert cache.seen("jti-3") is True
def test_jti_cache_add_when_already_present():
cache = new_jti_cache(2, 60)
cache.add("jti-1")
cache.add("jti-1")
assert cache.seen("jti-1") is True

View File

@@ -0,0 +1,38 @@
"""Additional tests for ledger module."""
import time
import pytest
from ect import Payload, MemoryLedger, ErrTaskIDExists
def test_ledger_append_and_get():
m = MemoryLedger()
p = Payload(iss="i", aud=["a"], iat=1, exp=2, jti="j1", exec_act="act", pred=[])
seq = m.append("jws1", p)
assert seq == 1
assert m.get_by_tid("j1").jti == "j1"
def test_ledger_err_task_id_exists():
m = MemoryLedger()
p = Payload(iss="i", aud=["a"], iat=1, exp=2, jti="j-dup", exec_act="e", pred=[])
m.append("jws1", p)
with pytest.raises(ErrTaskIDExists):
m.append("jws2", p)
def test_ledger_contains_wid():
m = MemoryLedger()
p = Payload(iss="i", aud=["a"], iat=1, exp=2, jti="j1", exec_act="e", pred=[], wid="wf1")
m.append("jws", p)
assert m.contains("j1", "") is True
assert m.contains("j1", "wf1") is True
assert m.contains("j1", "wf2") is False
def test_ledger_append_none():
m = MemoryLedger()
seq = m.append("jws", None)
assert seq == 0

View File

@@ -0,0 +1,64 @@
"""Additional tests for types module."""
import pytest
from ect import Payload
def test_payload_contains_audience():
p = Payload(iss="", aud=["a", "b"], iat=0, exp=0, jti="", exec_act="", pred=[])
assert p.contains_audience("a") is True
assert p.contains_audience("c") is False
def test_payload_compensation_required():
p = Payload(iss="", aud=[], iat=0, exp=0, jti="", exec_act="", pred=[])
assert p.compensation_required() is False
p.ext = {"compensation_required": True}
assert p.compensation_required() is True
def test_payload_has_policy_claims():
p = Payload(iss="", aud=[], iat=0, exp=0, jti="", exec_act="", pred=[],
ext={"pol": "p", "pol_decision": "approved"})
assert p.has_policy_claims() is True
p.ext = {"pol_decision": "approved"}
assert p.has_policy_claims() is False
p.ext = None
assert p.has_policy_claims() is False
def test_payload_pol_decision():
p = Payload(iss="", aud=[], iat=0, exp=0, jti="", exec_act="", pred=[],
ext={"pol_decision": "rejected"})
assert p.pol_decision() == "rejected"
p.ext = None
assert p.pol_decision() == ""
def test_payload_to_claims_optional():
p = Payload(iss="i", aud=["a"], iat=1, exp=2, jti="j", exec_act="e", pred=[], wid="wf")
claims = p.to_claims()
assert claims["wid"] == "wf"
assert "ect_ext" not in claims or not claims.get("ect_ext")
def test_payload_from_claims_aud_string():
claims = {"iss": "i", "aud": "single", "iat": 1, "exp": 2, "jti": "j", "exec_act": "e", "pred": []}
p = Payload.from_claims(claims)
assert p.aud == ["single"]
def test_payload_to_claims_all_optional():
p = Payload(
iss="i", aud=["a"], iat=1, exp=2, jti="j", exec_act="e", pred=[],
wid="w", inp_hash="h", out_hash="o", inp_classification="c",
ext={"pol": "p", "pol_decision": "approved"},
)
claims = p.to_claims()
assert claims["wid"] == "w"
assert claims["inp_hash"] == "h"
assert claims["out_hash"] == "o"
assert claims["inp_classification"] == "c"
assert claims["ect_ext"]["pol"] == "p"
assert claims["ect_ext"]["pol_decision"] == "approved"

View File

@@ -0,0 +1,64 @@
"""Tests for validate module."""
import json
import pytest
from ect.validate import (
EXT_MAX_DEPTH,
EXT_MAX_SIZE,
validate_ext,
validate_hash_format,
valid_uuid,
)
def test_valid_uuid():
assert valid_uuid("550e8400-e29b-41d4-a716-446655440000") is True
assert valid_uuid("00000000-0000-0000-0000-000000000000") is True
assert valid_uuid("") is False
assert valid_uuid("not-a-uuid") is False
assert valid_uuid("550e8400e29b41d4a716446655440000") is False # no dashes
def test_validate_ext_none():
validate_ext(None)
validate_ext({})
def test_validate_ext_size():
# Serialized JSON must exceed EXT_MAX_SIZE (4096) bytes
big = {"x": "y" * (EXT_MAX_SIZE - 2)} # "{\"x\":\"...\"}" + payload
raw = json.dumps(big)
assert len(raw.encode("utf-8")) > EXT_MAX_SIZE
with pytest.raises(ValueError, match="max size"):
validate_ext(big)
def test_validate_ext_depth():
deep = {"a": 1}
for _ in range(EXT_MAX_DEPTH):
deep = {"n": deep}
with pytest.raises(ValueError, match="depth"):
validate_ext(deep)
def test_validate_hash_format_empty():
validate_hash_format("")
def test_validate_hash_format_ok():
# Plain base64url per RFC 9449 / ECT spec (no algorithm prefix)
validate_hash_format("YQ")
validate_hash_format("dBjftJeZ4CVP-mB92K27uhbUJU1p1r_wW1gFWFOEjXk")
validate_hash_format("abc123-_XYZ")
def test_validate_hash_format_bad():
# Colon is not valid base64url — rejects old prefixed format
with pytest.raises(ValueError, match="plain base64url"):
validate_hash_format("sha-256:YQ")
with pytest.raises(ValueError, match="plain base64url"):
validate_hash_format("not valid!!")
# Null byte in payload
with pytest.raises(ValueError, match="plain base64url"):
validate_hash_format("YQ\x00")

View File

@@ -0,0 +1,194 @@
"""Tests for verify module."""
import time
import pytest
from ect import (
Payload,
create,
generate_key,
CreateOptions,
parse,
verify,
VerifyOptions,
default_verify_options,
)
def test_parse():
key = generate_key()
now = int(time.time())
p = Payload(
iss="iss", aud=["a"], iat=now, exp=now + 3600,
jti="jti-parse", exec_act="act", pred=[],
)
compact = create(p, key, CreateOptions(key_id="kid"))
parsed = parse(compact)
assert parsed.payload.jti == "jti-parse"
assert parsed.raw == compact
def test_default_verify_options():
opts = default_verify_options()
assert opts.dag is not None
assert opts.iat_max_age_sec == 900
def test_verify_expired():
key = generate_key()
now = int(time.time())
p = Payload(
iss="iss", aud=["v"], iat=now - 3600, exp=now - 60,
jti="jti-exp", exec_act="act", pred=[],
)
compact = create(p, key, CreateOptions(key_id="kid"))
resolver = lambda kid: key.public_key() if kid == "kid" else None
with pytest.raises(ValueError, match="expired"):
verify(compact, VerifyOptions(verifier_id="v", resolve_key=resolver, now=now))
def test_verify_replay():
key = generate_key()
now = int(time.time())
p = Payload(
iss="iss", aud=["v"], iat=now, exp=now + 3600,
jti="jti-replay", exec_act="act", pred=[],
)
compact = create(p, key, CreateOptions(key_id="kid"))
resolver = lambda kid: key.public_key() if kid == "kid" else None
with pytest.raises(ValueError, match="replay"):
verify(compact, VerifyOptions(
verifier_id="v", resolve_key=resolver, now=now,
jti_seen=lambda j: j == "jti-replay",
))
def test_verify_invalid_typ():
import jwt as jwt_lib
with pytest.raises((ValueError, jwt_lib.exceptions.DecodeError)):
verify("not-a-jws", VerifyOptions())
def test_verify_audience_mismatch():
key = generate_key()
now = int(time.time())
p = Payload(
iss="iss", aud=["other"], iat=now, exp=now + 3600,
jti="jti-a", exec_act="act", pred=[],
)
compact = create(p, key, CreateOptions(key_id="kid"))
resolver = lambda kid: key.public_key() if kid == "kid" else None
with pytest.raises(ValueError, match="audience"):
verify(compact, VerifyOptions(verifier_id="verifier", resolve_key=resolver, now=now))
def test_verify_wit_subject_mismatch():
key = generate_key()
now = int(time.time())
p = Payload(
iss="wrong-iss", aud=["v"], iat=now, exp=now + 3600,
jti="jti-w", exec_act="act", pred=[],
)
compact = create(p, key, CreateOptions(key_id="kid"))
resolver = lambda kid: key.public_key() if kid == "kid" else None
with pytest.raises(ValueError, match="WIT subject"):
verify(compact, VerifyOptions(
verifier_id="v", resolve_key=resolver, now=now, wit_subject="correct-iss",
))
def test_verify_iat_too_old():
key = generate_key()
now = int(time.time())
p = Payload(
iss="iss", aud=["v"], iat=now - 2000, exp=now + 3600,
jti="jti-old", exec_act="act", pred=[],
)
compact = create(p, key, CreateOptions(key_id="kid"))
resolver = lambda kid: key.public_key() if kid == "kid" else None
with pytest.raises(ValueError, match="iat"):
verify(compact, VerifyOptions(
verifier_id="v", resolve_key=resolver, now=now, iat_max_age_sec=900,
))
def test_verify_unknown_key():
key = generate_key()
now = int(time.time())
p = Payload(
iss="iss", aud=["v"], iat=now, exp=now + 3600,
jti="jti-k", exec_act="act", pred=[],
)
compact = create(p, key, CreateOptions(key_id="kid"))
resolver = lambda kid: None # unknown key
with pytest.raises(ValueError, match="unknown key"):
verify(compact, VerifyOptions(verifier_id="v", resolve_key=resolver, now=now))
def test_verify_resolve_key_required():
key = generate_key()
now = int(time.time())
p = Payload(
iss="iss", aud=["v"], iat=now, exp=now + 3600,
jti="jti-r", exec_act="act", pred=[],
)
compact = create(p, key, CreateOptions(key_id="kid"))
with pytest.raises(ValueError, match="ResolveKey"):
verify(compact, VerifyOptions(verifier_id="v", resolve_key=None))
def test_verify_with_dag():
from ect import MemoryLedger
key = generate_key()
ledger = MemoryLedger()
now = int(time.time())
root = Payload(
iss="iss", aud=["v"], iat=now, exp=now + 3600,
jti="jti-root", exec_act="act", pred=[],
)
compact_root = create(root, key, CreateOptions(key_id="kid"))
resolver = lambda kid: key.public_key() if kid == "kid" else None
opts = VerifyOptions(verifier_id="v", resolve_key=resolver, store=ledger, now=now)
parsed = verify(compact_root, opts)
ledger.append(compact_root, parsed.payload)
child = Payload(
iss="iss", aud=["v"], iat=now + 1, exp=now + 3600,
jti="jti-child", exec_act="act2", pred=["jti-root"],
)
compact_child = create(child, key, CreateOptions(key_id="kid"))
parsed2 = verify(compact_child, opts)
assert parsed2.payload.jti == "jti-child"
def test_on_verify_attempt_callback():
"""Observability: on_verify_attempt is called with jti and error (or None)."""
key = generate_key()
now = int(time.time())
p = Payload(iss="i", aud=["v"], iat=now, exp=now + 3600, jti="jti-obs", exec_act="a", pred=[])
compact = create(p, key, CreateOptions(key_id="kid"))
resolver = lambda k: key.public_key() if k == "kid" else None
seen = []
def hook(jti, err):
seen.append((jti, err))
opts = VerifyOptions(verifier_id="v", resolve_key=resolver, on_verify_attempt=hook)
result = verify(compact, opts)
assert result.payload.jti == "jti-obs"
assert len(seen) == 1
assert seen[0][0] == "jti-obs"
assert seen[0][1] is None
def test_on_verify_attempt_called_on_failure():
key = generate_key()
now = int(time.time())
p = Payload(iss="i", aud=["v"], iat=now, exp=now - 1, jti="jti-fail", exec_act="a", pred=[])
compact = create(p, key, CreateOptions(key_id="kid"))
resolver = lambda k: key.public_key() if k == "kid" else None
seen = []
opts = VerifyOptions(verifier_id="v", resolve_key=resolver, now=now, on_verify_attempt=lambda jti, err: seen.append((jti, err)))
with pytest.raises(ValueError, match="expired"):
verify(compact, opts)
assert len(seen) == 1
assert seen[0][0] == "jti-fail"
assert seen[0][1] is not None

View File

@@ -0,0 +1,87 @@
# ietf-act-ect-interop
Cross-spec interop tests between the `ietf-act` (Agent Context Token,
draft-nennemann-act-01) and `ietf-ect` (Execution Context Tokens,
draft-nennemann-wimse-ect-01) Python reference implementations.
The purpose of this package is not to ship runtime code — it is to
**document empirically** which shared claims, algorithms, and
structures round-trip cleanly between the two refimpls, so implementers
building bridges or shared tooling know what they can rely on.
## Compatibility matrix
Observed as of the commit that produced these 32 passing tests:
| Layer | Direction | Status | Evidence |
|---|---|---|---|
| ES256 raw JWS signature | ACT ↔ ECT | Compatible | `test_es256_primitive_is_wire_compatible_at_raw_sig_level` — ACT's `crypto.verify` accepts an ECT-signed compact's r\|\|s bytes |
| EdDSA signature | ACT → ECT | Incompatible | ECT verify refuses EdDSA at the alg gate (ES256-only) |
| `typ` header | ACT ↔ ECT | Strictly separated (by design) | `act+jwt` vs `exec+jwt`; each verifier rejects the other |
| `jti` format | Shared | Compatible | Same UUID string accepted by both |
| `wid` | Shared | Compatible | Preserved on both sides |
| `iat` / `exp` (NumericDate) | Shared | Compatible | Integer seconds on both |
| `aud` (string form) | ACT → ECT | Compatible (lossy round-trip) | ACT stores `str \| list[str]`; ECT coerces to `list[str]` via `_audience_deserialize` |
| `exec_act` | Shared | Compatible | ACT ABNF-legal values pass through ECT unchanged |
| `pred` array | Shared | Compatible | Same topology, same wire shape |
| `inp_hash` / `out_hash` | Shared | Compatible **now** | Both specs use plain base64url (ECT was aligned — the prefixed `sha-256:` form is now rejected by ECT) |
| `cap``exec_act` coupling | ACT-only | Divergent | ACT verifier raises `ACTCapabilityError`; ECT does not enforce |
| `status` claim | ACT-only | Divergent | Required in ACT Phase 2; absent in ECT |
| `sub`, `task`, `iss` required | ACT-only | Divergent | ECT `Payload.from_claims` silently drops them |
| `ect_ext`, `inp_classification` | ECT-only | Divergent | ACT `ACTRecord.from_claims` silently drops them |
| DAG cross-resolution | Separate stores | **Not supported** | `ECTStore` is keyed on `Payload`; `ACTLedger` on `ACTRecord`; no refimpl bridges the two |
### Hazard flag (found while writing these tests)
`ACTLedger.append()` does **not** perform a runtime `isinstance` check
— it relies on duck typing. An ECT `Payload` object has a `.jti`
attribute, and will therefore be **silently accepted** by the ACT
ledger. This is an implementation hazard, not a spec-level guarantee:
production bridges must enforce explicit type checks outside both
refimpls. Pinned in `test_act_ledger_does_not_type_check_ect_payload`.
## Do / Do not
**Do**
- Reuse ES256 (P-256) key material across ACT and ECT deployments — the signing primitive is byte-identical.
- Treat `jti`, `wid`, `pred`, `exec_act` as semantically aligned when building cross-type audit views.
- Rely on `inp_hash` / `out_hash` being byte-portable between refimpls today.
**Do not**
- Feed an ACT compact token to an ECT verifier or vice versa. The `typ` gates are deliberate and permanent.
- Forge one token type as the other. This is a first-class anti-goal.
- Expect ECT to enforce ACT's `cap` / `exec_act` coupling. Authorization stays in ACT.
- Use EdDSA-signed ACT tokens in an ECT-only deployment. ECT is ES256-only.
- Cross-insert objects between `ACTLedger` and `ECTStore` / `MemoryLedger`. Python's duck typing will let some of these through; that's a bug waiting to happen.
## Install and run
From the workspace root:
```bash
pip install -e packages/act
pip install -e packages/ect
pip install -e packages/interop
cd packages/interop
python -m pytest tests/ -v
```
Expected: **32 passed**.
## Test file map
| File | Focus |
|---|---|
| `tests/test_shared_claims.py` | `pred`, `jti`/`wid`, hash format, `exec_act` string shape |
| `tests/test_algorithm_matrix.py` | ES256 ↔ EdDSA × verifier compatibility |
| `tests/test_dag_structure.py` | `pred`-array topology equivalence; store separation |
| `tests/test_divergence.py` | claims each parser ignores; `typ` separation; `cap` coupling; `status` requirement |
| `tests/test_anti_goals.py` | cross-type forgery rejection; cross-type store hazard |
## Open questions for spec editors
- Should ECT optionally accept Ed25519? Today it is ES256-only.
- Should the refimpls enforce type-level rejection of cross-type objects passed into their stores/ledgers, or is that outside scope?

View File

@@ -0,0 +1,22 @@
[build-system]
requires = ["setuptools>=68.0"]
build-backend = "setuptools.build_meta"
[project]
name = "ietf-act-ect-interop"
version = "0.1.0"
description = "Cross-spec interop tests between ietf-act and ietf-ect reference implementations"
requires-python = ">=3.11"
license = "BSD-3-Clause"
dependencies = [
"ietf-act",
"ietf-ect",
"pytest>=8.0",
]
[tool.setuptools.packages.find]
where = ["."]
include = ["tests*"]
[tool.pytest.ini_options]
testpaths = ["tests"]

Some files were not shown because too many files have changed in this diff Show More