ietf-draft-analyzer/data/reports/holistic-agent-ecosystem-draft-outlines.md

# Holistic Agent Ecosystem: Analysis and Draft Outlines

*Derived from IETF draft analyzer gaps (12), overlap matrix (260 drafts), and 1,262 extracted ideas. Goal: a unified agent ecosystem with DAG orchestration and HITL built in, agnostic and extensible, applicable to both fast/relaxed (e.g. K8s) and regulated/proof-heavy environments.*

---

## 1. Vision Summary

| Pillar | Meaning | Gap / Overlap Evidence |
|--------|---------|------------------------|
| **DAG** | Task/workflow as directed acyclic graph: dependencies, execution order, checkpoints, rollback along the graph. | **Gap 3** (Error Recovery/Rollback), **Gap 1** (Resource Management — scheduling/quotas). Ideas: "Task-Oriented Multi-Agent Recovery", "Working Memory", "Execution Context Token (ECT)". No single draft defines *agent task DAG* as a first-class construct. |
| **HITL** | Human-in-the-loop as a first-class primitive: approval gates, escalation, emergency override, explainability. | **Gap 7** (Human Override); only 22 human-agent drafts vs 60 autonomous netops. Ideas: CHEQ (confirmation), "Human Oversight Requirements", "Level 4 Autonomous Network Architecture". |
| **Agnostic + extensible** | Protocol- and transport-agnostic; works over any A2A protocol; extensible via profiles. | **Gap 4** (Cross-Protocol Translation) — 92 A2A drafts, no universal translation/negotiation. Overlap matrix: high within-category similarity (0.75+) but no interoperability layer. |
| **Dual regime** | Same model works in "fast/relaxed" (K8s, dev) and "regulated" (proofs, attestation, audit). | **Gaps 2, 8, 9, 12**: Behavior verification, cross-domain security, dynamic trust, data provenance. Ideas: STAMP, DAAP, verifiable conversations, ECT — all additive assurance. |

---

## 2. How This Fits With WIMSE and ECT (Differentiation)

**SPIFFE** (CNCF) defines the *identifier* for a workload (`spiffe://trust-domain/path`) and its ecosystem (SVIDs, etc.). So **"who" at the identifier level is already SPIFFE** (or similar URI schemes).

**WIMSE** (Workload Identity in a *Multi System Environment* — [draft-ietf-wimse-arch](https://datatracker.ietf.org/doc/draft-ietf-wimse-arch/)) is the **IETF architecture** for how workload identity and **security context** are conveyed and used across systems. It is **not only "who"**. It covers:

- **Workload identity**: identifier (WIMSE uses a URI; SPIFFE ID is one conforming example), credentials (WIT, X.509), trust domains.
- **Security context**: "information needed for a workload to perform its function" — authorization, accounting, auditing, user info, what processing has already happened, propagation along the call chain.
- **Identity proxy**: inspect, replace, or augment identity and context (e.g. at gateways).
- **Use cases**: bootstrapping, service auth, **authorization**, **audit trails**, **security context establishment and propagation**, delegation, cross-boundary, **AI/ML intermediaries**.

So: **SPIFFE = identifier (who). WIMSE = architecture for conveying identity + security context in protocols (who + context + propagation + authz + audit).**

**ECT** (Execution Context Tokens), in [draft-nennemann-wimse-ect](https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/), is a **JWT-based extension** that records *what* each task did: each ECT is a signed record of one task, linked to predecessors via a DAG (`par` / `jti`). ECT reuses the WIMSE signing model (same key as WIT) and adds: token format, HTTP transport (Execution-Context header), DAG validation, audit-ledger interface. So **ECT = execution evidence (what happened)** built on WIMSE identity/signing.

### Fit in (recommended)

Our ecosystem drafts **do not compete with SPIFFE, WIMSE, or ECT**. They **build on** them:

| Layer | SPIFFE / WIMSE / ECT | Our drafts (AEM, AERR, HEOP, DATS, CPAT, etc.) |
|-------|----------------------|-----------------------------------------------|
| **Identifier** | SPIFFE (or URI): "who" | We use existing identifiers (e.g. `iss` / SPIFFE ID in ECT). |
| **Identity + context** | WIMSE: credentials, security context, propagation, authz, audit | We assume WIMSE (or equivalent) for auth and context; we do not redefine it. |
| **Evidence** | ECT: token format, DAG linkage, signing, audit | We **use ECT as the carrier**: checkpoints, errors, overrides, trust assertions, translation hops are **new ECT node types** (new `exec_act` values and optional claims). |
| **Semantics** | ECT: “a task happened, here are parents” | We define **orchestration and operations**: dependencies, checkpoints, rollback protocol, HITL points, resource hints, assurance profiles, protocol binding. |

Concretely:

- **AERR** (Error Recovery/Rollback): Checkpoints, errors, and rollback results are **ECTs** with specific `exec_act` and extension claims. Rollback walks the **ECT DAG**; no second DAG format.
- **HEOP** (Human Emergency Override): Override and acknowledgment are **ECTs** that link into the same ECT DAG for audit.
- **DATS** (Dynamic Trust): Trust events are derived from **ECT outcomes**; trust assertions are **ECTs**.
- **CPAT** (Cross-Protocol Translation): Each translation hop produces an **ECT**, so the cross-protocol path is one continuous ECT DAG.
- **Agent DAG HITL** ([draft-nennemann-agent-dag-hitl-safety](https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/)): Policy for when HITL is required; decisions and overrides still record as ECTs.

So the **execution model** (DAG of tasks, checkpoints, rollback, HITL) is **implemented using ECT** as the token and DAG format. We add **semantics and protocols** on top, not a new token or DAG structure.

### Differentiate (what we add)

| Concern | WIMSE/ECT | Our ecosystem |
|---------|-----------|----------------|
| **DAG** | ECT defines *how* nodes link (`par`, `jti`, validation). | We define *what* a node means (task, checkpoint, error, override, trust, translation), *when* to create them, and *how* to act on them (rollback, circuit breaker, HITL flow). |
| **Orchestration** | Out of scope for ECT. | Execution order, resource hints, scheduling, lifecycle; can be described in a **declarative workflow** (e.g. JSON) that is *realized* as ECTs at runtime. |
| **Recovery** | Not in ECT. | AERR: checkpoint placement, error propagation, rollback protocol, circuit breaker. |
| **HITL** | Not in ECT. | HEOP + Agent DAG HITL: approval gates, override, escalation; all recorded as ECTs. |
| **Trust / assurance** | ECT provides signed, linked evidence. | DATS, APAE: how to *derive* trust from ECT outcomes; assurance levels and profiles. |
| **Interop** | Single token format (ECT over HTTP). | CPAT, AEPB: translation between *protocols*; each hop still emits ECTs so the cross-protocol run is one ECT DAG. |

In other words: **SPIFFE = who (identifier). WIMSE = identity + security context + propagation + authz + audit. ECT = execution evidence (DAG of signed records). Our work = orchestration, recovery, HITL, trust, and interop that consume and produce that evidence.**

### Implications for the draft family

- **Draft A (AEM)**: Should state that the **reference implementation of the execution model** is ECT: “The ecosystem uses Execution Context Tokens (ECT) [I-D.nennemann-wimse-ect] as the standard representation of task execution and DAG linkage; extensions (e.g. AERR, HEOP) define additional ECT node types and procedures.”
- **Draft B (ATD)**: Either (1) **ATD = abstract model** (nodes, edges, checkpoints, rollback set) with “ECT is one binding” and optional JSON/CBOR for *declarative* workflow definition, or (2) **ATD = semantics of ECT usage** (when to emit which ECT types, execution semantics, resource hints) without a second wire format. Prefer (2) to avoid overlap with ECT and keep one DAG format.
- **Drafts C, D, E (HITL, AEPB, APAE)**: Same pattern: define procedures and semantics; **record all significant events as ECTs** so the full run is one auditable ECT DAG. Reference SPIFFE/WIMSE for identity and context, and ECT for format and validation.

### One-sentence positioning

**“SPIFFE gives the identifier; WIMSE gives the architecture for identity and security context; ECT gives the DAG evidence format. Our drafts specify how to use that format for orchestration, recovery, human oversight, trust, and cross-protocol interop, so the same stack works from relaxed to fully regulated.”**

---

## 3. How the Data Supports This

### From gap analysis
- **Critical**: Resource management (scheduling/quotas for DAG nodes), behavior verification (runtime proofs in regulated mode), error recovery/rollback (DAG-based undo).
- **High**: Cross-protocol translation (agnostic layer), human override (HITL), lifecycle (versioning/retirement of workflows), multi-agent consensus (coordination in DAG), cross-domain security and dynamic trust (regulated regime).
- **Medium**: Monitoring, explainability (HITL), provenance (regulated regime).

### From overlap
- **A2A protocols** (92 drafts, avg pairwise sim 0.76): Heavy duplication; a thin *ecosystem layer* on top of "any A2A" would reduce friction.
- **Agent discovery/reg** (57) and **identity/auth** (98): Discovery and identity are shared concerns; the ecosystem draft can reference ANS, ADL, OAuth RAR, etc., without mandating one.
- **Human-agent** (22) is underweight; HITL should be a first-class extension point in the ecosystem document.

### From ideas (sample)
- **DAG/context**: "Working Memory", "Execution Context Token", "Task-Oriented Multi-Agent Recovery Framework", "State Consistency Management", "Checkpoint".
- **HITL**: "CHEQ protocol", "Human Oversight Requirements", "Human-in-the-Loop", "Emergency Override".
- **Agnostic**: "Automated Protocol Adaptation", "Semantic Routing", "Protocol Adapter Layer", "Cross-Protocol Translation".
- **Dual regime**: "Cryptographic Proof-Based Autonomy", "Verifiable Agent Behavior Attestation", "Trust Scoring", "Behavioral Monitoring", "Data Provenance".

---

## 4. Proposed Draft Family

Five drafts that together define the holistic ecosystem. Each fills gaps and references existing work (see landscape) to avoid duplication.

---

### Draft A: **Agent Ecosystem Model (AEM) — Architecture and Terminology**

**Role**: Informational. Single source of concepts (DAG, HITL, assurance levels, protocol agnosticism) so other drafts and WGs share vocabulary.

**Gaps addressed**: Cross-cutting; reduces overlap by establishing a common model.

**Outline**:

1. **Introduction** — Need for a unified model across 260+ drafts; scope (orchestration, control, assurance); non-goals (no new wire protocol).
2. **Terminology** — Agent, task, workflow, DAG, node, checkpoint, HITL point, assurance level, ecosystem layer, protocol binding.
3. **Architectural Model** — Ecosystem layer above protocol bindings (A2A, MCP, etc.); DAG as the execution model; HITL as optional nodes; assurance as an orthogonal axis.
4. **Assurance Levels** — Level 0 (best-effort, no proofs), Level 1 (audit trail), Level 2 (attestation/verification), Level 3 (full provenance and compliance). Same DAG/HITL model at every level.
5. **Protocol Agnosticism** — How the ecosystem layer binds to existing protocols (reference ANS, ADL, MCP, STAMP, DAAP, etc.); extension points.
6. **Security Considerations** — Trust boundaries; what each assurance level guarantees.
7. **IANA Considerations** — None or registry for assurance level identifiers.

**Target**: Individual or NMOP; **Status**: Informational.
**Related drafts**: draft-rosenberg-aiproto-framework, draft-zyyhl-agent-networks-framework, draft-nennemann-wimse-ect, draft-aylward-daap-v2.

---

### Draft B: **Agent Task DAG (ATD) — Execution Model and Checkpoints**

**Role**: Define the *semantics* of the DAG execution model (when to emit which ECT types, execution order, checkpoints, rollback) **using ECT as the token and DAG format**. Does not define a second wire format; avoids overlap with [draft-nennemann-wimse-ect](https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/).

**Gaps addressed**: **Gap 1** (Resource Management — scheduling/quotas per node), **Gap 3** (Error Recovery and Rollback).

**Outline**:

1. **Introduction** — Why DAG (ordering, dependencies, safe rollback); relationship to AEM; dependency on ECT for token format and DAG structure.
2. **Terminology** — Node, edge, root, leaf, checkpoint, rollback set, blast radius (aligned with ECT and AERR).
3. **Execution Semantics** — Topological order; node states (pending, running, done, failed, rolled back); when agents MUST/SHOULD emit ECTs; checkpoint placement (per ECT or per subgraph, per AERR).
4. **Resource Hints and Quotas** — Optional resource claims (e.g. in ECT `ext` or workflow descriptor); integration with scheduling (K8s, etc.); fair allocation when agents compete (gap 1).
5. **Checkpoint and Rollback** — Reference AERR for checkpoint/error/rollback ECT types; rollback protocol: walk ECT DAG backwards; circuit-breaker to contain cascading failure.
6. **Optional: Declarative Workflow** — JSON/CBOR for *pre-run* workflow definition (nodes, dependencies, HITL slots, resource hints); at runtime this is realized as ECTs. Enables tooling and portability without replacing ECT.
7. **Security Considerations** — Integrity of ECT DAG and checkpoints; who may trigger rollback.
8. **IANA Considerations** — None if ECT registry is used for node/action types; or registry for workflow descriptor media type.

**Target**: NMOP or individual; **Status**: Standards-track or Experimental.
**Related drafts**: draft-nennemann-wimse-ect, draft-aerr-agent-error-recovery-rollback, draft-yue-anima-agent-recovery-networks, draft-li-dmsc-macp.

---

### Draft C: **Human-in-the-Loop (HITL) Primitives for Agent Ecosystems**

**Role**: Standardize HITL as first-class: approval gates, escalation, emergency override, and explainability hooks.

**Gaps addressed**: **Gap 7** (Human Override and Intervention), **Gap 11** (Agent Explainability).

**Outline**:

1. **Introduction** — Need for HITL in autonomous systems; design goals (minimal friction in relaxed mode, mandatory in regulated).
2. **Terminology** — HITL point, approval gate, escalation, override, explainability token.
3. **HITL Point Model** — Placement in DAG (before/after node, or as a node); types: approval required, notification only, override-only (emergency).
4. **Approval and Escalation** — Request/response format (reference OAuth, CHEQ); timeouts and escalation paths; revocation of approval.
5. **Emergency Override** — Signal to halt or rollback; scope (single node, subgraph, full DAG); authentication and audit (reference DAAP, STAMP).
6. **Explainability** — Optional explainability token (summary, evidence, link to verifiable conversation); required at higher assurance levels.
7. **Binding to AEM** — How HITL integrates with Draft A assurance levels; optional in Level 0, required in Level 2+ for critical paths.
8. **Security Considerations** — Who can approve/override; replay and revocation.
9. **IANA Considerations** — None or registry for HITL point types.

**Target**: NMOP or OPS; **Status**: Standards-track or Experimental.
**Related drafts**: draft-rosenberg-aiproto-cheq, draft-rosenberg-cheq, draft-cui-nmrg-llm-nm, draft-irtf-nmrg-llm-nm, draft-aap-oauth-profile, draft-birkholz-verifiable-agent-conversations.

---

### Draft D: **Agent Ecosystem Protocol Binding (AEPB) — Agnostic Interop Layer**

**Role**: Define how the ecosystem layer (AEM + ATD + HITL) binds to existing A2A and discovery protocols; translation/negotiation so different protocols interoperate.

**Gaps addressed**: **Gap 4** (Cross-Protocol Translation), **Gap 5** (Lifecycle — versioning/retirement).

**Outline**:

1. **Introduction** — Proliferation of 92 A2A protocols; need for a binding layer that preserves ecosystem semantics (DAG, HITL) over any protocol.
2. **Terminology** — Protocol binding, translation, negotiation, capability advertisement.
3. **Capability Advertisement** — How agents/gateways advertise support for AEM/ATD/HITL and assurance levels (reference ADL, ANS, DNS-SD).
4. **Binding Requirements** — What a protocol must provide: task invocation, dependency ordering, checkpoint/rollback signals, HITL callbacks. Mapping to MCP, A2A over HTTP, A2A over MOQT, etc.
5. **Translation and Negotiation** — When two agents speak different protocols: gateway or negotiation (e.g. common subset); minimal translation schema (intent, result, error).
6. **Lifecycle** — Versioning of DAG and agents; graceful shutdown and drain; retirement without breaking dependents (gap 5).
7. **Security Considerations** — Trust of translators; integrity across protocol boundaries.
8. **IANA Considerations** — Registry for protocol binding identifiers.

**Target**: NMOP or individual; **Status**: Experimental.
**Related drafts**: draft-agent-gw, draft-narajala-ans, draft-nederveld-adl, draft-ainp-protocol, draft-mallick-muacp, draft-a2a-moqt-transport.

---

### Draft E: **Assurance Profiles for Agent Ecosystems (APAE) — Dual Regime**

**Role**: Define how the same ecosystem runs in "relaxed" vs "regulated" mode: which proofs, attestations, and provenance are required.

**Gaps addressed**: **Gap 2** (Behavior Verification), **Gap 8** (Cross-Domain Security), **Gap 9** (Dynamic Trust), **Gap 12** (Data Provenance).

**Outline**:

1. **Introduction** — Same DAG/HITL model in dev (K8s) vs regulated (finance, healthcare); assurance profile = dial for proof level.
2. **Terminology** — Assurance profile, proof, attestation, provenance, trust boundary.
3. **Profiles** — **Relaxed**: best-effort, optional audit; **Standard**: audit trail + optional attestation; **Regulated**: attestation per critical node, provenance chain, behavior verification (reference STAMP, DAAP, verifiable conversations).
4. **Behavior Verification** — How runtime behavior is checked against declared policy (reference DAAP, RATS, EAT); evidence format.
5. **Cross-Domain and Trust** — Trust boundaries between domains; dynamic trust scoring (reference Cosmos, SEAT); revocation.
6. **Provenance** — Data lineage along the DAG; format for provenance records (reference verifiable conversation format); retention.
7. **Security Considerations** — What each profile guarantees and does not guarantee.
8. **IANA Considerations** — Registry for assurance profile identifiers.

**Target**: NMOP or LAKE/RATS; **Status**: Informational or Experimental.
**Related drafts**: draft-guy-bary-stamp-protocol, draft-aylward-daap-v2, draft-birkholz-verifiable-agent-conversations, draft-jiang-seat-dynamic-attestation, draft-cosmos-protocol-specification.

---

## 5. Dependency Graph of Drafts

```
    [A: AEM - Architecture & Terminology]
         |
    +----+----+----+----+
    |    |    |    |    |
    v    v    v    v    v
[B: ATD] [C: HITL] [D: AEPB] [E: APAE]
  DAG      Human     Binding   Assurance
  +        +         +         +
  |        |         |         |
  +--------+---------+---------+
         |
    Implementations: same ecosystem in K8s (relaxed) or regulated (proofs)
```

- **A** is the foundation; B, C, D, E reference A and each other where needed.
- **B** (ATD) and **C** (HITL) are the core execution model; **D** (AEPB) makes it agnostic; **E** (APAE) makes it dual-regime.

---

## 6. How to Use This With Your Analyzer

- **Generate outlines via CLI**: Use `ietf draft-gen <topic>` with topics: "Agent Ecosystem Model", "Agent Task DAG", "Human-in-the-Loop primitives", "Agent Ecosystem Protocol Binding", "Assurance Profiles for Agent Ecosystems". Feed the gap context from this file or from `data/reports/gaps.md`.
- **Cross-reference**: When writing each draft, cite the related drafts listed above; the analyzer’s `ietf similar` and `ietf compare` can find more.
- **Ideas database**: Search `ideas` for mechanism/architecture/protocol types (e.g. "checkpoint", "override", "translation") to pull concrete mechanisms into sections.

---

## 7. Deeper Datatracker Analysis — What to Run

To keep refining these drafts from the data, use your analyzer as follows:

| Goal | Commands / data |
|------|------------------|
| **Find drafts that touch DAG/workflow** | `ietf search "task graph"`, `ietf search "workflow"`, `ietf search "checkpoint"`, `ietf search "rollback"`; then `ietf similar <name>` for each. |
| **Find HITL-related work** | `ietf search "human-in-the-loop"`, `ietf search "override"`, `ietf search "approval"`; check `data/reports/ideas.md` for "CHEQ", "human", "override". |
| **Cross-protocol / interop** | `ietf report overlap-matrix`; focus on A2A vs other categories; `ietf similar draft-agent-gw` (protocol adaptation). |
| **Assurance / proofs** | Search ideas for "attestation", "verifiable", "provenance", "trust"; list drafts in AI safety/alignment and identity/auth. |
| **Gap → draft outline** | For each gap in `gaps.md`, run `ietf draft-gen "<gap topic>"` (e.g. "Agent Error Recovery and Rollback") and merge with the outlines above. |
| **Cluster overlap** | `ietf clusters --threshold 0.85` to see near-duplicates; use one as canonical and reference others to reduce fragmentation. |

Refreshing the pipeline periodically (`ietf fetch`, `ietf analyze --all`, `ietf ideas --all`, `ietf gaps`) keeps gaps and ideas aligned with the latest datatracker activity.

---

## 8. One-Page Pitch (Elevator Version)

**Problem**: 260+ IETF drafts on AI agents; no single model for orchestration (DAG), human oversight (HITL), or for running the same system in both fast and regulated environments.

**Proposal**: Five coordinated drafts — (A) shared architecture and terminology, (B) DAG execution with checkpoints and rollback, (C) HITL primitives (approval, override, explainability), (D) protocol-agnostic binding so any A2A protocol can participate, (E) assurance profiles so the same stack works in K8s or in a proof-heavy regulated regime.

**Outcome**: One holistic agent ecosystem: DAG + HITL built in, agnostic and extensible, applicable everywhere from relaxed to fully proven.