feat: add draft data, gap analysis report, and workspace config

2026-04-06 18:47:15 +02:00
parent 4f310407b0
commit 2506b6325a
189 changed files with 62649 additions and 0 deletions
--- a/workspace/drafts/new-drafts/draft-a-aem-agent-ecosystem-model-00.md
+++ b/workspace/drafts/new-drafts/draft-a-aem-agent-ecosystem-model-00.md
@@ -0,0 +1,289 @@
+---
+title: "Agent Ecosystem Model (AEM): Architecture and Terminology"
+abbrev: "AEM"
+category: info
+docname: draft-aem-agent-ecosystem-model-00
+submissiontype: IETF
+number:
+date:
+v: 3
+area: "OPS"
+workgroup: "NMOP"
+keyword:
+  - agent ecosystem
+  - DAG
+  - HITL
+  - agentic workflows
+
+author:
+  -
+    fullname: TBD
+    organization: Independent
+    email: placeholder@example.com
+
+normative:
+  RFC2119:
+  RFC8174:
+
+informative:
+  I-D.nennemann-wimse-ect:
+    title: "Execution Context Tokens for Distributed Agentic Workflows"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
+  I-D.nennemann-agent-dag-hitl-safety:
+    title: "Agent Context Policy Token: DAG Delegation with Human Override"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
+
+--- abstract
+
+This document defines the Agent Ecosystem Model (AEM), a shared
+architecture and terminology for building interoperable agent
+systems that incorporate DAG-based execution, human-in-the-loop
+safety, and graduated assurance levels.  AEM is not a protocol.
+It is a reference model that establishes common vocabulary and
+architectural concepts so that companion specifications (ATD,
+HITL, AEPB, APAE) and implementors share a consistent frame of
+reference.  The model builds on Execution Context Tokens (ECT)
+for execution evidence and ACP-DAG-HITL for delegation policy.
+
+--- middle
+
+# Introduction
+
+The IETF AI/agent landscape includes over 260 drafts proposing
+protocols for agent communication, identity, safety, and
+operations.  These drafts share many implicit concepts — tasks,
+delegation, workflows, safety checks — but use inconsistent
+terminology and incompatible models.
+
+AEM provides a single reference architecture so that:
+
+- Companion drafts (ATD, HITL, AEPB, APAE) share vocabulary.
+- Implementors understand how the pieces compose.
+- New proposals can position themselves within an existing model
+  rather than inventing another one.
+
+AEM is deliberately not a protocol.  It defines no wire formats,
+no endpoints, and no new token types.  It is the map; the
+companion drafts are the territory.
+
+## Design Principles
+
+1. **ECT is the execution backbone.**  All significant agent
+   actions produce Execution Context Tokens
+   {{I-D.nennemann-wimse-ect}}.  The ecosystem does not define a
+   second DAG or audit format.
+
+2. **ACP-DAG-HITL is the policy backbone.**
+   {{I-D.nennemann-agent-dag-hitl-safety}} defines delegation
+   DAGs and HITL rules.  The ecosystem extends these with
+   operational semantics, not replacement structures.
+
+3. **Same model, different assurance.**  The architecture works
+   identically from a relaxed K8s dev cluster (ECT L1) to a
+   regulated healthcare environment (ECT L3 with audit ledger).
+
+4. **Protocol-agnostic.**  The ecosystem sits above any A2A
+   protocol.  Agents may speak different protocols and still
+   participate through translation.
+
+# Conventions and Definitions
+
+{::boilerplate bcp14-tagged}
+
+# Terminology {#terminology}
+
+Agent:
+: An autonomous software entity that performs tasks, makes
+  decisions, and communicates with other agents or humans.
+
+Task:
+: A discrete unit of work performed by an agent, recorded as a
+  single ECT node.
+
+Workflow:
+: A set of tasks linked by dependencies, forming a DAG.
+  Identified by the ECT `wid` claim.
+
+DAG (Directed Acyclic Graph):
+: The execution graph formed by ECT parent references (`par`
+  claims).  Also used in ACP-DAG-HITL for delegation structure.
+
+Checkpoint:
+: An ECT node recording agent state before a consequential
+  action, enabling rollback.
+
+HITL Point:
+: A position in the workflow where human intervention is
+  required or available, governed by ACP-DAG-HITL rules.
+
+Override:
+: A human-initiated command that alters an agent's autonomous
+  operation, taking precedence over the agent's own decisions.
+
+Trust Score:
+: A floating-point value in \[0.0, 1.0\] representing one
+  agent's assessed reliability of another.
+
+Protocol Binding:
+: The mapping between ecosystem semantics and a specific A2A
+  communication protocol.
+
+Assurance Level:
+: The degree of cryptographic and audit protection applied to
+  ECTs: L1 (unsigned JSON), L2 (signed JWT), L3 (signed +
+  audit ledger).  Defined by {{I-D.nennemann-wimse-ect}}.
+
+# Architectural Model {#architecture}
+
+The ecosystem is organized in four layers:
+
+~~~
+┌─────────────────────────────────────────────────────┐
+│                  Policy Layer                        │
+│  ACP-DAG-HITL: delegation DAG, HITL rules,          │
+│  node constraints, trust thresholds                  │
+├─────────────────────────────────────────────────────┤
+│               Semantics Layer                        │
+│  ATD: execution order, checkpoints, rollback,        │
+│       circuit breakers, resource hints                │
+│  HITL: override levels, approval gates, escalation   │
+│  AEPB: capability ads, negotiation, translation      │
+│  APAE: trust scoring, behavior verification,         │
+│        provenance, assurance profiles                 │
+├─────────────────────────────────────────────────────┤
+│               Evidence Layer                         │
+│  ECT: signed DAG of execution records (L1/L2/L3)    │
+│  inp_hash/out_hash, ext claims, audit ledger         │
+├─────────────────────────────────────────────────────┤
+│               Identity Layer                         │
+│  WIMSE / X.509 / OAuth / JWK: agent identity         │
+└─────────────────────────────────────────────────────┘
+~~~
+{: #fig-stack title="Ecosystem Layer Stack"}
+
+Identity Layer:
+: Answers "who is this agent?"  AEM does not define identity
+  mechanisms; it assumes WIMSE, X.509, OAuth, or equivalent.
+
+Evidence Layer:
+: Answers "what did this agent do?"  ECT provides per-task
+  signed records linked into a DAG, with three assurance levels.
+
+Semantics Layer:
+: Answers "what does it mean and what to do about it?"  The
+  four companion drafts define operational semantics on top of
+  ECT:
+
+  - **ATD** (Agent Task DAG): execution order, checkpoints,
+    rollback, circuit breakers, resource hints.
+  - **HITL** (Human-in-the-Loop): override levels, approval
+    gates, escalation paths, explainability.
+  - **AEPB** (Agent Ecosystem Protocol Binding): capability
+    advertisement, protocol negotiation, translation gateways,
+    agent lifecycle.
+  - **APAE** (Assurance Profiles): dynamic trust scoring,
+    behavior verification, data provenance, assurance profiles.
+
+Policy Layer:
+: Answers "what's allowed?"  ACP-DAG-HITL defines delegation
+  constraints and HITL trigger rules.  Companion drafts extend
+  `constraints` with protocol-specific fields (trust thresholds,
+  checkpoint policies, protocol restrictions).
+
+## How ECT Extensions Work
+
+Each companion draft defines `ext` claim namespaces on ECT:
+
+| Draft | `ext` prefix | Example claims |
+|-------|-------------|----------------|
+| ATD | `atd.*` | `atd.reversible`, `atd.severity`, `atd.circuit_state` |
+| HITL | `hitl.*` | `hitl.level`, `hitl.operator_id`, `hitl.prior_state` |
+| AEPB | `aepb.*` | `aepb.source_protocol`, `aepb.dest_protocol` |
+| APAE | `apae.*` | `apae.trust_score`, `apae.confidence`, `apae.hops` |
+{: #fig-ext title="ECT Extension Namespaces"}
+
+## How Policy Extensions Work
+
+Each companion draft defines `constraints` fields on
+ACP-DAG-HITL DAG nodes:
+
+| Draft | Constraint fields |
+|-------|------------------|
+| ATD | `atd.checkpoint_policy`, `atd.circuit_threshold` |
+| HITL | (uses HITL rules directly) |
+| AEPB | `aepb.allowed_protocols`, `aepb.max_translation_hops` |
+| APAE | `apae.min_trust`, `apae.min_confidence`, `apae.assurance_profile` |
+{: #fig-constraints title="ACP-DAG-HITL Node Constraint Extensions"}
+
+# Assurance as an Orthogonal Axis {#assurance}
+
+The entire semantics layer operates identically at all ECT
+assurance levels.  The DAG structure, HITL processing, trust
+scoring, and protocol translation are the same whether the ECT
+is unsigned JSON (L1) or a ledger-committed signed JWT (L3).
+
+What changes across levels is the security envelope:
+
+| Property | L1 | L2 | L3 |
+|----------|----|----|-----|
+| Structured execution records | Yes | Yes | Yes |
+| DAG validation | Yes | Yes | Yes |
+| Non-repudiation | No | Yes | Yes |
+| Tamper detection | Transport only | Signature | Signature + ledger |
+| Regulatory audit trail | No | No | Yes |
+{: #fig-assurance title="Assurance Level Properties"}
+
+A deployment MAY use different levels for different workflows.
+Internal dev pipelines might use L1; cross-org integrations L2;
+regulated clinical workflows L3.
+
+# Protocol Agnosticism {#agnosticism}
+
+The ecosystem layer sits above any A2A communication protocol.
+Agents communicate via their native protocol (A2A, MCP, SLIM,
+uACP, etc.) while the Execution-Context HTTP header
+{{I-D.nennemann-wimse-ect}} carries ECTs alongside protocol
+messages.
+
+When two agents speak different protocols, a translation gateway
+(defined by AEPB) converts between protocols while preserving
+ECT DAG continuity.  The translation hop is itself an ECT node,
+so the cross-protocol path is one auditable DAG.
+
+# Companion Draft Summary {#companions}
+
+| Draft | Abbrev | Concern | Gaps Addressed |
+|-------|--------|---------|----------------|
+| Agent Task DAG | ATD | Execution, checkpoints, rollback | #1 Resource Mgmt, #3 Error Recovery |
+| Human-in-the-Loop | HITL | Override, approval, escalation | #7 Human Override, #11 Explainability |
+| Protocol Binding | AEPB | Interop, translation, lifecycle | #4 Cross-Protocol, #5 Lifecycle |
+| Assurance Profiles | APAE | Trust, verification, provenance | #2 Behavior Verification, #8 Cross-Domain, #9 Dynamic Trust, #12 Provenance |
+{: #fig-companions title="Companion Draft Family"}
+
+Together with ECT (evidence) and ACP-DAG-HITL (policy), these
+six documents cover all 3 critical and 6 high-severity gaps
+identified in the IETF AI/agent draft landscape.
+
+# Security Considerations
+
+AEM defines no protocol mechanisms and therefore introduces no
+direct security considerations.  Security properties are
+inherited from the evidence layer (ECT assurance levels) and
+the policy layer (ACP-DAG-HITL validation).
+
+Implementors MUST ensure that all layers are consistently
+configured: an L3 ECT deployment provides no additional
+assurance if the policy layer accepts unsigned tokens.
+
+# IANA Considerations
+
+This document has no IANA actions.
+
+--- back
+
+# Acknowledgments
+{:numbered="false"}
+
+This architecture builds on the Execution Context Token
+specification {{I-D.nennemann-wimse-ect}} and the Agent Context
+Policy Token {{I-D.nennemann-agent-dag-hitl-safety}}.
--- a/workspace/drafts/new-drafts/draft-a-aem-agent-ecosystem-model-01.md
+++ b/workspace/drafts/new-drafts/draft-a-aem-agent-ecosystem-model-01.md
@@ -0,0 +1,461 @@
+---
+title: "Agent Ecosystem Model (AEM): Architecture and Terminology"
+abbrev: "AEM"
+category: info
+docname: draft-aem-agent-ecosystem-model-01
+submissiontype: IETF
+number:
+date:
+v: 3
+area: "OPS"
+workgroup: "NMOP"
+keyword:
+  - agent ecosystem
+  - DAG
+  - HITL
+  - agentic workflows
+
+author:
+  -
+    fullname: TBD
+    organization: Independent
+    email: placeholder@example.com
+
+normative:
+  RFC2119:
+  RFC8174:
+  I-D.nennemann-wimse-ect:
+    title: "Execution Context Tokens for Distributed Agentic Workflows"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
+  I-D.nennemann-agent-dag-hitl-safety:
+    title: "Agent Context Policy Token: DAG Delegation with Human Override"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
+
+informative:
+  RFC9334:
+  RFC7519:
+  RFC8615:
+
+--- abstract
+
+This document defines the Agent Ecosystem Model (AEM), a shared
+architecture and terminology for building interoperable agent
+systems that incorporate DAG-based execution, human-in-the-loop
+safety, and graduated assurance levels.  AEM is not a protocol.
+It is a reference model that establishes common vocabulary and
+architectural concepts so that companion specifications (ATD,
+HITL, AEPB, APAE) and implementors share a consistent frame of
+reference.  The model builds on Execution Context Tokens (ECT)
+for execution evidence and ACP-DAG-HITL for delegation policy.
+
+--- middle
+
+# Introduction
+
+The IETF AI/agent landscape includes over 260 drafts proposing
+protocols for agent communication, identity, safety, and
+operations.  These drafts share many implicit concepts — tasks,
+delegation, workflows, safety checks — but use inconsistent
+terminology and incompatible models.
+
+AEM provides a single reference architecture so that:
+
+- Companion drafts (ATD, HITL, AEPB, APAE) share vocabulary.
+- Implementors understand how the pieces compose.
+- New proposals can position themselves within an existing model
+  rather than inventing another one.
+
+AEM is deliberately not a protocol.  It defines no wire formats,
+no endpoints, and no new token types.  It is the map; the
+companion drafts are the territory.
+
+## Design Principles
+
+1. **ECT is the execution backbone.**  All significant agent
+   actions produce Execution Context Tokens
+   {{I-D.nennemann-wimse-ect}}.  The ecosystem does not define a
+   second DAG or audit format.
+
+2. **ACP-DAG-HITL is the policy backbone.**
+   {{I-D.nennemann-agent-dag-hitl-safety}} defines delegation
+   DAGs and HITL rules.  The ecosystem extends these with
+   operational semantics, not replacement structures.
+
+3. **Same model, different assurance.**  The architecture works
+   identically from a relaxed K8s dev cluster (ECT L1) to a
+   regulated healthcare environment (ECT L3 with audit ledger).
+
+4. **Protocol-agnostic.**  The ecosystem sits above any A2A
+   protocol.  Agents may speak different protocols and still
+   participate through translation.
+
+# Conventions and Definitions
+
+{::boilerplate bcp14-tagged}
+
+# Terminology {#terminology}
+
+Agent:
+: An autonomous software entity that performs tasks, makes
+  decisions, and communicates with other agents or humans.
+
+Task:
+: A discrete unit of work performed by an agent, recorded as a
+  single ECT node.
+
+Workflow:
+: A set of tasks linked by dependencies, forming a DAG.
+  Identified by the ECT `wid` claim {{I-D.nennemann-wimse-ect}}.
+
+DAG (Directed Acyclic Graph):
+: The execution graph formed by ECT parent references (`par`
+  claims).  Also used in ACP-DAG-HITL for delegation structure.
+
+Checkpoint:
+: An ECT node recording agent state before a consequential
+  action, enabling rollback.  Fully specified in ATD.
+
+HITL Point:
+: A position in the workflow where human intervention is
+  required or available, governed by ACP-DAG-HITL rules.
+
+Override:
+: A human-initiated command that alters an agent's autonomous
+  operation, taking precedence over the agent's own decisions.
+  Fully specified in HITL.
+
+Trust Score:
+: A floating-point value in \[0.0, 1.0\] representing one
+  agent's assessed reliability of another.  Updated using an
+  AIMD model; fully specified in APAE.
+
+Protocol Binding:
+: The mapping between ecosystem semantics and a specific A2A
+  communication protocol.  Fully specified in AEPB.
+
+Assurance Level:
+: The degree of cryptographic and audit protection applied to
+  ECTs, defined in {{I-D.nennemann-wimse-ect}}:
+
+  | Level | ECT Format | Non-repudiation | Tamper detection | Audit ledger |
+  |-------|-----------|----------------|-----------------|-------------|
+  | L1 | Unsigned JSON | No | Transport only | No |
+  | L2 | Signed JWT | Yes | Signature | No |
+  | L3 | Signed JWT | Yes | Signature | Yes (ledger-committed) |
+  {: #fig-levels title="ECT Assurance Levels"}
+
+Assurance Profile:
+: A named configuration (Relaxed, Standard, Regulated) selecting
+  which mechanisms are required at a given deployment.  Fully
+  specified in APAE.
+
+Blast Radius:
+: The set of agents and systems affected by a single failure.
+
+Translation Gateway:
+: A service converting messages between two agent protocols,
+  recording each hop as an ECT DAG node.  Fully specified in AEPB.
+
+# Architectural Model {#architecture}
+
+The ecosystem is organized in four layers:
+
+~~~
+┌─────────────────────────────────────────────────────┐
+│                  Policy Layer                        │
+│  ACP-DAG-HITL: delegation DAG, HITL rules,          │
+│  node constraints, trust thresholds                  │
+├─────────────────────────────────────────────────────┤
+│               Semantics Layer                        │
+│  ATD: execution order, checkpoints, rollback,        │
+│       circuit breakers, resource hints                │
+│  HITL: override levels, approval gates, escalation   │
+│  AEPB: capability ads, negotiation, translation      │
+│  APAE: trust scoring, behavior verification,         │
+│        provenance, assurance profiles                 │
+├─────────────────────────────────────────────────────┤
+│               Evidence Layer                         │
+│  ECT: signed DAG of execution records (L1/L2/L3)    │
+│  inp_hash/out_hash, ext claims, audit ledger         │
+├─────────────────────────────────────────────────────┤
+│               Identity Layer                         │
+│  WIMSE / X.509 / OAuth / JWK: agent identity         │
+└─────────────────────────────────────────────────────┘
+~~~
+{: #fig-stack title="Ecosystem Layer Stack"}
+
+Identity Layer:
+: Answers "who is this agent?"  AEM does not define identity
+  mechanisms; it assumes WIMSE, X.509, OAuth, or equivalent.
+
+Evidence Layer:
+: Answers "what did this agent do?"  ECT provides per-task
+  signed records linked into a DAG, with three assurance levels.
+
+Semantics Layer:
+: Answers "what does it mean and what to do about it?"  The
+  four companion drafts define operational semantics on top of
+  ECT:
+
+  - **ATD** (Agent Task DAG): execution order, checkpoints,
+    rollback, circuit breakers, resource hints.
+  - **HITL** (Human-in-the-Loop): override levels, approval
+    gates, escalation paths, explainability.
+  - **AEPB** (Agent Ecosystem Protocol Binding): capability
+    advertisement, protocol negotiation, translation gateways,
+    agent lifecycle.
+  - **APAE** (Assurance Profiles): dynamic trust scoring,
+    behavior verification, data provenance, assurance profiles.
+
+Policy Layer:
+: Answers "what's allowed?"  ACP-DAG-HITL defines delegation
+  constraints and HITL trigger rules.  Companion drafts extend
+  `constraints` with protocol-specific fields (trust thresholds,
+  checkpoint policies, protocol restrictions).
+
+## How ECT Extensions Work {#ect-ext}
+
+Each companion draft defines `ext` claim namespaces on ECT:
+
+| Draft | `ext` prefix | Example claims |
+|-------|-------------|----------------|
+| ATD | `atd.*` | `atd.reversible`, `atd.severity`, `atd.circuit_state` |
+| HITL | `hitl.*` | `hitl.level`, `hitl.operator_id`, `hitl.prior_state` |
+| AEPB | `aepb.*` | `aepb.source_protocol`, `aepb.dest_protocol` |
+| APAE | `apae.*` | `apae.trust_score`, `apae.confidence`, `apae.hops` |
+{: #fig-ext title="ECT Extension Namespaces"}
+
+No draft MAY use another draft's `ext` namespace without a
+normative reference to that draft.
+
+## How Policy Extensions Work {#policy-ext}
+
+Each companion draft defines `constraints` fields on
+ACP-DAG-HITL DAG nodes:
+
+| Draft | Constraint fields |
+|-------|------------------|
+| ATD | `atd.checkpoint_policy`, `atd.circuit_threshold` |
+| HITL | (uses ACP-DAG-HITL HITL rule fields directly) |
+| AEPB | `aepb.allowed_protocols`, `aepb.max_translation_hops` |
+| APAE | `apae.min_trust`, `apae.min_confidence`, `apae.assurance_profile` |
+{: #fig-constraints title="ACP-DAG-HITL Node Constraint Extensions"}
+
+# Assurance as an Orthogonal Axis {#assurance}
+
+The entire semantics layer operates identically at all ECT
+assurance levels.  The DAG structure, HITL processing, trust
+scoring, and protocol translation are the same whether the ECT
+is unsigned JSON (L1) or a ledger-committed signed JWT (L3).
+
+What changes across levels is the security envelope (see
+{{fig-levels}}).  A deployment MAY use different levels for
+different workflows.  Internal dev pipelines might use L1;
+cross-org integrations L2; regulated clinical workflows L3.
+
+Implementations MUST ensure consistency across layers: an L3
+evidence configuration provides no additional assurance if the
+policy layer accepts unsigned tokens.
+
+# Protocol Agnosticism {#agnosticism}
+
+The ecosystem layer sits above any A2A communication protocol.
+Agents communicate via their native protocol (A2A, MCP, SLIM,
+uACP, etc.) while the `Execution-Context` HTTP header
+{{I-D.nennemann-wimse-ect}} carries ECTs alongside protocol
+messages.
+
+When two agents speak different protocols, a translation gateway
+(defined by AEPB) converts between protocols while preserving
+ECT DAG continuity.  The translation hop is itself an ECT node,
+so the cross-protocol path is one auditable DAG.
+
+# Relationship to Existing Standards {#standards}
+
+The ecosystem builds on existing IETF and industry standards.
+It does not replace any of them.
+
+| Standard | Scope | Relationship to AEM |
+|----------|-------|---------------------|
+| WIMSE (draft-ietf-wimse-arch) | Workload identity and security context propagation | Identity Layer; AEM assumes WIMSE for agent credentials and context propagation. |
+| ECT (I-D.nennemann-wimse-ect) | JWT-based execution evidence; DAG linkage via `par` | Evidence Layer; every significant action in the ecosystem produces an ECT. |
+| ACP-DAG-HITL (I-D.nennemann-agent-dag-hitl-safety) | Delegation DAG policy; HITL trigger rules | Policy Layer; ATD/HITL/AEPB/APAE extend `constraints` fields, not replace the policy language. |
+| OAuth 2.0 / RAR (RFC9396) | Authorization for API access | Identity Layer; operators and agents authenticate to HITL endpoints and capability documents via OAuth. |
+| RATS (RFC9334) | Remote attestation for verifying evidence freshness | Informative to APAE Regulated profile; behavior verification attestations are RATS-compatible. |
+| SPIFFE/SPIRE | Workload identity URI scheme (`spiffe://`) | Identity Layer; agent identities in ECT `sub` and ACP-DAG-HITL node `agent` fields use SPIFFE URIs by convention. |
+{: #fig-standards title="Relationship to Existing Standards"}
+
+## Working Group Targets
+
+| Companion Draft | Suggested WG | Rationale |
+|----------------|-------------|-----------|
+| AEM (this document) | NMOP | Informational reference model for network operations automation. |
+| ATD | NMOP | Execution semantics and error recovery for network agent workflows. |
+| HITL | NMOP or OPS | Human override for autonomous network management. |
+| AEPB | DISPATCH or ART | Protocol binding and interoperability layer; dispatch to appropriate WG. |
+| APAE | RATS or Security Dispatch | Attestation-based trust and assurance profiles for agents. |
+{: #fig-wgs title="Suggested Working Group Targets"}
+
+# Companion Draft Summary {#companions}
+
+| Draft | Abbrev | Concern | Gaps Addressed | Normative/Informative |
+|-------|--------|---------|----------------|----------------------|
+| Agent Task DAG | ATD | Execution, checkpoints, rollback, circuit breakers | #1 Resource Mgmt, #3 Error Recovery | Normative |
+| Human-in-the-Loop | HITL | Override, approval, escalation, explainability | #7 Human Override, #11 Explainability | Normative |
+| Protocol Binding | AEPB | Interop, translation, lifecycle | #4 Cross-Protocol, #5 Lifecycle | Normative |
+| Assurance Profiles | APAE | Trust, verification, provenance, dual-regime | #2 Behavior Verification, #8 Cross-Domain, #9 Dynamic Trust, #12 Provenance | Informative/Normative |
+{: #fig-companions title="Companion Draft Family"}
+
+Together with ECT (evidence) and ACP-DAG-HITL (policy), these
+six documents cover all 3 critical and 6 high-severity gaps
+identified in the IETF AI/agent draft landscape analysis.
+
+# Implementation Guidance {#implementation}
+
+## Choosing an Assurance Level
+
+Operators select the assurance level based on deployment context:
+
+Relaxed (L1):
+: Appropriate for internal development, testing, and
+  observability pipelines.  No cryptographic overhead.
+  Operators SHOULD NOT use L1 where ECT records could be
+  relied upon as evidence in disputes.
+
+Standard (L2):
+: Appropriate for production cross-organization deployments.
+  Signed ECTs provide non-repudiation.  RECOMMENDED as the
+  default for any deployment where agents cross trust domains.
+
+Regulated (L3):
+: Required for deployments subject to regulatory audit
+  requirements (healthcare, finance, critical infrastructure).
+  ECTs are committed to an append-only audit ledger.
+  Operators MUST use L3 when a regulatory framework mandates
+  tamper-evident audit trails.
+
+## Minimum Viable Implementation
+
+An implementation is AEM-compliant if it satisfies:
+
+1. **Evidence**: Emits ECTs for all consequential actions.
+   MAY use L1 initially.
+
+2. **Policy**: Evaluates ACP-DAG-HITL node constraints before
+   delegating tasks.
+
+3. **Checkpoints**: Implements ATD §4 (checkpoints before
+   consequential actions).  MUST declare `atd.reversible`.
+
+4. **HITL endpoint**: Implements HITL `/.well-known/hitl/override`
+   and responds within 1 second.
+
+5. **Capability document**: Serves AEPB `/.well-known/aepb` so
+   peers can discover protocol support.
+
+The following are OPTIONAL at L1 but REQUIRED at L2+:
+
+- Cryptographic signing of ECTs.
+- APAE trust scoring.
+- Behavior verification.
+
+The following are REQUIRED only at L3 (Regulated profile):
+
+- Audit ledger commitment.
+- Continuous behavior verification.
+- Provenance claims on data-transforming ECT nodes.
+
+## Upgrade Path
+
+Upgrading from L1 to L2:
+: Add a signing key (WIMSE WIT or X.509).  Update ECT emission
+  to sign tokens.  Update all agents to verify signatures.
+  No protocol changes needed; ECT format is compatible.
+
+Upgrading from L2 to L3:
+: Configure an audit ledger endpoint.  Update ECT emission to
+  commit each ECT.  Enable APAE continuous behavior
+  verification.  Enable provenance claims.
+
+Operators MUST NOT downgrade assurance level during an active
+workflow.
+
+# Security Considerations
+
+## Threat Model
+
+The AEM threat model covers the following adversary classes:
+
+**Compromised Agent**: An agent that emits false ECTs, fabricates
+errors, or attempts unauthorized rollbacks.  Mitigated by ECT
+signature verification (L2+) and ACP-DAG-HITL policy validation.
+
+**Rogue Operator**: A human who issues unauthorized overrides.
+Mitigated by HITL authentication requirements (signed JWTs,
+mutual TLS) and multi-operator approval for Level 4 TAKEOVER.
+
+**Translation Gateway Attack**: A malicious or compromised
+gateway that alters message content in transit.  Mitigated by
+ECT `inp_hash`/`out_hash` integrity checks; receivers MUST
+detect hash mismatches.
+
+**Trust Score Manipulation**: An agent accumulates high trust
+through benign behavior, then executes a malicious action.
+Mitigated by APAE double-penalty for `policy_violation` events
+and anomaly detection.
+
+**Downgrade Attack**: An attacker forces use of L1 ECTs where
+L2+ is required.  Mitigated by explicit assurance level checks
+in ACP-DAG-HITL constraints (`apae.assurance_profile` field).
+
+## Layer Consistency Requirement
+
+Implementations MUST configure the semantics, evidence, and
+policy layers consistently.  Specifically:
+
+- An L3 evidence deployment MUST NOT accept L1 ECTs as proof
+  of action in audit or policy decisions.
+- A Regulated assurance profile MUST be paired with L3 ECTs.
+- HITL Level 2+ (approval required) MUST be authenticated.
+
+## Translation Gateway Supply Chain
+
+Translation gateways are privileged intermediaries: they have
+access to plaintext message content and can inject ECT nodes.
+Operators MUST:
+
+- Authenticate gateways using the same identity mechanisms as
+  agents (WIMSE/SPIFFE).
+- Audit gateway ECT nodes at L2+ for tamper detection.
+- Limit `aepb.max_translation_hops` to prevent unbounded
+  delegation chains through untrusted gateways.
+
+# IANA Considerations
+
+## AEM Ecosystem Extension Registry
+
+This document requests the creation of the "AEM Ecosystem
+Extension Registry" under IANA.  This registry collects:
+
+1. **ECT Extension Namespaces**: Companion draft `ext` claim
+   prefixes (see {{fig-ext}}).
+2. **ACP-DAG-HITL Constraint Field Namespaces**: Companion draft
+   `constraints` field prefixes (see {{fig-constraints}}).
+3. **ECT `exec_act` Values**: All `exec_act` strings registered
+   by companion drafts (see each companion's IANA section).
+
+Registration policy: Specification Required.
+
+Initial entries: as defined in {{fig-ext}}, {{fig-constraints}},
+and the companion draft `exec_act` registrations.
+
+--- back
+
+# Acknowledgments
+{:numbered="false"}
+
+This architecture builds on the Execution Context Token
+specification {{I-D.nennemann-wimse-ect}} and the Agent Context
+Policy Token {{I-D.nennemann-agent-dag-hitl-safety}}.  The
+working group targets in {{fig-wgs}} reflect the current IETF
+AI/agent draft landscape analysis.
--- a/workspace/drafts/new-drafts/draft-aerr-agent-error-recovery-rollback-00.md
+++ b/workspace/drafts/new-drafts/draft-aerr-agent-error-recovery-rollback-00.md
@@ -0,0 +1,397 @@
+---
+title: "Agent Error Recovery and Rollback (AERR)"
+abbrev: "AERR"
+category: std
+docname: draft-aerr-agent-error-recovery-rollback-00
+submissiontype: IETF
+number:
+date:
+v: 3
+area: "OPS"
+workgroup: "NMOP"
+keyword:
+  - error recovery
+  - rollback
+  - circuit breaker
+  - agentic workflows
+  - execution context
+
+author:
+  -
+    fullname: Generated by IETF Draft Analyzer
+    organization: Independent
+    email: placeholder@example.com
+
+normative:
+  RFC7519:
+  RFC7515:
+  RFC9110:
+  I-D.nennemann-wimse-ect:
+    title: "Execution Context Tokens for Distributed Agentic Workflows"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
+
+informative:
+  I-D.nennemann-agent-dag-hitl-safety:
+    title: "Agent Context Policy Token: DAG Delegation with Human Override"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
+
+--- abstract
+
+This document defines the Agent Error Recovery and Rollback (AERR)
+protocol, a standard for handling errors, cascading failures, and
+rollback in multi-agent systems.  AERR defines three mechanisms:
+state checkpoints recorded as Execution Context Token (ECT) DAG
+nodes, a circuit breaker pattern to contain cascading failures,
+and a rollback protocol that walks the ECT DAG backwards to revert
+agent actions to a known-good state.  By building on ECT, AERR
+inherits cryptographic audit trails, assurance levels, and DAG
+validation without inventing parallel infrastructure.
+
+--- middle
+
+# Introduction
+
+The IETF AI/agent landscape includes 60 drafts on autonomous
+network operations but none that standardize error recovery.  When
+an autonomous agent misconfigures a router, allocates resources
+incorrectly, or triggers a cascade of failures across a multi-agent
+system, there is no standard mechanism for detecting the failure,
+containing its blast radius, or reverting to a safe state.
+
+AERR borrows proven patterns from distributed systems -- checkpoints
+from database transactions, circuit breakers from microservice
+architectures, rollback from version control -- and adapts them for
+AI agent workflows.  Rather than inventing its own audit and
+tracing layer, AERR records all checkpoints, errors, and rollbacks
+as ECT DAG nodes {{I-D.nennemann-wimse-ect}}, giving every
+recovery action a cryptographic proof chain.
+
+Design principles:
+
+1. Agents that take consequential actions MUST be able to undo
+   them, or MUST declare them irreversible upfront.
+2. Failure containment takes priority over failure diagnosis.
+3. The protocol adds minimal overhead to the happy path.
+
+# Conventions and Definitions
+
+{::boilerplate bcp14-tagged}
+
+Checkpoint:
+: An ECT recording an agent's state hash before a consequential
+  action, providing a restore point for rollback.
+
+Circuit Breaker:
+: A mechanism that stops an agent from propagating requests to a
+  failing downstream agent, preventing cascading failures.
+
+Rollback:
+: The process of reverting an agent's actions and state to a
+  previously recorded checkpoint, walking the ECT DAG backwards.
+
+Blast Radius:
+: The set of agents and systems affected by a single agent's
+  failure, determinable by traversing the ECT DAG forward from the
+  failing node.
+
+# Problem Statement
+
+Consider a network operations scenario: Agent A instructs Agent B
+to update firewall rules, which causes Agent C's traffic monitoring
+to fail, which causes Agent D to misclassify traffic.  Today each
+agent handles errors independently.  There is no standard way for
+Agent D to signal that the root cause is upstream, for the cascade
+to be halted, or for the chain of actions to be rolled back.
+
+The ECT DAG {{I-D.nennemann-wimse-ect}} already records causal
+ordering of agent actions via `par` references.  AERR adds
+checkpoint semantics, error propagation, and rollback operations
+on top of this existing structure.
+
+# Checkpoint Mechanism {#checkpoints}
+
+An AERR-compliant agent MUST create a checkpoint ECT before any
+action it classifies as consequential.  An action is consequential
+if it modifies external state (e.g., network config, database
+records, API calls with side effects).
+
+## Checkpoint as ECT
+
+A checkpoint is an ECT with:
+
+- `exec_act`: `"aerr:checkpoint"`
+- `par`: the `jti` of the preceding task ECT in the workflow
+- `out_hash`: SHA-256 hash of the agent's state snapshot at
+  checkpoint time (for rollback integrity verification)
+
+The `ext` claim carries AERR-specific metadata:
+
+~~~json
+{
+  "ext": {
+    "aerr.action_type": "config_update",
+    "aerr.target": "router-07.example.com",
+    "aerr.reversible": true,
+    "aerr.rollback_uri": "https://agent-b.example.com/aerr/rollback",
+    "aerr.ttl": 86400
+  }
+}
+~~~
+{: #fig-checkpoint title="Checkpoint ECT Extension Claims"}
+
+The `aerr.reversible` field MUST be present.  If `false`, the
+agent declares that this action cannot be automatically undone
+and rollback requests MUST be escalated to a human operator via
+the HITL mechanism {{I-D.nennemann-agent-dag-hitl-safety}}.
+
+Agents MAY create hierarchical checkpoints using the ECT DAG: a
+parent checkpoint ECT with `par` references to multiple child
+checkpoint ECTs.  Rolling back the parent rolls back all children.
+
+## Checkpoint Storage
+
+Checkpoint ECTs MUST be stored for at least the duration specified
+by `aerr.ttl`.  At L3 {{I-D.nennemann-wimse-ect}}, checkpoints
+are automatically preserved in the audit ledger.  At L1 and L2,
+agents MUST store checkpoints in durable local storage that
+survives agent restarts.
+
+# Error Signaling {#error-signals}
+
+When an agent detects an error, it MUST produce an error ECT and
+propagate it to affected agents in the DAG.
+
+## Error ECT
+
+An error signal is an ECT with:
+
+- `exec_act`: `"aerr:error"`
+- `par`: the `jti` of the checkpoint ECT associated with the
+  failing action
+
+The `ext` claim carries error details:
+
+~~~json
+{
+  "ext": {
+    "aerr.severity": "critical",
+    "aerr.error_type": "action_failed",
+    "aerr.description": "BGP session did not establish",
+    "aerr.checkpoint_id": "550e8400-e29b-41d4-a716-446655440001",
+    "aerr.upstream_errors": []
+  }
+}
+~~~
+{: #fig-error title="Error ECT Extension Claims"}
+
+Severity levels: `info`, `warning`, `error`, `critical`.
+
+Error types: `action_failed`, `timeout`, `constraint_violation`,
+`resource_exhausted`, `upstream_cascade`, `unknown`.
+
+## Error Propagation via DAG
+
+When an agent receives an error ECT caused by an action it
+initiated, it MUST either:
+
+(a) Attempt automatic rollback of its checkpoint ({{rollback}}), or
+
+(b) Escalate to its operator if the action was irreversible.
+
+The `aerr.upstream_errors` array allows agents to chain error
+context by referencing `jti` values of predecessor error ECTs,
+building a causal trace from symptom to root cause through the
+DAG.
+
+## HITL Escalation
+
+When an error requires human intervention, the error ECT SHOULD
+trigger a HITL rule per {{I-D.nennemann-agent-dag-hitl-safety}}.
+Example policy:
+
+~~~json
+{
+  "hitl": {
+    "rules": [{
+      "id": "r-critical-error",
+      "trigger": {
+        "kind": "keyword_match",
+        "op": "eq",
+        "value": "critical",
+        "input_ref": "ext.aerr.severity"
+      },
+      "required_role": "operator:oncall",
+      "action": "escalate",
+      "allow_override": true,
+      "override_action": "continue"
+    }]
+  }
+}
+~~~
+{: #fig-hitl-error title="HITL Policy for Critical Errors"}
+
+# Circuit Breaker Pattern {#circuit-breaker}
+
+Each agent MUST implement a circuit breaker for every downstream
+agent it communicates with.
+
+## States
+
+CLOSED (normal):
+: Requests flow through.  The agent tracks the error rate over a
+  sliding window (default: 60 seconds).
+
+OPEN (failure detected):
+: When the error rate exceeds a threshold (default: 50% over the
+  window), the breaker opens.  All requests to the downstream
+  agent are immediately rejected with `aerr.error_type`:
+  `circuit_open`.  The agent MUST produce an error ECT and emit
+  it to upstream peers.
+
+HALF-OPEN (recovery probe):
+: After a cooldown period (default: 30 seconds), the breaker
+  allows a single probe request.  If it succeeds, the breaker
+  returns to CLOSED.  If it fails, it returns to OPEN with doubled
+  cooldown (exponential backoff, max 300 seconds).
+
+## State Change ECTs
+
+Each circuit breaker state change MUST produce an ECT:
+
+- `exec_act`: `"aerr:circuit_open"`, `"aerr:circuit_half_open"`,
+  or `"aerr:circuit_closed"`
+- `par`: the `jti` of the error ECT that triggered the transition
+
+This records the health topology of the agent network in the ECT
+DAG, queryable from the audit ledger at L3.
+
+## Observability
+
+Agents MUST expose circuit breaker state at:
+
+~~~
+GET /aerr/circuits
+~~~
+
+Response:
+
+~~~json
+{
+  "circuits": [{
+    "downstream_agent": "spiffe://example.com/agent/router-mgr",
+    "state": "open",
+    "error_rate": 0.75,
+    "last_failure_ect": "550e8400-e29b-41d4-a716-446655440099",
+    "cooldown_remaining_s": 22
+  }]
+}
+~~~
+{: #fig-circuits title="Circuit Breaker Status"}
+
+# Rollback Protocol {#rollback}
+
+## Rollback Request
+
+A rollback is initiated by sending an HTTP POST to the target
+agent's rollback endpoint:
+
+~~~
+POST /aerr/rollback HTTP/1.1
+Content-Type: application/json
+Execution-Context: <rollback-request-ECT>
+
+{
+  "rollback_id": "urn:uuid:...",
+  "checkpoint_id": "550e8400-e29b-41d4-a716-446655440001",
+  "reason": "Upstream action caused cascading failure",
+  "cascade": true
+}
+~~~
+{: #fig-rollback-req title="Rollback Request"}
+
+The request MUST include an ECT in the Execution-Context header
+with `exec_act`: `"aerr:rollback_request"` and `par` referencing
+the error ECT that motivated the rollback.
+
+When `cascade` is `true`, the receiving agent MUST also initiate
+rollback of any downstream checkpoints created as a consequence
+of the checkpointed action.  The ECT DAG's `par` chain identifies
+these downstream actions.
+
+## Rollback Response
+
+The agent produces a rollback result ECT with:
+
+- `exec_act`: `"aerr:rollback_complete"` (or `"aerr:rollback_escalated"`)
+- `par`: the `jti` of the rollback request ECT
+- `out_hash`: SHA-256 hash of the agent's state after rollback
+
+~~~json
+{
+  "ext": {
+    "aerr.rollback_id": "urn:uuid:...",
+    "aerr.status": "completed",
+    "aerr.state_hash_before": "sha256:...",
+    "aerr.state_hash_after": "sha256:...",
+    "aerr.cascaded": [
+      {"agent": "spiffe://example.com/agent/monitor", "status": "completed"},
+      {"agent": "spiffe://example.com/agent/classify", "status": "escalated"}
+    ]
+  }
+}
+~~~
+{: #fig-rollback-resp title="Rollback Result ECT"}
+
+Status values: `completed`, `partial`, `escalated`, `failed`.
+
+`escalated` means the action was irreversible and a human operator
+has been notified via HITL.  `partial` means some but not all
+downstream rollbacks succeeded.
+
+## Idempotency
+
+Agents MUST implement idempotent rollback: receiving the same
+`rollback_id` twice MUST return the same result without
+re-executing the rollback.
+
+# Security Considerations
+
+Rollback requests are sensitive operations.  Agents MUST
+authenticate rollback requests via the ECT signature chain -- only
+agents whose ECTs appear in the same workflow DAG (identified by
+`wid`) SHOULD be authorized to request rollback.
+
+Checkpoint ECTs contain `out_hash` of agent state but not the
+state itself.  Agents MUST encrypt stored state snapshots at rest.
+
+Circuit breaker status exposes system health topology.  The
+`/aerr/circuits` endpoint SHOULD be access-controlled.
+
+Malicious agents could emit false error ECTs to trigger rollbacks.
+Agents SHOULD verify that error ECTs reference valid checkpoint
+`jti` values from their own workflow DAG before initiating
+rollback.  At L2 and L3, ECT signatures prevent forgery.
+
+# IANA Considerations
+
+This document requests the following IANA registrations:
+
+1. An "AERR Error Type" registry under Specification Required
+   policy.  Initial entries: `action_failed`, `timeout`,
+   `constraint_violation`, `resource_exhausted`,
+   `upstream_cascade`, `circuit_open`, `unknown`.
+
+2. Registration of `exec_act` values `aerr:checkpoint`,
+   `aerr:error`, `aerr:rollback_request`, `aerr:rollback_complete`,
+   `aerr:circuit_open`, `aerr:circuit_half_open`,
+   `aerr:circuit_closed` in a future ECT action type registry.
+
+--- back
+
+# Acknowledgments
+{:numbered="false"}
+
+This document builds on the Execution Context Token specification
+{{I-D.nennemann-wimse-ect}} for DAG-based audit trails and the
+Agent Context Policy Token {{I-D.nennemann-agent-dag-hitl-safety}}
+for HITL escalation of irreversible actions.
--- a/workspace/drafts/new-drafts/draft-aerr-agent-error-recovery-rollback-00.txt
+++ b/workspace/drafts/new-drafts/draft-aerr-agent-error-recovery-rollback-00.txt
@@ -0,0 +1,309 @@
+Internet-Draft                                           AI/Agent WG
+Intended status: Standards Track                          March 2026
+Expires: September 15, 2026
+
+
+         Agent Error Recovery and Rollback (AERR)
+         draft-aerr-agent-error-recovery-rollback-00
+
+Abstract
+
+   This document defines the Agent Error Recovery and Rollback
+   (AERR) protocol, a lightweight standard for handling errors,
+   cascading failures, and rollback in multi-agent systems.
+   Autonomous AI agents increasingly make unsupervised decisions,
+   yet no standard exists for how agents checkpoint state, signal
+   errors to peers, contain cascading failures, or roll back
+   autonomous decisions gone wrong.  AERR defines three mechanisms:
+   state checkpoints that agents create before consequential
+   actions, a circuit breaker pattern to contain cascading failures
+   across agent networks, and a rollback protocol for reverting
+   agent actions to a known-good state.  The protocol is transport-
+   agnostic and builds on JSON and standard HTTP semantics.
+
+Status of This Memo
+
+   This Internet-Draft is submitted in full conformance with the
+   provisions of BCP 78 and BCP 79.
+
+   This document is intended to have Standards Track status.
+   Distribution of this memo is unlimited.
+
+Table of Contents
+
+   1.  Introduction
+   2.  Terminology
+   3.  Problem Statement
+   4.  Checkpoint Mechanism
+   5.  Error Signaling
+   6.  Circuit Breaker Pattern
+   7.  Rollback Protocol
+   8.  Security Considerations
+   9.  IANA Considerations
+
+1.  Introduction
+
+   The IETF AI/agent landscape includes 60 drafts on autonomous
+   network operations but none that standardize error recovery.
+   When an autonomous agent misconfigures a router, allocates
+   resources incorrectly, or triggers an unintended cascade of
+   actions across a multi-agent system, there is currently no
+   standard mechanism for detecting the failure, containing its
+   blast radius, or reverting to a safe state.
+
+   AERR borrows proven patterns from distributed systems:
+   checkpoints from database transactions, circuit breakers from
+   microservice architectures, and rollback from version control.
+   It adapts these patterns to the specific needs of AI agents,
+   where actions may be partially reversible and where the agent
+   that caused the error may not be the best one to fix it.
+
+   Design principles:
+   1. Agents that take consequential actions MUST be able to undo
+      them, or MUST declare them irreversible upfront.
+   2. Failure containment takes priority over failure diagnosis.
+   3. The protocol adds minimal overhead to the happy path.
+
+2.  Terminology
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+   NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
+   "OPTIONAL" in this document are to be interpreted as described
+   in RFC 2119 [RFC2119].
+
+   Checkpoint: A snapshot of an agent's state and the external
+   effects of its actions at a point in time, sufficient to
+   restore the system to that state.
+
+   Circuit Breaker: A mechanism that stops an agent from
+   propagating requests to a failing downstream agent, preventing
+   cascading failures.
+
+   Rollback: The process of reverting an agent's actions and state
+   to a previously recorded checkpoint.
+
+   Blast Radius: The set of agents and systems affected by a
+   single agent's failure.
+
+3.  Problem Statement
+
+   Consider a network operations scenario: Agent A instructs
+   Agent B to update firewall rules, which causes Agent C's
+   traffic monitoring to fail, which causes Agent D to
+   misclassify traffic patterns.  Today each agent handles errors
+   independently with no coordination.  There is no standard way
+   for Agent D to signal that the root cause is upstream, for the
+   cascade to be halted, or for the chain of actions to be rolled
+   back.
+
+   The only existing draft that partially addresses this space
+   (draft-yue-anima-agent-recovery-networks) focuses on mobile
+   network fault recovery and does not provide general-purpose
+   error recovery primitives usable across agent types.
+
+4.  Checkpoint Mechanism
+
+   An AERR-compliant agent MUST create a checkpoint before any
+   action it classifies as "consequential."  An action is
+   consequential if it modifies external state (e.g., network
+   config, database records, API calls with side effects).
+
+   A checkpoint is a JSON object:
+
+      {
+        "checkpoint_id": "urn:uuid:...",
+        "agent_id": "urn:uuid:...",
+        "timestamp": "2026-03-01T12:00:00Z",
+        "action": {
+          "type": "config_update",
+          "target": "router-07.example.com",
+          "description": "Update BGP peer config"
+        },
+        "reversible": true,
+        "rollback_procedure": {
+          "method": "POST",
+          "uri": "https://agent-b.example.com/aerr/rollback",
+          "payload_ref": "urn:uuid:...prior-config-snapshot"
+        },
+        "state_hash": "sha256:abcdef...",
+        "ttl": 86400
+      }
+
+   The "reversible" field MUST be present.  If false, the agent
+   declares that this action cannot be automatically undone and
+   rollback requests for this checkpoint MUST be escalated to a
+   human operator.
+
+   The "state_hash" provides integrity verification: the agent
+   hashes its relevant state at checkpoint time so that rollback
+   can verify it is restoring to an authentic prior state.
+
+   Checkpoints MUST be stored for at least the duration specified
+   by "ttl" (seconds).  Agents SHOULD store checkpoints in durable
+   storage that survives agent restarts.
+
+   Agents MAY create hierarchical checkpoints where a parent
+   checkpoint groups multiple child checkpoints from a multi-step
+   operation.  Rolling back the parent rolls back all children.
+
+5.  Error Signaling
+
+   When an agent detects an error, it MUST emit an AERR error
+   signal to all agents in the current action chain.  The error
+   signal is an HTTP POST to each peer's AERR endpoint:
+
+      POST /aerr/error HTTP/1.1
+      Content-Type: application/json
+
+      {
+        "error_id": "urn:uuid:...",
+        "source_agent": "urn:uuid:...",
+        "severity": "critical",
+        "checkpoint_id": "urn:uuid:...",
+        "error_type": "action_failed",
+        "description": "BGP session did not establish after config update",
+        "timestamp": "2026-03-01T12:05:00Z",
+        "upstream_errors": []
+      }
+
+   Severity levels: "info", "warning", "error", "critical".
+
+   Error types: "action_failed", "timeout", "constraint_violation",
+   "resource_exhausted", "upstream_cascade", "unknown".
+
+   When an agent receives an error signal caused by an action it
+   initiated, it MUST either:
+   (a) Attempt automatic rollback of its checkpoint, or
+   (b) Escalate to its operator if the action was irreversible.
+
+   The "upstream_errors" array allows agents to chain error
+   context, building a causal trace from the symptom back to the
+   root cause.
+
+6.  Circuit Breaker Pattern
+
+   Each agent MUST implement a circuit breaker for every downstream
+   agent it communicates with.  The circuit breaker has three
+   states:
+
+   CLOSED (normal operation): Requests flow through.  The agent
+   tracks the error rate over a sliding window (default: 60s).
+
+   OPEN (failure detected): When the error rate exceeds a
+   threshold (default: 50% over the window), the circuit breaker
+   opens.  All requests to the downstream agent are immediately
+   rejected with error_type "circuit_open".  The agent MUST emit
+   an error signal to upstream peers.
+
+   HALF-OPEN (recovery probe): After a cooldown period (default:
+   30s), the circuit breaker allows a single probe request.  If it
+   succeeds, the breaker returns to CLOSED.  If it fails, it
+   returns to OPEN with a doubled cooldown (exponential backoff,
+   max 300s).
+
+   Agents MUST expose circuit breaker state at:
+
+      GET /aerr/circuits
+
+   Response:
+      {
+        "circuits": [
+          {
+            "downstream_agent": "urn:uuid:...",
+            "state": "open",
+            "error_rate": 0.75,
+            "last_failure": "2026-03-01T12:05:00Z",
+            "cooldown_remaining_s": 22
+          }
+        ]
+      }
+
+   This enables monitoring systems and upstream agents to
+   understand the health topology of the agent network.
+
+7.  Rollback Protocol
+
+   A rollback is initiated by sending an HTTP POST to the target
+   agent's rollback endpoint:
+
+      POST /aerr/rollback HTTP/1.1
+      Content-Type: application/json
+
+      {
+        "rollback_id": "urn:uuid:...",
+        "checkpoint_id": "urn:uuid:...",
+        "reason": "Upstream action caused cascading failure",
+        "initiator": "urn:uuid:...",
+        "cascade": true
+      }
+
+   When "cascade" is true, the receiving agent MUST also initiate
+   rollback of any downstream checkpoints that were created as a
+   consequence of the checkpointed action.  This enables a single
+   rollback request to unwind an entire chain of agent actions.
+
+   The agent MUST respond with a rollback result:
+
+      {
+        "rollback_id": "urn:uuid:...",
+        "status": "completed",
+        "checkpoint_id": "urn:uuid:...",
+        "state_hash_before": "sha256:...",
+        "state_hash_after": "sha256:...",
+        "cascaded_rollbacks": [
+          {"agent_id": "urn:uuid:...", "status": "completed"},
+          {"agent_id": "urn:uuid:...", "status": "escalated"}
+        ]
+      }
+
+   Rollback status values: "completed", "partial", "escalated",
+   "failed".
+
+   "escalated" means the action was irreversible and a human
+   operator has been notified.  "partial" means some but not all
+   downstream rollbacks succeeded.
+
+   Agents MUST implement idempotent rollback: receiving the same
+   rollback_id twice MUST return the same result without re-
+   executing the rollback.
+
+8.  Security Considerations
+
+   Rollback requests are sensitive operations.  Agents MUST
+   authenticate rollback requests using mutual TLS or signed JWTs.
+   Only agents in the same action chain (identified by checkpoint
+   lineage) SHOULD be authorized to request rollback.
+
+   Checkpoint data may contain sensitive system state.  Agents
+   MUST encrypt stored checkpoints at rest and MUST NOT include
+   checkpoint contents in error signals.
+
+   Circuit breaker state is observable information about system
+   health.  The /aerr/circuits endpoint SHOULD be access-
+   controlled to prevent adversaries from mapping system topology.
+
+   Malicious agents could send false error signals to trigger
+   unnecessary rollbacks.  Agents SHOULD verify that error signals
+   reference valid checkpoint IDs from their own action chains
+   before initiating rollback.
+
+9.  IANA Considerations
+
+   This document requests IANA establish the following:
+
+   1. An "AERR Error Type" registry under Specification Required
+      policy.  Initial entries: "action_failed", "timeout",
+      "constraint_violation", "resource_exhausted",
+      "upstream_cascade", "unknown".
+
+   2. An "AERR Severity Level" registry under Specification
+      Required policy.  Initial entries: "info", "warning",
+      "error", "critical".
+
+   3. Well-known URI registrations for "aerr/error",
+      "aerr/rollback", and "aerr/circuits" per RFC 8615.
+
+Author's Address
+
+   Generated by IETF Draft Analyzer
+   2026-03-01
--- a/workspace/drafts/new-drafts/draft-b-atd-agent-task-dag-00.md
+++ b/workspace/drafts/new-drafts/draft-b-atd-agent-task-dag-00.md
@@ -0,0 +1,386 @@
+---
+title: "Agent Task DAG (ATD): Execution Model, Checkpoints, and Recovery"
+abbrev: "ATD"
+category: std
+docname: draft-atd-agent-task-dag-00
+submissiontype: IETF
+number:
+date:
+v: 3
+area: "OPS"
+workgroup: "NMOP"
+keyword:
+  - agent DAG
+  - checkpoint
+  - rollback
+  - error recovery
+  - circuit breaker
+
+author:
+  -
+    fullname: TBD
+    organization: Independent
+    email: placeholder@example.com
+
+normative:
+  RFC2119:
+  RFC8174:
+  RFC8446:
+  I-D.nennemann-wimse-ect:
+    title: "Execution Context Tokens for Distributed Agentic Workflows"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
+  I-D.nennemann-agent-dag-hitl-safety:
+    title: "Agent Context Policy Token: DAG Delegation with Human Override"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
+
+informative:
+
+--- abstract
+
+This document defines the Agent Task DAG (ATD) specification:
+execution semantics, checkpoints, error signaling, circuit
+breakers, and rollback for agent workflows.  ATD does not define a
+new DAG or token format.  It defines when agents MUST emit ECT
+nodes, what those nodes mean, and how to recover when things go
+wrong.  Checkpoints, errors, and rollback results are ECT nodes
+with specific `exec_act` values and `ext` claims.  Rollback walks
+the ECT DAG backwards.  Circuit breakers contain cascading
+failures.  Resource hints enable scheduling.  The protocol is
+transport-agnostic and builds on ECT for evidence and ACP-DAG-HITL
+for policy.
+
+--- middle
+
+# Introduction
+
+Autonomous agents increasingly make unsupervised decisions, yet no
+standard exists for how agents checkpoint state, signal errors to
+peers, contain cascading failures, or roll back decisions gone
+wrong.
+
+ATD borrows proven patterns from distributed systems: checkpoints
+from database transactions, circuit breakers from microservice
+architectures, and rollback from version control.  It adapts these
+to agent workflows where actions may be partially reversible and
+where the agent that caused the error may not be the best one to
+fix it.
+
+ATD does not define a new DAG format.  The ECT DAG
+{{I-D.nennemann-wimse-ect}} IS the execution graph.  ATD defines
+the semantics of specific node types within that graph.
+
+Design principles:
+
+1. Agents that take consequential actions MUST be able to undo
+   them, or MUST declare them irreversible upfront.
+2. Failure containment takes priority over failure diagnosis.
+3. The protocol adds minimal overhead to the happy path.
+
+# Conventions and Definitions
+
+{::boilerplate bcp14-tagged}
+
+Checkpoint:
+: An ECT node recording agent state before a consequential action,
+  sufficient to restore the system to that state.
+
+Circuit Breaker:
+: A mechanism that stops an agent from propagating requests to a
+  failing downstream agent, preventing cascading failures.
+
+Rollback:
+: The process of reverting an agent's actions and state to a
+  previously recorded checkpoint.
+
+Blast Radius:
+: The set of agents and systems affected by a single failure.
+
+# Node States {#node-states}
+
+Each task node in the ECT DAG has an implicit state derived from
+subsequent ECT nodes:
+
+- **pending**: A delegation node exists in ACP-DAG-HITL but no
+  corresponding ECT has been emitted.
+- **running**: An ECT with `exec_act` matching the task type has
+  been emitted but no completion or error ECT follows.
+- **done**: A completion ECT (or the next `par`-linked ECT) exists.
+- **failed**: An `atd:error` ECT references this node.
+- **rolled_back**: An `atd:rollback_result` ECT references this
+  node's checkpoint.
+
+# Checkpoint Mechanism {#checkpoints}
+
+An ATD-compliant agent MUST create a checkpoint before any action
+it classifies as consequential.  An action is consequential if it
+modifies external state (network config, database records, API
+calls with side effects).
+
+A checkpoint is an ECT with:
+
+- `exec_act`: `"atd:checkpoint"`
+- `par`: the ECT of the action being checkpointed
+
+~~~json
+{
+  "jti": "ckpt-uuid",
+  "exec_act": "atd:checkpoint",
+  "par": ["action-ect-uuid"],
+  "out_hash": "sha256-of-agent-state-snapshot",
+  "ext": {
+    "atd.reversible": true,
+    "atd.rollback_uri": "https://agent-b.example.com/atd/rollback",
+    "atd.target": "router-07.example.com",
+    "atd.description": "Update BGP peer config",
+    "atd.ttl": 86400
+  }
+}
+~~~
+{: #fig-checkpoint title="Checkpoint ECT"}
+
+The `atd.reversible` field MUST be present.  If `false`, the agent
+declares that this action cannot be automatically undone and
+rollback requests MUST be escalated per the ACP-DAG-HITL
+`unreachable_human` policy.
+
+The `out_hash` provides integrity verification: the agent hashes
+its state at checkpoint time so that rollback can verify it is
+restoring to an authentic prior state.
+
+Checkpoints MUST be stored for at least `atd.ttl` seconds.  Agents
+SHOULD store checkpoints in durable storage that survives restarts.
+
+## Hierarchical Checkpoints
+
+Agents MAY create hierarchical checkpoints where a parent groups
+multiple child checkpoints from a multi-step operation.  Rolling
+back the parent rolls back all children.  The parent checkpoint's
+`par` array references all child checkpoint `jti` values.
+
+# Error Signaling {#errors}
+
+When an agent detects an error, it MUST emit an error ECT:
+
+- `exec_act`: `"atd:error"`
+- `par`: the ECT of the failed action
+
+~~~json
+{
+  "jti": "error-uuid",
+  "exec_act": "atd:error",
+  "par": ["failed-action-ect-uuid"],
+  "ext": {
+    "atd.severity": "critical",
+    "atd.error_type": "action_failed",
+    "atd.description": "BGP session did not establish",
+    "atd.checkpoint_id": "ckpt-uuid",
+    "atd.upstream_errors": []
+  }
+}
+~~~
+{: #fig-error title="Error ECT"}
+
+Severity levels: `info`, `warning`, `error`, `critical`.
+
+Error types: `action_failed`, `timeout`, `constraint_violation`,
+`resource_exhausted`, `upstream_cascade`, `unknown`.
+
+When an agent receives an error signal caused by an action it
+initiated, it MUST either:
+
+(a) Attempt automatic rollback of its checkpoint, or
+(b) Escalate per ACP-DAG-HITL HITL rules if the action was
+    irreversible.
+
+The `atd.upstream_errors` array allows agents to chain error
+context, building a causal trace from symptom to root cause.
+
+## HITL Escalation on Error
+
+Error ECTs MAY trigger ACP-DAG-HITL rules.  A deployment can
+define HITL rules such as:
+
+~~~json
+{
+  "id": "r-critical-error",
+  "trigger": {
+    "kind": "keyword_match",
+    "op": "eq",
+    "value": "critical",
+    "input_ref": "atd.severity"
+  },
+  "required_role": "operator:oncall",
+  "action": "escalate",
+  "allow_override": true,
+  "override_action": "continue"
+}
+~~~
+{: #fig-error-hitl title="HITL Rule for Critical Errors"}
+
+# Circuit Breaker Pattern {#circuit-breaker}
+
+Each agent MUST implement a circuit breaker for every downstream
+agent it communicates with.  The circuit breaker has three states:
+
+CLOSED (normal):
+: Requests flow through.  The agent tracks the error rate over a
+  sliding window (default: 60 seconds).
+
+OPEN (failure detected):
+: When the error rate exceeds a threshold (default: 50%), the
+  breaker opens.  All requests are immediately rejected.  The
+  agent MUST emit a circuit breaker ECT:
+
+~~~json
+{
+  "exec_act": "atd:circuit_open",
+  "ext": {
+    "atd.downstream_agent": "spiffe://example.com/agent/b",
+    "atd.error_rate": 0.75,
+    "atd.window_s": 60
+  }
+}
+~~~
+{: #fig-circuit title="Circuit Breaker ECT"}
+
+HALF-OPEN (recovery probe):
+: After a cooldown period (default: 30s), the breaker allows one
+  probe request.  If it succeeds, the breaker returns to CLOSED.
+  If it fails, it returns to OPEN with doubled cooldown
+  (exponential backoff, max 300s).
+
+Circuit breaker thresholds can be configured as ACP-DAG-HITL
+node constraints:
+
+~~~json
+{
+  "constraints": {
+    "atd.circuit_threshold": 0.5,
+    "atd.circuit_window_s": 60
+  }
+}
+~~~
+{: #fig-circuit-policy title="Circuit Breaker Policy"}
+
+# Rollback Protocol {#rollback}
+
+A rollback is initiated by emitting a rollback request ECT and
+sending an HTTP POST to the target agent's rollback endpoint:
+
+~~~
+POST /atd/rollback HTTP/1.1
+Content-Type: application/json
+Execution-Context: <rollback-request-ect>
+~~~
+
+- `exec_act`: `"atd:rollback_request"`
+- `par`: the checkpoint ECT to roll back to
+
+~~~json
+{
+  "exec_act": "atd:rollback_request",
+  "par": ["ckpt-uuid"],
+  "ext": {
+    "atd.reason": "Upstream action caused cascading failure",
+    "atd.cascade": true
+  }
+}
+~~~
+{: #fig-rollback-req title="Rollback Request ECT"}
+
+When `atd.cascade` is `true`, the receiving agent MUST also
+initiate rollback of any downstream checkpoints created as a
+consequence of the checkpointed action.
+
+The agent MUST respond with a rollback result ECT:
+
+- `exec_act`: `"atd:rollback_result"`
+- `par`: the rollback request ECT
+
+~~~json
+{
+  "exec_act": "atd:rollback_result",
+  "par": ["rollback-request-uuid"],
+  "out_hash": "sha256-of-restored-state",
+  "ext": {
+    "atd.status": "completed",
+    "atd.checkpoint_id": "ckpt-uuid",
+    "atd.cascaded": [
+      {"agent": "spiffe://example.com/agent/c", "status": "completed"},
+      {"agent": "spiffe://example.com/agent/d", "status": "escalated"}
+    ]
+  }
+}
+~~~
+{: #fig-rollback-result title="Rollback Result ECT"}
+
+Status values: `completed`, `partial`, `escalated`, `failed`.
+
+`escalated` means the action was irreversible and a human operator
+has been notified per ACP-DAG-HITL `unreachable_human` policy.
+
+Agents MUST implement idempotent rollback: receiving the same
+rollback request ECT `jti` twice MUST return the same result.
+
+# Resource Hints {#resources}
+
+Agents MAY declare resource requirements as ECT extension claims
+or ACP-DAG-HITL node constraints:
+
+~~~json
+{
+  "constraints": {
+    "atd.resource_cpu": "2",
+    "atd.resource_memory_mb": 4096,
+    "atd.resource_timeout_s": 300,
+    "atd.resource_priority": "high"
+  }
+}
+~~~
+{: #fig-resources title="Resource Hints as Node Constraints"}
+
+Orchestrators (e.g., Kubernetes schedulers, agent gateways) MAY
+use these hints for scheduling and quota enforcement.  Resource
+hints are advisory; agents MUST NOT depend on them for
+correctness.
+
+# Security Considerations
+
+Rollback requests are sensitive operations.  Agents MUST
+authenticate rollback requests using the ECT identity binding
+(L2/L3).  Only agents in the same workflow (`wid`) with
+checkpoint lineage in the DAG SHOULD be authorized to request
+rollback.
+
+Checkpoint data may contain sensitive system state.  Agents MUST
+encrypt stored checkpoints at rest and MUST NOT include checkpoint
+contents in error ECTs.
+
+Circuit breaker state reveals system health topology.  The
+`atd:circuit_open` ECT is part of the audit trail; access to the
+audit ledger SHOULD be controlled.
+
+Malicious agents could send false error ECTs to trigger
+unnecessary rollbacks.  Agents SHOULD verify that error ECTs
+reference valid `par` values within their own workflow DAG.
+
+# IANA Considerations
+
+This document requests registration of the following `exec_act`
+values in a future ECT action type registry:
+
+- `atd:checkpoint`
+- `atd:error`
+- `atd:circuit_open`
+- `atd:rollback_request`
+- `atd:rollback_result`
+
+--- back
+
+# Acknowledgments
+{:numbered="false"}
+
+ATD builds on ECT {{I-D.nennemann-wimse-ect}} for execution
+evidence and ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}}
+for delegation policy.  The circuit breaker pattern is adapted
+from microservice architecture best practices.
--- a/workspace/drafts/new-drafts/draft-b-atd-agent-task-dag-01.md
+++ b/workspace/drafts/new-drafts/draft-b-atd-agent-task-dag-01.md
@@ -0,0 +1,725 @@
+---
+title: "Agent Task DAG (ATD): Execution Model, Checkpoints, and Recovery"
+abbrev: "ATD"
+category: std
+docname: draft-atd-agent-task-dag-01
+submissiontype: IETF
+number:
+date:
+v: 3
+area: "OPS"
+workgroup: "NMOP"
+keyword:
+  - agent DAG
+  - checkpoint
+  - rollback
+  - error recovery
+  - circuit breaker
+
+author:
+  -
+    fullname: TBD
+    organization: Independent
+    email: placeholder@example.com
+
+normative:
+  RFC2119:
+  RFC8174:
+  RFC8446:
+  RFC9110:
+  RFC8615:
+  I-D.nennemann-wimse-ect:
+    title: "Execution Context Tokens for Distributed Agentic Workflows"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
+  I-D.nennemann-agent-dag-hitl-safety:
+    title: "Agent Context Policy Token: DAG Delegation with Human Override"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
+
+informative:
+
+--- abstract
+
+This document defines the Agent Task DAG (ATD) specification:
+execution semantics, checkpoints, error signaling, circuit
+breakers, and rollback for agent workflows.  ATD does not define a
+new DAG or token format.  It defines when agents MUST emit ECT
+nodes, what those nodes mean, and how to recover when things go
+wrong.  Checkpoints, errors, and rollback results are ECT nodes
+with specific `exec_act` values and `ext` claims.  Rollback walks
+the ECT DAG backwards.  Circuit breakers contain cascading
+failures.  Resource hints enable scheduling.  The protocol is
+transport-agnostic and builds on ECT for evidence and ACP-DAG-HITL
+for policy.
+
+--- middle
+
+# Introduction
+
+Autonomous agents increasingly make unsupervised decisions, yet no
+standard exists for how agents checkpoint state, signal errors to
+peers, contain cascading failures, or roll back decisions gone
+wrong.
+
+ATD borrows proven patterns from distributed systems: checkpoints
+from database transactions, circuit breakers from microservice
+architectures, and rollback from version control.  It adapts these
+to agent workflows where actions may be partially reversible and
+where the agent that caused the error may not be the best one to
+fix it.
+
+ATD does not define a new DAG format.  The ECT DAG
+{{I-D.nennemann-wimse-ect}} IS the execution graph.  ATD defines
+the semantics of specific node types within that graph.
+
+Design principles:
+
+1. Agents that take consequential actions MUST be able to undo
+   them, or MUST declare them irreversible upfront.
+2. Failure containment takes priority over failure diagnosis.
+3. The protocol adds minimal overhead to the happy path.
+
+# Conventions and Definitions
+
+{::boilerplate bcp14-tagged}
+
+Checkpoint:
+: An ECT node recording agent state before a consequential action,
+  sufficient to restore the system to that state.
+
+Circuit Breaker:
+: A mechanism that stops an agent from propagating requests to a
+  failing downstream agent, preventing cascading failures.
+
+Rollback:
+: The process of reverting an agent's actions and state to a
+  previously recorded checkpoint.
+
+Blast Radius:
+: The set of agents and systems affected by a single failure.
+
+Consequential Action:
+: An action that modifies external state (network configuration,
+  database records, API calls with side effects) such that
+  reversal requires explicit effort.
+
+# Execution Semantics {#execution}
+
+## Topological Order
+
+Tasks in the ECT DAG MUST execute in topological order: a task
+MUST NOT begin execution until all tasks referenced by its ECT
+`par` claims are in state `done`.
+
+Two tasks with no common ancestor in the DAG (no shared `par`
+lineage) MAY execute concurrently.  Orchestrators SHOULD
+exploit this parallelism for performance.
+
+Circular dependencies are prohibited.  Agents MUST reject
+ACP-DAG-HITL delegation DAGs containing cycles.
+
+## Workflow Boundary ECTs
+
+When a workflow begins, the initiating agent MUST emit:
+
+~~~json
+{
+  "exec_act": "atd:workflow_start",
+  "ext": {
+    "atd.wf_id": "wf-uuid",
+    "atd.description": "BGP failover workflow",
+    "atd.node_count": 5
+  }
+}
+~~~
+{: #fig-wf-start title="Workflow Start ECT"}
+
+When the workflow reaches a terminal state (all leaf nodes
+complete or any node failed with no rollback path), the
+orchestrator MUST emit:
+
+~~~json
+{
+  "exec_act": "atd:workflow_complete",
+  "par": ["wf-start-ect-uuid"],
+  "ext": {
+    "atd.wf_id": "wf-uuid",
+    "atd.terminal_status": "success",
+    "atd.elapsed_s": 42
+  }
+}
+~~~
+{: #fig-wf-complete title="Workflow Complete ECT"}
+
+Terminal status values: `success`, `partial`, `failed`,
+`rolled_back`, `escalated`.
+
+# Node States {#node-states}
+
+Each task node in the ECT DAG has an implicit state derived from
+subsequent ECT nodes:
+
+- **pending**: A delegation node exists in ACP-DAG-HITL but no
+  corresponding ECT has been emitted.
+- **running**: An ECT matching the task type has been emitted
+  but no completion or error ECT follows.
+- **done**: A completion ECT (or the next `par`-linked ECT) exists.
+- **failed**: An `atd:error` ECT references this node.
+- **rolled_back**: An `atd:rollback_result` ECT references this
+  node's checkpoint.
+- **escalated**: The task failed and a human has been notified
+  per HITL escalation rules.
+
+# Checkpoint Mechanism {#checkpoints}
+
+## Checkpoint Placement Policy
+
+An ATD-compliant agent MUST create a checkpoint before any action
+it classifies as consequential.  The following actions are always
+consequential and MUST be checkpointed:
+
+1. Any modification to network device configuration.
+2. Any write to a shared database or external data store.
+3. Any API call with side effects (non-idempotent HTTP methods).
+4. Any delegation to another agent that will itself take
+   consequential actions.
+
+The following SHOULD be checkpointed:
+
+1. Long-running computations (> `atd.resource_timeout_s`).
+2. Actions that cannot be verified without external state.
+
+The following are exempt from checkpoint requirements:
+
+1. Read-only queries.
+2. Sending notifications with no side effects.
+3. Internal state computations with no external observable effect.
+
+## Checkpoint ECT Format
+
+A checkpoint is an ECT with:
+
+- `exec_act`: `"atd:checkpoint"`
+- `par`: the ECT of the action being checkpointed
+
+~~~json
+{
+  "jti": "ckpt-uuid",
+  "exec_act": "atd:checkpoint",
+  "par": ["action-ect-uuid"],
+  "out_hash": "sha256-of-agent-state-snapshot",
+  "ext": {
+    "atd.reversible": true,
+    "atd.rollback_uri": "https://agent-b.example.com/.well-known/atd/rollback",
+    "atd.target": "router-07.example.com",
+    "atd.description": "Update BGP peer config",
+    "atd.ttl": 86400
+  }
+}
+~~~
+{: #fig-checkpoint title="Checkpoint ECT"}
+
+The `atd.reversible` field MUST be present.  If `false`, the agent
+declares that this action cannot be automatically undone and
+rollback requests MUST be escalated per the ACP-DAG-HITL
+`unreachable_human` policy.
+
+The `out_hash` provides integrity verification: the agent hashes
+its state at checkpoint time so that rollback can verify it is
+restoring to an authentic prior state.
+
+Checkpoints MUST be stored for at least `atd.ttl` seconds.  Agents
+SHOULD store checkpoints in durable storage that survives restarts.
+
+The rollback URI MUST be a well-known URI per {{RFC8615}} at the
+path `/.well-known/atd/rollback`.
+
+## Hierarchical Checkpoints
+
+Agents MAY create hierarchical checkpoints where a parent groups
+multiple child checkpoints from a multi-step operation.  Rolling
+back the parent rolls back all children.  The parent checkpoint's
+`par` array references all child checkpoint `jti` values.
+
+## Checkpoint `exec_act` Table
+
+| `exec_act` value | When emitted | Required `ext` fields |
+|-----------------|-------------|----------------------|
+| `atd:checkpoint` | Before consequential action | `atd.reversible`, `atd.rollback_uri`, `atd.ttl` |
+| `atd:error` | On failure detection | `atd.severity`, `atd.error_type`, `atd.checkpoint_id` |
+| `atd:circuit_open` | When error rate exceeds threshold | `atd.downstream_agent`, `atd.error_rate`, `atd.window_s` |
+| `atd:circuit_close` | When probe succeeds in HALF-OPEN | `atd.downstream_agent`, `atd.cooldown_s` |
+| `atd:rollback_request` | To initiate rollback | `atd.reason`, `atd.cascade` |
+| `atd:rollback_result` | Rollback complete or failed | `atd.status`, `atd.checkpoint_id`, `atd.cascaded` |
+| `atd:workflow_start` | Workflow begins | `atd.wf_id`, `atd.description` |
+| `atd:workflow_complete` | Workflow terminal | `atd.wf_id`, `atd.terminal_status` |
+{: #fig-actions title="ATD exec_act Values"}
+
+# Error Signaling {#errors}
+
+When an agent detects an error, it MUST emit an error ECT:
+
+- `exec_act`: `"atd:error"`
+- `par`: the ECT of the failed action
+
+~~~json
+{
+  "jti": "error-uuid",
+  "exec_act": "atd:error",
+  "par": ["failed-action-ect-uuid"],
+  "ext": {
+    "atd.severity": "critical",
+    "atd.error_type": "action_failed",
+    "atd.description": "BGP session did not establish",
+    "atd.checkpoint_id": "ckpt-uuid",
+    "atd.upstream_errors": []
+  }
+}
+~~~
+{: #fig-error title="Error ECT"}
+
+Severity levels (in increasing order): `info`, `warning`,
+`error`, `critical`.
+
+Error types: `action_failed`, `timeout`, `constraint_violation`,
+`resource_exhausted`, `upstream_cascade`, `unknown`.
+
+When an agent receives an error signal caused by an action it
+initiated, it MUST either:
+
+(a) Attempt automatic rollback of its checkpoint, or
+(b) Escalate per ACP-DAG-HITL HITL rules if the action was
+    irreversible.
+
+The `atd.upstream_errors` array allows agents to chain error
+context, building a causal trace from symptom to root cause.
+
+## HITL Escalation on Error
+
+Error ECTs with severity `critical` SHOULD trigger HITL
+escalation.  Deployments SHOULD define ACP-DAG-HITL rules such
+as:
+
+~~~json
+{
+  "id": "r-critical-error",
+  "trigger": {
+    "kind": "keyword_match",
+    "op": "eq",
+    "value": "critical",
+    "input_ref": "atd.severity"
+  },
+  "required_role": "operator:oncall",
+  "action": "escalate",
+  "allow_override": true,
+  "override_action": "continue"
+}
+~~~
+{: #fig-error-hitl title="HITL Rule for Critical Errors"}
+
+# Circuit Breaker Pattern {#circuit-breaker}
+
+Each agent MUST implement a circuit breaker for every downstream
+agent it communicates with.  The circuit breaker has three states:
+
+CLOSED (normal):
+: Requests flow through.  The agent tracks the error rate over a
+  sliding window (default: 60 seconds).
+
+OPEN (failure detected):
+: When the error rate exceeds a threshold (default: 50%), the
+  breaker opens.  All requests are immediately rejected.  The
+  agent MUST emit a circuit breaker open ECT:
+
+~~~json
+{
+  "exec_act": "atd:circuit_open",
+  "ext": {
+    "atd.downstream_agent": "spiffe://example.com/agent/b",
+    "atd.error_rate": 0.75,
+    "atd.window_s": 60
+  }
+}
+~~~
+{: #fig-circuit-open title="Circuit Breaker Open ECT"}
+
+HALF-OPEN (recovery probe):
+: After a cooldown period (default: 30s), the breaker allows one
+  probe request.  If it succeeds, the breaker returns to CLOSED
+  and MUST emit:
+
+~~~json
+{
+  "exec_act": "atd:circuit_close",
+  "ext": {
+    "atd.downstream_agent": "spiffe://example.com/agent/b",
+    "atd.cooldown_s": 30
+  }
+}
+~~~
+{: #fig-circuit-close title="Circuit Breaker Close ECT"}
+
+  If the probe fails, the breaker returns to OPEN with doubled
+  cooldown (exponential backoff, max 300s).
+
+## Circuit Breaker State Machine
+
+~~~
+         error_rate > threshold
+CLOSED ─────────────────────────► OPEN
+  ▲                                  │
+  │ probe success                    │ cooldown expires
+  │                                  ▼
+  └────────────────────────── HALF-OPEN
+         probe failure ──► OPEN (cooldown * 2)
+~~~
+{: #fig-fsm title="Circuit Breaker State Machine"}
+
+## Coordinated Circuit Breaking
+
+When multiple agents share a downstream dependency, each maintains
+its own circuit breaker independently.  However, agents SHOULD
+publish circuit breaker state via their ECT stream so peers can
+observe the signal.
+
+If an orchestrator observes N circuit breakers opening for the
+same downstream agent within a short window, it SHOULD initiate
+a HITL escalation rather than allowing N parallel recovery probes.
+
+## Circuit Breaker Policy Configuration
+
+Circuit breaker thresholds can be configured as ACP-DAG-HITL
+node constraints:
+
+~~~json
+{
+  "constraints": {
+    "atd.circuit_threshold": 0.5,
+    "atd.circuit_window_s": 60
+  }
+}
+~~~
+{: #fig-circuit-policy title="Circuit Breaker Policy"}
+
+# Rollback Protocol {#rollback}
+
+## Basic Rollback
+
+A rollback is initiated by emitting a rollback request ECT and
+sending an HTTP POST to the target agent's rollback endpoint:
+
+~~~
+POST /.well-known/atd/rollback HTTP/1.1
+Content-Type: application/json
+Execution-Context: <rollback-request-ect>
+~~~
+
+- `exec_act`: `"atd:rollback_request"`
+- `par`: the checkpoint ECT to roll back to
+
+~~~json
+{
+  "exec_act": "atd:rollback_request",
+  "par": ["ckpt-uuid"],
+  "ext": {
+    "atd.reason": "Upstream action caused cascading failure",
+    "atd.cascade": true
+  }
+}
+~~~
+{: #fig-rollback-req title="Rollback Request ECT"}
+
+When `atd.cascade` is `true`, the receiving agent MUST also
+initiate rollback of any downstream checkpoints created as a
+consequence of the checkpointed action.
+
+The agent MUST respond with a rollback result ECT:
+
+~~~json
+{
+  "exec_act": "atd:rollback_result",
+  "par": ["rollback-request-uuid"],
+  "out_hash": "sha256-of-restored-state",
+  "ext": {
+    "atd.status": "completed",
+    "atd.checkpoint_id": "ckpt-uuid",
+    "atd.cascaded": [
+      {"agent": "spiffe://example.com/agent/c", "status": "completed"},
+      {"agent": "spiffe://example.com/agent/d", "status": "escalated"}
+    ]
+  }
+}
+~~~
+{: #fig-rollback-result title="Rollback Result ECT"}
+
+Status values: `completed`, `partial`, `escalated`, `failed`.
+
+`escalated` means the action was irreversible and a human operator
+has been notified per ACP-DAG-HITL `unreachable_human` policy.
+
+## Partial Rollback and Blast Radius Containment
+
+When a failure occurs in the middle of a DAG, it is often
+undesirable to roll back the entire workflow.  ATD defines
+partial rollback as rolling back the failed subgraph while
+preserving completed sibling branches.
+
+Partial rollback MUST only proceed if:
+
+1. The checkpoints to be rolled back are in the same workflow
+   (`atd.wf_id`).
+2. No completed sibling task depends on the output of the
+   failed task (verified by walking the DAG forward from the
+   checkpoint).
+
+The blast radius is the set of agents holding checkpoints that
+are descendants of the failed node.  Orchestrators SHOULD
+compute blast radius before initiating cascade rollback to
+avoid unnecessary disruption.
+
+## Rollback Timeout and Escalation
+
+Rollback requests MUST include a timeout implicitly derived from
+the original checkpoint's `atd.ttl`.  If rollback is not
+completed within `atd.ttl / 2` seconds, the agent MUST:
+
+1. Emit an `atd:error` with `error_type: "timeout"` and
+   `atd.description` noting rollback timeout.
+2. Escalate to HITL per {{hitl-escalation}}.
+
+Agents MUST implement idempotent rollback: receiving the same
+rollback request ECT `jti` twice MUST return the same result.
+
+## Rollback Authorization {#rollback-authz}
+
+Only agents within the same workflow (`wid`) with checkpoint
+lineage in the DAG SHOULD be authorized to request rollback.
+Rollback requests from outside the originating workflow MUST be
+rejected with HTTP 403.
+
+# Interaction with HITL {#hitl-escalation}
+
+ATD escalates to HITL in the following scenarios:
+
+1. **Irreversible action failure**: An error ECT with
+   `atd.reversible: false` on the checkpoint MUST trigger
+   HITL Level 2 (approval required) per the companion HITL
+   specification.
+
+2. **Rollback failure**: A rollback result with `atd.status:
+   "failed"` MUST trigger HITL Level 3 (STOP) on the workflow.
+
+3. **Cascaded rollback of critical nodes**: When `atd.cascade:
+   true` rollback propagates to a node with `atd.severity:
+   critical`, HITL SHOULD be triggered at Level 1 (PAUSE)
+   to allow human review before proceeding.
+
+4. **Circuit breaker permanent open**: If a circuit breaker
+   re-opens after 3 successive HALF-OPEN probes, HITL Level 2
+   escalation SHOULD be triggered.
+
+ATD-to-HITL escalation is recorded as an ECT linked to both
+the triggering error ECT and the HITL override ECT, preserving
+the causal chain in the audit DAG.
+
+# Resource Hints {#resources}
+
+## Resource Claim Format
+
+Agents MAY declare resource requirements as ACP-DAG-HITL node
+constraints:
+
+~~~json
+{
+  "constraints": {
+    "atd.resource_cpu": "2",
+    "atd.resource_memory_mb": 4096,
+    "atd.resource_timeout_s": 300,
+    "atd.resource_priority": "high",
+    "atd.resource_gpu": "0",
+    "atd.resource_network_mbps": 100
+  }
+}
+~~~
+{: #fig-resources title="Resource Hints as Node Constraints"}
+
+## Priority Levels
+
+The `atd.resource_priority` field MUST be one of: `critical`,
+`high`, `normal`, `low`.  Orchestrators SHOULD map these to
+scheduling priority classes (e.g., Kubernetes QoS classes:
+`critical` → Guaranteed, `high`/`normal` → Burstable, `low`
+→ BestEffort).
+
+## Fair-Share Scheduling
+
+When multiple agents compete for a shared resource pool,
+orchestrators SHOULD implement fair-share scheduling:
+
+1. Each active workflow receives an equal base allocation.
+2. Unused allocation from `low` priority agents is redistributed
+   to `high`/`critical` agents within the same scheduling cycle.
+3. Starvation prevention: `low` priority agents MUST eventually
+   be scheduled within a configurable maximum wait (default: 300s).
+
+## Unsatisfiable Resource Hints
+
+Resource hints are advisory; agents MUST NOT depend on them for
+correctness.  When resource hints cannot be satisfied:
+
+- If `atd.resource_priority` is `critical`: orchestrator SHOULD
+  pre-empt lower-priority tasks.
+- If `critical` tasks still cannot be scheduled within 60s:
+  emit `atd:error` with `error_type: "resource_exhausted"` and
+  escalate to HITL.
+- All other priorities: proceed with degraded resources; log
+  a warning via `atd:error` with severity `warning`.
+
+# Optional Declarative Workflow Format {#workflow-format}
+
+To support pre-run planning and tooling, ATD defines an optional
+declarative workflow descriptor.  This is a planning artifact
+only; at runtime it is realized as ECTs per this specification.
+
+~~~json
+{
+  "wf_id": "bgp-failover-v2",
+  "description": "BGP peer failover with validation",
+  "nodes": [
+    {
+      "id": "n1",
+      "label": "validate-config",
+      "reversible": true,
+      "hitl_required": false,
+      "resource_hints": {
+        "priority": "normal",
+        "timeout_s": 30
+      }
+    },
+    {
+      "id": "n2",
+      "label": "update-bgp-peer",
+      "reversible": true,
+      "hitl_required": true,
+      "resource_hints": {
+        "priority": "critical",
+        "timeout_s": 120
+      }
+    },
+    {
+      "id": "n3",
+      "label": "verify-session",
+      "reversible": false,
+      "hitl_required": false,
+      "resource_hints": {
+        "priority": "high",
+        "timeout_s": 60
+      }
+    }
+  ],
+  "edges": [
+    {"from": "n1", "to": "n2"},
+    {"from": "n2", "to": "n3"}
+  ]
+}
+~~~
+{: #fig-workflow title="Declarative Workflow Descriptor"}
+
+The workflow descriptor media type is
+`application/atd-workflow+json`.  Orchestrators MAY store and
+version workflow descriptors independently of their ECT runtime
+realization.
+
+The `hitl_required` field is a hint to the HITL system that this
+node MUST have an approval gate as defined in the companion HITL
+specification.
+
+# Security Considerations
+
+## Rollback Authorization
+
+Rollback requests are high-privilege operations.  Agents MUST
+authenticate rollback requests using the ECT identity binding
+(L2/L3).  The rollback endpoint MUST require mutual TLS or a
+signed JWT from an agent within the same workflow DAG.
+
+Only agents that are ancestors in the ECT DAG of the checkpoint
+being rolled back SHOULD be authorized to request that rollback.
+
+## Checkpoint Confidentiality
+
+Checkpoint data may contain sensitive system state (API keys,
+session tokens, configuration).  Agents MUST:
+
+- Encrypt stored checkpoints at rest.
+- Reference checkpoint state via `out_hash` only in ECTs.
+- MUST NOT include checkpoint contents in error ECTs.
+
+## False Error Injection
+
+A malicious agent could send false `atd:error` ECTs to trigger
+unnecessary rollbacks and disrupt workflows.  Mitigation:
+
+- Agents SHOULD verify that error ECTs reference valid `par`
+  values within their own workflow DAG (`wid` claim).
+- Rollback MUST require authentication (see {{rollback-authz}}).
+- L2/L3 ECT signing prevents unauthenticated error injection.
+
+## Checkpoint Flooding
+
+An adversary could exhaust checkpoint storage by triggering
+many checkpoints.  Mitigation:
+
+- Agents SHOULD enforce a maximum checkpoint count per workflow.
+- Expired checkpoints (past `atd.ttl`) MUST be purged.
+- Checkpoint creation rate SHOULD be rate-limited per calling
+  workflow.
+
+## Circuit Breaker State Leakage
+
+The `atd:circuit_open` ECT reveals system health topology.  The
+audit ledger SHOULD enforce access controls: only agents within
+the same workflow or authorized operators SHOULD be able to query
+circuit breaker history.
+
+# IANA Considerations
+
+This document requests registration of the following values in
+the AEM Ecosystem Extension Registry established by
+draft-aem-agent-ecosystem-model:
+
+## `exec_act` Values
+
+| Value | Description | Reference |
+|-------|-------------|-----------|
+| `atd:checkpoint` | State snapshot before consequential action | This document |
+| `atd:error` | Error signal with severity and type | This document |
+| `atd:circuit_open` | Circuit breaker opened to downstream agent | This document |
+| `atd:circuit_close` | Circuit breaker returned to CLOSED state | This document |
+| `atd:rollback_request` | Initiate rollback to named checkpoint | This document |
+| `atd:rollback_result` | Result of rollback attempt | This document |
+| `atd:workflow_start` | Workflow began execution | This document |
+| `atd:workflow_complete` | Workflow reached terminal state | This document |
+{: #fig-iana-actions title="ATD exec_act Registrations"}
+
+## Well-Known URI
+
+This document requests registration of `atd/rollback` as a
+well-known URI suffix per {{RFC8615}}.
+
+## Media Type
+
+This document requests registration of
+`application/atd-workflow+json` for the declarative workflow
+descriptor format defined in {{workflow-format}}.
+
+--- back
+
+# Acknowledgments
+{:numbered="false"}
+
+ATD builds on ECT {{I-D.nennemann-wimse-ect}} for execution
+evidence and ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}}
+for delegation policy.  The circuit breaker pattern is adapted
+from microservice architecture best practices.  The declarative
+workflow format is inspired by workflow description languages
+(BPEL, BPMN) adapted for lightweight agent coordination.
--- a/workspace/drafts/new-drafts/draft-c-hitl-human-in-the-loop-00.md
+++ b/workspace/drafts/new-drafts/draft-c-hitl-human-in-the-loop-00.md
@@ -0,0 +1,368 @@
+---
+title: "Human-in-the-Loop (HITL) Primitives for Agent Ecosystems"
+abbrev: "HITL"
+category: std
+docname: draft-hitl-human-in-the-loop-00
+submissiontype: IETF
+number:
+date:
+v: 3
+area: "OPS"
+workgroup: "NMOP"
+keyword:
+  - human override
+  - HITL
+  - emergency stop
+  - agentic safety
+
+author:
+  -
+    fullname: TBD
+    organization: Independent
+    email: placeholder@example.com
+
+normative:
+  RFC2119:
+  RFC8174:
+  RFC7519:
+  RFC8446:
+  RFC8615:
+  I-D.nennemann-wimse-ect:
+    title: "Execution Context Tokens for Distributed Agentic Workflows"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
+  I-D.nennemann-agent-dag-hitl-safety:
+    title: "Agent Context Policy Token: DAG Delegation with Human Override"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
+
+informative:
+
+--- abstract
+
+This document defines runtime HITL (Human-in-the-Loop) primitives
+for agent ecosystems: four escalating override levels, approval
+gates, escalation paths, and explainability hooks.  ACP-DAG-HITL
+defines WHEN humans must intervene (policy rules and triggers).
+This specification defines HOW the intervention actually happens at
+the protocol level: the HTTP endpoints, override semantics, agent
+compliance requirements, and acknowledgment flows.  All overrides
+and decisions produce ECT nodes, making human interventions part of
+the same auditable DAG as agent actions.
+
+--- middle
+
+# Introduction
+
+The current ratio of autonomous capability drafts to human
+oversight drafts in the IETF is roughly 7:1.  Agents can act but
+humans cannot reliably stop them.
+
+ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}} defines the
+policy: trigger conditions, required roles, and actions (`pause`,
+`escalate`, `abort`).  But it deliberately defers the runtime
+protocol — how does an operator actually send a stop command?  How
+does the agent acknowledge it?  What happens if the operator is
+unreachable?
+
+This specification fills that gap.  It is the runtime enforcement
+companion to ACP-DAG-HITL, inspired by industrial safety systems:
+the e-stop button on factory equipment, the circuit breaker in
+electrical systems, and the kill switch in robotics.
+
+HITL is deliberately not a governance framework, policy language,
+or accountability protocol.  It is a panic button with a
+well-defined interface.
+
+# Conventions and Definitions
+
+{::boilerplate bcp14-tagged}
+
+Override:
+: A human-initiated command that alters an agent's autonomous
+  operation, taking precedence over the agent's own decisions.
+
+Operator:
+: A human user authorized to issue override commands.
+
+Approval Gate:
+: A DAG node that blocks workflow progression until a human
+  approves or rejects continuation.
+
+# Relationship to ACP-DAG-HITL {#mapping}
+
+ACP-DAG-HITL defines three HITL actions.  This specification
+maps them to four runtime override levels and extends with
+CONSTRAIN (partial restriction):
+
+| ACP-DAG-HITL action | HITL Override Level | Behavior |
+|---------------------|---------------------|----------|
+| `pause` | Level 1: PAUSE | Suspend autonomous actions, hold state |
+| (no equivalent) | Level 2: CONSTRAIN | Restrict to an allowlist of actions |
+| `abort` | Level 3: STOP | Cease all actions, enter inert state |
+| `escalate` | Level 4: TAKEOVER | Transfer control to human operator |
+{: #fig-mapping title="ACP-DAG-HITL to HITL Level Mapping"}
+
+When ACP-DAG-HITL rules trigger, the runtime system uses the
+corresponding HITL level to enforce the action.
+
+# Override Levels {#levels}
+
+## Level 1: PAUSE
+
+The agent MUST suspend all autonomous actions and hold current
+state.  It MUST NOT initiate new actions but MAY complete
+in-progress actions if stopping mid-execution would cause harm
+(e.g., an in-flight database transaction).  The agent resumes
+when a RESUME command is received.
+
+## Level 2: CONSTRAIN
+
+The agent MUST restrict its actions to a specified subset.  The
+override command includes an allowlist of permitted action types.
+The agent MUST reject any action not on the allowlist.
+
+## Level 3: STOP
+
+The agent MUST immediately cease all autonomous actions and enter
+an inert state.  It MUST NOT take any autonomous actions until
+explicitly restarted.  This is the e-stop.
+
+## Level 4: TAKEOVER
+
+The agent MUST transfer operational control to the human operator.
+It enters a pass-through mode where it executes only explicit
+operator commands.  The agent's sensors and outputs remain
+available to the operator as tools.
+
+# Override Protocol {#protocol}
+
+## Override Command
+
+Override commands are sent as HTTP POST to the agent's well-known
+endpoint:
+
+~~~
+POST /.well-known/hitl/override HTTP/1.1
+Content-Type: application/json
+Authorization: Bearer <operator-jwt>
+Execution-Context: <override-ect>
+~~~
+
+The override ECT MUST contain:
+
+- `exec_act`: `"hitl:override"`
+- `par`: the most recent ECT from the agent being overridden
+  (linking the override into the workflow DAG)
+
+~~~json
+{
+  "exec_act": "hitl:override",
+  "par": ["agent-last-action-ect"],
+  "ext": {
+    "hitl.level": 3,
+    "hitl.reason": "Agent blocking legitimate traffic",
+    "hitl.operator_id": "user:alice",
+    "hitl.scope": "*",
+    "hitl.constraints": null,
+    "hitl.ttl": null
+  }
+}
+~~~
+{: #fig-override title="Override ECT"}
+
+Field definitions:
+
+- `hitl.level`: Integer 1-4. MUST be present.
+- `hitl.reason`: Human-readable text. MUST be logged.
+- `hitl.scope`: `"*"` for all functions, or an array of function
+  IDs for partial override.
+- `hitl.constraints`: For Level 2 only. Array of permitted action
+  types.
+- `hitl.ttl`: Duration in seconds. If set, override auto-expires.
+  If null, persists until explicitly lifted.
+
+## Acknowledgment
+
+The agent MUST respond with an acknowledgment ECT:
+
+- `exec_act`: `"hitl:ack"`
+- `par`: the override ECT
+
+~~~json
+{
+  "exec_act": "hitl:ack",
+  "par": ["override-ect-uuid"],
+  "ext": {
+    "hitl.status": "accepted",
+    "hitl.prior_state": "autonomous",
+    "hitl.current_state": "stopped",
+    "hitl.effective_at": "2026-03-01T12:00:00.123Z"
+  }
+}
+~~~
+{: #fig-ack title="Acknowledgment ECT"}
+
+The override/ack ECT pair serves as the Decision Record defined
+in ACP-DAG-HITL Section 6.5.  No separate audit mechanism is
+needed.
+
+## Resume and Lift
+
+To resume from PAUSE:
+
+~~~
+POST /.well-known/hitl/resume HTTP/1.1
+Execution-Context: <resume-ect with exec_act="hitl:resume">
+~~~
+
+To lift any override:
+
+~~~
+POST /.well-known/hitl/lift HTTP/1.1
+Execution-Context: <lift-ect with exec_act="hitl:lift">
+~~~
+
+Both produce ECTs linked to the original override ECT via `par`.
+
+# Agent Compliance Requirements {#compliance}
+
+Every HITL-compliant agent MUST:
+
+1. Implement the `/.well-known/hitl/override` endpoint.
+
+2. Process override commands within 1 second of receipt.  The
+   override path MUST be independent of the agent's main
+   processing loop.
+
+3. Acknowledge every override with an ECT response.
+
+4. An agent MUST NOT respond with "rejected".  Overrides are
+   mandatory.  If the agent cannot fully comply, it MUST respond
+   with status `partial` and describe what it could not do.
+
+5. Expose current override status at:
+
+~~~
+GET /.well-known/hitl/status
+~~~
+
+~~~json
+{
+  "agent_id": "spiffe://example.com/agent/firewall",
+  "override_active": true,
+  "current_level": 3,
+  "override_ect": "override-ect-uuid",
+  "since": "2026-03-01T12:00:00Z",
+  "operator_id": "user:alice"
+}
+~~~
+{: #fig-status title="Override Status Response"}
+
+# Approval Gates {#approval-gates}
+
+An approval gate is a DAG node that blocks workflow progression
+until a human approves.  Unlike overrides (which interrupt running
+agents), approval gates are planned checkpoints in the workflow.
+
+Approval gates are defined as ACP-DAG-HITL nodes with HITL rules:
+
+~~~json
+{
+  "dag": {
+    "nodes": [
+      {
+        "id": "n-approve",
+        "type": "hitl:approval_gate",
+        "agent": "system:hitl-gateway",
+        "constraints": {
+          "hitl.required_role": "clinician:oncall",
+          "hitl.timeout_s": 300,
+          "hitl.timeout_action": "safe_pause"
+        }
+      }
+    ]
+  }
+}
+~~~
+{: #fig-gate title="Approval Gate as DAG Node"}
+
+When the workflow reaches an approval gate, the system:
+
+1. Emits an ECT with `exec_act: "hitl:approval_request"`
+2. Notifies the required human role
+3. Waits for approval (ECT: `"hitl:approval_granted"`) or
+   rejection (ECT: `"hitl:approval_denied"`)
+4. On timeout, applies `hitl.timeout_action`
+
+# Broadcast Override {#broadcast}
+
+For environments with many agents, an operator MAY send a
+broadcast override to a management endpoint:
+
+~~~
+POST /hitl/broadcast HTTP/1.1
+Execution-Context: <broadcast-override-ect>
+
+{
+  "targets": ["spiffe://example.com/agent/a",
+               "spiffe://example.com/agent/b"],
+  "level": 3,
+  "reason": "Coordinated emergency stop"
+}
+~~~
+
+The broadcast endpoint fans out individual override ECTs to each
+target and returns per-agent results.
+
+# Dead Man's Switch {#dead-man}
+
+For maximum reliability, agents SHOULD implement a heartbeat
+mechanism: the agent periodically pings an operator heartbeat
+endpoint.  If the heartbeat is missed for a configurable duration,
+the agent automatically enters Level 1 (PAUSE).
+
+This provides a safety net when network connectivity to the
+operator is lost.  The `unreachable_human` policy from
+ACP-DAG-HITL governs behavior when the dead man's switch
+activates: either `abort` or `safe_pause`.
+
+# Security Considerations
+
+Override commands are high-privilege operations.  All override
+endpoints MUST require authentication via mutual TLS or signed
+JWTs.
+
+Override ECTs MUST be signed at L2 or L3.  Agents MUST verify
+signatures before processing.
+
+To prevent replay attacks, agents MUST reject override ECTs with
+`iat` more than 30 seconds in the past.  The `jti` MUST be unique;
+agents MUST reject duplicate `jti` values.
+
+Deployments SHOULD implement multi-operator approval for Level 4
+(TAKEOVER), requiring two independent operator identities.
+
+The override endpoint SHOULD be served on a separate port or
+network interface from the agent's main API to ensure availability
+during overload.
+
+# IANA Considerations
+
+This document requests the following registrations:
+
+1. Well-known URI registrations for `hitl/override`,
+   `hitl/resume`, `hitl/lift`, and `hitl/status` per {{RFC8615}}.
+
+2. Registration of `exec_act` values: `hitl:override`,
+   `hitl:ack`, `hitl:resume`, `hitl:lift`,
+   `hitl:approval_request`, `hitl:approval_granted`,
+   `hitl:approval_denied` in a future ECT action type registry.
+
+--- back
+
+# Acknowledgments
+{:numbered="false"}
+
+This specification is the runtime enforcement companion to
+ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}}.  Override
+design is inspired by industrial safety systems (IEC 62061,
+ISO 13849).
--- a/workspace/drafts/new-drafts/draft-c-hitl-human-in-the-loop-01.md
+++ b/workspace/drafts/new-drafts/draft-c-hitl-human-in-the-loop-01.md
@@ -0,0 +1,612 @@
+---
+title: "Human-in-the-Loop (HITL) Primitives for Agent Ecosystems"
+abbrev: "HITL"
+category: std
+docname: draft-hitl-human-in-the-loop-01
+submissiontype: IETF
+number:
+date:
+v: 3
+area: "OPS"
+workgroup: "NMOP"
+keyword:
+  - human override
+  - HITL
+  - emergency stop
+  - agentic safety
+  - explainability
+
+author:
+  -
+    fullname: TBD
+    organization: Independent
+    email: placeholder@example.com
+
+normative:
+  RFC2119:
+  RFC8174:
+  RFC7519:
+  RFC8446:
+  RFC8615:
+  RFC9110:
+  I-D.nennemann-wimse-ect:
+    title: "Execution Context Tokens for Distributed Agentic Workflows"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
+  I-D.nennemann-agent-dag-hitl-safety:
+    title: "Agent Context Policy Token: DAG Delegation with Human Override"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
+
+informative:
+
+--- abstract
+
+This document defines runtime HITL (Human-in-the-Loop) primitives
+for agent ecosystems: four escalating override levels, approval
+gates, timeout and fallback policies, and explainability hooks.
+ACP-DAG-HITL defines WHEN humans must intervene (policy rules and
+triggers).  This specification defines HOW the intervention
+actually happens at the protocol level: the HTTP endpoints,
+override semantics, agent compliance requirements,
+acknowledgment flows, and explainability tokens that allow
+operators to make informed decisions.  All overrides and decisions
+produce ECT nodes, making human interventions part of the same
+auditable DAG as agent actions.
+
+--- middle
+
+# Introduction
+
+The current ratio of autonomous capability drafts to human
+oversight drafts in the IETF is roughly 7:1.  Agents can act but
+humans cannot reliably stop them.
+
+ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}} defines the
+policy: trigger conditions, required roles, and actions (`pause`,
+`escalate`, `abort`).  But it deliberately defers the runtime
+protocol — how does an operator actually send a stop command?  How
+does the agent acknowledge it?  What happens if the operator is
+unreachable?
+
+This specification fills that gap.  It is the runtime enforcement
+companion to ACP-DAG-HITL, inspired by industrial safety systems:
+the e-stop button on factory equipment, the circuit breaker in
+electrical systems, and the kill switch in robotics.
+
+HITL is deliberately not a governance framework, policy language,
+or accountability protocol.  It is a panic button with a
+well-defined interface.
+
+# Conventions and Definitions
+
+{::boilerplate bcp14-tagged}
+
+Override:
+: A human-initiated command that alters an agent's autonomous
+  operation, taking precedence over the agent's own decisions.
+
+Operator:
+: A human user authorized to issue override commands.
+
+Approval Gate:
+: A DAG node that blocks workflow progression until a human
+  approves or rejects continuation.
+
+HITL Intensity Level:
+: A deployment-wide configuration of how actively human oversight
+  is required.  Distinct from override levels (which are runtime
+  commands).
+
+# HITL Intensity Levels {#intensity}
+
+A deployment configures a HITL intensity level that determines
+the baseline human oversight requirement.  This is orthogonal to
+the four runtime override levels ({{levels}}): intensity levels
+govern planning; override levels govern runtime intervention.
+
+| Intensity | Label | Human requirement | When to use |
+|-----------|-------|-------------------|-------------|
+| I0 | Autonomous | No HITL required by default | Dev/test; fully trusted agents |
+| I1 | Advisory | Notifications; no blocking | Monitoring-only production deployments |
+| I2 | Selective | Approval required on critical paths only | Standard production cross-org deployments |
+| I3 | Mandatory | Approval required on every consequential action | Regulated environments; EU AI Act critical systems |
+{: #fig-intensity title="HITL Intensity Levels"}
+
+Intensity levels are declared in ACP-DAG-HITL workflow policy and
+map to AEM assurance levels (see {{assurance-binding}}):
+
+| HITL Intensity | Minimum AEM Assurance Level |
+|---------------|----------------------------|
+| I0 | L1 |
+| I1 | L1 |
+| I2 | L2 |
+| I3 | L3 |
+{: #fig-intensity-assurance title="Intensity to Assurance Level Mapping"}
+
+# Relationship to ACP-DAG-HITL {#mapping}
+
+ACP-DAG-HITL defines three HITL actions.  This specification
+maps them to four runtime override levels and extends with
+CONSTRAIN (partial restriction):
+
+| ACP-DAG-HITL action | HITL Override Level | Behavior |
+|---------------------|---------------------|----------|
+| `pause` | Level 1: PAUSE | Suspend autonomous actions, hold state |
+| (no equivalent) | Level 2: CONSTRAIN | Restrict to an allowlist of actions |
+| `abort` | Level 3: STOP | Cease all actions, enter inert state |
+| `escalate` | Level 4: TAKEOVER | Transfer control to human operator |
+{: #fig-mapping title="ACP-DAG-HITL to HITL Level Mapping"}
+
+When ACP-DAG-HITL rules trigger, the runtime system uses the
+corresponding HITL level to enforce the action.
+
+# Override Levels {#levels}
+
+## Level 1: PAUSE
+
+The agent MUST suspend all autonomous actions and hold current
+state.  It MUST NOT initiate new actions but MAY complete
+in-progress actions if stopping mid-execution would cause harm
+(e.g., an in-flight database transaction).  The agent resumes
+when a RESUME command is received.
+
+## Level 2: CONSTRAIN
+
+The agent MUST restrict its actions to a specified subset.  The
+override command includes an allowlist of permitted action types.
+The agent MUST reject any action not on the allowlist, responding
+with HTTP 403 and an ECT noting the constraint violation.
+
+## Level 3: STOP
+
+The agent MUST immediately cease all autonomous actions and enter
+an inert state.  It MUST NOT take any autonomous actions until
+explicitly restarted.  This is the e-stop.  Any in-progress
+consequential actions MUST be abandoned; if abandonment would
+leave external state inconsistent, the agent MUST emit an
+`atd:error` ECT and the ATD rollback protocol applies.
+
+## Level 4: TAKEOVER
+
+The agent MUST transfer operational control to the human operator.
+It enters a pass-through mode where it executes only explicit
+operator commands.  The agent's sensors and outputs remain
+available to the operator as tools.  Deployments SHOULD require
+two-operator authorization for TAKEOVER (see {{security}}).
+
+# Override Protocol {#protocol}
+
+## Override Command
+
+Override commands are sent as HTTP POST to the agent's well-known
+endpoint:
+
+~~~
+POST /.well-known/hitl/override HTTP/1.1
+Content-Type: application/json
+Authorization: Bearer <operator-jwt>
+Execution-Context: <override-ect>
+~~~
+
+The override ECT MUST contain:
+
+- `exec_act`: `"hitl:override"`
+- `par`: the most recent ECT from the agent being overridden
+  (linking the override into the workflow DAG)
+
+~~~json
+{
+  "exec_act": "hitl:override",
+  "par": ["agent-last-action-ect"],
+  "ext": {
+    "hitl.level": 3,
+    "hitl.reason": "Agent blocking legitimate traffic",
+    "hitl.operator_id": "user:alice",
+    "hitl.scope": "*",
+    "hitl.constraints": null,
+    "hitl.ttl": null,
+    "hitl.nonce": "a3f8b2c1"
+  }
+}
+~~~
+{: #fig-override title="Override ECT"}
+
+Field definitions:
+
+- `hitl.level`: Integer 1-4. MUST be present.
+- `hitl.reason`: Human-readable text. MUST be logged.
+- `hitl.scope`: `"*"` for all functions, or an array of function
+  IDs for partial override.
+- `hitl.constraints`: For Level 2 only. Array of permitted action
+  types.
+- `hitl.ttl`: Duration in seconds. If set, override auto-expires.
+  If null, persists until explicitly lifted.
+- `hitl.nonce`: REQUIRED. A random value to prevent replay attacks.
+
+## Acknowledgment
+
+The agent MUST respond within 1 second with an acknowledgment ECT:
+
+- `exec_act`: `"hitl:ack"`
+- `par`: the override ECT
+
+~~~json
+{
+  "exec_act": "hitl:ack",
+  "par": ["override-ect-uuid"],
+  "ext": {
+    "hitl.status": "accepted",
+    "hitl.prior_state": "autonomous",
+    "hitl.current_state": "stopped",
+    "hitl.effective_at": "2026-03-01T12:00:00.123Z"
+  }
+}
+~~~
+{: #fig-ack title="Acknowledgment ECT"}
+
+The override/ack ECT pair serves as the Decision Record defined
+in ACP-DAG-HITL Section 6.5.  No separate audit mechanism is
+needed.
+
+## Resume and Lift
+
+To resume from PAUSE:
+
+~~~
+POST /.well-known/hitl/resume HTTP/1.1
+Execution-Context: <resume-ect with exec_act="hitl:resume">
+~~~
+
+To lift any override:
+
+~~~
+POST /.well-known/hitl/lift HTTP/1.1
+Execution-Context: <lift-ect with exec_act="hitl:lift">
+~~~
+
+Both produce ECTs linked to the original override ECT via `par`.
+
+# Agent Compliance Requirements {#compliance}
+
+Every HITL-compliant agent MUST:
+
+1. Implement the `/.well-known/hitl/override` endpoint per
+   {{RFC8615}}.
+
+2. Process override commands within 1 second of receipt.  The
+   override path MUST be independent of the agent's main
+   processing loop and MUST NOT be blocked by ongoing tasks.
+
+3. Acknowledge every override with an ECT response.
+
+4. An agent MUST NOT respond with "rejected".  Overrides are
+   mandatory.  If the agent cannot fully comply, it MUST respond
+   with status `partial` and describe what it could not do.
+
+5. Expose current override status at:
+
+~~~
+GET /.well-known/hitl/status
+~~~
+
+~~~json
+{
+  "agent_id": "spiffe://example.com/agent/firewall",
+  "override_active": true,
+  "current_level": 3,
+  "override_ect": "override-ect-uuid",
+  "since": "2026-03-01T12:00:00Z",
+  "operator_id": "user:alice"
+}
+~~~
+{: #fig-status title="Override Status Response"}
+
+6. The override endpoint SHOULD be served on a separate port or
+   network interface from the agent's main API to ensure
+   availability under load.
+
+# Approval Gates {#approval-gates}
+
+An approval gate is a DAG node that blocks workflow progression
+until a human approves.  Unlike overrides (which interrupt running
+agents), approval gates are planned checkpoints in the workflow.
+
+Approval gates are defined as ACP-DAG-HITL nodes with HITL rules:
+
+~~~json
+{
+  "dag": {
+    "nodes": [
+      {
+        "id": "n-approve",
+        "type": "hitl:approval_gate",
+        "agent": "system:hitl-gateway",
+        "constraints": {
+          "hitl.required_role": "clinician:oncall",
+          "hitl.timeout_s": 300,
+          "hitl.timeout_action": "safe_pause"
+        }
+      }
+    ]
+  }
+}
+~~~
+{: #fig-gate title="Approval Gate as DAG Node"}
+
+When the workflow reaches an approval gate, the system:
+
+1. Emits an ECT with `exec_act: "hitl:approval_request"`.
+2. Notifies the required human role with an explainability
+   token (see {{explainability}}).
+3. Waits for approval (ECT: `"hitl:approval_granted"`) or
+   rejection (ECT: `"hitl:approval_denied"`).
+4. On timeout, applies `hitl.timeout_action` per {{timeout}}.
+
+## Approval Request and Response ECTs
+
+~~~json
+{
+  "exec_act": "hitl:approval_request",
+  "par": ["pre-gate-ect-uuid"],
+  "ext": {
+    "hitl.required_role": "clinician:oncall",
+    "hitl.context": "Medication dosage adjustment for patient P-1042",
+    "hitl.timeout_s": 300,
+    "hitl.explainability_ref": "expl-ect-uuid"
+  }
+}
+~~~
+{: #fig-approval-req title="Approval Request ECT"}
+
+~~~json
+{
+  "exec_act": "hitl:approval_granted",
+  "par": ["approval-request-ect-uuid"],
+  "ext": {
+    "hitl.operator_id": "user:dr-jones",
+    "hitl.scope": "medication:adjust",
+    "hitl.expires": "2026-03-01T13:00:00Z"
+  }
+}
+~~~
+{: #fig-approval-grant title="Approval Granted ECT"}
+
+~~~json
+{
+  "exec_act": "hitl:approval_denied",
+  "par": ["approval-request-ect-uuid"],
+  "ext": {
+    "hitl.operator_id": "user:dr-jones",
+    "hitl.reason": "Dosage exceeds safe maximum for patient weight",
+    "hitl.alternative": "Use standard protocol dosage"
+  }
+}
+~~~
+{: #fig-approval-deny title="Approval Denied ECT"}
+
+# Timeout and Fallback Policy {#timeout}
+
+When a human does not respond within `hitl.timeout_s`, the
+agent applies `hitl.timeout_action`.  Three policies are
+supported:
+
+fail-closed:
+: Abort the workflow.  The agent emits `atd:error` with
+  `error_type: "timeout"` and the ATD rollback protocol
+  applies.  Use when safety requires no action over wrong action.
+
+fail-open:
+: Continue as if approved, recording an audit ECT that no human
+  approved.  Use only when workflow continuity is more important
+  than human review (I0/I1 intensity deployments).
+
+escalate:
+: Move the approval request to the next operator in the
+  escalation chain (defined in ACP-DAG-HITL policy).  If the
+  escalation chain is exhausted, fall back to `fail-closed`.
+
+The timeout policy is set in ACP-DAG-HITL node constraints:
+
+~~~json
+{
+  "constraints": {
+    "hitl.timeout_s": 300,
+    "hitl.timeout_action": "escalate"
+  }
+}
+~~~
+{: #fig-timeout title="Timeout Policy as Node Constraint"}
+
+Timeout policy MUST be `fail-closed` at HITL intensity I3.
+Timeout policy MUST NOT be `fail-open` when assurance level is L3.
+
+# Explainability {#explainability}
+
+When a HITL point is triggered, the agent SHOULD provide an
+explainability token that allows the operator to make an informed
+decision.  At AEM assurance L2+, explainability is REQUIRED for
+approval gate requests.
+
+An explainability token is an ECT:
+
+- `exec_act`: `"hitl:explanation"`
+
+~~~json
+{
+  "exec_act": "hitl:explanation",
+  "par": ["last-agent-action-ect"],
+  "ext": {
+    "hitl.summary": "Agent proposes to reroute BGP traffic from AS64496 to AS64497 due to packet loss exceeding 15% threshold over 5-minute window.",
+    "hitl.proposed_action": "update-bgp-peer router-07 neighbor 198.51.100.1 remove-private-as",
+    "hitl.evidence_ects": [
+      "snmp-poll-1-ect-uuid",
+      "snmp-poll-2-ect-uuid",
+      "loss-calc-ect-uuid"
+    ],
+    "hitl.confidence": 0.91,
+    "hitl.risk_level": "medium",
+    "hitl.reversible": true
+  }
+}
+~~~
+{: #fig-explanation title="Explainability Token ECT"}
+
+Field definitions:
+
+- `hitl.summary`: Human-readable description of what the agent
+  was doing and why HITL was reached.  REQUIRED.
+- `hitl.proposed_action`: What the agent proposes to do.
+  REQUIRED.
+- `hitl.evidence_ects`: Array of `jti` values from prior ECTs
+  that support the proposal.  SHOULD be present.
+- `hitl.confidence`: Float 0.0-1.0; agent's self-assessed
+  confidence in the proposed action.  SHOULD be present.
+- `hitl.risk_level`: One of `low`, `medium`, `high`, `critical`.
+  SHOULD be present.
+- `hitl.reversible`: Whether the proposed action can be rolled
+  back.  REQUIRED.
+
+The `hitl.explainability_ref` field in the approval request ECT
+({{fig-approval-req}}) references the `jti` of this ECT.
+
+# Binding to AEM Assurance Levels {#assurance-binding}
+
+HITL requirements vary by AEM assurance level.  The following
+table is normative:
+
+| AEM Level | Required HITL Intensity | Override signing | Explainability |
+|-----------|------------------------|-----------------|----------------|
+| L1 | I0 (optional) | Optional | Optional |
+| L2 | I2 or higher | REQUIRED (signed JWT) | REQUIRED for I2+ |
+| L3 | I3 | REQUIRED (signed JWT, L3 ECT) | REQUIRED |
+{: #fig-assurance-hitl title="HITL Requirements by Assurance Level"}
+
+At L3, approval gate responses (hitl:approval_granted) MUST be
+committed to the audit ledger.
+
+# Broadcast Override {#broadcast}
+
+For environments with many agents, an operator MAY send a
+broadcast override to a management endpoint:
+
+~~~
+POST /hitl/broadcast HTTP/1.1
+Execution-Context: <broadcast-override-ect>
+
+{
+  "targets": ["spiffe://example.com/agent/a",
+               "spiffe://example.com/agent/b"],
+  "level": 3,
+  "reason": "Coordinated emergency stop"
+}
+~~~
+
+The broadcast endpoint fans out individual override ECTs to each
+target and returns per-agent results.  Each fan-out is itself an
+ECT linked to the broadcast override ECT.
+
+Broadcast overrides MUST be authenticated at L2 or higher.
+
+# Dead Man's Switch {#dead-man}
+
+For maximum reliability, agents SHOULD implement a heartbeat
+mechanism: the agent periodically pings an operator heartbeat
+endpoint.  If the heartbeat is missed for a configurable duration,
+the agent automatically enters Level 1 (PAUSE).
+
+The heartbeat interval SHOULD be 30 seconds.  The trigger
+threshold SHOULD be 3 missed heartbeats.
+
+This provides a safety net when network connectivity to the
+operator is lost.  The `unreachable_human` policy from
+ACP-DAG-HITL governs behavior when the dead man's switch
+activates: either `abort` (→ Level 3) or `safe_pause` (→ Level 1).
+
+# Security Considerations {#security}
+
+## Authentication of Override Commands
+
+All override endpoints MUST require authentication via mutual
+TLS ({{RFC8446}}) or signed JWTs ({{RFC7519}}).  The JWT MUST
+contain the operator's identity and be signed by a trusted key
+(per ACP-DAG-HITL operator role configuration).
+
+## Replay Prevention
+
+To prevent replay attacks, agents MUST:
+
+1. Reject override ECTs with `iat` more than 30 seconds in the
+   past.
+2. Reject duplicate `jti` values (require a nonce per override).
+3. Require the `hitl.nonce` field in override ECTs.
+
+## Impersonation
+
+Override commands carry high privilege.  Agents MUST verify:
+
+- The operator JWT is signed by a trusted key in the ACP-DAG-HITL
+  operator registry.
+- The operator role matches the `required_role` in the triggering
+  HITL rule.
+
+## Two-Operator Approval for TAKEOVER
+
+Deployments SHOULD implement multi-operator approval for Level 4
+(TAKEOVER), requiring two independent operator identities.  The
+two approval ECTs MUST both appear as `par` in the TAKEOVER
+override ECT.
+
+## HITL Bypass Prevention
+
+Agents that claim a HITL gate was satisfied MUST provide the
+`jti` of the corresponding `hitl:approval_granted` ECT in the
+ECT that follows the gate.  Agents MUST NOT proceed past an
+approval gate without a valid signed approval ECT.
+
+## Escalation Chain Integrity
+
+The escalation chain in ACP-DAG-HITL policy defines which roles
+receive escalations.  This chain MUST be signed as part of the
+policy token to prevent tampering.  Agents MUST NOT follow
+escalation chains from unsigned or unverified policy tokens.
+
+# IANA Considerations
+
+## Well-Known URI Registrations
+
+This document requests the following registrations per {{RFC8615}}:
+
+| URI Suffix | Purpose |
+|------------|---------|
+| `hitl/override` | Override command endpoint |
+| `hitl/resume` | Resume from PAUSE |
+| `hitl/lift` | Lift any active override |
+| `hitl/status` | Override status query |
+{: #fig-wellknown title="Well-Known URI Registrations"}
+
+## `exec_act` Values
+
+This document requests registration in the AEM Ecosystem
+Extension Registry:
+
+| Value | Description | Reference |
+|-------|-------------|-----------|
+| `hitl:override` | Human override command | This document |
+| `hitl:ack` | Agent acknowledgment of override | This document |
+| `hitl:resume` | Resume from PAUSE state | This document |
+| `hitl:lift` | Lift any active override | This document |
+| `hitl:approval_request` | Workflow blocked at approval gate | This document |
+| `hitl:approval_granted` | Human approved continuation | This document |
+| `hitl:approval_denied` | Human denied continuation | This document |
+| `hitl:explanation` | Explainability token for HITL decision | This document |
+{: #fig-iana-actions title="HITL exec_act Registrations"}
+
+--- back
+
+# Acknowledgments
+{:numbered="false"}
+
+This specification is the runtime enforcement companion to
+ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}}.  Override
+design is inspired by industrial safety systems (IEC 62061,
+ISO 13849).  The explainability token design is informed by
+EU AI Act Article 13 transparency requirements.
--- a/workspace/drafts/new-drafts/draft-cpat-cross-protocol-agent-translation-00.md
+++ b/workspace/drafts/new-drafts/draft-cpat-cross-protocol-agent-translation-00.md
@@ -0,0 +1,354 @@
+---
+title: "Cross-Protocol Agent Translation (CPAT)"
+abbrev: "CPAT"
+category: std
+docname: draft-cpat-cross-protocol-agent-translation-00
+submissiontype: IETF
+number:
+date:
+v: 3
+area: "ART"
+workgroup: "DISPATCH"
+keyword:
+  - agent interoperability
+  - protocol translation
+  - agentic workflows
+  - execution context
+
+author:
+  -
+    fullname: Generated by IETF Draft Analyzer
+    organization: Independent
+    email: placeholder@example.com
+
+normative:
+  RFC7519:
+  RFC7515:
+  RFC9110:
+  RFC8615:
+  I-D.nennemann-wimse-ect:
+    title: "Execution Context Tokens for Distributed Agentic Workflows"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
+
+informative:
+  I-D.nennemann-agent-dag-hitl-safety:
+    title: "Agent Context Policy Token: DAG Delegation with Human Override"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
+
+--- abstract
+
+This document defines the Cross-Protocol Agent Translation (CPAT)
+framework, a mechanism enabling AI agents using different
+communication protocols to interoperate.  With over 90 competing
+agent-to-agent protocol drafts and no interoperability standard,
+protocol fragmentation is the primary barrier to multi-vendor agent
+ecosystems.  CPAT defines capability advertisement, protocol
+negotiation, and translation gateways.  Translation hops are
+recorded as Execution Context Token (ECT) DAG nodes, giving every
+cross-protocol interaction a cryptographic audit trail without
+inventing a parallel tracing mechanism.
+
+--- middle
+
+# Introduction
+
+The IETF AI/agent landscape includes over 90 drafts proposing
+agent-to-agent communication protocols, yet no standard exists
+for agents using different protocols to exchange messages.
+
+CPAT takes a pragmatic approach: rather than mandating a single
+protocol, it defines the minimum machinery for agents to discover
+each other's protocol support, agree on a common format, and fall
+back to translation gateways when no common protocol exists.
+
+CPAT builds on Execution Context Tokens
+{{I-D.nennemann-wimse-ect}} as its audit and tracing backbone.
+Every translation hop produces an ECT, linking into the workflow
+DAG alongside the source and destination agents.  This eliminates
+the need for a separate tracing or provenance mechanism -- the ECT
+DAG already provides it.
+
+Design principles:
+
+1. Reuse existing standards (HTTP, JSON, TLS, ECT) wherever
+   possible.
+2. Keep the core mechanism small enough to implement in a day.
+3. Do not require agents to support any protocol beyond their own
+   plus CPAT negotiation.
+
+# Conventions and Definitions
+
+{::boilerplate bcp14-tagged}
+
+The following terms are used in this document:
+
+Agent Protocol:
+: A communication protocol used by an AI agent for peer-to-peer
+  message exchange (e.g., A2A, MCP, SLIM, uACP).
+
+Capability Document:
+: A JSON object describing the protocols an agent supports, served
+  at a well-known URI.
+
+Translation Gateway:
+: A service that converts messages between two agent protocols,
+  recording each translation as an ECT DAG node.
+
+# Problem Statement
+
+Consider three agents: Agent A speaks Protocol X, Agent B speaks
+Protocol Y, and Agent C speaks both X and Z.  Today there is no
+standard way for A to discover that B uses a different protocol,
+negotiate a common format, or route through a translator.
+
+Existing work on Agent Name Service (ANS) and agent discovery
+addresses finding agents but not protocol compatibility.  CPAT
+fills the gap between discovery and communication.
+
+# Protocol Capability Advertisement {#capability-ad}
+
+Each CPAT-compliant agent MUST serve a capability document at the
+well-known URI `/.well-known/cpat` {{RFC8615}}.  The document is a
+JSON object:
+
+~~~json
+{
+  "cpat_version": "1.0",
+  "agent_id": "spiffe://example.com/agent/pricing",
+  "protocols": [
+    {
+      "id": "a2a-v1",
+      "version": "1.0",
+      "endpoint": "https://agent.example.com/a2a",
+      "priority": 10
+    },
+    {
+      "id": "mcp-v1",
+      "version": "2025-03-26",
+      "endpoint": "https://agent.example.com/mcp",
+      "priority": 20
+    }
+  ],
+  "translation_gateways": [
+    "https://gateway.example.com/cpat/translate"
+  ],
+  "ect_assurance_level": "L2"
+}
+~~~
+{: #fig-capability title="Capability Document Example"}
+
+The `protocols` array MUST contain at least one entry.  Each entry
+MUST include `id` (a registered protocol identifier), `version`,
+and `endpoint`.  The `priority` field is OPTIONAL; lower values
+indicate higher preference.
+
+The `ect_assurance_level` field declares the minimum ECT assurance
+level the agent requires for interactions.  This enables gateways
+to produce ECTs at the correct level.
+
+Agents SHOULD also advertise their capability document URI in DNS
+SVCB records.  The DNS record type `_cpat._tcp` SHOULD be used.
+
+# Negotiation Handshake {#negotiation}
+
+When Agent A wants to communicate with Agent B:
+
+Step 1:
+: Agent A fetches Agent B's capability document from B's
+  well-known CPAT URI over HTTPS.
+
+Step 2:
+: Agent A computes the intersection of its own protocol list with
+  Agent B's.  If the intersection is non-empty, the protocol with
+  the lowest combined priority score is selected.  Communication
+  proceeds directly using that protocol.
+
+Step 3:
+: If no common protocol exists, Agent A checks whether any
+  translation gateway listed by either agent supports both
+  protocols.  Agent A queries the gateway:
+
+~~~
+GET /.well-known/cpat/gateway?from=a2a-v1&to=slim-v1
+~~~
+
+The gateway responds with 200 OK if it supports the pair, or
+404 if not.
+
+Step 4:
+: If a suitable gateway is found, Agent A sends its message to the
+  gateway, which translates and forwards it to Agent B.  The
+  gateway records the translation as an ECT (see {{ect-integration}}).
+
+Step 5:
+: If no gateway supports the required pair, Agent A returns an
+  error to its caller with error code `no_translation_path`.
+
+The entire negotiation is stateless and cacheable.  Agents SHOULD
+cache capability documents for the duration indicated by HTTP
+Cache-Control headers, defaulting to 3600 seconds.
+
+# ECT Integration {#ect-integration}
+
+Every translation hop produces an ECT {{I-D.nennemann-wimse-ect}}
+that links into the workflow DAG.  This provides cryptographic
+proof of protocol translation without a separate tracing mechanism.
+
+## Translation ECT Claims
+
+A gateway producing a translation ECT MUST set:
+
+- `exec_act`: `"cpat:translate"`
+- `par`: array containing the `jti` of the source agent's ECT
+- `wid`: the workflow identifier from the source ECT (preserving
+  workflow continuity across protocol boundaries)
+
+The `ext` claim carries CPAT-specific metadata:
+
+~~~json
+{
+  "ext": {
+    "cpat.source_protocol": "a2a-v1",
+    "cpat.dest_protocol": "slim-v1",
+    "cpat.gateway_id": "spiffe://gw.example.com/cpat",
+    "cpat.translation_warnings": []
+  }
+}
+~~~
+{: #fig-translation-ect title="Translation ECT Extension Claims"}
+
+The `inp_hash` claim MUST contain the SHA-256 hash of the source
+protocol message.  The `out_hash` claim MUST contain the SHA-256
+hash of the translated message.  This allows verifiers to confirm
+that a specific input produced a specific output without accessing
+the message content.
+
+## Assurance Level Inheritance
+
+The gateway MUST produce ECTs at the higher of:
+
+- The source agent's declared `ect_assurance_level`
+- The destination agent's declared `ect_assurance_level`
+
+At L3, the translation ECT MUST be recorded in the audit ledger
+before the translated message is forwarded to the destination agent.
+
+## DAG Continuity
+
+The translation creates a three-node subgraph in the workflow DAG:
+
+~~~
+Source Agent ECT (exec_act: "send_task")
+      |
+      v  [par reference]
+Gateway ECT (exec_act: "cpat:translate")
+      |
+      v  [par reference]
+Dest Agent ECT (exec_act: "receive_task")
+~~~
+{: #fig-dag-continuity title="Translation DAG Subgraph"}
+
+The Execution-Context HTTP header {{I-D.nennemann-wimse-ect}}
+survives protocol translation: the gateway includes the
+translation ECT in the Execution-Context header of the forwarded
+request to the destination agent.
+
+# Translation Gateway Requirements {#gateway-reqs}
+
+A CPAT translation gateway MUST:
+
+1. Serve a capability document listing all supported protocol
+   pairs at `/.well-known/cpat/gateway`.
+
+2. Accept messages via HTTP POST at its translate endpoint.
+
+3. Produce an ECT for every translation per {{ect-integration}}.
+
+4. Preserve message semantics: the intent, core payload content,
+   and metadata MUST survive translation.  Fields with no
+   equivalent in the destination protocol SHOULD be carried in a
+   protocol-specific extension field or dropped with a warning
+   recorded in `cpat.translation_warnings`.
+
+5. Return the translated message in the response body, or forward
+   it directly to the destination agent.
+
+A gateway MUST NOT modify payload semantics during translation.
+
+Gateways MUST require TLS 1.3 for all connections and SHOULD
+implement rate limiting per source agent.
+
+# Policy Integration {#policy-integration}
+
+When used with the Agent Context Policy Token
+{{I-D.nennemann-agent-dag-hitl-safety}}, CPAT-related policies
+can be expressed as DAG node constraints:
+
+~~~json
+{
+  "dag": {
+    "nodes": [
+      {
+        "id": "n-translate",
+        "type": "cpat:translate",
+        "agent": "spiffe://gw.example.com/cpat",
+        "constraints": {
+          "allowed_source_protocols": ["a2a-v1", "mcp-v1"],
+          "allowed_dest_protocols": ["slim-v1"],
+          "max_translation_hops": 2
+        }
+      }
+    ]
+  }
+}
+~~~
+{: #fig-policy title="CPAT Policy as DAG Node Constraints"}
+
+The `max_translation_hops` constraint prevents messages from being
+translated through an excessive number of gateways.  Agents
+receiving a message SHOULD reject it if the ECT DAG contains more
+translation hops than allowed by policy.
+
+# Security Considerations
+
+Capability documents are served over HTTPS, ensuring transport
+security.  Agents SHOULD verify TLS certificates before trusting
+capability documents.
+
+Gateways are trusted intermediaries with access to message content
+during translation.  For end-to-end confidentiality, agents MAY
+encrypt the message payload using a shared key established out of
+band; the gateway translates only the protocol framing, not the
+encrypted content.
+
+The ECT audit trail ({{ect-integration}}) enables detection of:
+
+- Unauthorized gateways (unexpected `cpat.gateway_id` in the DAG)
+- Content tampering (mismatched `inp_hash`/`out_hash` relative to
+  message content)
+- Routing loops (repeated gateway IDs in the DAG ancestry)
+
+At L3, the audit ledger provides tamper-evident proof of all
+translations for regulatory compliance.
+
+# IANA Considerations
+
+This document requests the following IANA registrations:
+
+1. A "CPAT Protocol Identifier" registry under Expert Review
+   policy.  Initial entries: "a2a-v1", "mcp-v1", "slim-v1",
+   "uacp-v1", "ainp-v1".
+
+2. A well-known URI registration for "cpat" per {{RFC8615}}.
+
+3. Registration of the `exec_act` value "cpat:translate" in a
+   future ECT action type registry.
+
+--- back
+
+# Acknowledgments
+{:numbered="false"}
+
+This document builds on the Execution Context Token specification
+{{I-D.nennemann-wimse-ect}} and the Agent Context Policy Token
+{{I-D.nennemann-agent-dag-hitl-safety}}.
--- a/workspace/drafts/new-drafts/draft-cpat-cross-protocol-agent-translation-00.txt
+++ b/workspace/drafts/new-drafts/draft-cpat-cross-protocol-agent-translation-00.txt
@@ -0,0 +1,281 @@
+Internet-Draft                                           AI/Agent WG
+Intended status: Standards Track                          March 2026
+Expires: September 15, 2026
+
+
+         Cross-Protocol Agent Translation (CPAT)
+         draft-cpat-cross-protocol-agent-translation-00
+
+Abstract
+
+   This document defines the Cross-Protocol Agent Translation (CPAT)
+   framework, a lightweight mechanism enabling AI agents using
+   different communication protocols to interoperate through
+   capability advertisement and message translation.  With over 90
+   competing agent-to-agent (A2A) protocol drafts and no
+   interoperability standard, protocol fragmentation is the primary
+   barrier to multi-vendor agent ecosystems.  CPAT defines three
+   components: a capability advertisement format for agents to
+   declare supported protocols, a negotiation handshake to select a
+   common protocol or translation path, and a canonical envelope
+   format that enables translation gateways to convert messages
+   between incompatible protocols.  CPAT reuses existing HTTP
+   content negotiation patterns and builds on JSON for simplicity.
+
+Status of This Memo
+
+   This Internet-Draft is submitted in full conformance with the
+   provisions of BCP 78 and BCP 79.
+
+   This document is intended to have Standards Track status.
+   Distribution of this memo is unlimited.
+
+Table of Contents
+
+   1.  Introduction
+   2.  Terminology
+   3.  Problem Statement
+   4.  Protocol Capability Advertisement
+   5.  Negotiation Handshake
+   6.  Canonical Envelope Format
+   7.  Translation Gateway Requirements
+   8.  Security Considerations
+   9.  IANA Considerations
+
+1.  Introduction
+
+   The IETF AI/agent landscape includes over 90 drafts proposing
+   agent-to-agent communication protocols, yet no standard exists
+   for agents using different protocols to exchange messages.  This
+   fragmentation mirrors the early days of instant messaging, where
+   users on different networks could not communicate until gateway
+   and federation standards emerged.
+
+   CPAT takes a pragmatic approach: rather than mandating a single
+   protocol, it defines the minimum machinery for agents to
+   discover each other's protocol support, agree on a common
+   format, and fall back to translation gateways when no common
+   protocol exists.  The design follows three principles:
+
+   1. Reuse existing standards (HTTP, JSON, TLS) wherever possible.
+   2. Keep the core mechanism small enough to implement in a day.
+   3. Do not require agents to support any protocol beyond their own
+      plus CPAT negotiation.
+
+2.  Terminology
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+   NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
+   "OPTIONAL" in this document are to be interpreted as described
+   in RFC 2119 [RFC2119].
+
+   Agent Protocol: A communication protocol used by an AI agent for
+   peer-to-peer message exchange (e.g., A2A, MCP, SLIM, uACP).
+
+   Capability Document: A JSON object describing the protocols an
+   agent supports, served at a well-known URI.
+
+   Translation Gateway: A service that converts messages between
+   two agent protocols using the CPAT canonical envelope as an
+   intermediate representation.
+
+3.  Problem Statement
+
+   Consider three agents: Agent A speaks Protocol X, Agent B speaks
+   Protocol Y, and Agent C speaks both X and Z.  Today there is no
+   standard way for A to discover that B uses a different protocol,
+   negotiate a common format, or route through a translator.  Each
+   protocol defines its own discovery and messaging layer, creating
+   isolated silos.
+
+   Existing work on Agent Name Service (ANS) and agent discovery
+   addresses finding agents but not protocol compatibility.  The
+   ADOL draft addresses token efficiency within a single protocol
+   but not cross-protocol translation.  CPAT fills the gap between
+   discovery and communication.
+
+4.  Protocol Capability Advertisement
+
+   Each CPAT-compliant agent MUST serve a capability document at
+   the well-known URI /.well-known/cpat.  The document is a JSON
+   object with the following structure:
+
+      {
+        "cpat_version": "1.0",
+        "agent_id": "urn:uuid:550e8400-e29b-41d4-a716-446655440000",
+        "protocols": [
+          {
+            "id": "a2a-v1",
+            "version": "1.0",
+            "endpoint": "https://agent.example.com/a2a",
+            "priority": 10
+          },
+          {
+            "id": "mcp-v1",
+            "version": "2025-03-26",
+            "endpoint": "https://agent.example.com/mcp",
+            "priority": 20
+          }
+        ],
+        "translation_gateways": [
+          "https://gateway.example.com/cpat/translate"
+        ],
+        "envelope_formats": ["cpat-envelope-v1"]
+      }
+
+   The "protocols" array MUST contain at least one entry.  Each
+   entry MUST include "id" (a registered protocol identifier),
+   "version", and "endpoint".  The "priority" field is OPTIONAL;
+   lower values indicate higher preference.
+
+   Agents SHOULD also advertise their capability document URI in
+   DNS SRV or SVCB records for automated discovery.  The DNS
+   record type "_cpat._tcp" SHOULD be used.
+
+5.  Negotiation Handshake
+
+   When Agent A wants to communicate with Agent B, the following
+   negotiation procedure applies:
+
+   Step 1: Agent A fetches Agent B's capability document from
+   B's well-known CPAT URI over HTTPS.
+
+   Step 2: Agent A computes the intersection of its own protocol
+   list with Agent B's.  If the intersection is non-empty, the
+   protocol with the lowest combined priority score is selected.
+   Communication proceeds directly using that protocol.
+
+   Step 3: If no common protocol exists, Agent A checks whether
+   any translation gateway listed by either agent supports both
+   protocols.  Agent A queries the gateway's capability endpoint
+   at /.well-known/cpat/gateway:
+
+      GET /.well-known/cpat/gateway?from=a2a-v1&to=slim-v1
+
+   The gateway responds with 200 OK and a translation descriptor
+   if it supports the pair, or 404 if not.
+
+   Step 4: If a suitable gateway is found, Agent A sends its
+   message wrapped in a CPAT envelope (Section 6) to the gateway,
+   which translates and forwards it to Agent B.
+
+   Step 5: If no gateway supports the required pair, Agent A
+   SHOULD return an error to its caller indicating protocol
+   incompatibility, using the CPAT error code "no_translation_path".
+
+   The entire negotiation is stateless and cacheable.  Agents
+   SHOULD cache capability documents for the duration indicated by
+   HTTP Cache-Control headers, defaulting to 3600 seconds.
+
+6.  Canonical Envelope Format
+
+   The CPAT envelope wraps a protocol-specific message in a
+   standard container for gateway translation.  The envelope is a
+   JSON object:
+
+      {
+        "cpat_version": "1.0",
+        "message_id": "urn:uuid:6ba7b810-9dad-11d1-80b4-00c04fd430c8",
+        "timestamp": "2026-03-01T12:00:00Z",
+        "source": {
+          "agent_id": "urn:uuid:...",
+          "protocol": "a2a-v1"
+        },
+        "destination": {
+          "agent_id": "urn:uuid:...",
+          "protocol": "slim-v1"
+        },
+        "intent": "task_request",
+        "payload": {
+          "content_type": "application/json",
+          "body": "...base64-encoded protocol-specific message..."
+        },
+        "trace": ["urn:uuid:...source", "urn:uuid:...gateway"]
+      }
+
+   The "intent" field MUST be one of: "task_request",
+   "task_response", "notification", "error", "capability_query".
+   This allows gateways to perform semantic translation even when
+   protocol message structures differ significantly.
+
+   The "trace" array provides a simple provenance chain of all
+   agents and gateways that have handled the message.  Each
+   intermediary MUST append its own identifier.
+
+   The "payload.body" field contains the original protocol message,
+   base64-encoded.  Gateways translate by decoding the source
+   protocol message, mapping it to the CPAT semantic model (intent
+   + standard fields), and re-encoding in the destination protocol.
+
+7.  Translation Gateway Requirements
+
+   A CPAT translation gateway MUST:
+
+   1. Serve a capability document listing all supported protocol
+      pairs at /.well-known/cpat/gateway.
+
+   2. Accept CPAT envelopes via HTTP POST at its translate endpoint.
+
+   3. Validate envelope integrity before translation.
+
+   4. Preserve message semantics: the intent, core payload content,
+      and metadata MUST survive translation.  Fields with no
+      equivalent in the destination protocol SHOULD be carried in
+      a protocol-specific extension field or dropped with a warning.
+
+   5. Return the translated envelope in the response body, or
+      forward it directly to the destination agent.
+
+   6. Log all translations with source, destination, and timestamp
+      for audit purposes.
+
+   A gateway MUST NOT modify the payload semantics during
+   translation.  If exact translation is not possible, the gateway
+   MUST include a "translation_warnings" array in the envelope
+   listing fields that were approximated or dropped.
+
+   Gateways SHOULD implement rate limiting per source agent and
+   MUST require TLS 1.3 [RFC8446] for all connections.
+
+8.  Security Considerations
+
+   Capability documents are served over HTTPS, ensuring transport
+   security.  Agents SHOULD verify the TLS certificate of peers
+   before trusting their capability documents.
+
+   CPAT envelopes in transit through gateways are visible to the
+   gateway operator.  For end-to-end confidentiality, agents MAY
+   encrypt the payload.body field using a shared key established
+   out of band.  The envelope metadata (intent, agent IDs,
+   timestamps) remains visible to enable routing.
+
+   Gateways are trusted intermediaries.  Deployments SHOULD use
+   gateways operated by mutually trusted parties or verified
+   through attestation mechanisms such as those in
+   draft-aylward-daap-v2.
+
+   The trace array enables detection of routing loops and
+   unauthorized intermediaries.  Agents SHOULD reject messages
+   with unexpected entries in the trace.
+
+   Denial-of-service attacks against gateways are mitigated by
+   rate limiting (Section 7) and standard HTTP-layer protections.
+
+9.  IANA Considerations
+
+   This document requests IANA establish the following:
+
+   1. A "CPAT Protocol Identifier" registry under Expert Review
+      policy.  Initial entries: "a2a-v1", "mcp-v1", "slim-v1",
+      "uacp-v1", "ainp-v1".
+
+   2. A "CPAT Intent Type" registry under Specification Required
+      policy.  Initial entries: "task_request", "task_response",
+      "notification", "error", "capability_query".
+
+   3. A well-known URI registration for "cpat" per RFC 8615.
+
+Author's Address
+
+   Generated by IETF Draft Analyzer
+   2026-03-01
--- a/workspace/drafts/new-drafts/draft-d-aepb-protocol-binding-00.md
+++ b/workspace/drafts/new-drafts/draft-d-aepb-protocol-binding-00.md
@@ -0,0 +1,315 @@
+---
+title: "Agent Ecosystem Protocol Binding (AEPB): Interop and Lifecycle"
+abbrev: "AEPB"
+category: std
+docname: draft-aepb-agent-ecosystem-protocol-binding-00
+submissiontype: IETF
+number:
+date:
+v: 3
+area: "ART"
+workgroup: "DISPATCH"
+keyword:
+  - agent interoperability
+  - protocol translation
+  - lifecycle
+  - agentic workflows
+
+author:
+  -
+    fullname: TBD
+    organization: Independent
+    email: placeholder@example.com
+
+normative:
+  RFC2119:
+  RFC8174:
+  RFC8446:
+  RFC8615:
+  RFC9110:
+  I-D.nennemann-wimse-ect:
+    title: "Execution Context Tokens for Distributed Agentic Workflows"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
+  I-D.nennemann-agent-dag-hitl-safety:
+    title: "Agent Context Policy Token: DAG Delegation with Human Override"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
+
+informative:
+
+--- abstract
+
+This document defines the Agent Ecosystem Protocol Binding (AEPB),
+the interoperability and lifecycle layer of the agent ecosystem.
+With over 90 competing A2A protocol drafts and no interoperability
+standard, AEPB defines capability advertisement, protocol
+negotiation, translation gateways, and agent lifecycle management
+(versioning, graceful shutdown, retirement).  Translation hops
+produce ECT nodes, preserving DAG continuity across protocol
+boundaries.  Protocol constraints are expressed as ACP-DAG-HITL
+node constraints.
+
+--- middle
+
+# Introduction
+
+The IETF AI/agent landscape includes over 90 drafts proposing
+agent-to-agent communication protocols.  No standard exists for
+agents using different protocols to exchange messages, and no
+standard exists for how agents evolve, get replaced, or retire
+without disrupting dependent services.
+
+AEPB addresses both gaps with a pragmatic approach: rather than
+mandating a single protocol, it defines the minimum machinery for
+agents to discover each other's protocol support, agree on a
+common format, fall back to translation gateways, and manage their
+lifecycle.
+
+AEPB builds on ECT {{I-D.nennemann-wimse-ect}} for audit (every
+translation hop is a DAG node) and ACP-DAG-HITL
+{{I-D.nennemann-agent-dag-hitl-safety}} for policy (protocol
+constraints as node constraints).
+
+# Conventions and Definitions
+
+{::boilerplate bcp14-tagged}
+
+Agent Protocol:
+: A communication protocol used by an AI agent for peer-to-peer
+  message exchange (e.g., A2A, MCP, SLIM, uACP).
+
+Capability Document:
+: A JSON object describing the protocols an agent supports.
+
+Translation Gateway:
+: A service that converts messages between two agent protocols,
+  recording each translation as an ECT DAG node.
+
+# Capability Advertisement {#capability}
+
+Each AEPB-compliant agent MUST serve a capability document at
+`/.well-known/aepb` {{RFC8615}}:
+
+~~~json
+{
+  "aepb_version": "1.0",
+  "agent_id": "spiffe://example.com/agent/pricing",
+  "protocols": [
+    {
+      "id": "a2a-v1",
+      "version": "1.0",
+      "endpoint": "https://agent.example.com/a2a",
+      "priority": 10
+    },
+    {
+      "id": "mcp-v1",
+      "version": "2025-03-26",
+      "endpoint": "https://agent.example.com/mcp",
+      "priority": 20
+    }
+  ],
+  "translation_gateways": [
+    "https://gateway.example.com/aepb/translate"
+  ],
+  "ect_assurance_level": "L2",
+  "lifecycle": {
+    "status": "active",
+    "version": "2.1.0",
+    "deprecated_at": null,
+    "sunset_at": null,
+    "successor": null
+  }
+}
+~~~
+{: #fig-capability title="Capability Document"}
+
+The `protocols` array MUST contain at least one entry.  `priority`
+is OPTIONAL; lower values indicate higher preference.
+
+The `lifecycle` object (see {{lifecycle}}) provides versioning and
+deprecation metadata.
+
+Agents SHOULD advertise via DNS SVCB records (`_aepb._tcp`).
+
+# Protocol Negotiation {#negotiation}
+
+When Agent A wants to communicate with Agent B:
+
+1. Agent A fetches B's capability document over HTTPS.
+
+2. Agent A computes the intersection of protocol lists.  If
+   non-empty, the protocol with the lowest combined priority is
+   selected.  Communication proceeds directly.
+
+3. If no common protocol exists, Agent A checks translation
+   gateways listed by either agent:
+
+~~~
+GET /.well-known/aepb/gateway?from=a2a-v1&to=slim-v1
+~~~
+
+   The gateway responds 200 if it supports the pair, 404 if not.
+
+4. If a suitable gateway is found, Agent A sends its message to
+   the gateway, which translates and forwards.
+
+5. If no gateway supports the pair, Agent A returns error
+   `no_translation_path`.
+
+Negotiation is stateless and cacheable (Cache-Control, default
+3600s).
+
+# Translation as ECT DAG Nodes {#translation-ect}
+
+Every translation hop produces an ECT:
+
+- `exec_act`: `"aepb:translate"`
+- `par`: the source agent's ECT
+- `inp_hash`: SHA-256 of source protocol message
+- `out_hash`: SHA-256 of translated message
+
+~~~json
+{
+  "exec_act": "aepb:translate",
+  "par": ["source-agent-ect-uuid"],
+  "inp_hash": "sha256-of-source-message",
+  "out_hash": "sha256-of-translated-message",
+  "ext": {
+    "aepb.source_protocol": "a2a-v1",
+    "aepb.dest_protocol": "slim-v1",
+    "aepb.gateway_id": "spiffe://gw.example.com/aepb",
+    "aepb.translation_warnings": []
+  }
+}
+~~~
+{: #fig-translate-ect title="Translation ECT"}
+
+This creates a three-node subgraph:
+
+~~~
+Source ECT → Gateway ECT (aepb:translate) → Dest ECT
+~~~
+
+The Execution-Context HTTP header survives protocol translation:
+the gateway includes the translation ECT in the header of the
+forwarded request.
+
+## Translation Policy
+
+Protocol constraints are ACP-DAG-HITL node constraints:
+
+~~~json
+{
+  "constraints": {
+    "aepb.allowed_source_protocols": ["a2a-v1", "mcp-v1"],
+    "aepb.allowed_dest_protocols": ["slim-v1"],
+    "aepb.max_translation_hops": 2
+  }
+}
+~~~
+{: #fig-policy title="Translation Policy"}
+
+Agents receiving a message SHOULD reject it if the ECT DAG
+contains more translation hops than `aepb.max_translation_hops`.
+
+# Translation Gateway Requirements {#gateway}
+
+A gateway MUST:
+
+1. Serve a capability document at `/.well-known/aepb/gateway`.
+2. Accept messages via HTTP POST at its translate endpoint.
+3. Produce an ECT per {{translation-ect}} for every translation.
+4. Preserve message semantics.  Fields without a destination
+   equivalent are carried in an extension field or dropped with
+   a warning in `aepb.translation_warnings`.
+5. Require TLS 1.3 {{RFC8446}} for all connections.
+6. Implement rate limiting per source agent.
+
+A gateway MUST NOT modify payload semantics.
+
+# Agent Lifecycle Management {#lifecycle}
+
+## Lifecycle States
+
+An agent's `lifecycle.status` MUST be one of:
+
+- `active`: Normal operation.  Default state.
+- `deprecated`: Agent is functional but will be retired.
+  `deprecated_at` MUST be set.  Clients SHOULD migrate to
+  `successor` if provided.
+- `draining`: Agent is rejecting new workflows but completing
+  in-progress ones.  New delegation requests return HTTP 503
+  with `Retry-After` header pointing to `successor`.
+- `retired`: Agent is offline.  Capability document returns
+  HTTP 410 Gone with `successor` for redirect.
+
+## Versioning
+
+The `lifecycle.version` field uses semantic versioning.  Agents
+MUST increment the major version when breaking changes occur
+(incompatible protocol or behavior changes).
+
+Capability documents MUST include the version.  Agents SHOULD
+include version in ECT `ext` claims (`aepb.agent_version`) so
+the audit trail records which version performed each action.
+
+## Graceful Shutdown
+
+When an agent transitions to `draining`:
+
+1. Update capability document: `status: "draining"`,
+   set `sunset_at` timestamp.
+2. Reject new workflow delegations with HTTP 503.
+3. Complete all in-progress workflows.
+4. Emit a final ECT: `exec_act: "aepb:shutdown"`.
+5. Transition to `retired`.
+
+Agents SHOULD provide at least 24 hours between `deprecated`
+and `draining` to allow clients to discover the change via
+cached capability documents.
+
+## Successor Discovery
+
+When `successor` is set, it MUST be the URI of the replacement
+agent's capability document.  Clients SHOULD transparently
+redirect to the successor after verifying its capability
+document.
+
+# Security Considerations
+
+Capability documents are served over HTTPS.  Agents SHOULD verify
+TLS certificates before trusting capability documents.
+
+Gateways are trusted intermediaries with access to message content.
+For end-to-end confidentiality, agents MAY encrypt message payloads
+with a shared key established out of band.
+
+The ECT audit trail enables detection of unauthorized gateways,
+content tampering (mismatched `inp_hash`/`out_hash`), and routing
+loops (repeated gateway IDs in DAG ancestry).
+
+Lifecycle transitions (especially `draining` and `retired`) can be
+exploited for denial of service.  Only the agent operator (verified
+via identity binding) SHOULD be able to update lifecycle status.
+
+# IANA Considerations
+
+This document requests:
+
+1. A "AEPB Protocol Identifier" registry under Expert Review.
+   Initial entries: `a2a-v1`, `mcp-v1`, `slim-v1`, `uacp-v1`,
+   `ainp-v1`.
+
+2. Well-known URI registrations for `aepb` and `aepb/gateway`
+   per {{RFC8615}}.
+
+3. Registration of `exec_act` values: `aepb:translate`,
+   `aepb:shutdown` in a future ECT action type registry.
+
+--- back
+
+# Acknowledgments
+{:numbered="false"}
+
+AEPB builds on ECT {{I-D.nennemann-wimse-ect}} for translation
+audit trails and ACP-DAG-HITL
+{{I-D.nennemann-agent-dag-hitl-safety}} for protocol policy.
--- a/workspace/drafts/new-drafts/draft-d-aepb-protocol-binding-01.md
+++ b/workspace/drafts/new-drafts/draft-d-aepb-protocol-binding-01.md
@@ -0,0 +1,577 @@
+---
+title: "Agent Ecosystem Protocol Binding (AEPB): Interop and Lifecycle"
+abbrev: "AEPB"
+category: std
+docname: draft-aepb-agent-ecosystem-protocol-binding-01
+submissiontype: IETF
+number:
+date:
+v: 3
+area: "ART"
+workgroup: "DISPATCH"
+keyword:
+  - agent interoperability
+  - protocol translation
+  - lifecycle
+  - agentic workflows
+
+author:
+  -
+    fullname: TBD
+    organization: Independent
+    email: placeholder@example.com
+
+normative:
+  RFC2119:
+  RFC8174:
+  RFC8446:
+  RFC8615:
+  RFC9110:
+  RFC8594:
+  I-D.nennemann-wimse-ect:
+    title: "Execution Context Tokens for Distributed Agentic Workflows"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
+  I-D.nennemann-agent-dag-hitl-safety:
+    title: "Agent Context Policy Token: DAG Delegation with Human Override"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
+
+informative:
+
+--- abstract
+
+This document defines the Agent Ecosystem Protocol Binding (AEPB),
+the interoperability and lifecycle layer of the agent ecosystem.
+With over 90 competing A2A protocol drafts and no interoperability
+standard, AEPB defines capability advertisement, protocol
+negotiation, formal binding requirements, translation gateway
+architecture, and agent lifecycle management (versioning, graceful
+shutdown, retirement).  Translation hops produce ECT nodes,
+preserving DAG continuity across protocol boundaries.  Protocol
+constraints are expressed as ACP-DAG-HITL node constraints.
+
+--- middle
+
+# Introduction
+
+The IETF AI/agent landscape includes over 90 drafts proposing
+agent-to-agent communication protocols.  No standard exists for
+agents using different protocols to exchange messages, and no
+standard exists for how agents evolve, get replaced, or retire
+without disrupting dependent services.
+
+AEPB addresses both gaps with a pragmatic approach: rather than
+mandating a single protocol, it defines the minimum machinery for
+agents to discover each other's protocol support, agree on a
+common format, fall back to translation gateways, and manage their
+lifecycle.
+
+AEPB builds on ECT {{I-D.nennemann-wimse-ect}} for audit (every
+translation hop is a DAG node) and ACP-DAG-HITL
+{{I-D.nennemann-agent-dag-hitl-safety}} for policy (protocol
+constraints as node constraints).
+
+# Conventions and Definitions
+
+{::boilerplate bcp14-tagged}
+
+Agent Protocol:
+: A communication protocol used by an AI agent for peer-to-peer
+  message exchange (e.g., A2A, MCP, SLIM, uACP).
+
+Capability Document:
+: A JSON object describing the protocols an agent supports,
+  lifecycle status, and ECT assurance level.
+
+Translation Gateway:
+: A service that converts messages between two agent protocols,
+  recording each translation as an ECT DAG node.
+
+Protocol Binding:
+: The mapping between the AEPB ecosystem semantics and a specific
+  agent protocol.  Each binding has a stable identifier string.
+
+Binding Identifier:
+: A short string identifying a specific protocol binding
+  version (e.g., `a2a-v1`, `mcp-v1`).
+
+# Capability Advertisement {#capability}
+
+## Capability Document Format
+
+Each AEPB-compliant agent MUST serve a capability document at
+`/.well-known/aepb` per {{RFC8615}}:
+
+~~~json
+{
+  "aepb_version": "1.0",
+  "agent_id": "spiffe://example.com/agent/pricing",
+  "protocols": [
+    {
+      "id": "a2a-v1",
+      "version": "1.0",
+      "endpoint": "https://agent.example.com/a2a",
+      "priority": 10
+    },
+    {
+      "id": "mcp-v1",
+      "version": "2025-03-26",
+      "endpoint": "https://agent.example.com/mcp",
+      "priority": 20
+    }
+  ],
+  "translation_gateways": [
+    "https://gateway.example.com/aepb/translate"
+  ],
+  "ect_assurance_level": "L2",
+  "ect_namespaces": ["atd", "hitl", "apae"],
+  "lifecycle": {
+    "status": "active",
+    "version": "2.1.0",
+    "deprecated_at": null,
+    "sunset_at": null,
+    "successor": null
+  }
+}
+~~~
+{: #fig-capability title="Capability Document"}
+
+The `protocols` array MUST contain at least one entry.  `priority`
+is OPTIONAL; lower values indicate higher preference.
+
+The `ect_namespaces` field MUST list all ECT extension namespaces
+(ATD, HITL, APAE) that this agent emits and can process.  Peers
+use this to determine whether ecosystem semantics are compatible.
+
+The `lifecycle` object (see {{lifecycle}}) provides versioning and
+deprecation metadata.
+
+## DNS-SD Advertisement
+
+Agents SHOULD advertise via DNS SVCB records (`_aepb._tcp`) as
+an alternative to well-known URI discovery.  The SVCB record
+MUST include a `hint` parameter pointing to the well-known URI.
+
+## Capability Document Caching
+
+Capability documents MAY be cached per HTTP cache-control
+semantics per {{RFC9110}}.  The default max-age is 3600 seconds.
+Agents MUST set `Expires` or `Cache-Control: max-age` on
+capability document responses.
+
+# Protocol Negotiation {#negotiation}
+
+When Agent A wants to communicate with Agent B:
+
+1. Agent A fetches B's capability document over HTTPS.
+
+2. Agent A computes the intersection of protocol lists.  If
+   non-empty, the protocol with the lowest combined priority is
+   selected.  Communication proceeds directly.
+
+3. If no common protocol exists, Agent A checks translation
+   gateways listed by either agent:
+
+~~~
+GET /.well-known/aepb/gateway?from=a2a-v1&to=slim-v1 HTTP/1.1
+~~~
+
+   The gateway responds 200 if it supports the pair, 404 if not.
+
+4. If a suitable gateway is found, Agent A sends its message to
+   the gateway, which translates and forwards.
+
+5. If no gateway supports the pair, Agent A MUST return error
+   `no_translation_path` and MUST NOT proceed.
+
+Negotiation is stateless and cacheable (Cache-Control, default
+3600s).
+
+## Protocol Downgrade Prevention
+
+Protocol negotiation MUST NOT result in selection of a binding
+below the minimum configured in ACP-DAG-HITL node constraints:
+
+~~~json
+{
+  "constraints": {
+    "aepb.min_protocol_security": "tls-1.3"
+  }
+}
+~~~
+
+Agents MUST reject capability documents that advertise only
+protocols below their configured minimum security requirement.
+Specifically, all protocols MUST use TLS 1.3 {{RFC8446}}; no
+plaintext bindings are permitted in production deployments.
+
+# Conforming Protocol Binding Requirements {#binding-requirements}
+
+A protocol binding MUST satisfy the following requirements to be
+registered in the AEPB Protocol Binding Registry.
+
+## ECT Carriage
+
+A conforming binding MUST provide a mechanism to carry ECTs
+alongside protocol messages.  For HTTP-based protocols, this
+MUST be the `Execution-Context` header as defined in
+{{I-D.nennemann-wimse-ect}}.  For non-HTTP protocols, the
+binding specification MUST define an equivalent envelope field.
+
+## Task Invocation with Parent Reference
+
+A conforming binding MUST support task invocation messages that
+include a reference to the parent ECT `jti`.  This allows the
+receiving agent to link the new task into the ECT DAG.
+
+## Checkpoint and Rollback Signal Carriage
+
+A conforming binding MUST support conveying ATD rollback requests
+and results.  For HTTP-based bindings, the `/.well-known/atd/rollback`
+endpoint MUST be accessible independent of the main protocol
+endpoint.
+
+## HITL Callback Registration
+
+A conforming binding MUST support HITL approval callback
+registration.  When a task involves a planned approval gate, the
+initiating agent MUST be able to register a callback URI that
+receives the `hitl:approval_granted` or `hitl:approval_denied`
+ECT when the human responds.  For HTTP bindings, this is a
+standard webhook registration.
+
+## Summary Table
+
+| Requirement | Minimum | Rationale |
+|-------------|---------|-----------|
+| ECT carriage | `Execution-Context` header or equivalent | DAG continuity |
+| Parent ECT reference | In task invocation | DAG linkage |
+| Rollback signal | `/.well-known/atd/rollback` accessible | Error recovery |
+| HITL callback | Webhook or equivalent | Async approval |
+| Transport security | TLS 1.3 | Integrity and confidentiality |
+{: #fig-requirements title="Protocol Binding Conformance Requirements"}
+
+# Translation Gateway Architecture {#translation}
+
+## Gateway as DAG Node
+
+Every translation hop produces an ECT:
+
+- `exec_act`: `"aepb:translate"`
+- `par`: the source agent's ECT
+
+~~~json
+{
+  "exec_act": "aepb:translate",
+  "par": ["source-agent-ect-uuid"],
+  "inp_hash": "sha256-of-source-message",
+  "out_hash": "sha256-of-translated-message",
+  "ext": {
+    "aepb.source_protocol": "a2a-v1",
+    "aepb.dest_protocol": "slim-v1",
+    "aepb.gateway_id": "spiffe://gw.example.com/aepb",
+    "aepb.translation_warnings": []
+  }
+}
+~~~
+{: #fig-translate-ect title="Translation ECT"}
+
+This creates a three-node subgraph in the ECT DAG:
+
+~~~
+Source ECT → Gateway ECT (aepb:translate) → Dest ECT
+~~~
+
+The `Execution-Context` HTTP header survives protocol translation:
+the gateway includes the translation ECT in the header of the
+forwarded request, maintaining DAG continuity.
+
+## Multi-Hop Translation
+
+When a single gateway cannot handle a translation pair, messages
+may traverse multiple gateways.  Each hop produces an
+`aepb:translate` ECT, all linked in the same DAG:
+
+~~~
+Agent-A ECT
+    │
+    ▼
+Gateway-1 ECT (a2a-v1 → mcp-v1)
+    │
+    ▼
+Gateway-2 ECT (mcp-v1 → slim-v1)
+    │
+    ▼
+Agent-B ECT
+~~~
+{: #fig-multihop title="Multi-Hop Translation DAG"}
+
+The maximum number of translation hops is configured as a
+node constraint:
+
+~~~json
+{
+  "constraints": {
+    "aepb.max_translation_hops": 2
+  }
+}
+~~~
+
+Agents receiving a message MUST count `aepb:translate` ECTs in
+the `par` ancestry and MUST reject messages exceeding
+`aepb.max_translation_hops`.  The default maximum is 3.
+
+## Gateway Requirements
+
+A gateway MUST:
+
+1. Serve a capability document at `/.well-known/aepb/gateway`
+   listing supported translation pairs.
+2. Accept messages via HTTP POST at its translate endpoint.
+3. Produce an `aepb:translate` ECT per {{translation}} for
+   every translation.
+4. Preserve message semantics.  Fields without a destination
+   equivalent are carried in an extension field or dropped with
+   a warning in `aepb.translation_warnings`.
+5. Require TLS 1.3 {{RFC8446}} for all connections.
+6. Implement per-source-agent rate limiting.
+7. Verify gateway ECTs at L2 or higher (signed JWT minimum).
+
+A gateway MUST NOT modify payload semantics beyond what is
+required for protocol translation.
+
+## Translation Failure Handling
+
+When a gateway fails to translate a message, it MUST emit an
+error ECT:
+
+~~~json
+{
+  "exec_act": "aepb:translate_error",
+  "par": ["source-agent-ect-uuid"],
+  "ext": {
+    "aepb.source_protocol": "a2a-v1",
+    "aepb.dest_protocol": "slim-v1",
+    "aepb.error": "semantic_loss",
+    "aepb.description": "Source message contains field 'action.stream' with no slim-v1 equivalent"
+  }
+}
+~~~
+{: #fig-translate-error title="Translation Error ECT"}
+
+Error values: `semantic_loss` (untranslatable field), `timeout`,
+`policy_violation` (exceeds hop limit), `internal_error`.
+
+On translation failure:
+- The ATD circuit breaker for the gateway agent SHOULD be
+  updated.
+- If `atd.cascade: false`, the calling agent returns
+  `no_translation_path` to its upstream caller.
+- If `atd.cascade: true`, the ATD rollback protocol applies
+  to the entire workflow subgraph.
+
+## Translation Policy
+
+Protocol constraints are ACP-DAG-HITL node constraints:
+
+~~~json
+{
+  "constraints": {
+    "aepb.allowed_source_protocols": ["a2a-v1", "mcp-v1"],
+    "aepb.allowed_dest_protocols": ["slim-v1"],
+    "aepb.max_translation_hops": 2
+  }
+}
+~~~
+{: #fig-policy title="Translation Policy"}
+
+# Agent Lifecycle Management {#lifecycle}
+
+## Lifecycle States
+
+An agent's `lifecycle.status` MUST be one of:
+
+active:
+: Normal operation.  Default state.
+
+deprecated:
+: Agent is functional but will be retired.
+  `deprecated_at` MUST be set.  The agent MUST include a
+  `Deprecation` header per {{RFC8594}} in all responses.
+  Clients SHOULD migrate to `successor` if provided.
+
+draining:
+: Agent is rejecting new workflows but completing in-progress
+  ones.  New delegation requests MUST return HTTP 503 with
+  `Retry-After` header and, if set, `Location` pointing to
+  `successor`.
+
+retired:
+: Agent is offline.  Capability document MUST return HTTP 410
+  Gone with `Link: <successor>; rel="successor-version"`.
+
+## Lifecycle State Transitions
+
+~~~
+         deprecate              drain
+active ──────────► deprecated ────────► draining ──► retired
+  ▲                    │                    │
+  │                    │ immediate drain    │
+  └────────────────────┴────────────────────┘
+                   (operator discretion)
+~~~
+{: #fig-lifecycle-fsm title="Lifecycle State Machine"}
+
+All transitions MUST be recorded as ECTs:
+
+~~~json
+{
+  "exec_act": "aepb:lifecycle_change",
+  "ext": {
+    "aepb.agent_id": "spiffe://example.com/agent/pricing",
+    "aepb.from_state": "active",
+    "aepb.to_state": "deprecated",
+    "aepb.reason": "Replaced by pricing-v3"
+  }
+}
+~~~
+{: #fig-lifecycle-ect title="Lifecycle Change ECT"}
+
+## Versioning
+
+The `lifecycle.version` field uses semantic versioning.  Agents
+MUST increment the major version when breaking changes occur
+(incompatible protocol or behavior changes).
+
+Capability documents MUST include the version.  Agents SHOULD
+include version in ECT `ext` claims (`aepb.agent_version`) so
+the audit trail records which version performed each action.
+
+## Graceful Shutdown
+
+When an agent transitions to `draining`:
+
+1. Update capability document: `status: "draining"`,
+   set `sunset_at` timestamp.
+2. Reject new workflow delegations with HTTP 503.
+3. Complete all in-progress workflows.
+4. Emit a final ECT: `exec_act: "aepb:shutdown"`.
+5. Transition to `retired`.
+
+Agents SHOULD provide at least 24 hours between `deprecated`
+and `draining` to allow clients to discover the change via
+cached capability documents.
+
+## Successor Discovery
+
+When `successor` is set, it MUST be the URI of the replacement
+agent's capability document.  Clients SHOULD transparently
+redirect to the successor after verifying its capability
+document.  Clients MUST verify that the successor's assurance
+level is equal to or greater than the predecessor's.
+
+# Security Considerations
+
+## Capability Document Integrity
+
+Capability documents are served over HTTPS with TLS 1.3.
+Agents SHOULD verify TLS certificates before trusting capability
+documents.  For high-assurance deployments, capability documents
+SHOULD be signed as JWTs ({{RFC7519}}) so their integrity can
+be verified independently of transport security.
+
+## Gateway Trust
+
+Gateways are trusted intermediaries with access to message
+content.  For end-to-end confidentiality, agents MAY encrypt
+message payloads with a shared key established out of band.
+
+The ECT audit trail enables detection of:
+- Unauthorized gateways (unknown `aepb.gateway_id`).
+- Content tampering (`inp_hash`/`out_hash` mismatch).
+- Routing loops (repeated gateway IDs in DAG ancestry).
+
+Gateways MUST authenticate using WIMSE/SPIFFE identities at
+ECT assurance L2+.
+
+## Protocol Downgrade Attacks
+
+An attacker may attempt to force negotiation to a weaker
+protocol.  Mitigation:
+
+- Agents MUST enforce `aepb.min_protocol_security` constraint.
+- TLS 1.3 is the minimum transport; lower versions MUST be
+  rejected.
+- Protocol negotiation results MUST be logged as part of the
+  workflow ECT DAG.
+
+## Translation Amplification
+
+A single cross-protocol request could trigger a chain of N
+translations, each consuming resources.  Mitigation:
+
+- `aepb.max_translation_hops` (default 3) prevents unbounded
+  chains.
+- Per-source rate limiting at each gateway prevents a single
+  agent from flooding the translation infrastructure.
+
+## Lifecycle Denial of Service
+
+Transitioning an agent to `draining` or `retired` disrupts
+its callers.  Only the agent operator (verified via ACP-DAG-HITL
+identity binding) SHOULD be able to trigger lifecycle
+transitions.  Lifecycle-change ECTs MUST be signed at L2+.
+
+# IANA Considerations
+
+## AEPB Protocol Binding Registry
+
+This document requests the creation of the "AEPB Protocol Binding
+Registry" under IANA.  Registration policy: Specification Required.
+
+Required fields: Binding Identifier, Protocol Name, Specification
+Reference, Minimum ECT Assurance Level, HITL Callback Support.
+
+Initial entries:
+
+| Identifier | Protocol | Spec Reference | Min Assurance | HITL Callback |
+|------------|----------|---------------|--------------|---------------|
+| `a2a-v1` | A2A | (TBD) | L1 | Webhook |
+| `mcp-v1` | Model Context Protocol | (TBD) | L1 | Webhook |
+| `slim-v1` | SLIM | (TBD) | L1 | Webhook |
+| `uacp-v1` | uACP | (TBD) | L1 | Webhook |
+| `ainp-v1` | AINP | (TBD) | L1 | Webhook |
+{: #fig-registry title="Initial Protocol Binding Registry Entries"}
+
+## Well-Known URIs
+
+This document requests registration per {{RFC8615}}:
+
+| URI Suffix | Purpose |
+|------------|---------|
+| `aepb` | Agent capability document |
+| `aepb/gateway` | Translation gateway capability |
+{: #fig-wellknown title="Well-Known URI Registrations"}
+
+## `exec_act` Values
+
+This document requests registration in the AEM Ecosystem
+Extension Registry:
+
+| Value | Description | Reference |
+|-------|-------------|-----------|
+| `aepb:translate` | Protocol translation hop | This document |
+| `aepb:translate_error` | Translation failure | This document |
+| `aepb:shutdown` | Agent graceful shutdown complete | This document |
+| `aepb:lifecycle_change` | Lifecycle state transition | This document |
+{: #fig-iana-actions title="AEPB exec_act Registrations"}
+
+--- back
+
+# Acknowledgments
+{:numbered="false"}
+
+AEPB builds on ECT {{I-D.nennemann-wimse-ect}} for translation
+audit trails and ACP-DAG-HITL
+{{I-D.nennemann-agent-dag-hitl-safety}} for protocol policy.
+The lifecycle model is inspired by Kubernetes graceful shutdown
+semantics and the `Deprecation` header {{RFC8594}}.
--- a/workspace/drafts/new-drafts/draft-dats-dynamic-agent-trust-scoring-00.md
+++ b/workspace/drafts/new-drafts/draft-dats-dynamic-agent-trust-scoring-00.md
@@ -0,0 +1,360 @@
+---
+title: "Dynamic Agent Trust Scoring (DATS)"
+abbrev: "DATS"
+category: std
+docname: draft-dats-dynamic-agent-trust-scoring-00
+submissiontype: IETF
+number:
+date:
+v: 3
+area: "SEC"
+workgroup: "Security Dispatch"
+keyword:
+  - dynamic trust
+  - reputation
+  - agentic workflows
+  - execution context
+
+author:
+  -
+    fullname: Generated by IETF Draft Analyzer
+    organization: Independent
+    email: placeholder@example.com
+
+normative:
+  RFC7519:
+  RFC7515:
+  RFC7518:
+  RFC9110:
+  I-D.nennemann-wimse-ect:
+    title: "Execution Context Tokens for Distributed Agentic Workflows"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
+
+informative:
+  I-D.nennemann-agent-dag-hitl-safety:
+    title: "Agent Context Policy Token: DAG Delegation with Human Override"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
+
+--- abstract
+
+This document defines the Dynamic Agent Trust Scoring (DATS)
+protocol, a mechanism for AI agents to build, assess, and revoke
+trust relationships based on observed behavior over time.  Static
+authentication verifies identity but says nothing about reliability.
+DATS augments identity-based auth with a numeric trust score that
+adjusts dynamically based on interaction outcomes recorded in the
+ECT DAG.  Trust events are derived from ECT action outcomes rather
+than agent-local tracking, making trust computation auditable and
+tamper-evident.  Trust assertions are ECTs themselves, and trust
+thresholds integrate with ACP-DAG-HITL node constraints as
+enforceable policy.
+
+--- middle
+
+# Introduction
+
+The IETF has 98 drafts addressing agent identity and
+authentication, providing strong mechanisms for verifying who an
+agent is.  But identity alone is insufficient for long-running
+autonomous systems.  A properly authenticated agent may still
+produce bad results, violate expectations, or degrade over time.
+
+DATS adds a behavioral dimension to trust.  It answers: "I know
+who you are, but should I rely on you?"  The model is deliberately
+simple -- a single floating-point score between 0.0 and 1.0 per
+agent relationship -- because complex reputation systems tend to
+be gamed or ignored.
+
+By building on ECT {{I-D.nennemann-wimse-ect}}, DATS derives trust
+from the cryptographically signed record of actual interactions
+rather than agent-local counters that can be manipulated.  At L3,
+the audit ledger provides an immutable interaction history.
+
+The protocol is inspired by:
+
+- TCP congestion control: trust increases slowly (additive) and
+  decreases quickly (multiplicative) on failure.
+- TLS certificate transparency: trust assertions are logged.
+- Web of trust (PGP): trust propagates through intermediaries
+  with attenuation.
+
+# Conventions and Definitions
+
+{::boilerplate bcp14-tagged}
+
+Trust Score:
+: A floating-point value in \[0.0, 1.0\] representing one agent's
+  assessed reliability of another, based on observed ECT outcomes.
+
+Trust Event:
+: An observable interaction outcome that causes a trust score
+  adjustment.  Derived from ECTs in the workflow DAG.
+
+Trust Decay:
+: Automatic reduction of trust scores over inactivity, reflecting
+  the principle that trust requires ongoing evidence.
+
+Trust Assertion:
+: An ECT recording one agent's trust score for another,
+  transportable as a signed token.
+
+# Problem Statement
+
+Agent A delegates a task to Agent B.  After 100 successful
+interactions, Agent B starts returning incorrect results (model
+drift, adversarial manipulation, or degradation).  Agent A has no
+standard way to:
+
+1. Track B's reliability over time.
+2. Reduce B's privileges based on degraded performance.
+3. Share its experience with Agent C.
+4. Automatically revoke B's access when trust drops below
+   acceptable levels.
+
+Existing attestation drafts (STAMP, DAAP) provide cryptographic
+proof of specific actions but not ongoing behavioral assessment.
+The ECT DAG records what happened; DATS adds evaluation of
+whether what happened was good.
+
+# Trust Score Model {#trust-model}
+
+Each agent maintains a trust table: a mapping from peer agent IDs
+to trust scores.
+
+~~~json
+{
+  "spiffe://example.com/agent/b": {
+    "score": 0.82,
+    "interactions": 147,
+    "last_updated": "2026-03-01T11:30:00Z",
+    "last_event_ect": "550e8400-e29b-41d4-a716-446655440099"
+  }
+}
+~~~
+{: #fig-trust-table title="Trust Table Entry"}
+
+Initial trust for an unknown agent is deployment-configured.  A
+value of 0.5 is RECOMMENDED as a neutral starting point.
+Zero-trust deployments MAY use 0.1.
+
+Trust scores are updated using additive-increase,
+multiplicative-decrease (AIMD):
+
+On positive event:
+: `score = min(1.0, score + alpha)`
+
+On negative event:
+: `score = max(0.0, score * beta)`
+
+Default parameters: `alpha = 0.01`, `beta = 0.8`.
+
+This means trust builds slowly (100 successes from 0.5 to ~1.0)
+but drops quickly (a single failure takes 0.82 to 0.66).  This
+asymmetry is intentional: in autonomous systems, the cost of
+trusting a bad agent exceeds the cost of slow trust building.
+
+# Trust Events from ECT {#trust-events}
+
+Trust events are derived from ECTs in the workflow DAG rather than
+agent-local tracking.  This makes trust computation auditable.
+
+## Standard Trust Events
+
+| ECT condition | Event | Adjustment |
+|--------------|-------|------------|
+| `exec_act` completed, no error ECT follows | `task_success` | +1x alpha |
+| `exec_act` completed, partial result | `task_partial` | +0.5x alpha |
+| `aerr:error` ECT with `par` referencing agent | `task_failure` | 1x beta |
+| Timeout (no response ECT within threshold) | `task_timeout` | 1x beta |
+| `aerr:error` with `constraint_violation` | `policy_violation` | beta^2 |
+| ECT signature verification fails | `attestation_invalid` | beta^2 |
+| `aerr:rollback_request` targeting agent | `rollback_triggered` | 1x beta |
+{: #fig-events title="Trust Events Derived from ECT"}
+
+`beta^2` means the multiplicative decrease is applied twice
+(`score * beta * beta`), reflecting the severity of policy
+violations versus simple failures.
+
+## Trust Decay
+
+If no interaction (no ECT involving the peer) occurs for a
+configurable period (default: 7 days), the trust score decays:
+
+`score = max(initial_default, score - decay_rate)`
+
+Default `decay_rate`: 0.01 per day.
+
+Agents MUST record all trust events in a local audit log.  At L3,
+the trust events are derivable from the audit ledger, providing
+independent verifiability.
+
+# Trust Assertions as ECT {#trust-assertions}
+
+Agent A shares its trust assessment of Agent B with Agent C via a
+trust assertion ECT:
+
+- `exec_act`: `"dats:assertion"`
+- `par`: empty (trust assertions are standalone) or referencing
+  the most recent interaction ECT
+
+~~~json
+{
+  "iss": "spiffe://example.com/agent/a",
+  "ext": {
+    "dats.subject": "spiffe://example.com/agent/b",
+    "dats.score": 0.82,
+    "dats.interactions": 147,
+    "dats.confidence": "high",
+    "dats.hops": 0
+  }
+}
+~~~
+{: #fig-assertion title="Trust Assertion ECT"}
+
+`dats.confidence` is based on interaction count: `low` (<10),
+`medium` (10-99), `high` (100+).
+
+## Trust Propagation with Attenuation
+
+When Agent C receives a trust assertion from Agent A about Agent B,
+it MAY incorporate it:
+
+~~~
+c_score_for_b = max(c_score_for_b,
+                    a_score_for_b * trust_of_a * attenuation)
+~~~
+
+Where:
+
+- `a_score_for_b` = A's reported score for B (0.82)
+- `trust_of_a` = C's own trust score for A
+- `attenuation` = constant (default: 0.5)
+
+Trust assertions are advisory.  An agent's own direct observations
+always take precedence over propagated trust.
+
+## Anti-Gaming Measures
+
+To prevent trust laundering (colluding agents inflating each
+other's scores):
+
+- Agents SHOULD limit propagation depth to 1 hop by default
+- The `dats.hops` field tracks depth; agents MUST NOT propagate
+  assertions where `dats.hops` exceeds their configured maximum
+- At L3, trust assertions are recorded in the audit ledger,
+  making collusion patterns detectable through graph analysis
+
+# Trust Thresholds as Policy {#trust-policy}
+
+## Threshold-Based Access
+
+Agents SHOULD define trust thresholds per action type:
+
+~~~json
+{
+  "thresholds": {
+    "read_data": 0.3,
+    "execute_task": 0.5,
+    "modify_config": 0.7,
+    "delegate_auth": 0.9
+  }
+}
+~~~
+{: #fig-thresholds title="Trust Thresholds"}
+
+When a request arrives, the agent checks the requester's trust
+score against the threshold.  If below threshold, the request is
+denied with HTTP 403 and error `trust_insufficient`.
+
+## Integration with ACP-DAG-HITL
+
+Trust thresholds can be expressed as DAG node constraints
+{{I-D.nennemann-agent-dag-hitl-safety}}:
+
+~~~json
+{
+  "dag": {
+    "nodes": [{
+      "id": "n-critical-action",
+      "type": "modify_config",
+      "agent": "spiffe://example.com/agent/b",
+      "constraints": {
+        "dats.min_trust": 0.7,
+        "dats.min_confidence": "medium"
+      }
+    }]
+  },
+  "hitl": {
+    "rules": [{
+      "id": "r-low-trust",
+      "trigger": {
+        "kind": "confidence_below",
+        "op": "lt",
+        "value": 0.5,
+        "input_ref": "dats.peer_trust_score"
+      },
+      "required_role": "operator:security",
+      "action": "escalate",
+      "allow_override": true,
+      "override_action": "continue"
+    }]
+  }
+}
+~~~
+{: #fig-policy title="Trust Policy as DAG Constraints + HITL"}
+
+This means: if the delegated agent's trust score drops below 0.5,
+escalate to a human security operator before proceeding.
+
+## Automatic Revocation
+
+When an agent's trust score drops below a configured floor
+(default: 0.2), the trusting agent SHOULD:
+
+1. Revoke all outstanding delegations to that agent
+2. Produce a revocation ECT (`exec_act`: `"dats:revoke"`)
+3. Emit an error ECT per AERR if the agent was part of an
+   active workflow
+
+# Security Considerations
+
+Trust scores are sensitive metadata.  Agents MUST NOT expose
+their full trust tables to peers.  Only pairwise trust assertions
+should be shared intentionally.
+
+Trust assertion ECTs MUST be signed at L2 or L3.  Agents MUST
+verify signatures before processing.
+
+Score manipulation: a malicious agent could behave well to build
+trust, then exploit it.  Mitigation: `policy_violation` events
+apply double penalties, and deployments SHOULD set high thresholds
+for critical actions.
+
+Sybil attacks: an attacker creates many agents for fake positive
+assertions.  Mitigation: attenuation ({{trust-assertions}}),
+hop limits, and requiring agents to be registered in a trusted
+directory before accepting assertions.
+
+All trust-related communications MUST use TLS 1.3.
+
+# IANA Considerations
+
+This document requests the following IANA registrations:
+
+1. Registration of `exec_act` values `dats:assertion` and
+   `dats:revoke` in a future ECT action type registry.
+
+2. A "DATS Trust Event Type" registry under Specification Required
+   policy.  Initial entries: `task_success`, `task_partial`,
+   `task_failure`, `task_timeout`, `policy_violation`,
+   `attestation_invalid`, `rollback_triggered`.
+
+--- back
+
+# Acknowledgments
+{:numbered="false"}
+
+This document builds on the Execution Context Token specification
+{{I-D.nennemann-wimse-ect}} for interaction evidence and the
+Agent Context Policy Token {{I-D.nennemann-agent-dag-hitl-safety}}
+for trust threshold policy enforcement.
--- a/workspace/drafts/new-drafts/draft-dats-dynamic-agent-trust-scoring-00.txt
+++ b/workspace/drafts/new-drafts/draft-dats-dynamic-agent-trust-scoring-00.txt
@@ -0,0 +1,298 @@
+Internet-Draft                                           AI/Agent WG
+Intended status: Standards Track                          March 2026
+Expires: September 15, 2026
+
+
+         Dynamic Agent Trust Scoring (DATS)
+         draft-dats-dynamic-agent-trust-scoring-00
+
+Abstract
+
+   This document defines the Dynamic Agent Trust Scoring (DATS)
+   protocol, a mechanism for AI agents to build, assess, and
+   revoke trust relationships based on observed behavior over
+   time.  Static authentication (certificates, API keys) verifies
+   identity but says nothing about whether an agent is reliable,
+   accurate, or well-behaved.  DATS augments identity-based auth
+   with a numeric trust score that adjusts dynamically based on
+   interaction outcomes.  The protocol defines trust score
+   computation, propagation between agents, decay over inactivity,
+   and threshold-based access policies.  DATS is intentionally
+   simple: a single score per agent-pair, standard adjustment
+   events, and a JWT-based transport for trust assertions.
+
+Status of This Memo
+
+   This Internet-Draft is submitted in full conformance with the
+   provisions of BCP 78 and BCP 79.
+
+   This document is intended to have Standards Track status.
+   Distribution of this memo is unlimited.
+
+Table of Contents
+
+   1.  Introduction
+   2.  Terminology
+   3.  Problem Statement
+   4.  Trust Score Model
+   5.  Trust Events and Adjustments
+   6.  Trust Propagation
+   7.  Threshold-Based Access Policies
+   8.  Security Considerations
+   9.  IANA Considerations
+
+1.  Introduction
+
+   The IETF has 98 drafts addressing agent identity and
+   authentication, providing strong mechanisms for verifying who
+   an agent is.  But identity alone is insufficient for long-
+   running autonomous systems.  A properly authenticated agent
+   may still produce bad results, violate expectations, or
+   degrade over time.  Static certificates cannot capture this.
+
+   DATS adds a behavioral dimension to agent trust.  It answers
+   the question: "I know who you are, but should I rely on you?"
+   The model is deliberately simple — a single floating-point
+   score between 0.0 and 1.0 per agent relationship — because
+   complex reputation systems tend to be gamed or ignored.
+
+   The protocol is inspired by:
+   - TCP congestion control: trust increases slowly (additive)
+     and decreases quickly (multiplicative) on failure.
+   - TLS certificate transparency: trust assertions are logged
+     for auditability.
+   - Web of trust (PGP): trust can propagate through
+     intermediaries, with attenuation.
+
+2.  Terminology
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+   NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
+   "OPTIONAL" in this document are to be interpreted as described
+   in RFC 2119 [RFC2119].
+
+   Trust Score: A floating-point value in [0.0, 1.0] representing
+   one agent's assessed reliability of another, based on observed
+   interaction outcomes.
+
+   Trust Event: An observable interaction outcome that causes a
+   trust score adjustment.  Events are either positive (task
+   completed successfully) or negative (task failed, timeout,
+   policy violation).
+
+   Trust Decay: The automatic reduction of trust scores over
+   periods of inactivity, reflecting the principle that trust
+   requires ongoing evidence.
+
+   Trust Assertion: A signed statement by one agent about another
+   agent's trust score, transportable as a JWT claim.
+
+3.  Problem Statement
+
+   Agent A delegates a task to Agent B.  Agent B completes it
+   correctly.  Agent A delegates again.  After 100 successful
+   interactions, Agent B starts returning subtly incorrect results
+   (model drift, adversarial manipulation, or simple degradation).
+   Agent A has no standard way to:
+
+   1. Track B's reliability over time.
+   2. Reduce B's privileges based on degraded performance.
+   3. Share its experience with Agent C, who is considering
+      delegating to Agent B.
+   4. Automatically revoke B's access when trust drops below
+      acceptable levels.
+
+   Existing attestation drafts (STAMP, DAAP) provide
+   cryptographic proof of specific actions but not ongoing
+   behavioral assessment.  DATS fills this gap.
+
+4.  Trust Score Model
+
+   Each agent maintains a trust table: a mapping from peer agent
+   IDs to trust scores.
+
+      {
+        "urn:uuid:agent-b": {
+          "score": 0.82,
+          "interactions": 147,
+          "last_updated": "2026-03-01T11:30:00Z",
+          "last_event": "task_success"
+        }
+      }
+
+   Initial trust for an unknown agent is a deployment-configured
+   default.  A value of 0.5 is RECOMMENDED as a neutral starting
+   point, but deployments MAY use lower values (e.g., 0.1) for
+   zero-trust environments.
+
+   Trust scores are updated using an additive-increase,
+   multiplicative-decrease (AIMD) algorithm:
+
+   On positive event:
+      score = min(1.0, score + alpha)
+
+   On negative event:
+      score = max(0.0, score * beta)
+
+   Default parameters: alpha = 0.01, beta = 0.8.
+
+   This means trust builds slowly (100 successes to go from 0.5
+   to ~1.0) but drops quickly (a single failure reduces an 0.82
+   score to 0.66).  This asymmetry is intentional: in autonomous
+   systems, the cost of trusting a bad agent exceeds the cost of
+   being slow to trust a good one.
+
+   Agents MAY tune alpha and beta per relationship or per action
+   type, but MUST use the AIMD structure.
+
+5.  Trust Events and Adjustments
+
+   The following standard trust events are defined:
+
+   | Event                | Direction | Default Weight |
+   |----------------------|-----------|----------------|
+   | task_success         | positive  | 1x alpha       |
+   | task_partial_success | positive  | 0.5x alpha     |
+   | task_failure         | negative  | 1x beta        |
+   | task_timeout         | negative  | 1x beta        |
+   | policy_violation     | negative  | applied twice  |
+   | attestation_invalid  | negative  | applied twice  |
+   | rollback_triggered   | negative  | 1x beta        |
+
+   "applied twice" means the multiplicative decrease is applied
+   two times in succession (score * beta * beta), reflecting the
+   severity of policy violations versus simple failures.
+
+   Trust decay: if no interaction occurs for a configurable
+   period (default: 7 days), the trust score decays:
+
+      score = max(initial_default, score - decay_rate)
+
+   Default decay_rate: 0.01 per day.  This ensures that stale
+   trust relationships gradually return to the default level
+   rather than persisting indefinitely.
+
+   Agents MUST record all trust events in a local audit log.
+
+6.  Trust Propagation
+
+   Agent A may share its trust assessment of Agent B with Agent C
+   through a signed trust assertion.  The assertion is a JWT
+   (RFC 7519) with the following claims:
+
+      {
+        "iss": "urn:uuid:agent-a",
+        "sub": "urn:uuid:agent-b",
+        "iat": 1709294400,
+        "exp": 1709380800,
+        "dats_score": 0.82,
+        "dats_interactions": 147,
+        "dats_confidence": "high"
+      }
+
+   "dats_confidence" is based on interaction count: "low" (<10),
+   "medium" (10-99), "high" (100+).
+
+   When Agent C receives this assertion, it MAY incorporate it
+   into its own trust score for Agent B using attenuation:
+
+      c_score_for_b = max(c_score_for_b,
+                          a_score_for_b * trust_of_a * attenuation)
+
+   Where:
+   - a_score_for_b is Agent A's reported score for B (0.82)
+   - trust_of_a is Agent C's trust score for Agent A
+   - attenuation is a constant (default: 0.5) preventing
+     unbounded trust propagation
+
+   Trust assertions are advisory.  Agents MUST NOT blindly adopt
+   propagated scores.  An agent's own direct observations always
+   take precedence over propagated trust.
+
+   To prevent trust laundering (colluding agents inflating each
+   other's scores), agents SHOULD limit propagation depth to 1
+   hop by default.  The "dats_hops" claim tracks propagation
+   depth; agents MUST NOT propagate assertions where dats_hops
+   exceeds their configured maximum.
+
+7.  Threshold-Based Access Policies
+
+   Agents SHOULD define trust thresholds for different action
+   categories:
+
+      {
+        "thresholds": {
+          "read_data":      0.3,
+          "execute_task":   0.5,
+          "modify_config":  0.7,
+          "delegate_auth":  0.9
+        }
+      }
+
+   When an agent requests an action, the serving agent checks the
+   requester's trust score against the threshold for that action
+   type.  If the score is below the threshold, the request is
+   denied with a 403 response including a DATS-specific error:
+
+      {
+        "error": "trust_insufficient",
+        "required_score": 0.7,
+        "current_score": 0.54,
+        "action": "modify_config"
+      }
+
+   The response SHOULD NOT reveal the exact current score in
+   production deployments to prevent score probing.  Instead, it
+   MAY return only the "trust_insufficient" error.
+
+   Automatic revocation: when an agent's trust score drops below
+   a configured floor (default: 0.2), the trusting agent SHOULD
+   revoke all outstanding delegations and emit a trust revocation
+   event.  This provides automatic containment of agents that
+   have become unreliable.
+
+8.  Security Considerations
+
+   Trust scores are sensitive metadata.  Agents MUST NOT expose
+   their full trust tables to peers.  Only pairwise trust
+   assertions (Section 6) should be shared, and only
+   intentionally.
+
+   Trust assertion JWTs MUST be signed using algorithms from
+   RFC 7518 (e.g., ES256, EdDSA).  Agents MUST verify signatures
+   before processing trust assertions.
+
+   Score manipulation attacks: a malicious agent could
+   intentionally behave well for many interactions to build trust,
+   then exploit high trust for a damaging action.  Mitigation:
+   policy_violation events apply double penalties, and
+   deployments SHOULD set trust thresholds high for critical
+   actions regardless of accumulated trust.
+
+   Sybil attacks: an attacker could create many agents to
+   generate fake positive trust assertions.  Mitigation: agents
+   SHOULD weight propagated trust by their own direct trust in
+   the asserting agent (Section 6 attenuation) and SHOULD
+   require agents to be registered in a trusted directory (e.g.,
+   ANS) before accepting trust assertions.
+
+   All trust-related communications MUST use TLS 1.3 [RFC8446].
+
+9.  IANA Considerations
+
+   This document requests IANA establish the following:
+
+   1. Registration of JWT claims "dats_score",
+      "dats_interactions", "dats_confidence", and "dats_hops"
+      in the JSON Web Token Claims registry per RFC 7519.
+
+   2. A "DATS Trust Event Type" registry under Specification
+      Required policy.  Initial entries: "task_success",
+      "task_partial_success", "task_failure", "task_timeout",
+      "policy_violation", "attestation_invalid",
+      "rollback_triggered".
+
+Author's Address
+
+   Generated by IETF Draft Analyzer
+   2026-03-01
--- a/workspace/drafts/new-drafts/draft-e-apae-assurance-profiles-00.md
+++ b/workspace/drafts/new-drafts/draft-e-apae-assurance-profiles-00.md
@@ -0,0 +1,384 @@
+---
+title: "Assurance Profiles for Agent Ecosystems (APAE)"
+abbrev: "APAE"
+category: info
+docname: draft-apae-assurance-profiles-00
+submissiontype: IETF
+number:
+date:
+v: 3
+area: "SEC"
+workgroup: "Security Dispatch"
+keyword:
+  - dynamic trust
+  - assurance
+  - behavior verification
+  - data provenance
+
+author:
+  -
+    fullname: TBD
+    organization: Independent
+    email: placeholder@example.com
+
+normative:
+  RFC2119:
+  RFC8174:
+  RFC7519:
+  RFC7518:
+  I-D.nennemann-wimse-ect:
+    title: "Execution Context Tokens for Distributed Agentic Workflows"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
+  I-D.nennemann-agent-dag-hitl-safety:
+    title: "Agent Context Policy Token: DAG Delegation with Human Override"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
+
+informative:
+
+--- abstract
+
+This document defines Assurance Profiles for Agent Ecosystems
+(APAE): dynamic trust scoring, behavior verification, data
+provenance, and graduated assurance profiles that allow the same
+agent ecosystem to operate in relaxed (dev/K8s) and regulated
+(healthcare, finance) environments.  Trust events are derived from
+ECT outcomes.  Trust assertions are ECTs.  Behavior verification
+references ECT claims.  Provenance chains are implicit in the ECT
+DAG.  Assurance profiles select which combination of these
+mechanisms is required for a given deployment, mapping to ECT
+assurance levels L1/L2/L3.
+
+--- middle
+
+# Introduction
+
+Identity verifies who an agent is.  ECT records what an agent did.
+But neither answers: should I rely on this agent?  Is it doing what
+it promised?  Can I trace where this data came from?
+
+APAE adds three capabilities to the ecosystem:
+
+1. **Dynamic trust scoring** — behavioral reputation that adjusts
+   based on interaction outcomes (AIMD model).
+2. **Behavior verification** — checking agent actions against
+   declared specifications.
+3. **Data provenance** — tracing data lineage through the DAG.
+
+These three capabilities are bundled into assurance profiles
+(relaxed, standard, regulated) that map to ECT assurance levels,
+so the same ecosystem works from a dev cluster to a hospital.
+
+# Conventions and Definitions
+
+{::boilerplate bcp14-tagged}
+
+Trust Score:
+: A floating-point value in \[0.0, 1.0\] representing one agent's
+  assessed reliability of another.
+
+Trust Event:
+: An interaction outcome that causes a trust score adjustment.
+  Derived from ECTs.
+
+Behavior Specification:
+: A machine-readable declaration of permitted agent actions and
+  constraints.
+
+Provenance Chain:
+: The sequence of ECT nodes recording how a piece of data was
+  produced, transformed, and consumed.
+
+Assurance Profile:
+: A named configuration selecting which trust, verification, and
+  provenance mechanisms are required.
+
+# Dynamic Trust Scoring {#trust}
+
+## Trust Score Model
+
+Each agent maintains a trust table: peer agent IDs mapped to
+trust scores.  Initial trust for unknown agents is deployment-
+configured (RECOMMENDED: 0.5; zero-trust: 0.1).
+
+Scores update using additive-increase, multiplicative-decrease
+(AIMD):
+
+- Positive event: `score = min(1.0, score + alpha)`
+- Negative event: `score = max(0.0, score * beta)`
+
+Defaults: `alpha = 0.01`, `beta = 0.8`.
+
+Trust builds slowly (100 successes: 0.5 → ~1.0) and drops fast
+(one failure: 0.82 → 0.66).
+
+## Trust Events from ECT {#trust-events}
+
+Trust events are derived from ECTs rather than agent-local
+counters, making trust computation auditable:
+
+| ECT condition | Event | Adjustment |
+|--------------|-------|------------|
+| Completed, no error follows | `task_success` | +1x alpha |
+| Completed, partial result | `task_partial` | +0.5x alpha |
+| `atd:error` referencing agent | `task_failure` | 1x beta |
+| No response within threshold | `task_timeout` | 1x beta |
+| `atd:error` with `constraint_violation` | `policy_violation` | beta^2 |
+| ECT signature verification fails | `attestation_invalid` | beta^2 |
+| `atd:rollback_request` targeting agent | `rollback_triggered` | 1x beta |
+{: #fig-events title="Trust Events from ECT"}
+
+## Trust Decay
+
+If no interaction occurs for a configurable period (default:
+7 days): `score = max(initial_default, score - 0.01/day)`.
+
+## Trust Assertions as ECT {#trust-assertions}
+
+Agent A shares its trust assessment via a trust assertion ECT:
+
+- `exec_act`: `"apae:trust_assertion"`
+
+~~~json
+{
+  "exec_act": "apae:trust_assertion",
+  "ext": {
+    "apae.subject": "spiffe://example.com/agent/b",
+    "apae.trust_score": 0.82,
+    "apae.interactions": 147,
+    "apae.confidence": "high",
+    "apae.hops": 0
+  }
+}
+~~~
+{: #fig-assertion title="Trust Assertion ECT"}
+
+Confidence: `low` (<10 interactions), `medium` (10-99),
+`high` (100+).
+
+## Trust Propagation
+
+When Agent C receives A's assertion about B:
+
+~~~
+c_score_for_b = max(c_score_for_b,
+                    a_score * trust_of_a * attenuation)
+~~~
+
+Default `attenuation`: 0.5.  Direct observations always take
+precedence.  `apae.hops` tracks propagation depth; agents MUST NOT
+propagate beyond their configured maximum (default: 1).
+
+## Trust Thresholds as Policy
+
+Trust thresholds are ACP-DAG-HITL node constraints:
+
+~~~json
+{
+  "constraints": {
+    "apae.min_trust": 0.7,
+    "apae.min_confidence": "medium"
+  }
+}
+~~~
+{: #fig-threshold title="Trust Threshold as Node Constraint"}
+
+Requests from agents below threshold are denied with HTTP 403.
+
+Low trust can trigger HITL escalation:
+
+~~~json
+{
+  "id": "r-low-trust",
+  "trigger": {
+    "kind": "confidence_below",
+    "op": "lt",
+    "value": 0.5,
+    "input_ref": "apae.peer_trust_score"
+  },
+  "required_role": "operator:security",
+  "action": "escalate",
+  "allow_override": true,
+  "override_action": "continue"
+}
+~~~
+{: #fig-trust-hitl title="HITL Rule for Low Trust"}
+
+## Automatic Revocation
+
+When trust drops below a floor (default: 0.2), the trusting agent
+SHOULD revoke delegations and emit:
+`exec_act: "apae:trust_revoke"`.
+
+# Behavior Verification {#behavior}
+
+## Behavior Specifications
+
+A behavior specification declares what an agent is permitted to do.
+Specifications are JSON documents referencing ECT claims:
+
+~~~json
+{
+  "spec_version": "1.0",
+  "agent_id": "spiffe://example.com/agent/firewall",
+  "allowed_actions": ["update_rules", "read_config", "report"],
+  "constraints": {
+    "max_actions_per_minute": 60,
+    "forbidden_targets": ["core-router-*"],
+    "require_checkpoint_before": ["update_rules"]
+  },
+  "verification_frequency": "continuous"
+}
+~~~
+{: #fig-spec title="Behavior Specification"}
+
+## Verification Against ECT Stream
+
+A verifier monitors the agent's ECT stream and checks:
+
+1. `exec_act` values are in `allowed_actions`.
+2. Action rate does not exceed `max_actions_per_minute` (computed
+   from `iat` timestamps).
+3. `atd:checkpoint` ECTs precede `update_rules` ECTs (from
+   `require_checkpoint_before`).
+4. Targets in `ext` claims do not match `forbidden_targets`.
+
+Verification results are ECTs:
+
+- `exec_act`: `"apae:compliance_check"`
+
+~~~json
+{
+  "exec_act": "apae:compliance_check",
+  "par": ["latest-agent-ect-uuid"],
+  "ext": {
+    "apae.compliance_status": "passing",
+    "apae.violations": [],
+    "apae.spec_version": "1.0",
+    "apae.window": "2026-03-01T12:00:00Z/PT1H"
+  }
+}
+~~~
+{: #fig-compliance title="Compliance Check ECT"}
+
+Violations trigger trust score decreases (`policy_violation` event)
+and MAY trigger HITL escalation.
+
+# Data Provenance {#provenance}
+
+## DAG as Provenance Chain
+
+The ECT DAG already encodes data provenance: each ECT's `par`
+references show which prior tasks produced its inputs.  The
+`inp_hash` and `out_hash` claims prove what was processed without
+revealing the data.
+
+For deployments requiring explicit provenance metadata, agents
+MAY include:
+
+~~~json
+{
+  "ext": {
+    "apae.data_source": "database:patients",
+    "apae.data_classification": "pii",
+    "apae.retention_days": 365,
+    "apae.transformations": ["anonymize", "aggregate"]
+  }
+}
+~~~
+{: #fig-provenance title="Provenance Extension Claims"}
+
+## Provenance Queries
+
+At L3, the audit ledger enables provenance queries:
+
+- "Which agents touched this data?" → walk `par` chain from
+  final ECT to roots.
+- "Was this data transformed?" → check `apae.transformations`
+  along the chain.
+- "Is provenance complete?" → verify all `par` references
+  resolve to ledger entries.
+
+# Assurance Profiles {#profiles}
+
+An assurance profile is a named configuration that selects which
+mechanisms are required:
+
+| | Relaxed | Standard | Regulated |
+|---|---------|----------|-----------|
+| **ECT level** | L1 | L2 | L3 |
+| **Trust scoring** | Optional | RECOMMENDED | REQUIRED |
+| **Trust threshold enforcement** | Optional | RECOMMENDED | REQUIRED |
+| **Behavior verification** | Off | Periodic | Continuous |
+| **HITL approval gates** | Optional | Critical paths | Mandatory |
+| **Data provenance** | Off | Optional | REQUIRED |
+| **Checkpoint before consequential** | RECOMMENDED | REQUIRED | REQUIRED |
+| **Audit ledger** | Optional | Optional | REQUIRED |
+{: #fig-profiles title="Assurance Profiles"}
+
+Relaxed:
+: Internal dev/staging.  L1 ECTs.  Trust and verification
+  optional.  Useful for debugging and observability without
+  cryptographic overhead.
+
+Standard:
+: Production cross-org.  L2 ECTs.  Trust scoring and thresholds
+  recommended.  Periodic behavior verification.  HITL on critical
+  paths.
+
+Regulated:
+: Healthcare, finance, EU AI Act.  L3 ECTs with audit ledger.
+  Continuous behavior verification.  All trust mechanisms
+  required.  Full provenance chain.  Mandatory HITL gates.
+
+Profiles are declared in ACP-DAG-HITL node constraints:
+
+~~~json
+{
+  "constraints": {
+    "apae.assurance_profile": "regulated"
+  }
+}
+~~~
+{: #fig-profile-policy title="Profile as Node Constraint"}
+
+A single deployment MAY use different profiles for different
+workflows.
+
+# Security Considerations
+
+Trust scores are sensitive metadata.  Agents MUST NOT expose full
+trust tables.  Only pairwise assertions should be shared.
+
+Trust assertion ECTs MUST be signed at L2/L3.
+
+Score manipulation (building trust then exploiting it): mitigated
+by double penalties for `policy_violation` and high thresholds for
+critical actions.
+
+Sybil attacks (fake agents inflating trust): mitigated by
+attenuation ({{trust-assertions}}), hop limits, and requiring
+agents to be registered in a trusted directory.
+
+Behavior specifications could be tampered with.  Specifications
+SHOULD be signed and versioned.  Changes MUST be recorded as ECTs.
+
+All trust and verification communications MUST use TLS 1.3.
+
+# IANA Considerations
+
+This document requests registration of `exec_act` values:
+
+- `apae:trust_assertion`
+- `apae:trust_revoke`
+- `apae:compliance_check`
+
+--- back
+
+# Acknowledgments
+{:numbered="false"}
+
+APAE builds on ECT {{I-D.nennemann-wimse-ect}} for interaction
+evidence and audit, and ACP-DAG-HITL
+{{I-D.nennemann-agent-dag-hitl-safety}} for trust threshold and
+assurance profile policy enforcement.  The AIMD trust model is
+adapted from TCP congestion control.
--- a/workspace/drafts/new-drafts/draft-e-apae-assurance-profiles-01.md
+++ b/workspace/drafts/new-drafts/draft-e-apae-assurance-profiles-01.md
@@ -0,0 +1,695 @@
+---
+title: "Assurance Profiles for Agent Ecosystems (APAE)"
+abbrev: "APAE"
+category: info
+docname: draft-apae-assurance-profiles-01
+submissiontype: IETF
+number:
+date:
+v: 3
+area: "SEC"
+workgroup: "Security Dispatch"
+keyword:
+  - dynamic trust
+  - assurance
+  - behavior verification
+  - data provenance
+  - quarantine
+
+author:
+  -
+    fullname: TBD
+    organization: Independent
+    email: placeholder@example.com
+
+normative:
+  RFC2119:
+  RFC8174:
+  RFC7519:
+  RFC7518:
+  RFC9110:
+  I-D.nennemann-wimse-ect:
+    title: "Execution Context Tokens for Distributed Agentic Workflows"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
+  I-D.nennemann-agent-dag-hitl-safety:
+    title: "Agent Context Policy Token: DAG Delegation with Human Override"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
+
+informative:
+  RFC9334:
+
+--- abstract
+
+This document defines Assurance Profiles for Agent Ecosystems
+(APAE): dynamic trust scoring, behavior verification, data
+provenance, cross-domain trust, and graduated assurance profiles
+that allow the same agent ecosystem to operate in relaxed
+(dev/K8s) and regulated (healthcare, finance) environments.
+Trust events are derived from ECT outcomes.  Trust assertions are
+ECTs.  Behavior verification references ECT claims.  Provenance
+chains are implicit in the ECT DAG.  Assurance profiles select
+which combination of these mechanisms is required for a given
+deployment, mapping to ECT assurance levels L1/L2/L3.  Agents
+whose trust falls below a floor are quarantined via a protocol
+defined here.
+
+--- middle
+
+# Introduction
+
+Identity verifies who an agent is.  ECT records what an agent did.
+But neither answers: should I rely on this agent?  Is it doing what
+it promised?  Can I trace where this data came from?
+
+APAE adds four capabilities to the ecosystem:
+
+1. **Dynamic trust scoring** — behavioral reputation that adjusts
+   based on interaction outcomes (AIMD model).
+2. **Behavior verification** — checking agent actions against
+   declared specifications.
+3. **Data provenance** — tracing data lineage through the DAG.
+4. **Cross-domain trust** — federating trust across administrative
+   domains.
+
+These capabilities are bundled into assurance profiles
+(Relaxed, Standard, Regulated) that map to ECT assurance levels,
+so the same ecosystem works from a dev cluster to a hospital.
+
+# Conventions and Definitions
+
+{::boilerplate bcp14-tagged}
+
+Trust Score:
+: A floating-point value in \[0.0, 1.0\] representing one agent's
+  assessed reliability of another.
+
+Trust Event:
+: An interaction outcome that causes a trust score adjustment.
+  Derived from ECTs.
+
+Trust Domain:
+: An administrative boundary within which a single trust anchor
+  (CA or JWK set) governs agent identity.
+
+Behavior Specification:
+: A machine-readable declaration of permitted agent actions and
+  constraints.
+
+Provenance Chain:
+: The sequence of ECT nodes recording how a piece of data was
+  produced, transformed, and consumed.
+
+Assurance Profile:
+: A named configuration selecting which trust, verification, and
+  provenance mechanisms are required.
+
+Quarantine:
+: A state in which an agent's trust score has dropped below a
+  configured floor; the agent is prohibited from accepting new
+  delegations.
+
+# Dynamic Trust Scoring {#trust}
+
+## Trust Score Model
+
+Each agent maintains a trust table: peer agent IDs mapped to
+trust scores.  Initial trust for unknown agents is deployment-
+configured (RECOMMENDED: 0.5; zero-trust deployments: 0.1).
+
+Scores update using additive-increase, multiplicative-decrease
+(AIMD):
+
+- Positive event: `score = min(1.0, score + alpha)`
+- Negative event: `score = max(0.0, score * beta)`
+
+Defaults: `alpha = 0.01`, `beta = 0.8`.
+
+Trust builds slowly (100 successes: 0.5 → ~1.0) and drops fast
+(one failure: 0.82 → 0.66).
+
+## Trust Events from ECT {#trust-events}
+
+Trust events are derived from ECTs rather than agent-local
+counters, making trust computation auditable:
+
+| ECT condition | Event | Adjustment |
+|--------------|-------|------------|
+| Completed, no error follows | `task_success` | +1x alpha |
+| Completed, partial result | `task_partial` | +0.5x alpha |
+| `atd:error` referencing agent | `task_failure` | 1x beta |
+| No response within threshold | `task_timeout` | 1x beta |
+| `atd:error` with `constraint_violation` | `policy_violation` | beta^2 |
+| ECT signature verification fails | `attestation_invalid` | beta^2 |
+| `atd:rollback_request` targeting agent | `rollback_triggered` | 1x beta |
+{: #fig-events title="Trust Events from ECT"}
+
+## Trust Decay
+
+If no interaction occurs for a configurable period (default:
+7 days): `score = max(initial_default, score - 0.01/day)`.
+
+## Trust Assertions as ECT {#trust-assertions}
+
+Agent A shares its trust assessment via a trust assertion ECT:
+
+- `exec_act`: `"apae:trust_assertion"`
+
+~~~json
+{
+  "exec_act": "apae:trust_assertion",
+  "ext": {
+    "apae.subject": "spiffe://example.com/agent/b",
+    "apae.trust_score": 0.82,
+    "apae.interactions": 147,
+    "apae.confidence": "high",
+    "apae.hops": 0
+  }
+}
+~~~
+{: #fig-assertion title="Trust Assertion ECT"}
+
+Confidence: `low` (<10 interactions), `medium` (10-99),
+`high` (100+).
+
+Trust assertion ECTs MUST be signed at L2/L3.
+
+## Trust Propagation
+
+When Agent C receives A's assertion about B:
+
+~~~
+c_score_for_b = max(c_score_for_b,
+                    a_score * trust_of_a * attenuation)
+~~~
+
+Default `attenuation`: 0.5.  Direct observations always take
+precedence.  `apae.hops` tracks propagation depth; agents MUST NOT
+propagate beyond their configured maximum (default: 1).
+
+## Trust Thresholds as Policy
+
+Trust thresholds are ACP-DAG-HITL node constraints:
+
+~~~json
+{
+  "constraints": {
+    "apae.min_trust": 0.7,
+    "apae.min_confidence": "medium"
+  }
+}
+~~~
+{: #fig-threshold title="Trust Threshold as Node Constraint"}
+
+Requests from agents below threshold MUST be denied with HTTP 403.
+The `apae.peer_trust_score` is a runtime context value derived
+from the trusting agent's trust table for the requesting peer;
+it is not an ECT claim itself.
+
+Low trust can trigger HITL escalation:
+
+~~~json
+{
+  "id": "r-low-trust",
+  "trigger": {
+    "kind": "confidence_below",
+    "op": "lt",
+    "value": 0.5,
+    "input_ref": "apae.peer_trust_score"
+  },
+  "required_role": "operator:security",
+  "action": "escalate",
+  "allow_override": true,
+  "override_action": "continue"
+}
+~~~
+{: #fig-trust-hitl title="HITL Rule for Low Trust"}
+
+## Automatic Revocation
+
+When trust drops below a floor (default: 0.2), the trusting agent
+SHOULD revoke delegations and emit:
+`exec_act: "apae:trust_revoke"`.
+
+# Quarantine Protocol {#quarantine}
+
+When a trust score drops below the configured quarantine floor
+(default: 0.15), the agent enters quarantine.
+
+## Quarantine Entry
+
+The detecting agent MUST emit a quarantine ECT:
+
+~~~json
+{
+  "exec_act": "apae:quarantine",
+  "ext": {
+    "apae.subject": "spiffe://example.com/agent/b",
+    "apae.score": 0.12,
+    "apae.threshold": 0.15,
+    "apae.quarantine_until": "2026-03-02T12:00:00Z",
+    "apae.reason": "Repeated policy_violation events (3 in 1 hour)"
+  }
+}
+~~~
+{: #fig-quarantine title="Quarantine ECT"}
+
+The quarantine ECT MUST be broadcast to all agents that have
+received trust assertions about the quarantined agent
+(via `apae:trust_assertion` with matching `apae.subject`).
+
+## Quarantined Agent Behavior
+
+While quarantined, a subject agent:
+
+- MUST NOT accept new delegations.  New delegation requests MUST
+  return HTTP 503 with `Retry-After` set to `apae.quarantine_until`.
+- MUST complete in-progress workflows (drain behavior per AEPB).
+- MAY accept direct operator commands (HITL Level 4 is unaffected).
+
+Agents receiving the quarantine notification MUST update their
+trust table and MUST NOT delegate new tasks to the quarantined
+agent until the quarantine expires or is lifted.
+
+## Quarantine Duration
+
+The default quarantine duration is 1 hour, doubling on each
+successive quarantine entry:
+
+| Quarantine count | Duration |
+|-----------------|---------|
+| 1 | 1 hour |
+| 2 | 2 hours |
+| 3 | 4 hours |
+| n | 2^(n-1) hours (max 168 hours / 7 days) |
+{: #fig-quarantine-duration title="Quarantine Duration Escalation"}
+
+## Quarantine Expiry and Recovery
+
+When the quarantine period expires:
+
+1. The agent's trust score is reset to the initial default
+   (deployment-configured; RECOMMENDED: 0.5 for recovery).
+2. The agent transitions back to active status per AEPB lifecycle.
+3. A recovery ECT MAY be emitted: `exec_act: "apae:quarantine"` with
+   `apae.to_state: "active"`.
+
+An operator MAY lift a quarantine early by issuing a HITL override
+(Level 1 or higher) with scope `apae:quarantine_lift` for the
+subject agent.
+
+# Behavior Verification {#behavior}
+
+## Behavior Specifications
+
+A behavior specification declares what an agent is permitted to do.
+Specifications are JSON documents referencing ECT claims:
+
+~~~json
+{
+  "spec_version": "1.0",
+  "agent_id": "spiffe://example.com/agent/firewall",
+  "allowed_actions": ["update_rules", "read_config", "report"],
+  "constraints": {
+    "max_actions_per_minute": 60,
+    "forbidden_targets": ["core-router-*"],
+    "require_checkpoint_before": ["update_rules"]
+  },
+  "verification_frequency": "continuous"
+}
+~~~
+{: #fig-spec title="Behavior Specification"}
+
+Behavior specifications SHOULD be signed and versioned.  Changes
+MUST be recorded as ECTs.
+
+## Verification Against ECT Stream
+
+A verifier monitors the agent's ECT stream and checks:
+
+1. `exec_act` values are in `allowed_actions`.
+2. Action rate does not exceed `max_actions_per_minute` (computed
+   from `iat` timestamps).
+3. `atd:checkpoint` ECTs precede `update_rules` ECTs (from
+   `require_checkpoint_before`).
+4. Targets in `ext` claims do not match `forbidden_targets`.
+
+Verification results are ECTs:
+
+- `exec_act`: `"apae:compliance_check"`
+
+~~~json
+{
+  "exec_act": "apae:compliance_check",
+  "par": ["latest-agent-ect-uuid"],
+  "ext": {
+    "apae.compliance_status": "passing",
+    "apae.violations": [],
+    "apae.spec_version": "1.0",
+    "apae.window": "2026-03-01T12:00:00Z/PT1H"
+  }
+}
+~~~
+{: #fig-compliance title="Compliance Check ECT"}
+
+Violations trigger trust score decreases (`policy_violation` event)
+and MAY trigger HITL escalation.
+
+A violation compliance check ECT looks like:
+
+~~~json
+{
+  "exec_act": "apae:compliance_check",
+  "par": ["offending-ect-uuid"],
+  "ext": {
+    "apae.compliance_status": "failing",
+    "apae.violations": [
+      {
+        "rule": "require_checkpoint_before",
+        "action": "update_rules",
+        "ect": "offending-ect-uuid",
+        "description": "update_rules at 12:03:15 has no preceding atd:checkpoint within 10s"
+      }
+    ],
+    "apae.spec_version": "1.0",
+    "apae.window": "2026-03-01T12:00:00Z/PT1H"
+  }
+}
+~~~
+{: #fig-violation title="Compliance Violation ECT"}
+
+# Data Provenance {#provenance}
+
+## DAG as Provenance Chain
+
+The ECT DAG already encodes data provenance: each ECT's `par`
+references show which prior tasks produced its inputs.  The
+`inp_hash` and `out_hash` claims prove what was processed without
+revealing the data.
+
+For deployments requiring explicit provenance metadata, agents
+MAY include:
+
+~~~json
+{
+  "ext": {
+    "apae.data_source": "database:patients",
+    "apae.data_classification": "pii",
+    "apae.retention_days": 365,
+    "apae.transformations": ["anonymize", "aggregate"]
+  }
+}
+~~~
+{: #fig-provenance title="Provenance Extension Claims"}
+
+At Regulated assurance level, all data-transforming ECT nodes
+MUST include provenance claims.
+
+## Provenance Queries
+
+At L3, the audit ledger enables provenance queries:
+
+- "Which agents touched this data?" → walk `par` chain from
+  final ECT to roots.
+- "Was this data transformed?" → check `apae.transformations`
+  along the chain.
+- "Is provenance complete?" → verify all `par` references
+  resolve to ledger entries.
+
+# Cross-Domain Trust {#cross-domain}
+
+## Trust Domain Basics
+
+A trust domain is an administrative boundary within which a
+single trust anchor (CA certificate or JWK set) governs agent
+identity.  Trust scores are local to a trust domain by default.
+
+## Trust Domain Registration
+
+Each trust domain MUST publish a trust anchor at a well-known URI:
+
+~~~
+GET /.well-known/apae/trust-anchor HTTP/1.1
+~~~
+
+The response MUST be a JSON object containing:
+
+~~~json
+{
+  "domain": "example.com",
+  "trust_anchor_type": "jwks",
+  "trust_anchor_uri": "https://example.com/.well-known/jwks.json",
+  "contact": "trust-admin@example.com"
+}
+~~~
+{: #fig-trust-anchor title="Trust Anchor Document"}
+
+## Cross-Domain Delegation
+
+When Agent A (domain X) delegates to Agent B (domain Y):
+
+1. A MUST verify that its ACP-DAG-HITL policy permits cross-domain
+   delegation to domain Y (bilateral trust agreement).
+2. A fetches B's trust anchor document to verify B's identity.
+3. A creates an `apae:cross_domain_assertion` ECT linking the
+   two domains.
+4. Both A and B include their domain in ECT `iss` claims.
+
+~~~json
+{
+  "exec_act": "apae:cross_domain_assertion",
+  "ext": {
+    "apae.source_domain": "example.com",
+    "apae.dest_domain": "hospital.example",
+    "apae.bilateral_agreement_ref": "agreement-id-2026-001",
+    "apae.min_assurance": "L2"
+  }
+}
+~~~
+{: #fig-cross-domain title="Cross-Domain Assertion ECT"}
+
+The ASCII diagram below illustrates a cross-domain delegation:
+
+~~~
+Domain: example.com         Domain: hospital.example
+┌──────────────────┐        ┌──────────────────────┐
+│  Agent A         │  AEPB  │  Agent B             │
+│  (orchestrator)  ├───────►│  (treatment planner) │
+│  ECT: L2         │        │  ECT: L3             │
+└──────────────────┘        └──────────────────────┘
+         │                           │
+         └─── cross_domain_assertion ECT ──┘
+              (bilateral agreement verified)
+~~~
+{: #fig-cross-domain-diag title="Cross-Domain Delegation"}
+
+## Cross-Domain Trust Scores
+
+Trust scores do not transfer across domain boundaries by default.
+When Agent A in domain X has no prior interactions with Agent B
+in domain Y:
+
+- If a bilateral trust agreement exists: initial trust is set to
+  the agreement's `default_trust` value (negotiated out of band).
+- If no agreement exists: delegation MUST be rejected (zero-trust
+  default).
+
+Cross-domain trust scores are isolated from intra-domain scores
+and are stored separately in the trust table.
+
+# Assurance Profiles {#profiles}
+
+## Profile Definitions
+
+An assurance profile is a named configuration that selects which
+mechanisms are required.  Profiles MUST be declared in ACP-DAG-HITL
+workflow policy and announced in the AEPB capability document.
+
+| Mechanism | Relaxed | Standard | Regulated |
+|-----------|---------|----------|-----------|
+| **ECT level** | L1 | L2 | L3 |
+| **Trust scoring** | Optional | RECOMMENDED | REQUIRED |
+| **Trust threshold enforcement** | Optional | RECOMMENDED | REQUIRED |
+| **Behavior verification** | Off | Periodic | Continuous |
+| **HITL approval gates** | Optional | Critical paths | Mandatory |
+| **Data provenance claims** | Off | Optional | REQUIRED |
+| **Checkpoint before consequential** | RECOMMENDED | REQUIRED | REQUIRED |
+| **Audit ledger** | Optional | Optional | REQUIRED |
+| **Quarantine protocol** | Optional | RECOMMENDED | REQUIRED |
+| **Cross-domain trust agreements** | Optional | Required if cross-domain | Required if cross-domain |
+{: #fig-profiles title="Assurance Profile Requirements"}
+
+Relaxed:
+: Internal dev/staging.  L1 ECTs.  Trust and verification
+  optional.  Useful for debugging and observability without
+  cryptographic overhead.
+
+Standard:
+: Production cross-org.  L2 ECTs.  Trust scoring and thresholds
+  recommended.  Periodic behavior verification.  HITL on critical
+  paths.
+
+Regulated:
+: Healthcare, finance, EU AI Act.  L3 ECTs with audit ledger.
+  Continuous behavior verification.  All trust mechanisms
+  required.  Full provenance chain.  Mandatory HITL gates.
+
+Profiles are declared in ACP-DAG-HITL node constraints:
+
+~~~json
+{
+  "constraints": {
+    "apae.assurance_profile": "regulated"
+  }
+}
+~~~
+{: #fig-profile-policy title="Profile as Node Constraint"}
+
+A single deployment MAY use different profiles for different
+workflows.
+
+## Profile Selection Guidance
+
+Operators SHOULD select profiles using the following decision
+table:
+
+| Deployment context | Recommended profile |
+|-------------------|--------------------|
+| Unit tests, local development | Relaxed |
+| Internal production (single org) | Standard |
+| Cross-organization production | Standard (with trust agreements) |
+| Financial services, EU AI Act critical | Regulated |
+| Healthcare (HIPAA, clinical trials) | Regulated |
+| Critical infrastructure (NIS2) | Regulated |
+{: #fig-profile-selection title="Profile Selection Guidance"}
+
+## Upgrade Path Between Profiles
+
+Operators MUST NOT downgrade assurance profile during an active
+workflow.
+
+Relaxed → Standard:
+: (1) Add ECT signing keys (WIMSE WIT or X.509). (2) Update ECT
+  emission to sign tokens. (3) Configure trust scoring
+  (alpha/beta, initial trust, thresholds). (4) Define behavior
+  specifications for critical agents. (5) Add HITL approval gates
+  on critical DAG paths.
+
+Standard → Regulated:
+: (1) Configure audit ledger endpoint. (2) Update ECT emission
+  to commit each ECT to ledger. (3) Enable continuous behavior
+  verification (change `verification_frequency` from `periodic`
+  to `continuous`). (4) Enable provenance claims on all
+  data-transforming ECTs. (5) Add mandatory HITL gates on all
+  consequential actions. (6) Enable quarantine protocol.
+
+# Security Considerations
+
+## Trust Score Sensitivity
+
+Trust scores are sensitive metadata.  Agents MUST NOT expose
+full trust tables.  Only pairwise assertions SHOULD be shared,
+and only in response to explicit authenticated requests.
+
+## Score Inflation (Adversarial Trust Building)
+
+An adversary performs many small successful interactions to
+inflate trust, then executes a malicious action.  Mitigation:
+
+- Apply double penalty (`beta^2`) for `policy_violation` events.
+- Enforce high trust thresholds for high-risk actions.
+- Rate-limit trust score increases: an agent MUST NOT increase
+  trust by more than 0.1 per day toward any single peer.
+- Use behavior verification continuously at Standard+.
+
+## Attestation Freshness
+
+Stale compliance check ECTs MUST be rejected.  The verifier MUST
+check that `apae:compliance_check` ECTs have `iat` within the
+configured verification window (default: 1 hour for Standard,
+5 minutes for Regulated).
+
+## Provenance Chain Forgery
+
+Each provenance hop must be signed (L2+) to prevent injection
+of false provenance records.  Agents MUST verify the signature
+on all `par`-linked ECTs before accepting provenance claims.
+
+## Sybil Attack on Trust
+
+Fake agents inflate trust for each other to gain influence.
+Mitigation:
+
+- Trust propagation attenuation (default 0.5) limits the impact
+  of second-hand assertions.
+- Maximum hop count of 1 for trust propagation.
+- Require agents to be registered in a trusted directory before
+  initial trust is assigned above the floor value.
+
+## Cross-Domain Trust Downgrade
+
+An attacker forces delegation through an untrusted domain by
+presenting a forged bilateral agreement.  Mitigation:
+
+- Bilateral trust agreements MUST be signed by operators of
+  both domains.
+- Agents MUST verify the agreement signature before accepting
+  cross-domain delegations.
+- Cross-domain ECTs MUST use L2+ assurance.
+
+## Quarantine Evasion
+
+An agent subject to quarantine re-registers under a different
+identity to escape the quarantine.  Mitigation:
+
+- Quarantine ECTs are broadcast; receiving agents record the
+  quarantine by both agent ID and by behavioral fingerprint.
+- Agents SHOULD require re-onboarding with operator approval
+  before accepting new identities from known-quarantined domains.
+
+# IANA Considerations
+
+## Assurance Profile Registry
+
+This document requests the creation of the "APAE Assurance Profile
+Registry" under IANA.  Registration policy: Specification Required.
+
+Initial entries:
+
+| Profile Name | Profile URI | Description | Reference |
+|-------------|------------|-------------|-----------|
+| Relaxed | `urn:ietf:params:apae:profile:relaxed` | Dev/test, L1 ECTs | This document |
+| Standard | `urn:ietf:params:apae:profile:standard` | Production, L2 ECTs | This document |
+| Regulated | `urn:ietf:params:apae:profile:regulated` | Regulated, L3 ECTs | This document |
+{: #fig-profile-registry title="Assurance Profile Registry"}
+
+## `exec_act` Values
+
+This document requests registration in the AEM Ecosystem
+Extension Registry:
+
+| Value | Description | Reference |
+|-------|-------------|-----------|
+| `apae:trust_assertion` | Sharing trust score for a peer | This document |
+| `apae:trust_revoke` | Revoking delegations due to low trust | This document |
+| `apae:compliance_check` | Behavior verification result | This document |
+| `apae:quarantine` | Agent quarantine entry or exit | This document |
+| `apae:cross_domain_assertion` | Cross-domain delegation evidence | This document |
+{: #fig-iana-actions title="APAE exec_act Registrations"}
+
+## Well-Known URI
+
+This document requests registration of `apae/trust-anchor` as a
+well-known URI suffix per RFC 8615 for trust domain anchor
+publication.
+
+--- back
+
+# Acknowledgments
+{:numbered="false"}
+
+APAE builds on ECT {{I-D.nennemann-wimse-ect}} for interaction
+evidence and audit, and ACP-DAG-HITL
+{{I-D.nennemann-agent-dag-hitl-safety}} for trust threshold and
+assurance profile policy enforcement.  The AIMD trust model is
+adapted from TCP congestion control (RFC 5681).  Behavior
+verification is informed by RATS architecture {{RFC9334}}.
--- a/workspace/drafts/new-drafts/draft-heop-human-emergency-override-00.md
+++ b/workspace/drafts/new-drafts/draft-heop-human-emergency-override-00.md
@@ -0,0 +1,372 @@
+---
+title: "Human Emergency Override Protocol (HEOP)"
+abbrev: "HEOP"
+category: std
+docname: draft-heop-human-emergency-override-00
+submissiontype: IETF
+number:
+date:
+v: 3
+area: "SEC"
+workgroup: "Security Dispatch"
+keyword:
+  - human override
+  - emergency stop
+  - agentic workflows
+  - HITL
+  - execution context
+
+author:
+  -
+    fullname: Generated by IETF Draft Analyzer
+    organization: Independent
+    email: placeholder@example.com
+
+normative:
+  RFC7519:
+  RFC7515:
+  RFC9110:
+  RFC8615:
+  I-D.nennemann-wimse-ect:
+    title: "Execution Context Tokens for Distributed Agentic Workflows"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
+  I-D.nennemann-agent-dag-hitl-safety:
+    title: "Agent Context Policy Token: DAG Delegation with Human Override"
+    target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
+
+informative:
+
+--- abstract
+
+This document defines the Human Emergency Override Protocol (HEOP),
+the runtime enforcement mechanism for human intervention in
+autonomous AI agent operations.  HEOP is the "how" to ACP-DAG-HITL's
+"when": where the Agent Context Policy Token defines conditions
+that require human decision, HEOP defines the wire protocol for
+override commands, agent compliance, and acknowledgment.  HEOP
+specifies four override levels (pause, constrain, stop, takeover),
+a mandatory agent compliance endpoint, and records every override
+as an ECT DAG node for tamper-evident audit.  Override levels map
+directly to ACP-DAG-HITL actions.
+
+--- middle
+
+# Introduction
+
+As AI agents gain autonomy in critical infrastructure, the ability
+for humans to intervene quickly and reliably becomes essential.
+The current ratio of autonomous capability drafts to human
+oversight drafts in the IETF is roughly 7:1.
+
+The Agent Context Policy Token
+{{I-D.nennemann-agent-dag-hitl-safety}} defines a policy language
+for human-in-the-loop safety: trigger conditions, required roles,
+and permitted actions (`pause`, `escalate`, `abort`).  But it does
+not define the runtime protocol for how overrides are transmitted to
+agents, how agents acknowledge them, or how the intervention is
+recorded.  HEOP fills this gap.
+
+HEOP draws from industrial safety: the emergency stop button on
+factory equipment, the circuit breaker in electrical systems, the
+kill switch in robotics.  The override mechanism must be simpler
+and more reliable than the system it controls.
+
+Every override command and acknowledgment is recorded as an ECT
+{{I-D.nennemann-wimse-ect}}, linking into the workflow DAG.  At
+L3, this provides the tamper-evident audit trail that regulated
+environments (FDA, MiFID II, EU AI Act) require for human
+intervention records.
+
+# Conventions and Definitions
+
+{::boilerplate bcp14-tagged}
+
+Override:
+: A human-initiated command that alters an agent's autonomous
+  operation, taking precedence over the agent's own decision-making.
+
+Operator:
+: A human user authorized to issue override commands, corresponding
+  to a `required_role` in ACP-DAG-HITL policy.
+
+Override Level:
+: One of four escalating intervention types, each with
+  deterministic agent behavior requirements.
+
+# Mapping to ACP-DAG-HITL Actions {#mapping}
+
+HEOP override levels are the runtime realization of ACP-DAG-HITL
+actions:
+
+| ACP-DAG-HITL action | HEOP Level | Behavior |
+|---------------------|------------|----------|
+| `pause`             | 1 (PAUSE)  | Suspend autonomous actions, hold state |
+| (no equivalent)     | 2 (CONSTRAIN) | Restrict to allowed action subset |
+| `abort`             | 3 (STOP)   | Cease all actions, enter inert state |
+| `escalate`          | 4 (TAKEOVER) | Transfer control to human operator |
+{: #fig-mapping title="ACP-DAG-HITL to HEOP Mapping"}
+
+Level 2 (CONSTRAIN) extends beyond ACP-DAG-HITL's current action
+vocabulary.  When a HITL rule triggers with `action: "pause"` and
+`override_action: "continue"`, the operator MAY continue with
+HEOP Level 2 constraints rather than full resumption.
+
+# Override Levels {#levels}
+
+## Level 1 -- PAUSE
+
+The agent MUST suspend all autonomous actions and hold its current
+state.  It MUST NOT initiate new actions but MAY complete
+in-progress actions if stopping mid-execution would cause harm.
+The agent resumes when a RESUME command is received.
+
+## Level 2 -- CONSTRAIN
+
+The agent MUST restrict its actions to a specified subset defined
+in the override command.  The agent MUST reject any action not on
+the allowlist.
+
+## Level 3 -- STOP
+
+The agent MUST immediately cease all autonomous actions, abandon
+in-progress actions where safe, and enter an inert state.  It
+MUST NOT act until explicitly restarted.  This is the e-stop.
+
+## Level 4 -- TAKEOVER
+
+The agent MUST transfer operational control to the human operator,
+entering pass-through mode where it executes only explicit operator
+commands.  The agent's sensors and outputs remain available to the
+operator as tools.
+
+# Override Command Format {#command-format}
+
+Override commands are HTTP POST requests to the agent's well-known
+endpoint, carrying an ECT in the Execution-Context header:
+
+~~~
+POST /.well-known/heop/override HTTP/1.1
+Content-Type: application/json
+Authorization: Bearer <operator-jwt>
+Execution-Context: <override-ECT>
+
+{
+  "override_id": "urn:uuid:...",
+  "level": 3,
+  "reason": "Agent blocking legitimate traffic",
+  "operator_id": "spiffe://example.com/human/alice",
+  "scope": "*",
+  "constraints": null,
+  "ttl": null
+}
+~~~
+{: #fig-override title="Override Command"}
+
+Field definitions:
+
+`level`:
+: Integer 1-4.  MUST be present.
+
+`reason`:
+: Human-readable text.  MUST be present and logged.
+
+`scope`:
+: Which agent functions to override.  `"*"` means all.  MAY be a
+  list of function identifiers for partial overrides.
+
+`constraints`:
+: For Level 2 only.  JSON array of permitted action types, e.g.,
+  `["read", "monitor", "report"]`.
+
+`ttl`:
+: Optional duration in seconds.  If set, the override expires
+  automatically and the agent resumes its prior mode.
+
+## Resume and Lift
+
+~~~
+POST /.well-known/heop/resume HTTP/1.1
+{"override_id": "urn:uuid:...", "operator_id": "..."}
+
+POST /.well-known/heop/lift HTTP/1.1
+{"override_id": "urn:uuid:...", "operator_id": "..."}
+~~~
+{: #fig-resume title="Resume and Lift Commands"}
+
+# ECT Integration {#ect-integration}
+
+## Override ECT
+
+The operator (or operator's tooling) MUST produce an ECT for
+every override command:
+
+- `exec_act`: `"heop:override"`
+- `par`: the `jti` of the HITL trigger ECT (if the override was
+  triggered by ACP-DAG-HITL policy) or empty (if manually
+  initiated)
+
+~~~json
+{
+  "ext": {
+    "heop.level": 3,
+    "heop.reason": "Agent blocking legitimate traffic",
+    "heop.operator_id": "spiffe://example.com/human/alice",
+    "heop.scope": "*"
+  }
+}
+~~~
+{: #fig-override-ect title="Override ECT Extension Claims"}
+
+## Acknowledgment ECT
+
+The agent MUST produce an acknowledgment ECT:
+
+- `exec_act`: `"heop:ack"`
+- `par`: the `jti` of the override ECT
+
+~~~json
+{
+  "ext": {
+    "heop.status": "accepted",
+    "heop.prior_state": "autonomous",
+    "heop.current_state": "stopped",
+    "heop.effective_at": "2026-03-01T12:00:00.123Z"
+  }
+}
+~~~
+{: #fig-ack-ect title="Acknowledgment ECT Extension Claims"}
+
+## Decision Record Alignment
+
+The override/ack ECT pair serves as the ACP-DAG-HITL Decision
+Record {{I-D.nennemann-agent-dag-hitl-safety}}.  The required
+Decision Record fields map as follows:
+
+| Decision Record field | ECT source |
+|----------------------|------------|
+| `decision_id` | Override ECT `jti` |
+| `token_jti` | HITL trigger ECT `jti` (from `par`) |
+| `rule_ids` | From HITL trigger context |
+| `human_id` | `heop.operator_id` |
+| `human_role` | From operator JWT claims |
+| `decision` | Derived from `heop.level` |
+| `time` | Override ECT `iat` |
+{: #fig-decision-record title="Decision Record Mapping"}
+
+At L3, both ECTs are recorded in the audit ledger, providing a
+tamper-evident record of every human intervention.
+
+# Agent Compliance Requirements {#compliance}
+
+Every HEOP-compliant agent MUST:
+
+1. Implement the `/.well-known/heop/override` endpoint.
+
+2. Process override commands within 1 second of receipt.  The
+   override path MUST be independent of the agent's main
+   processing loop.
+
+3. Produce an acknowledgment ECT for every override.
+
+4. If the agent cannot fully comply (e.g., hardware limitation),
+   it MUST respond with `heop.status`: `"partial"` and a
+   description.  An agent MUST NOT respond with `"rejected"`.
+
+5. Expose current override status at:
+
+~~~
+GET /.well-known/heop/status
+~~~
+
+Response:
+
+~~~json
+{
+  "agent_id": "spiffe://example.com/agent/firewall-mgr",
+  "override_active": true,
+  "current_level": 3,
+  "override_ect_jti": "550e8400-e29b-41d4-a716-446655440055",
+  "since": "2026-03-01T12:00:00Z",
+  "operator_id": "spiffe://example.com/human/alice"
+}
+~~~
+{: #fig-status title="Override Status"}
+
+# Broadcast Overrides {#broadcast}
+
+For environments with many agents, HEOP supports broadcast.  An
+operator sends a single command to a management endpoint:
+
+~~~
+POST /heop/broadcast HTTP/1.1
+{
+  "override_id": "urn:uuid:...",
+  "level": 3,
+  "reason": "Coordinated emergency stop",
+  "targets": ["spiffe://example.com/agent/a1", "spiffe://example.com/agent/a2"]
+}
+~~~
+{: #fig-broadcast title="Broadcast Override"}
+
+The broadcast endpoint produces a parent ECT with
+`exec_act`: `"heop:broadcast"`, and each per-agent override ECT
+references it via `par`.
+
+# Dead Man's Switch {#dead-mans-switch}
+
+Agents SHOULD support a heartbeat-based safety net: the agent
+periodically pings an operator heartbeat endpoint.  If the
+heartbeat is missed for a configurable duration, the agent
+automatically enters Level 1 (PAUSE) and produces a
+self-override ECT with `exec_act`: `"heop:dead_mans_switch"`.
+
+This provides safety when network connectivity to the operator
+is lost.
+
+# Security Considerations
+
+Override commands are high-privilege operations.  All override
+endpoints MUST require authentication via signed JWTs with the
+`heop_override` scope.  The JWT MUST include the operator's
+identity, a timestamp, and be signed using an asymmetric algorithm.
+
+Override commands MUST be transmitted over TLS 1.3.
+
+To prevent replay, agents MUST reject overrides with timestamps
+more than 30 seconds in the past.  The `override_id` MUST be
+unique; agents MUST reject duplicates.
+
+Deployments SHOULD implement multi-operator approval for Level 4
+(TAKEOVER), requiring two independent operator JWTs.
+
+The override endpoint SHOULD be served on a separate port or
+network interface from the agent's main API to ensure availability
+during overload.
+
+The ECT DAG provides tamper-evident audit of all overrides.  At
+L3, the audit ledger prevents override records from being deleted
+or modified after the fact.
+
+# IANA Considerations
+
+This document requests the following IANA registrations:
+
+1. Well-known URI registrations for `heop/override`,
+   `heop/resume`, `heop/lift`, and `heop/status` per {{RFC8615}}.
+
+2. Registration of `exec_act` values `heop:override`, `heop:ack`,
+   `heop:broadcast`, `heop:dead_mans_switch` in a future ECT
+   action type registry.
+
+3. Registration of the `heop_override` OAuth scope.
+
+--- back
+
+# Acknowledgments
+{:numbered="false"}
+
+This document is the runtime enforcement companion to the Agent
+Context Policy Token {{I-D.nennemann-agent-dag-hitl-safety}},
+which defines the HITL policy language, and builds on the
+Execution Context Token {{I-D.nennemann-wimse-ect}} for
+audit and tracing.
--- a/workspace/drafts/new-drafts/draft-heop-human-emergency-override-00.txt
+++ b/workspace/drafts/new-drafts/draft-heop-human-emergency-override-00.txt
@@ -0,0 +1,307 @@
+Internet-Draft                                           AI/Agent WG
+Intended status: Standards Track                          March 2026
+Expires: September 15, 2026
+
+
+         Human Emergency Override Protocol (HEOP)
+         draft-heop-human-emergency-override-00
+
+Abstract
+
+   This document defines the Human Emergency Override Protocol
+   (HEOP), a standard mechanism for human operators to intervene
+   in autonomous AI agent operations during critical situations.
+   Current IETF drafts include 60 autonomous operations proposals
+   but only 22 addressing human-agent interaction, with none
+   defining emergency override procedures.  HEOP specifies four
+   escalating override levels (pause, constrain, stop, takeover),
+   a mandatory agent compliance interface, and acknowledgment
+   semantics that ensure overrides are received and acted upon.
+   The protocol is intentionally minimal: a single HTTP endpoint
+   per agent, four command types, and deterministic agent
+   behavior for each.
+
+Status of This Memo
+
+   This Internet-Draft is submitted in full conformance with the
+   provisions of BCP 78 and BCP 79.
+
+   This document is intended to have Standards Track status.
+   Distribution of this memo is unlimited.
+
+Table of Contents
+
+   1.  Introduction
+   2.  Terminology
+   3.  Problem Statement
+   4.  Override Levels
+   5.  Override Command Format
+   6.  Agent Compliance Requirements
+   7.  Override Management Interface
+   8.  Security Considerations
+   9.  IANA Considerations
+
+1.  Introduction
+
+   As AI agents gain autonomy in critical infrastructure, the
+   ability for humans to intervene quickly and reliably becomes
+   essential.  The current ratio of autonomous capability drafts
+   to human oversight drafts in the IETF is roughly 7:1, creating
+   an asymmetry where agents can act but humans cannot reliably
+   stop them.
+
+   HEOP draws inspiration from industrial safety systems: the
+   emergency stop (e-stop) button on factory equipment, the
+   circuit breaker in electrical systems, and the kill switch in
+   robotics.  These systems share a design philosophy: the
+   override mechanism must be simpler and more reliable than the
+   system it controls.
+
+   HEOP is deliberately not a governance framework, policy
+   language, or accountability protocol.  It is a panic button
+   with a well-defined interface.
+
+2.  Terminology
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+   NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
+   "OPTIONAL" in this document are to be interpreted as described
+   in RFC 2119 [RFC2119].
+
+   Override: A human-initiated command that alters an agent's
+   autonomous operation, taking precedence over the agent's own
+   decision-making.
+
+   Operator: A human user authorized to issue override commands
+   to one or more agents.
+
+   Override Level: One of four escalating intervention types,
+   each with deterministic agent behavior requirements.
+
+3.  Problem Statement
+
+   An autonomous network management agent detects what it believes
+   is a DDoS attack and begins blocking traffic.  It is wrong —
+   the traffic spike is legitimate (a product launch).  The
+   operator sees revenue dropping and needs to stop the agent
+   immediately.  Today, the operator must:
+
+   1. Figure out which agent is responsible.
+   2. Find that agent's proprietary management interface.
+   3. Understand its specific stop mechanism (if one exists).
+   4. Hope the agent actually stops.
+
+   There is no standard for any of these steps.  HEOP addresses
+   steps 2-4 by defining a universal override interface that all
+   agents MUST implement.
+
+4.  Override Levels
+
+   HEOP defines four override levels, each more restrictive than
+   the last:
+
+   Level 1 — PAUSE
+   The agent MUST suspend all autonomous actions and hold its
+   current state.  It MUST NOT initiate new actions but MAY
+   complete actions already in progress if stopping them mid-
+   execution would cause more harm (e.g., an in-flight database
+   transaction).  The agent MUST resume normal operation when a
+   RESUME command is received.
+
+   Level 2 — CONSTRAIN
+   The agent MUST restrict its actions to a specified subset.
+   The override command includes an allowlist of permitted action
+   types.  The agent MUST reject any action not on the allowlist.
+   This enables operators to let the agent continue operating in
+   a limited, safe capacity.
+
+   Level 3 — STOP
+   The agent MUST immediately cease all autonomous actions,
+   abandon in-progress actions where safe to do so, and enter an
+   inert state.  It MUST NOT take any autonomous actions until
+   explicitly restarted by an operator.  This is the equivalent
+   of an e-stop.
+
+   Level 4 — TAKEOVER
+   The agent MUST transfer operational control to the human
+   operator.  It enters a pass-through mode where it executes
+   only explicit operator commands and takes no autonomous
+   actions.  The agent's sensors and outputs remain available to
+   the operator as tools.
+
+5.  Override Command Format
+
+   Override commands are sent as HTTP POST requests to the agent's
+   well-known override endpoint:
+
+      POST /.well-known/heop/override HTTP/1.1
+      Content-Type: application/json
+      Authorization: Bearer <operator-jwt>
+
+      {
+        "override_id": "urn:uuid:...",
+        "level": 3,
+        "reason": "Agent blocking legitimate traffic",
+        "operator_id": "urn:uuid:...",
+        "timestamp": "2026-03-01T12:00:00Z",
+        "scope": "*",
+        "constraints": null,
+        "ttl": null
+      }
+
+   Field definitions:
+
+   "level": Integer 1-4, corresponding to the override levels in
+   Section 4.  MUST be present.
+
+   "reason": Human-readable text.  MUST be present and MUST be
+   logged by the agent.
+
+   "scope": Which of the agent's functions to override.  "*" means
+   all functions.  MAY be a list of function identifiers for
+   partial overrides.
+
+   "constraints": For Level 2 only.  A JSON array of permitted
+   action types, e.g., ["read", "monitor", "report"].
+
+   "ttl": Optional duration in seconds.  If set, the override
+   automatically expires after this duration and the agent
+   resumes its prior operating mode.  If null, the override
+   persists until explicitly lifted.
+
+   To resume from Level 1 (PAUSE):
+
+      POST /.well-known/heop/resume HTTP/1.1
+      Authorization: Bearer <operator-jwt>
+
+      {"override_id": "urn:uuid:...", "operator_id": "urn:uuid:..."}
+
+   To lift any override:
+
+      POST /.well-known/heop/lift HTTP/1.1
+      Authorization: Bearer <operator-jwt>
+
+      {"override_id": "urn:uuid:...", "operator_id": "urn:uuid:..."}
+
+6.  Agent Compliance Requirements
+
+   Every HEOP-compliant agent MUST:
+
+   1. Implement the /.well-known/heop/override endpoint.
+
+   2. Process override commands within 1 second of receipt.
+      The override path MUST be independent of the agent's main
+      processing loop to ensure responsiveness even when the
+      agent is under heavy load or in a failure state.
+
+   3. Acknowledge every override with an HTTP response:
+
+      200 OK:
+      {
+        "override_id": "urn:uuid:...",
+        "status": "accepted",
+        "effective_at": "2026-03-01T12:00:00.123Z",
+        "prior_state": "autonomous",
+        "current_state": "stopped"
+      }
+
+   4. Log all overrides, including the full command, timestamp,
+      operator identity, and agent state before and after.
+
+   5. If the agent cannot comply (e.g., hardware limitation), it
+      MUST respond with status "partial" and a description of
+      what it could and could not do.  An agent MUST NOT respond
+      with "rejected" — overrides are mandatory.
+
+   6. Expose current override status at:
+
+      GET /.well-known/heop/status
+
+      {
+        "agent_id": "urn:uuid:...",
+        "override_active": true,
+        "current_level": 3,
+        "override_id": "urn:uuid:...",
+        "since": "2026-03-01T12:00:00Z",
+        "operator_id": "urn:uuid:..."
+      }
+
+7.  Override Management Interface
+
+   For environments with many agents, HEOP supports broadcast
+   overrides.  An operator MAY send a single override command to
+   a management endpoint that fans out to multiple agents:
+
+      POST /heop/broadcast HTTP/1.1
+
+      {
+        "override_id": "urn:uuid:...",
+        "level": 3,
+        "reason": "Coordinated emergency stop",
+        "targets": ["urn:uuid:agent-1", "urn:uuid:agent-2"],
+        "operator_id": "urn:uuid:..."
+      }
+
+   The broadcast endpoint MUST return per-agent results:
+
+      {
+        "results": [
+          {"agent_id": "urn:uuid:agent-1", "status": "accepted"},
+          {"agent_id": "urn:uuid:agent-2", "status": "accepted"}
+        ],
+        "failed": []
+      }
+
+   For maximum reliability, operators SHOULD also implement a
+   dead man's switch: agents periodically ping an operator
+   heartbeat endpoint, and if the heartbeat is missed for a
+   configurable duration, the agent automatically enters Level 1
+   (PAUSE).  This provides a safety net when network connectivity
+   to the operator is lost.
+
+8.  Security Considerations
+
+   Override commands are high-privilege operations.  All override
+   endpoints MUST require authentication via mutual TLS or signed
+   JWTs issued by a trusted operator identity provider.
+
+   The JWT MUST include the operator's identity, a timestamp, and
+   the "heop_override" scope.  Agents MUST verify JWT signatures
+   and reject expired tokens.
+
+   Override commands MUST be transmitted over TLS 1.3 [RFC8446].
+
+   To prevent override replay attacks, agents MUST reject
+   override commands with timestamps more than 30 seconds in the
+   past.  The override_id MUST be unique; agents MUST reject
+   duplicate override_ids.
+
+   Rogue operators are mitigated through the operator identity
+   framework.  Deployments SHOULD implement multi-operator
+   approval for Level 4 (TAKEOVER) overrides, requiring two
+   independent operator JWTs.
+
+   The override mechanism itself MUST be resistant to denial of
+   service.  The override endpoint SHOULD be served on a
+   separate port or network interface from the agent's main
+   API to ensure availability during agent overload conditions.
+
+9.  IANA Considerations
+
+   This document requests IANA establish the following:
+
+   1. A well-known URI registration for "heop/override",
+      "heop/resume", "heop/lift", and "heop/status" per
+      RFC 8615.
+
+   2. A "HEOP Override Level" registry under Standards Action
+      policy.  Initial entries: 1 (PAUSE), 2 (CONSTRAIN),
+      3 (STOP), 4 (TAKEOVER).
+
+   3. Registration of the "heop_override" OAuth scope in the
+      OAuth Parameters registry.
+
+Author's Address
+
+   Generated by IETF Draft Analyzer
+   2026-03-01
--- a/workspace/drafts/new-drafts/generated-draft.txt
+++ b/workspace/drafts/new-drafts/generated-draft.txt
@@ -0,0 +1,598 @@
+Internet-Draft                                           AI/Agent WG
+Intended status: standards-track                             March 2026
+Expires: September 02, 2026
+
+
+         Agent Behavior Verification Protocol (ABVP) for Runtime Compliance Validation
+         draft-ai-agent-behavior-verification-protocol-00
+
+Abstract
+
+   This document defines the Agent Behavior Verification Protocol
+   (ABVP), a standardized framework for continuously validating that
+   deployed AI agents operate according to their declared policies
+   and specifications. As autonomous agents become increasingly
+   prevalent in critical systems, there is a growing gap between
+   stated agent capabilities and actual runtime behavior
+   verification. ABVP provides mechanisms for real-time behavior
+   monitoring, policy compliance validation, and cryptographic
+   attestation of agent actions against predefined behavioral
+   specifications. The protocol defines a verification architecture
+   that includes behavior witnesses, compliance checkers, and
+   attestation chains to ensure agents maintain fidelity to their
+   declared operational parameters. ABVP integrates with existing
+   agent accountability frameworks while providing specific
+   mechanisms for runtime verification, behavioral drift detection,
+   and compliance reporting. This specification addresses the
+   critical need for trustworthy agent deployment by enabling
+   operators to continuously verify agent behavior matches stated
+   policies throughout the agent lifecycle.
+
+Status of This Memo
+
+   This Internet-Draft is submitted in full conformance with the
+   provisions of BCP 78 and BCP 79.
+
+   This document is intended to have standards-track status.
+   Distribution of this memo is unlimited.
+
+Table of Contents
+
+   1.  Introduction  ................................................  3
+   2.  Terminology  .................................................  4
+   3.  Problem Statement  ...........................................  5
+   4.  Agent Behavior Verification Architecture  ....................  6
+   5.  Behavior Specification Format  ...............................  7
+   6.  Runtime Verification Protocol  ...............................  8
+   7.  Compliance Reporting and Attestation  ........................  9
+   8.  Security Considerations  .....................................  10
+   9.  IANA Considerations  .........................................  11
+
+1.  Introduction
+
+   The proliferation of autonomous AI agents in critical
+   infrastructure, financial systems, and decision-making processes
+   has created an urgent need for continuous verification that these
+   agents operate according to their declared policies and behavioral
+   specifications. Traditional approaches to agent deployment rely on
+   pre-deployment testing and static policy validation, which fail to
+   address the dynamic nature of agent behavior in production
+   environments. As agents adapt, learn, and respond to changing
+   conditions, their actual runtime behavior may diverge
+   significantly from their original specifications, creating
+   security vulnerabilities, compliance violations, and operational
+   risks that remain undetected until system failures occur.
+
+   Existing agent accountability frameworks primarily focus on post-
+   hoc analysis and audit trails, providing limited capability for
+   real-time behavior verification and immediate detection of policy
+   violations. This reactive approach is insufficient for autonomous
+   systems that make critical decisions with limited human oversight,
+   where behavioral drift or policy violations can have immediate and
+   severe consequences. Current verification methodologies also lack
+   standardized protocols for expressing behavioral constraints in
+   machine-verifiable formats, making it difficult to establish
+   consistent compliance validation across diverse agent
+   implementations and deployment environments.
+
+   The gap between declared agent capabilities and actual runtime
+   behavior verification represents a fundamental trust problem in
+   autonomous systems deployment. Organizations deploying AI agents
+   face significant challenges in ensuring that agents continue to
+   operate within specified parameters throughout their operational
+   lifecycle, particularly as agents encounter novel situations not
+   covered in initial testing scenarios. This verification gap
+   undermines confidence in agent reliability and limits the adoption
+   of autonomous systems in high-stakes environments where behavioral
+   compliance is critical for safety, security, and regulatory
+   compliance.
+
+   The Agent Behavior Verification Protocol (ABVP) addresses these
+   challenges by providing a standardized framework for continuous
+   runtime verification of agent behavior against declared
+   specifications. ABVP enables real-time monitoring of agent
+   actions, automated compliance checking against behavioral
+   policies, and cryptographic attestation of verification results to
+   establish trust chains for agent operation validation. The
+   protocol is designed to integrate with existing agent
+   architectures while providing mechanisms for detecting behavioral
+   drift, validating policy adherence, and generating verifiable
+   evidence of agent compliance throughout the operational lifecycle.
+   This specification defines the core protocol mechanisms, message
+   formats, and verification procedures necessary to implement
+   trustworthy agent behavior validation in production deployments.
+
+2.  Terminology
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+   NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
+   in this document are to be interpreted as described in RFC 2119
+   [RFC2119].
+
+   Agent: An autonomous software entity that performs actions or
+   makes decisions according to defined policies and specifications.
+   In the context of ABVP, an agent is a system whose runtime
+   behavior requires continuous verification against its declared
+   operational parameters and behavioral constraints.
+
+   Behavior Specification: A formally defined set of policies,
+   constraints, and operational parameters that describe the expected
+   and permitted actions of an agent. A behavior specification MUST
+   be machine-readable and verifiable, containing sufficient detail
+   to enable automated compliance checking during agent runtime
+   operation.
+
+   Behavior Witness: A system component or external entity that
+   observes and records agent actions for verification purposes. A
+   behavior witness MUST provide cryptographically signed
+   attestations of observed agent behavior and MAY operate
+   independently of the agent being monitored to ensure verification
+   integrity.
+
+   Compliance Validation: The process of evaluating agent runtime
+   behavior against its declared behavior specification to determine
+   conformance. Compliance validation encompasses real-time
+   monitoring, policy checking, and the generation of verification
+   results that attest to agent adherence to specified behavioral
+   constraints.
+
+   Verification Attestation: A cryptographically signed statement
+   that asserts the compliance status of an agent's behavior relative
+   to its specification during a defined time period. Verification
+   attestations MUST include sufficient detail to enable third-party
+   validation and SHOULD reference the specific behavior
+   specification version and verification criteria used in the
+   assessment.
+
+   Behavioral Drift: The phenomenon where an agent's actual runtime
+   behavior gradually diverges from its declared specification over
+   time, either due to learning adaptations, environmental changes,
+   or system degradation. ABVP mechanisms MUST be capable of
+   detecting behavioral drift and reporting deviations from
+   established behavioral baselines.
+
+3.  Problem Statement
+
+   The proliferation of autonomous AI agents in critical
+   infrastructure, financial systems, and safety-critical
+   applications has created an urgent need for continuous
+   verification that deployed agents operate within their declared
+   behavioral boundaries. Current agent deployment practices rely
+   primarily on pre-deployment testing and static policy
+   declarations, creating a significant verification gap between an
+   agent's stated capabilities and constraints and its actual runtime
+   behavior. This gap becomes particularly problematic as agents
+   adapt their behavior through learning mechanisms, interact with
+   dynamic environments, or experience gradual behavioral drift due
+   to model degradation or adversarial influences.
+
+   Traditional software verification approaches are insufficient for
+   autonomous agents because agent behavior is often non-
+   deterministic, context-dependent, and may evolve over time through
+   machine learning processes. Unlike conventional software systems
+   where behavior can be predicted from code analysis, agent systems
+   exhibit emergent behaviors that arise from complex interactions
+   between training data, environmental inputs, and decision-making
+   algorithms. The absence of standardized mechanisms for expressing
+   machine-verifiable behavioral specifications further complicates
+   runtime verification, as operators lack a common framework for
+   defining what constitutes compliant agent behavior and how
+   compliance can be automatically validated.
+
+   The security and trust implications of unverified agent behavior
+   are substantial, particularly in scenarios where agents operate
+   with elevated privileges or make decisions affecting human safety
+   or economic systems. Behavioral drift, where an agent's actions
+   gradually deviate from intended policies, may go undetected for
+   extended periods without continuous verification mechanisms.
+   Similarly, adversarial attacks that subtly modify agent behavior
+   to achieve malicious objectives could remain unnoticed in systems
+   that lack real-time compliance monitoring. The inability to
+   provide cryptographic attestations of agent behavior compliance
+   also prevents the establishment of trust chains necessary for
+   multi-agent systems or cross-organizational agent interactions.
+
+   Current accountability frameworks for AI systems focus primarily
+   on explainability and audit trails but do not provide mechanisms
+   for real-time verification of behavioral compliance against
+   formally specified policies. This creates operational risks where
+   agents may violate their declared constraints without immediate
+   detection, potentially causing system failures, security breaches,
+   or regulatory violations. The lack of standardized verification
+   protocols also prevents interoperability between different agent
+   verification systems and limits the ability to establish industry-
+   wide trust frameworks for autonomous agent deployment.
+
+4.  Agent Behavior Verification Architecture
+
+   The ABVP architecture consists of four primary components that
+   work together to provide continuous runtime verification of agent
+   behavior: Agent Runtime Environments (AREs), Behavior Verification
+   Nodes (BVNs), Attestation Authorities (AAs), and Verification
+   Clients (VCs). Agent Runtime Environments host the deployed agents
+   and MUST implement behavior monitoring capabilities that capture
+   relevant behavioral data and forward it to designated Behavior
+   Verification Nodes. These environments MUST provide secure
+   isolation between the agent execution context and the monitoring
+   subsystem to prevent agents from interfering with their own
+   verification processes. The ARE MUST also implement a trusted
+   communication channel to BVNs using protocols such as TLS 1.3
+   [RFC8446] or QUIC [RFC9000] to ensure behavior data integrity
+   during transmission.
+
+   Behavior Verification Nodes serve as the core verification engines
+   within the ABVP architecture and MUST implement the runtime
+   verification protocol defined in Section 6. Each BVN maintains a
+   repository of behavior specifications for agents under its
+   verification authority and continuously processes behavioral
+   evidence received from AREs. BVNs MUST validate incoming behavior
+   data against the appropriate specifications and generate
+   compliance assessments in real-time. Multiple BVNs MAY collaborate
+   in a distributed verification network to provide redundancy and
+   prevent single points of failure. When operating in a distributed
+   configuration, BVNs MUST implement consensus mechanisms to ensure
+   consistent verification results across the network. BVNs MUST also
+   implement rate limiting and resource management to handle high-
+   volume verification requests without compromising verification
+   quality.
+
+   Attestation Authorities provide cryptographic attestation services
+   for verified behavior compliance and MUST maintain secure key
+   management infrastructure capable of generating unforgeable
+   attestations. AAs receive compliance reports from BVNs and MUST
+   verify the authenticity and integrity of these reports before
+   issuing attestations. The AA MUST implement a hierarchical trust
+   model where attestations can be validated through a chain of trust
+   extending to a root certificate authority. AAs SHOULD implement
+   hardware security modules (HSMs) or equivalent trusted execution
+   environments to protect attestation signing keys from compromise.
+   Multiple AAs MAY participate in cross-attestation relationships to
+   provide attestation redundancy and prevent single points of trust
+   failure.
+
+   Verification Clients represent entities that consume ABVP
+   attestations to make trust decisions about agent behavior and MAY
+   include system operators, regulatory bodies, or other automated
+   systems. VCs MUST implement attestation verification capabilities
+   including certificate chain validation and revocation checking as
+   specified in Section 7. The architecture MUST support both real-
+   time verification queries and batch verification processes to
+   accommodate different operational requirements. VCs SHOULD
+   implement local attestation caching with appropriate cache
+   invalidation mechanisms to reduce verification latency while
+   maintaining attestation freshness. The ABVP architecture MUST
+   provide clear separation of duties between verification components
+   to prevent conflicts of interest and ensure independent
+   verification processes.
+
+   The communication between architectural components MUST follow the
+   protocol specifications defined in Section 6, with all inter-
+   component communications authenticated and encrypted. The
+   architecture MUST support both synchronous and asynchronous
+   verification modes to accommodate different agent deployment
+   scenarios and performance requirements. Components MUST implement
+   appropriate logging and audit trail capabilities to support
+   forensic analysis and compliance reporting. The overall
+   architecture SHOULD be designed for horizontal scalability to
+   support large-scale agent deployments while maintaining
+   verification performance and reliability.
+
+5.  Behavior Specification Format
+
+   This section defines the standardized format for expressing agent
+   behavioral policies and constraints within the ABVP framework. The
+   behavior specification format enables machine-readable policy
+   declarations that can be automatically verified during agent
+   runtime. All behavior specifications MUST be expressed in a
+   structured format that supports both human readability and
+   automated processing by verification systems.
+
+   The core behavior specification is structured as a JSON document
+   conforming to the ABVP Behavior Schema. Each specification MUST
+   contain a policy declaration section, verification parameters, and
+   compliance thresholds. The policy declaration section includes
+   behavioral constraints expressed as logical predicates, allowed
+   action sets, and resource utilization bounds. Verification
+   parameters specify the monitoring frequency, sampling rates, and
+   attestation requirements for each declared behavior. Compliance
+   thresholds define the acceptable deviation ranges and tolerance
+   levels for measured behaviors compared to declared specifications.
+
+   Behavioral constraints within the specification are expressed
+   using a formal constraint language based on temporal logic
+   predicates. Each constraint MUST specify a behavioral property
+   (such as "response_time_bound" or "resource_utilization_limit"),
+   an operator (such as "less_than", "equals", or "within_range"),
+   and target values or ranges. Complex behavioral policies MAY be
+   constructed using logical operators (AND, OR, NOT) to combine
+   multiple constraints. The specification format supports
+   hierarchical constraint groupings to represent different
+   operational modes or contextual behavior variations.
+
+   The behavior specification includes a verification requirements
+   section that defines how each behavioral constraint should be
+   monitored and validated. This section MUST specify the required
+   verification frequency, acceptable measurement methods, and
+   cryptographic attestation parameters for each constraint.
+   Verification requirements MAY include sampling strategies for
+   performance-sensitive constraints and continuous monitoring
+   directives for safety-critical behaviors. The specification format
+   also supports conditional verification rules that adjust
+   monitoring parameters based on agent operational context or
+   detected behavioral patterns.
+
+   Each behavior specification MUST include metadata sections
+   containing versioning information, validity periods, and
+   specification dependencies. The metadata enables proper
+   specification lifecycle management and ensures compatibility
+   between agent deployments and verification infrastructure.
+   Specifications SHOULD include digital signatures from authorized
+   policy authors to ensure specification integrity and authenticity.
+   The format supports specification inheritance and composition,
+   allowing complex agent policies to be built from validated
+   behavioral specification components while maintaining verification
+   traceability throughout the composition hierarchy.
+
+6.  Runtime Verification Protocol
+
+   The Runtime Verification Protocol defines the message exchange
+   patterns and procedures that enable continuous monitoring and
+   validation of agent behavior against declared specifications. The
+   protocol operates on a request-response model where Verification
+   Requesters initiate compliance checks, Behavior Monitors observe
+   agent actions, and Compliance Checkers evaluate adherence to
+   behavioral specifications. All protocol participants MUST
+   implement the core verification message set defined in this
+   section, and MAY implement optional extensions for specialized
+   verification scenarios. The protocol is designed to operate over
+   existing transport mechanisms including HTTP/2 [RFC7540],
+   WebSocket [RFC6455], or dedicated secure channels established
+   through TLS 1.3 [RFC8446].
+
+   Verification sessions are initiated through a VERIFICATION_REQUEST
+   message that specifies the agent identifier, behavioral
+   specification reference, verification scope, and temporal
+   parameters for the compliance check. The requesting entity MUST
+   include a cryptographically secure session identifier, timestamp
+   bounds for the verification window, and references to the specific
+   behavioral constraints to be validated. Behavior Monitors respond
+   with MONITORING_DATA messages containing timestamped observations
+   of agent actions, decision traces, and relevant contextual
+   information captured during the specified verification window.
+   These messages MUST include integrity protection through digital
+   signatures and SHOULD include privacy-preserving mechanisms when
+   agent actions contain sensitive information.
+
+   Compliance evaluation proceeds through COMPLIANCE_CHECK messages
+   exchanged between Verification Requesters and designated
+   Compliance Checkers. Each compliance check message MUST reference
+   the behavioral specification being evaluated, include the
+   monitoring data to be assessed, and specify the verification
+   algorithms or rules to be applied. Compliance Checkers process the
+   monitoring data against the behavioral constraints and generate
+   COMPLIANCE_RESULT messages indicating whether the observed
+   behavior satisfies the specified requirements. Results MUST
+   include binary compliance indicators, detailed violation reports
+   when non-compliance is detected, and confidence metrics indicating
+   the reliability of the compliance assessment.
+
+   The protocol includes mechanisms for handling streaming
+   verification scenarios where agent behavior must be validated
+   continuously rather than in discrete sessions. Streaming
+   verification employs persistent connections where MONITORING_DATA
+   messages are transmitted in near real-time as agent actions occur,
+   enabling immediate detection of behavioral deviations. Compliance
+   Checkers maintain running assessments of behavioral compliance and
+   generate COMPLIANCE_ALERT messages when violations are detected or
+   when behavioral patterns indicate potential drift from specified
+   policies. All streaming verification sessions MUST implement flow
+   control mechanisms to prevent resource exhaustion and SHOULD
+   include adaptive sampling techniques to manage verification
+   overhead in high-throughput scenarios.
+
+   Attestation generation occurs through ATTESTATION_REQUEST messages
+   that trigger the creation of cryptographic proofs of compliance
+   assessment results. These requests MUST specify the compliance
+   results to be attested, the cryptographic algorithms to be used
+   for attestation generation, and any additional claims or
+   assertions to be included in the attestation. The resulting
+   ATTESTATION_RESPONSE messages contain digitally signed
+   attestations that bind compliance results to specific agents, time
+   periods, and behavioral specifications through tamper-evident
+   cryptographic structures. Attestations MUST include sufficient
+   information to enable independent verification of compliance
+   claims and SHOULD reference the complete verification audit trail
+   to support forensic analysis when behavioral violations occur.
+
+7.  Compliance Reporting and Attestation
+
+   Compliance reporting in ABVP provides a standardized mechanism for
+   documenting and cryptographically attesting to agent behavior
+   verification results. A compliance report MUST contain the agent
+   identifier, verification period, evaluated behavior
+   specifications, compliance status for each specification, and
+   supporting evidence including behavioral observations and
+   verification computations. Reports MUST be generated at
+   configurable intervals or upon detection of compliance violations,
+   with emergency reports triggered immediately when critical policy
+   violations occur. The reporting format MUST support both human-
+   readable summaries and machine-processable structured data to
+   enable automated compliance monitoring and audit trail generation.
+
+   Cryptographic attestation ensures the integrity and non-
+   repudiation of compliance reports through digital signatures and
+   hash chain mechanisms. Each compliance report MUST be digitally
+   signed by the generating Compliance Checker using keys certified
+   within the ABVP trust framework. Attestations MUST include a
+   timestamp from a trusted time source, the hash of the previous
+   attestation to form a verification chain, and sufficient
+   cryptographic binding to prevent tampering or replay attacks. The
+   attestation format SHOULD follow established standards such as RFC
+   8392 (CWT) or RFC 7519 (JWT) to ensure interoperability with
+   existing security infrastructures.
+
+   Trust chain establishment requires a hierarchical certification
+   authority structure where Compliance Checkers obtain certificates
+   from trusted ABVP Certificate Authorities. Root certificates for
+   ABVP trust anchors MUST be distributed through secure channels and
+   updated using standard certificate management practices as defined
+   in RFC 5280. Verification entities MUST validate the complete
+   certificate chain from the signing Compliance Checker to a trusted
+   root before accepting attestations. Certificate revocation MUST be
+   supported through standard mechanisms such as Certificate
+   Revocation Lists (CRLs) or Online Certificate Status Protocol
+   (OCSP) as specified in RFC 5280 and RFC 6960 respectively.
+
+   The compliance reporting protocol defines specific message formats
+   for distributing attestations to interested parties including
+   agent operators, regulatory authorities, and other verification
+   systems. Compliance reports MAY be distributed through push
+   mechanisms to subscribed entities or pulled on-demand through
+   standardized query interfaces. Report distribution MUST preserve
+   attestation integrity while allowing for appropriate access
+   control based on the sensitivity of the reported agent behaviors.
+   Long-term storage and archival of compliance reports SHOULD
+   implement tamper-evident logging mechanisms to support forensic
+   analysis and regulatory compliance requirements.
+
+8.  Security Considerations
+
+   The ABVP verification infrastructure introduces several security
+   considerations that must be addressed to ensure the integrity and
+   trustworthiness of agent behavior verification. The protocol's
+   reliance on continuous monitoring and attestation creates
+   potential attack vectors that could compromise the verification
+   process itself. Attackers may attempt to subvert verification
+   mechanisms to mask non-compliant agent behavior or to falsely
+   indicate compliance violations where none exist. The verification
+   system MUST be designed with the assumption that both the
+   monitored agents and the verification infrastructure may be
+   targets of sophisticated adversaries seeking to undermine
+   behavioral compliance validation.
+
+   Attestation integrity represents a critical security requirement
+   for ABVP implementations. Verification attestations MUST be
+   cryptographically signed using mechanisms that provide non-
+   repudiation and tamper detection capabilities. The attestation
+   chain MUST be anchored in a trusted root of trust, such as
+   hardware security modules or trusted platform modules, to prevent
+   forgery of compliance attestations. Implementations SHOULD employ
+   time-stamping mechanisms to prevent replay attacks where old
+   attestations are reused to mask current non-compliance. The
+   cryptographic algorithms used for attestation signing MUST conform
+   to current best practices for digital signatures and SHOULD
+   support algorithm agility to enable updates as cryptographic
+   standards evolve. Key management for attestation signing MUST
+   follow established security practices, including regular key
+   rotation and secure key storage.
+
+   The distributed nature of ABVP verification creates additional
+   security challenges related to verification node compromise and
+   Byzantine behavior among verification participants. Verification
+   nodes may be compromised by attackers seeking to manipulate
+   compliance reporting or inject false verification results.
+   Implementations MUST employ consensus mechanisms or threshold-
+   based verification approaches to detect and mitigate the impact of
+   compromised verification nodes. The protocol SHOULD include
+   mechanisms for verification node authentication and authorization
+   to prevent unauthorized participants from joining verification
+   networks. Network communications between verification components
+   MUST be encrypted and authenticated to prevent eavesdropping and
+   man-in-the-middle attacks. Implementations SHOULD implement rate
+   limiting and anomaly detection to identify potential denial-of-
+   service attacks against verification infrastructure.
+
+   Behavioral specification tampering and specification substitution
+   attacks pose significant threats to the ABVP framework's
+   effectiveness. Attackers may attempt to modify behavioral
+   specifications to make non-compliant behavior appear compliant or
+   to introduce specifications that are impossible to verify
+   accurately. Behavioral specifications MUST be cryptographically
+   protected through digital signatures and integrity checking
+   mechanisms. The protocol MUST include versioning and change
+   tracking for behavioral specifications to detect unauthorized
+   modifications. Verification systems SHOULD implement specification
+   validation to detect specifications that contain logical
+   inconsistencies or verification bypasses. Access controls for
+   specification modification MUST follow principle of least
+   privilege and include audit logging of all specification changes.
+
+   The ABVP verification process may inadvertently expose sensitive
+   information about agent operations, internal state, or the systems
+   being monitored. Verification data collection MUST be designed to
+   minimize information disclosure while maintaining verification
+   effectiveness. Implementations SHOULD employ privacy-preserving
+   techniques such as zero-knowledge proofs or selective disclosure
+   mechanisms where appropriate to limit exposure of sensitive
+   operational details. Verification logs and attestations MUST be
+   protected against unauthorized access and SHOULD include data
+   retention policies that balance verification auditability with
+   privacy requirements. The protocol MUST consider the implications
+   of cross-border data flows when verification infrastructure spans
+   multiple jurisdictions with different privacy regulations.
+
+   Side-channel attacks and covert channels represent additional
+   security considerations for ABVP implementations. The verification
+   process itself may create observable patterns that could be
+   exploited by attackers to infer information about agent behavior
+   or verification outcomes. Timing-based side channels in
+   verification operations MAY reveal information about the
+   complexity or results of compliance checking. Implementations
+   SHOULD consider countermeasures such as constant-time operations
+   and traffic analysis resistance where appropriate. The protocol
+   design MUST consider how verification metadata and communication
+   patterns might be used to build profiles of agent behavior that
+   could compromise operational security or reveal sensitive system
+   characteristics.
+
+9.  IANA Considerations
+
+   This document requires the registration of several new namespaces
+   and protocol parameters with the Internet Assigned Numbers
+   Authority (IANA). These registrations are necessary to ensure
+   global uniqueness and interoperability of ABVP implementations
+   across different vendors and deployment environments.
+
+   IANA SHALL establish a new registry titled "Agent Behavior
+   Verification Protocol (ABVP) Parameters" under the "Structured
+   Syntax Suffixes" registry group. This registry SHALL contain three
+   sub-registries: "Behavior Specification Schema Types",
+   "Verification Message Types", and "Attestation Format
+   Identifiers". The registration policy for all ABVP parameter sub-
+   registries SHALL follow the "Specification Required" policy as
+   defined in RFC 8126, with the additional requirement that all
+   registrations include a reference to a publicly available
+   specification document and demonstrate interoperability with at
+   least one existing ABVP implementation.
+
+   The "Behavior Specification Schema Types" sub-registry SHALL
+   maintain identifiers for standardized behavior specification
+   formats as defined in Section 5. Each registration MUST include a
+   unique identifier string, a human-readable description, a
+   reference specification, and version information. Initial
+   registrations SHALL include "abvp-policy-v1" for the base policy
+   specification format and "abvp-constraints-v1" for behavioral
+   constraint specifications. The "Verification Message Types" sub-
+   registry SHALL contain identifiers for protocol messages defined
+   in Section 6, including verification requests, compliance reports,
+   and attestation messages. Registration entries MUST specify the
+   message identifier, purpose, required parameters, and applicable
+   verification contexts.
+
+   The "Attestation Format Identifiers" sub-registry SHALL maintain
+   identifiers for cryptographic attestation formats used in
+   compliance reporting as specified in Section 7. Each registration
+   MUST include the attestation format identifier, cryptographic
+   algorithm requirements, trust model specifications, and
+   interoperability considerations. IANA SHALL reserve the identifier
+   prefix "abvp-" for protocol-specific attestation formats and MAY
+   delegate sub-namespace management to recognized standards bodies
+   for domain-specific attestation requirements. All registry entries
+   MUST include contact information for the registrant and SHALL be
+   subject to periodic review to ensure continued relevance and
+   security adequacy.
+
+Author's Address
+
+   Generated by IETF Draft Analyzer
+   2026-03-01