feat: add draft data, gap analysis report, and workspace config
Some checks failed
CI / test (3.11) (push) Failing after 1m37s
CI / test (3.12) (push) Failing after 57s

This commit is contained in:
2026-04-06 18:47:15 +02:00
parent 4f310407b0
commit 2506b6325a
189 changed files with 62649 additions and 0 deletions

View File

@@ -0,0 +1,289 @@
---
title: "Agent Ecosystem Model (AEM): Architecture and Terminology"
abbrev: "AEM"
category: info
docname: draft-aem-agent-ecosystem-model-00
submissiontype: IETF
number:
date:
v: 3
area: "OPS"
workgroup: "NMOP"
keyword:
- agent ecosystem
- DAG
- HITL
- agentic workflows
author:
-
fullname: TBD
organization: Independent
email: placeholder@example.com
normative:
RFC2119:
RFC8174:
informative:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
--- abstract
This document defines the Agent Ecosystem Model (AEM), a shared
architecture and terminology for building interoperable agent
systems that incorporate DAG-based execution, human-in-the-loop
safety, and graduated assurance levels. AEM is not a protocol.
It is a reference model that establishes common vocabulary and
architectural concepts so that companion specifications (ATD,
HITL, AEPB, APAE) and implementors share a consistent frame of
reference. The model builds on Execution Context Tokens (ECT)
for execution evidence and ACP-DAG-HITL for delegation policy.
--- middle
# Introduction
The IETF AI/agent landscape includes over 260 drafts proposing
protocols for agent communication, identity, safety, and
operations. These drafts share many implicit concepts — tasks,
delegation, workflows, safety checks — but use inconsistent
terminology and incompatible models.
AEM provides a single reference architecture so that:
- Companion drafts (ATD, HITL, AEPB, APAE) share vocabulary.
- Implementors understand how the pieces compose.
- New proposals can position themselves within an existing model
rather than inventing another one.
AEM is deliberately not a protocol. It defines no wire formats,
no endpoints, and no new token types. It is the map; the
companion drafts are the territory.
## Design Principles
1. **ECT is the execution backbone.** All significant agent
actions produce Execution Context Tokens
{{I-D.nennemann-wimse-ect}}. The ecosystem does not define a
second DAG or audit format.
2. **ACP-DAG-HITL is the policy backbone.**
{{I-D.nennemann-agent-dag-hitl-safety}} defines delegation
DAGs and HITL rules. The ecosystem extends these with
operational semantics, not replacement structures.
3. **Same model, different assurance.** The architecture works
identically from a relaxed K8s dev cluster (ECT L1) to a
regulated healthcare environment (ECT L3 with audit ledger).
4. **Protocol-agnostic.** The ecosystem sits above any A2A
protocol. Agents may speak different protocols and still
participate through translation.
# Conventions and Definitions
{::boilerplate bcp14-tagged}
# Terminology {#terminology}
Agent:
: An autonomous software entity that performs tasks, makes
decisions, and communicates with other agents or humans.
Task:
: A discrete unit of work performed by an agent, recorded as a
single ECT node.
Workflow:
: A set of tasks linked by dependencies, forming a DAG.
Identified by the ECT `wid` claim.
DAG (Directed Acyclic Graph):
: The execution graph formed by ECT parent references (`par`
claims). Also used in ACP-DAG-HITL for delegation structure.
Checkpoint:
: An ECT node recording agent state before a consequential
action, enabling rollback.
HITL Point:
: A position in the workflow where human intervention is
required or available, governed by ACP-DAG-HITL rules.
Override:
: A human-initiated command that alters an agent's autonomous
operation, taking precedence over the agent's own decisions.
Trust Score:
: A floating-point value in \[0.0, 1.0\] representing one
agent's assessed reliability of another.
Protocol Binding:
: The mapping between ecosystem semantics and a specific A2A
communication protocol.
Assurance Level:
: The degree of cryptographic and audit protection applied to
ECTs: L1 (unsigned JSON), L2 (signed JWT), L3 (signed +
audit ledger). Defined by {{I-D.nennemann-wimse-ect}}.
# Architectural Model {#architecture}
The ecosystem is organized in four layers:
~~~
┌─────────────────────────────────────────────────────┐
│ Policy Layer │
│ ACP-DAG-HITL: delegation DAG, HITL rules, │
│ node constraints, trust thresholds │
├─────────────────────────────────────────────────────┤
│ Semantics Layer │
│ ATD: execution order, checkpoints, rollback, │
│ circuit breakers, resource hints │
│ HITL: override levels, approval gates, escalation │
│ AEPB: capability ads, negotiation, translation │
│ APAE: trust scoring, behavior verification, │
│ provenance, assurance profiles │
├─────────────────────────────────────────────────────┤
│ Evidence Layer │
│ ECT: signed DAG of execution records (L1/L2/L3) │
│ inp_hash/out_hash, ext claims, audit ledger │
├─────────────────────────────────────────────────────┤
│ Identity Layer │
│ WIMSE / X.509 / OAuth / JWK: agent identity │
└─────────────────────────────────────────────────────┘
~~~
{: #fig-stack title="Ecosystem Layer Stack"}
Identity Layer:
: Answers "who is this agent?" AEM does not define identity
mechanisms; it assumes WIMSE, X.509, OAuth, or equivalent.
Evidence Layer:
: Answers "what did this agent do?" ECT provides per-task
signed records linked into a DAG, with three assurance levels.
Semantics Layer:
: Answers "what does it mean and what to do about it?" The
four companion drafts define operational semantics on top of
ECT:
- **ATD** (Agent Task DAG): execution order, checkpoints,
rollback, circuit breakers, resource hints.
- **HITL** (Human-in-the-Loop): override levels, approval
gates, escalation paths, explainability.
- **AEPB** (Agent Ecosystem Protocol Binding): capability
advertisement, protocol negotiation, translation gateways,
agent lifecycle.
- **APAE** (Assurance Profiles): dynamic trust scoring,
behavior verification, data provenance, assurance profiles.
Policy Layer:
: Answers "what's allowed?" ACP-DAG-HITL defines delegation
constraints and HITL trigger rules. Companion drafts extend
`constraints` with protocol-specific fields (trust thresholds,
checkpoint policies, protocol restrictions).
## How ECT Extensions Work
Each companion draft defines `ext` claim namespaces on ECT:
| Draft | `ext` prefix | Example claims |
|-------|-------------|----------------|
| ATD | `atd.*` | `atd.reversible`, `atd.severity`, `atd.circuit_state` |
| HITL | `hitl.*` | `hitl.level`, `hitl.operator_id`, `hitl.prior_state` |
| AEPB | `aepb.*` | `aepb.source_protocol`, `aepb.dest_protocol` |
| APAE | `apae.*` | `apae.trust_score`, `apae.confidence`, `apae.hops` |
{: #fig-ext title="ECT Extension Namespaces"}
## How Policy Extensions Work
Each companion draft defines `constraints` fields on
ACP-DAG-HITL DAG nodes:
| Draft | Constraint fields |
|-------|------------------|
| ATD | `atd.checkpoint_policy`, `atd.circuit_threshold` |
| HITL | (uses HITL rules directly) |
| AEPB | `aepb.allowed_protocols`, `aepb.max_translation_hops` |
| APAE | `apae.min_trust`, `apae.min_confidence`, `apae.assurance_profile` |
{: #fig-constraints title="ACP-DAG-HITL Node Constraint Extensions"}
# Assurance as an Orthogonal Axis {#assurance}
The entire semantics layer operates identically at all ECT
assurance levels. The DAG structure, HITL processing, trust
scoring, and protocol translation are the same whether the ECT
is unsigned JSON (L1) or a ledger-committed signed JWT (L3).
What changes across levels is the security envelope:
| Property | L1 | L2 | L3 |
|----------|----|----|-----|
| Structured execution records | Yes | Yes | Yes |
| DAG validation | Yes | Yes | Yes |
| Non-repudiation | No | Yes | Yes |
| Tamper detection | Transport only | Signature | Signature + ledger |
| Regulatory audit trail | No | No | Yes |
{: #fig-assurance title="Assurance Level Properties"}
A deployment MAY use different levels for different workflows.
Internal dev pipelines might use L1; cross-org integrations L2;
regulated clinical workflows L3.
# Protocol Agnosticism {#agnosticism}
The ecosystem layer sits above any A2A communication protocol.
Agents communicate via their native protocol (A2A, MCP, SLIM,
uACP, etc.) while the Execution-Context HTTP header
{{I-D.nennemann-wimse-ect}} carries ECTs alongside protocol
messages.
When two agents speak different protocols, a translation gateway
(defined by AEPB) converts between protocols while preserving
ECT DAG continuity. The translation hop is itself an ECT node,
so the cross-protocol path is one auditable DAG.
# Companion Draft Summary {#companions}
| Draft | Abbrev | Concern | Gaps Addressed |
|-------|--------|---------|----------------|
| Agent Task DAG | ATD | Execution, checkpoints, rollback | #1 Resource Mgmt, #3 Error Recovery |
| Human-in-the-Loop | HITL | Override, approval, escalation | #7 Human Override, #11 Explainability |
| Protocol Binding | AEPB | Interop, translation, lifecycle | #4 Cross-Protocol, #5 Lifecycle |
| Assurance Profiles | APAE | Trust, verification, provenance | #2 Behavior Verification, #8 Cross-Domain, #9 Dynamic Trust, #12 Provenance |
{: #fig-companions title="Companion Draft Family"}
Together with ECT (evidence) and ACP-DAG-HITL (policy), these
six documents cover all 3 critical and 6 high-severity gaps
identified in the IETF AI/agent draft landscape.
# Security Considerations
AEM defines no protocol mechanisms and therefore introduces no
direct security considerations. Security properties are
inherited from the evidence layer (ECT assurance levels) and
the policy layer (ACP-DAG-HITL validation).
Implementors MUST ensure that all layers are consistently
configured: an L3 ECT deployment provides no additional
assurance if the policy layer accepts unsigned tokens.
# IANA Considerations
This document has no IANA actions.
--- back
# Acknowledgments
{:numbered="false"}
This architecture builds on the Execution Context Token
specification {{I-D.nennemann-wimse-ect}} and the Agent Context
Policy Token {{I-D.nennemann-agent-dag-hitl-safety}}.

View File

@@ -0,0 +1,461 @@
---
title: "Agent Ecosystem Model (AEM): Architecture and Terminology"
abbrev: "AEM"
category: info
docname: draft-aem-agent-ecosystem-model-01
submissiontype: IETF
number:
date:
v: 3
area: "OPS"
workgroup: "NMOP"
keyword:
- agent ecosystem
- DAG
- HITL
- agentic workflows
author:
-
fullname: TBD
organization: Independent
email: placeholder@example.com
normative:
RFC2119:
RFC8174:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
informative:
RFC9334:
RFC7519:
RFC8615:
--- abstract
This document defines the Agent Ecosystem Model (AEM), a shared
architecture and terminology for building interoperable agent
systems that incorporate DAG-based execution, human-in-the-loop
safety, and graduated assurance levels. AEM is not a protocol.
It is a reference model that establishes common vocabulary and
architectural concepts so that companion specifications (ATD,
HITL, AEPB, APAE) and implementors share a consistent frame of
reference. The model builds on Execution Context Tokens (ECT)
for execution evidence and ACP-DAG-HITL for delegation policy.
--- middle
# Introduction
The IETF AI/agent landscape includes over 260 drafts proposing
protocols for agent communication, identity, safety, and
operations. These drafts share many implicit concepts — tasks,
delegation, workflows, safety checks — but use inconsistent
terminology and incompatible models.
AEM provides a single reference architecture so that:
- Companion drafts (ATD, HITL, AEPB, APAE) share vocabulary.
- Implementors understand how the pieces compose.
- New proposals can position themselves within an existing model
rather than inventing another one.
AEM is deliberately not a protocol. It defines no wire formats,
no endpoints, and no new token types. It is the map; the
companion drafts are the territory.
## Design Principles
1. **ECT is the execution backbone.** All significant agent
actions produce Execution Context Tokens
{{I-D.nennemann-wimse-ect}}. The ecosystem does not define a
second DAG or audit format.
2. **ACP-DAG-HITL is the policy backbone.**
{{I-D.nennemann-agent-dag-hitl-safety}} defines delegation
DAGs and HITL rules. The ecosystem extends these with
operational semantics, not replacement structures.
3. **Same model, different assurance.** The architecture works
identically from a relaxed K8s dev cluster (ECT L1) to a
regulated healthcare environment (ECT L3 with audit ledger).
4. **Protocol-agnostic.** The ecosystem sits above any A2A
protocol. Agents may speak different protocols and still
participate through translation.
# Conventions and Definitions
{::boilerplate bcp14-tagged}
# Terminology {#terminology}
Agent:
: An autonomous software entity that performs tasks, makes
decisions, and communicates with other agents or humans.
Task:
: A discrete unit of work performed by an agent, recorded as a
single ECT node.
Workflow:
: A set of tasks linked by dependencies, forming a DAG.
Identified by the ECT `wid` claim {{I-D.nennemann-wimse-ect}}.
DAG (Directed Acyclic Graph):
: The execution graph formed by ECT parent references (`par`
claims). Also used in ACP-DAG-HITL for delegation structure.
Checkpoint:
: An ECT node recording agent state before a consequential
action, enabling rollback. Fully specified in ATD.
HITL Point:
: A position in the workflow where human intervention is
required or available, governed by ACP-DAG-HITL rules.
Override:
: A human-initiated command that alters an agent's autonomous
operation, taking precedence over the agent's own decisions.
Fully specified in HITL.
Trust Score:
: A floating-point value in \[0.0, 1.0\] representing one
agent's assessed reliability of another. Updated using an
AIMD model; fully specified in APAE.
Protocol Binding:
: The mapping between ecosystem semantics and a specific A2A
communication protocol. Fully specified in AEPB.
Assurance Level:
: The degree of cryptographic and audit protection applied to
ECTs, defined in {{I-D.nennemann-wimse-ect}}:
| Level | ECT Format | Non-repudiation | Tamper detection | Audit ledger |
|-------|-----------|----------------|-----------------|-------------|
| L1 | Unsigned JSON | No | Transport only | No |
| L2 | Signed JWT | Yes | Signature | No |
| L3 | Signed JWT | Yes | Signature | Yes (ledger-committed) |
{: #fig-levels title="ECT Assurance Levels"}
Assurance Profile:
: A named configuration (Relaxed, Standard, Regulated) selecting
which mechanisms are required at a given deployment. Fully
specified in APAE.
Blast Radius:
: The set of agents and systems affected by a single failure.
Translation Gateway:
: A service converting messages between two agent protocols,
recording each hop as an ECT DAG node. Fully specified in AEPB.
# Architectural Model {#architecture}
The ecosystem is organized in four layers:
~~~
┌─────────────────────────────────────────────────────┐
│ Policy Layer │
│ ACP-DAG-HITL: delegation DAG, HITL rules, │
│ node constraints, trust thresholds │
├─────────────────────────────────────────────────────┤
│ Semantics Layer │
│ ATD: execution order, checkpoints, rollback, │
│ circuit breakers, resource hints │
│ HITL: override levels, approval gates, escalation │
│ AEPB: capability ads, negotiation, translation │
│ APAE: trust scoring, behavior verification, │
│ provenance, assurance profiles │
├─────────────────────────────────────────────────────┤
│ Evidence Layer │
│ ECT: signed DAG of execution records (L1/L2/L3) │
│ inp_hash/out_hash, ext claims, audit ledger │
├─────────────────────────────────────────────────────┤
│ Identity Layer │
│ WIMSE / X.509 / OAuth / JWK: agent identity │
└─────────────────────────────────────────────────────┘
~~~
{: #fig-stack title="Ecosystem Layer Stack"}
Identity Layer:
: Answers "who is this agent?" AEM does not define identity
mechanisms; it assumes WIMSE, X.509, OAuth, or equivalent.
Evidence Layer:
: Answers "what did this agent do?" ECT provides per-task
signed records linked into a DAG, with three assurance levels.
Semantics Layer:
: Answers "what does it mean and what to do about it?" The
four companion drafts define operational semantics on top of
ECT:
- **ATD** (Agent Task DAG): execution order, checkpoints,
rollback, circuit breakers, resource hints.
- **HITL** (Human-in-the-Loop): override levels, approval
gates, escalation paths, explainability.
- **AEPB** (Agent Ecosystem Protocol Binding): capability
advertisement, protocol negotiation, translation gateways,
agent lifecycle.
- **APAE** (Assurance Profiles): dynamic trust scoring,
behavior verification, data provenance, assurance profiles.
Policy Layer:
: Answers "what's allowed?" ACP-DAG-HITL defines delegation
constraints and HITL trigger rules. Companion drafts extend
`constraints` with protocol-specific fields (trust thresholds,
checkpoint policies, protocol restrictions).
## How ECT Extensions Work {#ect-ext}
Each companion draft defines `ext` claim namespaces on ECT:
| Draft | `ext` prefix | Example claims |
|-------|-------------|----------------|
| ATD | `atd.*` | `atd.reversible`, `atd.severity`, `atd.circuit_state` |
| HITL | `hitl.*` | `hitl.level`, `hitl.operator_id`, `hitl.prior_state` |
| AEPB | `aepb.*` | `aepb.source_protocol`, `aepb.dest_protocol` |
| APAE | `apae.*` | `apae.trust_score`, `apae.confidence`, `apae.hops` |
{: #fig-ext title="ECT Extension Namespaces"}
No draft MAY use another draft's `ext` namespace without a
normative reference to that draft.
## How Policy Extensions Work {#policy-ext}
Each companion draft defines `constraints` fields on
ACP-DAG-HITL DAG nodes:
| Draft | Constraint fields |
|-------|------------------|
| ATD | `atd.checkpoint_policy`, `atd.circuit_threshold` |
| HITL | (uses ACP-DAG-HITL HITL rule fields directly) |
| AEPB | `aepb.allowed_protocols`, `aepb.max_translation_hops` |
| APAE | `apae.min_trust`, `apae.min_confidence`, `apae.assurance_profile` |
{: #fig-constraints title="ACP-DAG-HITL Node Constraint Extensions"}
# Assurance as an Orthogonal Axis {#assurance}
The entire semantics layer operates identically at all ECT
assurance levels. The DAG structure, HITL processing, trust
scoring, and protocol translation are the same whether the ECT
is unsigned JSON (L1) or a ledger-committed signed JWT (L3).
What changes across levels is the security envelope (see
{{fig-levels}}). A deployment MAY use different levels for
different workflows. Internal dev pipelines might use L1;
cross-org integrations L2; regulated clinical workflows L3.
Implementations MUST ensure consistency across layers: an L3
evidence configuration provides no additional assurance if the
policy layer accepts unsigned tokens.
# Protocol Agnosticism {#agnosticism}
The ecosystem layer sits above any A2A communication protocol.
Agents communicate via their native protocol (A2A, MCP, SLIM,
uACP, etc.) while the `Execution-Context` HTTP header
{{I-D.nennemann-wimse-ect}} carries ECTs alongside protocol
messages.
When two agents speak different protocols, a translation gateway
(defined by AEPB) converts between protocols while preserving
ECT DAG continuity. The translation hop is itself an ECT node,
so the cross-protocol path is one auditable DAG.
# Relationship to Existing Standards {#standards}
The ecosystem builds on existing IETF and industry standards.
It does not replace any of them.
| Standard | Scope | Relationship to AEM |
|----------|-------|---------------------|
| WIMSE (draft-ietf-wimse-arch) | Workload identity and security context propagation | Identity Layer; AEM assumes WIMSE for agent credentials and context propagation. |
| ECT (I-D.nennemann-wimse-ect) | JWT-based execution evidence; DAG linkage via `par` | Evidence Layer; every significant action in the ecosystem produces an ECT. |
| ACP-DAG-HITL (I-D.nennemann-agent-dag-hitl-safety) | Delegation DAG policy; HITL trigger rules | Policy Layer; ATD/HITL/AEPB/APAE extend `constraints` fields, not replace the policy language. |
| OAuth 2.0 / RAR (RFC9396) | Authorization for API access | Identity Layer; operators and agents authenticate to HITL endpoints and capability documents via OAuth. |
| RATS (RFC9334) | Remote attestation for verifying evidence freshness | Informative to APAE Regulated profile; behavior verification attestations are RATS-compatible. |
| SPIFFE/SPIRE | Workload identity URI scheme (`spiffe://`) | Identity Layer; agent identities in ECT `sub` and ACP-DAG-HITL node `agent` fields use SPIFFE URIs by convention. |
{: #fig-standards title="Relationship to Existing Standards"}
## Working Group Targets
| Companion Draft | Suggested WG | Rationale |
|----------------|-------------|-----------|
| AEM (this document) | NMOP | Informational reference model for network operations automation. |
| ATD | NMOP | Execution semantics and error recovery for network agent workflows. |
| HITL | NMOP or OPS | Human override for autonomous network management. |
| AEPB | DISPATCH or ART | Protocol binding and interoperability layer; dispatch to appropriate WG. |
| APAE | RATS or Security Dispatch | Attestation-based trust and assurance profiles for agents. |
{: #fig-wgs title="Suggested Working Group Targets"}
# Companion Draft Summary {#companions}
| Draft | Abbrev | Concern | Gaps Addressed | Normative/Informative |
|-------|--------|---------|----------------|----------------------|
| Agent Task DAG | ATD | Execution, checkpoints, rollback, circuit breakers | #1 Resource Mgmt, #3 Error Recovery | Normative |
| Human-in-the-Loop | HITL | Override, approval, escalation, explainability | #7 Human Override, #11 Explainability | Normative |
| Protocol Binding | AEPB | Interop, translation, lifecycle | #4 Cross-Protocol, #5 Lifecycle | Normative |
| Assurance Profiles | APAE | Trust, verification, provenance, dual-regime | #2 Behavior Verification, #8 Cross-Domain, #9 Dynamic Trust, #12 Provenance | Informative/Normative |
{: #fig-companions title="Companion Draft Family"}
Together with ECT (evidence) and ACP-DAG-HITL (policy), these
six documents cover all 3 critical and 6 high-severity gaps
identified in the IETF AI/agent draft landscape analysis.
# Implementation Guidance {#implementation}
## Choosing an Assurance Level
Operators select the assurance level based on deployment context:
Relaxed (L1):
: Appropriate for internal development, testing, and
observability pipelines. No cryptographic overhead.
Operators SHOULD NOT use L1 where ECT records could be
relied upon as evidence in disputes.
Standard (L2):
: Appropriate for production cross-organization deployments.
Signed ECTs provide non-repudiation. RECOMMENDED as the
default for any deployment where agents cross trust domains.
Regulated (L3):
: Required for deployments subject to regulatory audit
requirements (healthcare, finance, critical infrastructure).
ECTs are committed to an append-only audit ledger.
Operators MUST use L3 when a regulatory framework mandates
tamper-evident audit trails.
## Minimum Viable Implementation
An implementation is AEM-compliant if it satisfies:
1. **Evidence**: Emits ECTs for all consequential actions.
MAY use L1 initially.
2. **Policy**: Evaluates ACP-DAG-HITL node constraints before
delegating tasks.
3. **Checkpoints**: Implements ATD §4 (checkpoints before
consequential actions). MUST declare `atd.reversible`.
4. **HITL endpoint**: Implements HITL `/.well-known/hitl/override`
and responds within 1 second.
5. **Capability document**: Serves AEPB `/.well-known/aepb` so
peers can discover protocol support.
The following are OPTIONAL at L1 but REQUIRED at L2+:
- Cryptographic signing of ECTs.
- APAE trust scoring.
- Behavior verification.
The following are REQUIRED only at L3 (Regulated profile):
- Audit ledger commitment.
- Continuous behavior verification.
- Provenance claims on data-transforming ECT nodes.
## Upgrade Path
Upgrading from L1 to L2:
: Add a signing key (WIMSE WIT or X.509). Update ECT emission
to sign tokens. Update all agents to verify signatures.
No protocol changes needed; ECT format is compatible.
Upgrading from L2 to L3:
: Configure an audit ledger endpoint. Update ECT emission to
commit each ECT. Enable APAE continuous behavior
verification. Enable provenance claims.
Operators MUST NOT downgrade assurance level during an active
workflow.
# Security Considerations
## Threat Model
The AEM threat model covers the following adversary classes:
**Compromised Agent**: An agent that emits false ECTs, fabricates
errors, or attempts unauthorized rollbacks. Mitigated by ECT
signature verification (L2+) and ACP-DAG-HITL policy validation.
**Rogue Operator**: A human who issues unauthorized overrides.
Mitigated by HITL authentication requirements (signed JWTs,
mutual TLS) and multi-operator approval for Level 4 TAKEOVER.
**Translation Gateway Attack**: A malicious or compromised
gateway that alters message content in transit. Mitigated by
ECT `inp_hash`/`out_hash` integrity checks; receivers MUST
detect hash mismatches.
**Trust Score Manipulation**: An agent accumulates high trust
through benign behavior, then executes a malicious action.
Mitigated by APAE double-penalty for `policy_violation` events
and anomaly detection.
**Downgrade Attack**: An attacker forces use of L1 ECTs where
L2+ is required. Mitigated by explicit assurance level checks
in ACP-DAG-HITL constraints (`apae.assurance_profile` field).
## Layer Consistency Requirement
Implementations MUST configure the semantics, evidence, and
policy layers consistently. Specifically:
- An L3 evidence deployment MUST NOT accept L1 ECTs as proof
of action in audit or policy decisions.
- A Regulated assurance profile MUST be paired with L3 ECTs.
- HITL Level 2+ (approval required) MUST be authenticated.
## Translation Gateway Supply Chain
Translation gateways are privileged intermediaries: they have
access to plaintext message content and can inject ECT nodes.
Operators MUST:
- Authenticate gateways using the same identity mechanisms as
agents (WIMSE/SPIFFE).
- Audit gateway ECT nodes at L2+ for tamper detection.
- Limit `aepb.max_translation_hops` to prevent unbounded
delegation chains through untrusted gateways.
# IANA Considerations
## AEM Ecosystem Extension Registry
This document requests the creation of the "AEM Ecosystem
Extension Registry" under IANA. This registry collects:
1. **ECT Extension Namespaces**: Companion draft `ext` claim
prefixes (see {{fig-ext}}).
2. **ACP-DAG-HITL Constraint Field Namespaces**: Companion draft
`constraints` field prefixes (see {{fig-constraints}}).
3. **ECT `exec_act` Values**: All `exec_act` strings registered
by companion drafts (see each companion's IANA section).
Registration policy: Specification Required.
Initial entries: as defined in {{fig-ext}}, {{fig-constraints}},
and the companion draft `exec_act` registrations.
--- back
# Acknowledgments
{:numbered="false"}
This architecture builds on the Execution Context Token
specification {{I-D.nennemann-wimse-ect}} and the Agent Context
Policy Token {{I-D.nennemann-agent-dag-hitl-safety}}. The
working group targets in {{fig-wgs}} reflect the current IETF
AI/agent draft landscape analysis.

View File

@@ -0,0 +1,397 @@
---
title: "Agent Error Recovery and Rollback (AERR)"
abbrev: "AERR"
category: std
docname: draft-aerr-agent-error-recovery-rollback-00
submissiontype: IETF
number:
date:
v: 3
area: "OPS"
workgroup: "NMOP"
keyword:
- error recovery
- rollback
- circuit breaker
- agentic workflows
- execution context
author:
-
fullname: Generated by IETF Draft Analyzer
organization: Independent
email: placeholder@example.com
normative:
RFC7519:
RFC7515:
RFC9110:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
informative:
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
--- abstract
This document defines the Agent Error Recovery and Rollback (AERR)
protocol, a standard for handling errors, cascading failures, and
rollback in multi-agent systems. AERR defines three mechanisms:
state checkpoints recorded as Execution Context Token (ECT) DAG
nodes, a circuit breaker pattern to contain cascading failures,
and a rollback protocol that walks the ECT DAG backwards to revert
agent actions to a known-good state. By building on ECT, AERR
inherits cryptographic audit trails, assurance levels, and DAG
validation without inventing parallel infrastructure.
--- middle
# Introduction
The IETF AI/agent landscape includes 60 drafts on autonomous
network operations but none that standardize error recovery. When
an autonomous agent misconfigures a router, allocates resources
incorrectly, or triggers a cascade of failures across a multi-agent
system, there is no standard mechanism for detecting the failure,
containing its blast radius, or reverting to a safe state.
AERR borrows proven patterns from distributed systems -- checkpoints
from database transactions, circuit breakers from microservice
architectures, rollback from version control -- and adapts them for
AI agent workflows. Rather than inventing its own audit and
tracing layer, AERR records all checkpoints, errors, and rollbacks
as ECT DAG nodes {{I-D.nennemann-wimse-ect}}, giving every
recovery action a cryptographic proof chain.
Design principles:
1. Agents that take consequential actions MUST be able to undo
them, or MUST declare them irreversible upfront.
2. Failure containment takes priority over failure diagnosis.
3. The protocol adds minimal overhead to the happy path.
# Conventions and Definitions
{::boilerplate bcp14-tagged}
Checkpoint:
: An ECT recording an agent's state hash before a consequential
action, providing a restore point for rollback.
Circuit Breaker:
: A mechanism that stops an agent from propagating requests to a
failing downstream agent, preventing cascading failures.
Rollback:
: The process of reverting an agent's actions and state to a
previously recorded checkpoint, walking the ECT DAG backwards.
Blast Radius:
: The set of agents and systems affected by a single agent's
failure, determinable by traversing the ECT DAG forward from the
failing node.
# Problem Statement
Consider a network operations scenario: Agent A instructs Agent B
to update firewall rules, which causes Agent C's traffic monitoring
to fail, which causes Agent D to misclassify traffic. Today each
agent handles errors independently. There is no standard way for
Agent D to signal that the root cause is upstream, for the cascade
to be halted, or for the chain of actions to be rolled back.
The ECT DAG {{I-D.nennemann-wimse-ect}} already records causal
ordering of agent actions via `par` references. AERR adds
checkpoint semantics, error propagation, and rollback operations
on top of this existing structure.
# Checkpoint Mechanism {#checkpoints}
An AERR-compliant agent MUST create a checkpoint ECT before any
action it classifies as consequential. An action is consequential
if it modifies external state (e.g., network config, database
records, API calls with side effects).
## Checkpoint as ECT
A checkpoint is an ECT with:
- `exec_act`: `"aerr:checkpoint"`
- `par`: the `jti` of the preceding task ECT in the workflow
- `out_hash`: SHA-256 hash of the agent's state snapshot at
checkpoint time (for rollback integrity verification)
The `ext` claim carries AERR-specific metadata:
~~~json
{
"ext": {
"aerr.action_type": "config_update",
"aerr.target": "router-07.example.com",
"aerr.reversible": true,
"aerr.rollback_uri": "https://agent-b.example.com/aerr/rollback",
"aerr.ttl": 86400
}
}
~~~
{: #fig-checkpoint title="Checkpoint ECT Extension Claims"}
The `aerr.reversible` field MUST be present. If `false`, the
agent declares that this action cannot be automatically undone
and rollback requests MUST be escalated to a human operator via
the HITL mechanism {{I-D.nennemann-agent-dag-hitl-safety}}.
Agents MAY create hierarchical checkpoints using the ECT DAG: a
parent checkpoint ECT with `par` references to multiple child
checkpoint ECTs. Rolling back the parent rolls back all children.
## Checkpoint Storage
Checkpoint ECTs MUST be stored for at least the duration specified
by `aerr.ttl`. At L3 {{I-D.nennemann-wimse-ect}}, checkpoints
are automatically preserved in the audit ledger. At L1 and L2,
agents MUST store checkpoints in durable local storage that
survives agent restarts.
# Error Signaling {#error-signals}
When an agent detects an error, it MUST produce an error ECT and
propagate it to affected agents in the DAG.
## Error ECT
An error signal is an ECT with:
- `exec_act`: `"aerr:error"`
- `par`: the `jti` of the checkpoint ECT associated with the
failing action
The `ext` claim carries error details:
~~~json
{
"ext": {
"aerr.severity": "critical",
"aerr.error_type": "action_failed",
"aerr.description": "BGP session did not establish",
"aerr.checkpoint_id": "550e8400-e29b-41d4-a716-446655440001",
"aerr.upstream_errors": []
}
}
~~~
{: #fig-error title="Error ECT Extension Claims"}
Severity levels: `info`, `warning`, `error`, `critical`.
Error types: `action_failed`, `timeout`, `constraint_violation`,
`resource_exhausted`, `upstream_cascade`, `unknown`.
## Error Propagation via DAG
When an agent receives an error ECT caused by an action it
initiated, it MUST either:
(a) Attempt automatic rollback of its checkpoint ({{rollback}}), or
(b) Escalate to its operator if the action was irreversible.
The `aerr.upstream_errors` array allows agents to chain error
context by referencing `jti` values of predecessor error ECTs,
building a causal trace from symptom to root cause through the
DAG.
## HITL Escalation
When an error requires human intervention, the error ECT SHOULD
trigger a HITL rule per {{I-D.nennemann-agent-dag-hitl-safety}}.
Example policy:
~~~json
{
"hitl": {
"rules": [{
"id": "r-critical-error",
"trigger": {
"kind": "keyword_match",
"op": "eq",
"value": "critical",
"input_ref": "ext.aerr.severity"
},
"required_role": "operator:oncall",
"action": "escalate",
"allow_override": true,
"override_action": "continue"
}]
}
}
~~~
{: #fig-hitl-error title="HITL Policy for Critical Errors"}
# Circuit Breaker Pattern {#circuit-breaker}
Each agent MUST implement a circuit breaker for every downstream
agent it communicates with.
## States
CLOSED (normal):
: Requests flow through. The agent tracks the error rate over a
sliding window (default: 60 seconds).
OPEN (failure detected):
: When the error rate exceeds a threshold (default: 50% over the
window), the breaker opens. All requests to the downstream
agent are immediately rejected with `aerr.error_type`:
`circuit_open`. The agent MUST produce an error ECT and emit
it to upstream peers.
HALF-OPEN (recovery probe):
: After a cooldown period (default: 30 seconds), the breaker
allows a single probe request. If it succeeds, the breaker
returns to CLOSED. If it fails, it returns to OPEN with doubled
cooldown (exponential backoff, max 300 seconds).
## State Change ECTs
Each circuit breaker state change MUST produce an ECT:
- `exec_act`: `"aerr:circuit_open"`, `"aerr:circuit_half_open"`,
or `"aerr:circuit_closed"`
- `par`: the `jti` of the error ECT that triggered the transition
This records the health topology of the agent network in the ECT
DAG, queryable from the audit ledger at L3.
## Observability
Agents MUST expose circuit breaker state at:
~~~
GET /aerr/circuits
~~~
Response:
~~~json
{
"circuits": [{
"downstream_agent": "spiffe://example.com/agent/router-mgr",
"state": "open",
"error_rate": 0.75,
"last_failure_ect": "550e8400-e29b-41d4-a716-446655440099",
"cooldown_remaining_s": 22
}]
}
~~~
{: #fig-circuits title="Circuit Breaker Status"}
# Rollback Protocol {#rollback}
## Rollback Request
A rollback is initiated by sending an HTTP POST to the target
agent's rollback endpoint:
~~~
POST /aerr/rollback HTTP/1.1
Content-Type: application/json
Execution-Context: <rollback-request-ECT>
{
"rollback_id": "urn:uuid:...",
"checkpoint_id": "550e8400-e29b-41d4-a716-446655440001",
"reason": "Upstream action caused cascading failure",
"cascade": true
}
~~~
{: #fig-rollback-req title="Rollback Request"}
The request MUST include an ECT in the Execution-Context header
with `exec_act`: `"aerr:rollback_request"` and `par` referencing
the error ECT that motivated the rollback.
When `cascade` is `true`, the receiving agent MUST also initiate
rollback of any downstream checkpoints created as a consequence
of the checkpointed action. The ECT DAG's `par` chain identifies
these downstream actions.
## Rollback Response
The agent produces a rollback result ECT with:
- `exec_act`: `"aerr:rollback_complete"` (or `"aerr:rollback_escalated"`)
- `par`: the `jti` of the rollback request ECT
- `out_hash`: SHA-256 hash of the agent's state after rollback
~~~json
{
"ext": {
"aerr.rollback_id": "urn:uuid:...",
"aerr.status": "completed",
"aerr.state_hash_before": "sha256:...",
"aerr.state_hash_after": "sha256:...",
"aerr.cascaded": [
{"agent": "spiffe://example.com/agent/monitor", "status": "completed"},
{"agent": "spiffe://example.com/agent/classify", "status": "escalated"}
]
}
}
~~~
{: #fig-rollback-resp title="Rollback Result ECT"}
Status values: `completed`, `partial`, `escalated`, `failed`.
`escalated` means the action was irreversible and a human operator
has been notified via HITL. `partial` means some but not all
downstream rollbacks succeeded.
## Idempotency
Agents MUST implement idempotent rollback: receiving the same
`rollback_id` twice MUST return the same result without
re-executing the rollback.
# Security Considerations
Rollback requests are sensitive operations. Agents MUST
authenticate rollback requests via the ECT signature chain -- only
agents whose ECTs appear in the same workflow DAG (identified by
`wid`) SHOULD be authorized to request rollback.
Checkpoint ECTs contain `out_hash` of agent state but not the
state itself. Agents MUST encrypt stored state snapshots at rest.
Circuit breaker status exposes system health topology. The
`/aerr/circuits` endpoint SHOULD be access-controlled.
Malicious agents could emit false error ECTs to trigger rollbacks.
Agents SHOULD verify that error ECTs reference valid checkpoint
`jti` values from their own workflow DAG before initiating
rollback. At L2 and L3, ECT signatures prevent forgery.
# IANA Considerations
This document requests the following IANA registrations:
1. An "AERR Error Type" registry under Specification Required
policy. Initial entries: `action_failed`, `timeout`,
`constraint_violation`, `resource_exhausted`,
`upstream_cascade`, `circuit_open`, `unknown`.
2. Registration of `exec_act` values `aerr:checkpoint`,
`aerr:error`, `aerr:rollback_request`, `aerr:rollback_complete`,
`aerr:circuit_open`, `aerr:circuit_half_open`,
`aerr:circuit_closed` in a future ECT action type registry.
--- back
# Acknowledgments
{:numbered="false"}
This document builds on the Execution Context Token specification
{{I-D.nennemann-wimse-ect}} for DAG-based audit trails and the
Agent Context Policy Token {{I-D.nennemann-agent-dag-hitl-safety}}
for HITL escalation of irreversible actions.

View File

@@ -0,0 +1,309 @@
Internet-Draft AI/Agent WG
Intended status: Standards Track March 2026
Expires: September 15, 2026
Agent Error Recovery and Rollback (AERR)
draft-aerr-agent-error-recovery-rollback-00
Abstract
This document defines the Agent Error Recovery and Rollback
(AERR) protocol, a lightweight standard for handling errors,
cascading failures, and rollback in multi-agent systems.
Autonomous AI agents increasingly make unsupervised decisions,
yet no standard exists for how agents checkpoint state, signal
errors to peers, contain cascading failures, or roll back
autonomous decisions gone wrong. AERR defines three mechanisms:
state checkpoints that agents create before consequential
actions, a circuit breaker pattern to contain cascading failures
across agent networks, and a rollback protocol for reverting
agent actions to a known-good state. The protocol is transport-
agnostic and builds on JSON and standard HTTP semantics.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
This document is intended to have Standards Track status.
Distribution of this memo is unlimited.
Table of Contents
1. Introduction
2. Terminology
3. Problem Statement
4. Checkpoint Mechanism
5. Error Signaling
6. Circuit Breaker Pattern
7. Rollback Protocol
8. Security Considerations
9. IANA Considerations
1. Introduction
The IETF AI/agent landscape includes 60 drafts on autonomous
network operations but none that standardize error recovery.
When an autonomous agent misconfigures a router, allocates
resources incorrectly, or triggers an unintended cascade of
actions across a multi-agent system, there is currently no
standard mechanism for detecting the failure, containing its
blast radius, or reverting to a safe state.
AERR borrows proven patterns from distributed systems:
checkpoints from database transactions, circuit breakers from
microservice architectures, and rollback from version control.
It adapts these patterns to the specific needs of AI agents,
where actions may be partially reversible and where the agent
that caused the error may not be the best one to fix it.
Design principles:
1. Agents that take consequential actions MUST be able to undo
them, or MUST declare them irreversible upfront.
2. Failure containment takes priority over failure diagnosis.
3. The protocol adds minimal overhead to the happy path.
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described
in RFC 2119 [RFC2119].
Checkpoint: A snapshot of an agent's state and the external
effects of its actions at a point in time, sufficient to
restore the system to that state.
Circuit Breaker: A mechanism that stops an agent from
propagating requests to a failing downstream agent, preventing
cascading failures.
Rollback: The process of reverting an agent's actions and state
to a previously recorded checkpoint.
Blast Radius: The set of agents and systems affected by a
single agent's failure.
3. Problem Statement
Consider a network operations scenario: Agent A instructs
Agent B to update firewall rules, which causes Agent C's
traffic monitoring to fail, which causes Agent D to
misclassify traffic patterns. Today each agent handles errors
independently with no coordination. There is no standard way
for Agent D to signal that the root cause is upstream, for the
cascade to be halted, or for the chain of actions to be rolled
back.
The only existing draft that partially addresses this space
(draft-yue-anima-agent-recovery-networks) focuses on mobile
network fault recovery and does not provide general-purpose
error recovery primitives usable across agent types.
4. Checkpoint Mechanism
An AERR-compliant agent MUST create a checkpoint before any
action it classifies as "consequential." An action is
consequential if it modifies external state (e.g., network
config, database records, API calls with side effects).
A checkpoint is a JSON object:
{
"checkpoint_id": "urn:uuid:...",
"agent_id": "urn:uuid:...",
"timestamp": "2026-03-01T12:00:00Z",
"action": {
"type": "config_update",
"target": "router-07.example.com",
"description": "Update BGP peer config"
},
"reversible": true,
"rollback_procedure": {
"method": "POST",
"uri": "https://agent-b.example.com/aerr/rollback",
"payload_ref": "urn:uuid:...prior-config-snapshot"
},
"state_hash": "sha256:abcdef...",
"ttl": 86400
}
The "reversible" field MUST be present. If false, the agent
declares that this action cannot be automatically undone and
rollback requests for this checkpoint MUST be escalated to a
human operator.
The "state_hash" provides integrity verification: the agent
hashes its relevant state at checkpoint time so that rollback
can verify it is restoring to an authentic prior state.
Checkpoints MUST be stored for at least the duration specified
by "ttl" (seconds). Agents SHOULD store checkpoints in durable
storage that survives agent restarts.
Agents MAY create hierarchical checkpoints where a parent
checkpoint groups multiple child checkpoints from a multi-step
operation. Rolling back the parent rolls back all children.
5. Error Signaling
When an agent detects an error, it MUST emit an AERR error
signal to all agents in the current action chain. The error
signal is an HTTP POST to each peer's AERR endpoint:
POST /aerr/error HTTP/1.1
Content-Type: application/json
{
"error_id": "urn:uuid:...",
"source_agent": "urn:uuid:...",
"severity": "critical",
"checkpoint_id": "urn:uuid:...",
"error_type": "action_failed",
"description": "BGP session did not establish after config update",
"timestamp": "2026-03-01T12:05:00Z",
"upstream_errors": []
}
Severity levels: "info", "warning", "error", "critical".
Error types: "action_failed", "timeout", "constraint_violation",
"resource_exhausted", "upstream_cascade", "unknown".
When an agent receives an error signal caused by an action it
initiated, it MUST either:
(a) Attempt automatic rollback of its checkpoint, or
(b) Escalate to its operator if the action was irreversible.
The "upstream_errors" array allows agents to chain error
context, building a causal trace from the symptom back to the
root cause.
6. Circuit Breaker Pattern
Each agent MUST implement a circuit breaker for every downstream
agent it communicates with. The circuit breaker has three
states:
CLOSED (normal operation): Requests flow through. The agent
tracks the error rate over a sliding window (default: 60s).
OPEN (failure detected): When the error rate exceeds a
threshold (default: 50% over the window), the circuit breaker
opens. All requests to the downstream agent are immediately
rejected with error_type "circuit_open". The agent MUST emit
an error signal to upstream peers.
HALF-OPEN (recovery probe): After a cooldown period (default:
30s), the circuit breaker allows a single probe request. If it
succeeds, the breaker returns to CLOSED. If it fails, it
returns to OPEN with a doubled cooldown (exponential backoff,
max 300s).
Agents MUST expose circuit breaker state at:
GET /aerr/circuits
Response:
{
"circuits": [
{
"downstream_agent": "urn:uuid:...",
"state": "open",
"error_rate": 0.75,
"last_failure": "2026-03-01T12:05:00Z",
"cooldown_remaining_s": 22
}
]
}
This enables monitoring systems and upstream agents to
understand the health topology of the agent network.
7. Rollback Protocol
A rollback is initiated by sending an HTTP POST to the target
agent's rollback endpoint:
POST /aerr/rollback HTTP/1.1
Content-Type: application/json
{
"rollback_id": "urn:uuid:...",
"checkpoint_id": "urn:uuid:...",
"reason": "Upstream action caused cascading failure",
"initiator": "urn:uuid:...",
"cascade": true
}
When "cascade" is true, the receiving agent MUST also initiate
rollback of any downstream checkpoints that were created as a
consequence of the checkpointed action. This enables a single
rollback request to unwind an entire chain of agent actions.
The agent MUST respond with a rollback result:
{
"rollback_id": "urn:uuid:...",
"status": "completed",
"checkpoint_id": "urn:uuid:...",
"state_hash_before": "sha256:...",
"state_hash_after": "sha256:...",
"cascaded_rollbacks": [
{"agent_id": "urn:uuid:...", "status": "completed"},
{"agent_id": "urn:uuid:...", "status": "escalated"}
]
}
Rollback status values: "completed", "partial", "escalated",
"failed".
"escalated" means the action was irreversible and a human
operator has been notified. "partial" means some but not all
downstream rollbacks succeeded.
Agents MUST implement idempotent rollback: receiving the same
rollback_id twice MUST return the same result without re-
executing the rollback.
8. Security Considerations
Rollback requests are sensitive operations. Agents MUST
authenticate rollback requests using mutual TLS or signed JWTs.
Only agents in the same action chain (identified by checkpoint
lineage) SHOULD be authorized to request rollback.
Checkpoint data may contain sensitive system state. Agents
MUST encrypt stored checkpoints at rest and MUST NOT include
checkpoint contents in error signals.
Circuit breaker state is observable information about system
health. The /aerr/circuits endpoint SHOULD be access-
controlled to prevent adversaries from mapping system topology.
Malicious agents could send false error signals to trigger
unnecessary rollbacks. Agents SHOULD verify that error signals
reference valid checkpoint IDs from their own action chains
before initiating rollback.
9. IANA Considerations
This document requests IANA establish the following:
1. An "AERR Error Type" registry under Specification Required
policy. Initial entries: "action_failed", "timeout",
"constraint_violation", "resource_exhausted",
"upstream_cascade", "unknown".
2. An "AERR Severity Level" registry under Specification
Required policy. Initial entries: "info", "warning",
"error", "critical".
3. Well-known URI registrations for "aerr/error",
"aerr/rollback", and "aerr/circuits" per RFC 8615.
Author's Address
Generated by IETF Draft Analyzer
2026-03-01

View File

@@ -0,0 +1,386 @@
---
title: "Agent Task DAG (ATD): Execution Model, Checkpoints, and Recovery"
abbrev: "ATD"
category: std
docname: draft-atd-agent-task-dag-00
submissiontype: IETF
number:
date:
v: 3
area: "OPS"
workgroup: "NMOP"
keyword:
- agent DAG
- checkpoint
- rollback
- error recovery
- circuit breaker
author:
-
fullname: TBD
organization: Independent
email: placeholder@example.com
normative:
RFC2119:
RFC8174:
RFC8446:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
informative:
--- abstract
This document defines the Agent Task DAG (ATD) specification:
execution semantics, checkpoints, error signaling, circuit
breakers, and rollback for agent workflows. ATD does not define a
new DAG or token format. It defines when agents MUST emit ECT
nodes, what those nodes mean, and how to recover when things go
wrong. Checkpoints, errors, and rollback results are ECT nodes
with specific `exec_act` values and `ext` claims. Rollback walks
the ECT DAG backwards. Circuit breakers contain cascading
failures. Resource hints enable scheduling. The protocol is
transport-agnostic and builds on ECT for evidence and ACP-DAG-HITL
for policy.
--- middle
# Introduction
Autonomous agents increasingly make unsupervised decisions, yet no
standard exists for how agents checkpoint state, signal errors to
peers, contain cascading failures, or roll back decisions gone
wrong.
ATD borrows proven patterns from distributed systems: checkpoints
from database transactions, circuit breakers from microservice
architectures, and rollback from version control. It adapts these
to agent workflows where actions may be partially reversible and
where the agent that caused the error may not be the best one to
fix it.
ATD does not define a new DAG format. The ECT DAG
{{I-D.nennemann-wimse-ect}} IS the execution graph. ATD defines
the semantics of specific node types within that graph.
Design principles:
1. Agents that take consequential actions MUST be able to undo
them, or MUST declare them irreversible upfront.
2. Failure containment takes priority over failure diagnosis.
3. The protocol adds minimal overhead to the happy path.
# Conventions and Definitions
{::boilerplate bcp14-tagged}
Checkpoint:
: An ECT node recording agent state before a consequential action,
sufficient to restore the system to that state.
Circuit Breaker:
: A mechanism that stops an agent from propagating requests to a
failing downstream agent, preventing cascading failures.
Rollback:
: The process of reverting an agent's actions and state to a
previously recorded checkpoint.
Blast Radius:
: The set of agents and systems affected by a single failure.
# Node States {#node-states}
Each task node in the ECT DAG has an implicit state derived from
subsequent ECT nodes:
- **pending**: A delegation node exists in ACP-DAG-HITL but no
corresponding ECT has been emitted.
- **running**: An ECT with `exec_act` matching the task type has
been emitted but no completion or error ECT follows.
- **done**: A completion ECT (or the next `par`-linked ECT) exists.
- **failed**: An `atd:error` ECT references this node.
- **rolled_back**: An `atd:rollback_result` ECT references this
node's checkpoint.
# Checkpoint Mechanism {#checkpoints}
An ATD-compliant agent MUST create a checkpoint before any action
it classifies as consequential. An action is consequential if it
modifies external state (network config, database records, API
calls with side effects).
A checkpoint is an ECT with:
- `exec_act`: `"atd:checkpoint"`
- `par`: the ECT of the action being checkpointed
~~~json
{
"jti": "ckpt-uuid",
"exec_act": "atd:checkpoint",
"par": ["action-ect-uuid"],
"out_hash": "sha256-of-agent-state-snapshot",
"ext": {
"atd.reversible": true,
"atd.rollback_uri": "https://agent-b.example.com/atd/rollback",
"atd.target": "router-07.example.com",
"atd.description": "Update BGP peer config",
"atd.ttl": 86400
}
}
~~~
{: #fig-checkpoint title="Checkpoint ECT"}
The `atd.reversible` field MUST be present. If `false`, the agent
declares that this action cannot be automatically undone and
rollback requests MUST be escalated per the ACP-DAG-HITL
`unreachable_human` policy.
The `out_hash` provides integrity verification: the agent hashes
its state at checkpoint time so that rollback can verify it is
restoring to an authentic prior state.
Checkpoints MUST be stored for at least `atd.ttl` seconds. Agents
SHOULD store checkpoints in durable storage that survives restarts.
## Hierarchical Checkpoints
Agents MAY create hierarchical checkpoints where a parent groups
multiple child checkpoints from a multi-step operation. Rolling
back the parent rolls back all children. The parent checkpoint's
`par` array references all child checkpoint `jti` values.
# Error Signaling {#errors}
When an agent detects an error, it MUST emit an error ECT:
- `exec_act`: `"atd:error"`
- `par`: the ECT of the failed action
~~~json
{
"jti": "error-uuid",
"exec_act": "atd:error",
"par": ["failed-action-ect-uuid"],
"ext": {
"atd.severity": "critical",
"atd.error_type": "action_failed",
"atd.description": "BGP session did not establish",
"atd.checkpoint_id": "ckpt-uuid",
"atd.upstream_errors": []
}
}
~~~
{: #fig-error title="Error ECT"}
Severity levels: `info`, `warning`, `error`, `critical`.
Error types: `action_failed`, `timeout`, `constraint_violation`,
`resource_exhausted`, `upstream_cascade`, `unknown`.
When an agent receives an error signal caused by an action it
initiated, it MUST either:
(a) Attempt automatic rollback of its checkpoint, or
(b) Escalate per ACP-DAG-HITL HITL rules if the action was
irreversible.
The `atd.upstream_errors` array allows agents to chain error
context, building a causal trace from symptom to root cause.
## HITL Escalation on Error
Error ECTs MAY trigger ACP-DAG-HITL rules. A deployment can
define HITL rules such as:
~~~json
{
"id": "r-critical-error",
"trigger": {
"kind": "keyword_match",
"op": "eq",
"value": "critical",
"input_ref": "atd.severity"
},
"required_role": "operator:oncall",
"action": "escalate",
"allow_override": true,
"override_action": "continue"
}
~~~
{: #fig-error-hitl title="HITL Rule for Critical Errors"}
# Circuit Breaker Pattern {#circuit-breaker}
Each agent MUST implement a circuit breaker for every downstream
agent it communicates with. The circuit breaker has three states:
CLOSED (normal):
: Requests flow through. The agent tracks the error rate over a
sliding window (default: 60 seconds).
OPEN (failure detected):
: When the error rate exceeds a threshold (default: 50%), the
breaker opens. All requests are immediately rejected. The
agent MUST emit a circuit breaker ECT:
~~~json
{
"exec_act": "atd:circuit_open",
"ext": {
"atd.downstream_agent": "spiffe://example.com/agent/b",
"atd.error_rate": 0.75,
"atd.window_s": 60
}
}
~~~
{: #fig-circuit title="Circuit Breaker ECT"}
HALF-OPEN (recovery probe):
: After a cooldown period (default: 30s), the breaker allows one
probe request. If it succeeds, the breaker returns to CLOSED.
If it fails, it returns to OPEN with doubled cooldown
(exponential backoff, max 300s).
Circuit breaker thresholds can be configured as ACP-DAG-HITL
node constraints:
~~~json
{
"constraints": {
"atd.circuit_threshold": 0.5,
"atd.circuit_window_s": 60
}
}
~~~
{: #fig-circuit-policy title="Circuit Breaker Policy"}
# Rollback Protocol {#rollback}
A rollback is initiated by emitting a rollback request ECT and
sending an HTTP POST to the target agent's rollback endpoint:
~~~
POST /atd/rollback HTTP/1.1
Content-Type: application/json
Execution-Context: <rollback-request-ect>
~~~
- `exec_act`: `"atd:rollback_request"`
- `par`: the checkpoint ECT to roll back to
~~~json
{
"exec_act": "atd:rollback_request",
"par": ["ckpt-uuid"],
"ext": {
"atd.reason": "Upstream action caused cascading failure",
"atd.cascade": true
}
}
~~~
{: #fig-rollback-req title="Rollback Request ECT"}
When `atd.cascade` is `true`, the receiving agent MUST also
initiate rollback of any downstream checkpoints created as a
consequence of the checkpointed action.
The agent MUST respond with a rollback result ECT:
- `exec_act`: `"atd:rollback_result"`
- `par`: the rollback request ECT
~~~json
{
"exec_act": "atd:rollback_result",
"par": ["rollback-request-uuid"],
"out_hash": "sha256-of-restored-state",
"ext": {
"atd.status": "completed",
"atd.checkpoint_id": "ckpt-uuid",
"atd.cascaded": [
{"agent": "spiffe://example.com/agent/c", "status": "completed"},
{"agent": "spiffe://example.com/agent/d", "status": "escalated"}
]
}
}
~~~
{: #fig-rollback-result title="Rollback Result ECT"}
Status values: `completed`, `partial`, `escalated`, `failed`.
`escalated` means the action was irreversible and a human operator
has been notified per ACP-DAG-HITL `unreachable_human` policy.
Agents MUST implement idempotent rollback: receiving the same
rollback request ECT `jti` twice MUST return the same result.
# Resource Hints {#resources}
Agents MAY declare resource requirements as ECT extension claims
or ACP-DAG-HITL node constraints:
~~~json
{
"constraints": {
"atd.resource_cpu": "2",
"atd.resource_memory_mb": 4096,
"atd.resource_timeout_s": 300,
"atd.resource_priority": "high"
}
}
~~~
{: #fig-resources title="Resource Hints as Node Constraints"}
Orchestrators (e.g., Kubernetes schedulers, agent gateways) MAY
use these hints for scheduling and quota enforcement. Resource
hints are advisory; agents MUST NOT depend on them for
correctness.
# Security Considerations
Rollback requests are sensitive operations. Agents MUST
authenticate rollback requests using the ECT identity binding
(L2/L3). Only agents in the same workflow (`wid`) with
checkpoint lineage in the DAG SHOULD be authorized to request
rollback.
Checkpoint data may contain sensitive system state. Agents MUST
encrypt stored checkpoints at rest and MUST NOT include checkpoint
contents in error ECTs.
Circuit breaker state reveals system health topology. The
`atd:circuit_open` ECT is part of the audit trail; access to the
audit ledger SHOULD be controlled.
Malicious agents could send false error ECTs to trigger
unnecessary rollbacks. Agents SHOULD verify that error ECTs
reference valid `par` values within their own workflow DAG.
# IANA Considerations
This document requests registration of the following `exec_act`
values in a future ECT action type registry:
- `atd:checkpoint`
- `atd:error`
- `atd:circuit_open`
- `atd:rollback_request`
- `atd:rollback_result`
--- back
# Acknowledgments
{:numbered="false"}
ATD builds on ECT {{I-D.nennemann-wimse-ect}} for execution
evidence and ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}}
for delegation policy. The circuit breaker pattern is adapted
from microservice architecture best practices.

View File

@@ -0,0 +1,725 @@
---
title: "Agent Task DAG (ATD): Execution Model, Checkpoints, and Recovery"
abbrev: "ATD"
category: std
docname: draft-atd-agent-task-dag-01
submissiontype: IETF
number:
date:
v: 3
area: "OPS"
workgroup: "NMOP"
keyword:
- agent DAG
- checkpoint
- rollback
- error recovery
- circuit breaker
author:
-
fullname: TBD
organization: Independent
email: placeholder@example.com
normative:
RFC2119:
RFC8174:
RFC8446:
RFC9110:
RFC8615:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
informative:
--- abstract
This document defines the Agent Task DAG (ATD) specification:
execution semantics, checkpoints, error signaling, circuit
breakers, and rollback for agent workflows. ATD does not define a
new DAG or token format. It defines when agents MUST emit ECT
nodes, what those nodes mean, and how to recover when things go
wrong. Checkpoints, errors, and rollback results are ECT nodes
with specific `exec_act` values and `ext` claims. Rollback walks
the ECT DAG backwards. Circuit breakers contain cascading
failures. Resource hints enable scheduling. The protocol is
transport-agnostic and builds on ECT for evidence and ACP-DAG-HITL
for policy.
--- middle
# Introduction
Autonomous agents increasingly make unsupervised decisions, yet no
standard exists for how agents checkpoint state, signal errors to
peers, contain cascading failures, or roll back decisions gone
wrong.
ATD borrows proven patterns from distributed systems: checkpoints
from database transactions, circuit breakers from microservice
architectures, and rollback from version control. It adapts these
to agent workflows where actions may be partially reversible and
where the agent that caused the error may not be the best one to
fix it.
ATD does not define a new DAG format. The ECT DAG
{{I-D.nennemann-wimse-ect}} IS the execution graph. ATD defines
the semantics of specific node types within that graph.
Design principles:
1. Agents that take consequential actions MUST be able to undo
them, or MUST declare them irreversible upfront.
2. Failure containment takes priority over failure diagnosis.
3. The protocol adds minimal overhead to the happy path.
# Conventions and Definitions
{::boilerplate bcp14-tagged}
Checkpoint:
: An ECT node recording agent state before a consequential action,
sufficient to restore the system to that state.
Circuit Breaker:
: A mechanism that stops an agent from propagating requests to a
failing downstream agent, preventing cascading failures.
Rollback:
: The process of reverting an agent's actions and state to a
previously recorded checkpoint.
Blast Radius:
: The set of agents and systems affected by a single failure.
Consequential Action:
: An action that modifies external state (network configuration,
database records, API calls with side effects) such that
reversal requires explicit effort.
# Execution Semantics {#execution}
## Topological Order
Tasks in the ECT DAG MUST execute in topological order: a task
MUST NOT begin execution until all tasks referenced by its ECT
`par` claims are in state `done`.
Two tasks with no common ancestor in the DAG (no shared `par`
lineage) MAY execute concurrently. Orchestrators SHOULD
exploit this parallelism for performance.
Circular dependencies are prohibited. Agents MUST reject
ACP-DAG-HITL delegation DAGs containing cycles.
## Workflow Boundary ECTs
When a workflow begins, the initiating agent MUST emit:
~~~json
{
"exec_act": "atd:workflow_start",
"ext": {
"atd.wf_id": "wf-uuid",
"atd.description": "BGP failover workflow",
"atd.node_count": 5
}
}
~~~
{: #fig-wf-start title="Workflow Start ECT"}
When the workflow reaches a terminal state (all leaf nodes
complete or any node failed with no rollback path), the
orchestrator MUST emit:
~~~json
{
"exec_act": "atd:workflow_complete",
"par": ["wf-start-ect-uuid"],
"ext": {
"atd.wf_id": "wf-uuid",
"atd.terminal_status": "success",
"atd.elapsed_s": 42
}
}
~~~
{: #fig-wf-complete title="Workflow Complete ECT"}
Terminal status values: `success`, `partial`, `failed`,
`rolled_back`, `escalated`.
# Node States {#node-states}
Each task node in the ECT DAG has an implicit state derived from
subsequent ECT nodes:
- **pending**: A delegation node exists in ACP-DAG-HITL but no
corresponding ECT has been emitted.
- **running**: An ECT matching the task type has been emitted
but no completion or error ECT follows.
- **done**: A completion ECT (or the next `par`-linked ECT) exists.
- **failed**: An `atd:error` ECT references this node.
- **rolled_back**: An `atd:rollback_result` ECT references this
node's checkpoint.
- **escalated**: The task failed and a human has been notified
per HITL escalation rules.
# Checkpoint Mechanism {#checkpoints}
## Checkpoint Placement Policy
An ATD-compliant agent MUST create a checkpoint before any action
it classifies as consequential. The following actions are always
consequential and MUST be checkpointed:
1. Any modification to network device configuration.
2. Any write to a shared database or external data store.
3. Any API call with side effects (non-idempotent HTTP methods).
4. Any delegation to another agent that will itself take
consequential actions.
The following SHOULD be checkpointed:
1. Long-running computations (> `atd.resource_timeout_s`).
2. Actions that cannot be verified without external state.
The following are exempt from checkpoint requirements:
1. Read-only queries.
2. Sending notifications with no side effects.
3. Internal state computations with no external observable effect.
## Checkpoint ECT Format
A checkpoint is an ECT with:
- `exec_act`: `"atd:checkpoint"`
- `par`: the ECT of the action being checkpointed
~~~json
{
"jti": "ckpt-uuid",
"exec_act": "atd:checkpoint",
"par": ["action-ect-uuid"],
"out_hash": "sha256-of-agent-state-snapshot",
"ext": {
"atd.reversible": true,
"atd.rollback_uri": "https://agent-b.example.com/.well-known/atd/rollback",
"atd.target": "router-07.example.com",
"atd.description": "Update BGP peer config",
"atd.ttl": 86400
}
}
~~~
{: #fig-checkpoint title="Checkpoint ECT"}
The `atd.reversible` field MUST be present. If `false`, the agent
declares that this action cannot be automatically undone and
rollback requests MUST be escalated per the ACP-DAG-HITL
`unreachable_human` policy.
The `out_hash` provides integrity verification: the agent hashes
its state at checkpoint time so that rollback can verify it is
restoring to an authentic prior state.
Checkpoints MUST be stored for at least `atd.ttl` seconds. Agents
SHOULD store checkpoints in durable storage that survives restarts.
The rollback URI MUST be a well-known URI per {{RFC8615}} at the
path `/.well-known/atd/rollback`.
## Hierarchical Checkpoints
Agents MAY create hierarchical checkpoints where a parent groups
multiple child checkpoints from a multi-step operation. Rolling
back the parent rolls back all children. The parent checkpoint's
`par` array references all child checkpoint `jti` values.
## Checkpoint `exec_act` Table
| `exec_act` value | When emitted | Required `ext` fields |
|-----------------|-------------|----------------------|
| `atd:checkpoint` | Before consequential action | `atd.reversible`, `atd.rollback_uri`, `atd.ttl` |
| `atd:error` | On failure detection | `atd.severity`, `atd.error_type`, `atd.checkpoint_id` |
| `atd:circuit_open` | When error rate exceeds threshold | `atd.downstream_agent`, `atd.error_rate`, `atd.window_s` |
| `atd:circuit_close` | When probe succeeds in HALF-OPEN | `atd.downstream_agent`, `atd.cooldown_s` |
| `atd:rollback_request` | To initiate rollback | `atd.reason`, `atd.cascade` |
| `atd:rollback_result` | Rollback complete or failed | `atd.status`, `atd.checkpoint_id`, `atd.cascaded` |
| `atd:workflow_start` | Workflow begins | `atd.wf_id`, `atd.description` |
| `atd:workflow_complete` | Workflow terminal | `atd.wf_id`, `atd.terminal_status` |
{: #fig-actions title="ATD exec_act Values"}
# Error Signaling {#errors}
When an agent detects an error, it MUST emit an error ECT:
- `exec_act`: `"atd:error"`
- `par`: the ECT of the failed action
~~~json
{
"jti": "error-uuid",
"exec_act": "atd:error",
"par": ["failed-action-ect-uuid"],
"ext": {
"atd.severity": "critical",
"atd.error_type": "action_failed",
"atd.description": "BGP session did not establish",
"atd.checkpoint_id": "ckpt-uuid",
"atd.upstream_errors": []
}
}
~~~
{: #fig-error title="Error ECT"}
Severity levels (in increasing order): `info`, `warning`,
`error`, `critical`.
Error types: `action_failed`, `timeout`, `constraint_violation`,
`resource_exhausted`, `upstream_cascade`, `unknown`.
When an agent receives an error signal caused by an action it
initiated, it MUST either:
(a) Attempt automatic rollback of its checkpoint, or
(b) Escalate per ACP-DAG-HITL HITL rules if the action was
irreversible.
The `atd.upstream_errors` array allows agents to chain error
context, building a causal trace from symptom to root cause.
## HITL Escalation on Error
Error ECTs with severity `critical` SHOULD trigger HITL
escalation. Deployments SHOULD define ACP-DAG-HITL rules such
as:
~~~json
{
"id": "r-critical-error",
"trigger": {
"kind": "keyword_match",
"op": "eq",
"value": "critical",
"input_ref": "atd.severity"
},
"required_role": "operator:oncall",
"action": "escalate",
"allow_override": true,
"override_action": "continue"
}
~~~
{: #fig-error-hitl title="HITL Rule for Critical Errors"}
# Circuit Breaker Pattern {#circuit-breaker}
Each agent MUST implement a circuit breaker for every downstream
agent it communicates with. The circuit breaker has three states:
CLOSED (normal):
: Requests flow through. The agent tracks the error rate over a
sliding window (default: 60 seconds).
OPEN (failure detected):
: When the error rate exceeds a threshold (default: 50%), the
breaker opens. All requests are immediately rejected. The
agent MUST emit a circuit breaker open ECT:
~~~json
{
"exec_act": "atd:circuit_open",
"ext": {
"atd.downstream_agent": "spiffe://example.com/agent/b",
"atd.error_rate": 0.75,
"atd.window_s": 60
}
}
~~~
{: #fig-circuit-open title="Circuit Breaker Open ECT"}
HALF-OPEN (recovery probe):
: After a cooldown period (default: 30s), the breaker allows one
probe request. If it succeeds, the breaker returns to CLOSED
and MUST emit:
~~~json
{
"exec_act": "atd:circuit_close",
"ext": {
"atd.downstream_agent": "spiffe://example.com/agent/b",
"atd.cooldown_s": 30
}
}
~~~
{: #fig-circuit-close title="Circuit Breaker Close ECT"}
If the probe fails, the breaker returns to OPEN with doubled
cooldown (exponential backoff, max 300s).
## Circuit Breaker State Machine
~~~
error_rate > threshold
CLOSED ─────────────────────────► OPEN
▲ │
│ probe success │ cooldown expires
│ ▼
└────────────────────────── HALF-OPEN
probe failure ──► OPEN (cooldown * 2)
~~~
{: #fig-fsm title="Circuit Breaker State Machine"}
## Coordinated Circuit Breaking
When multiple agents share a downstream dependency, each maintains
its own circuit breaker independently. However, agents SHOULD
publish circuit breaker state via their ECT stream so peers can
observe the signal.
If an orchestrator observes N circuit breakers opening for the
same downstream agent within a short window, it SHOULD initiate
a HITL escalation rather than allowing N parallel recovery probes.
## Circuit Breaker Policy Configuration
Circuit breaker thresholds can be configured as ACP-DAG-HITL
node constraints:
~~~json
{
"constraints": {
"atd.circuit_threshold": 0.5,
"atd.circuit_window_s": 60
}
}
~~~
{: #fig-circuit-policy title="Circuit Breaker Policy"}
# Rollback Protocol {#rollback}
## Basic Rollback
A rollback is initiated by emitting a rollback request ECT and
sending an HTTP POST to the target agent's rollback endpoint:
~~~
POST /.well-known/atd/rollback HTTP/1.1
Content-Type: application/json
Execution-Context: <rollback-request-ect>
~~~
- `exec_act`: `"atd:rollback_request"`
- `par`: the checkpoint ECT to roll back to
~~~json
{
"exec_act": "atd:rollback_request",
"par": ["ckpt-uuid"],
"ext": {
"atd.reason": "Upstream action caused cascading failure",
"atd.cascade": true
}
}
~~~
{: #fig-rollback-req title="Rollback Request ECT"}
When `atd.cascade` is `true`, the receiving agent MUST also
initiate rollback of any downstream checkpoints created as a
consequence of the checkpointed action.
The agent MUST respond with a rollback result ECT:
~~~json
{
"exec_act": "atd:rollback_result",
"par": ["rollback-request-uuid"],
"out_hash": "sha256-of-restored-state",
"ext": {
"atd.status": "completed",
"atd.checkpoint_id": "ckpt-uuid",
"atd.cascaded": [
{"agent": "spiffe://example.com/agent/c", "status": "completed"},
{"agent": "spiffe://example.com/agent/d", "status": "escalated"}
]
}
}
~~~
{: #fig-rollback-result title="Rollback Result ECT"}
Status values: `completed`, `partial`, `escalated`, `failed`.
`escalated` means the action was irreversible and a human operator
has been notified per ACP-DAG-HITL `unreachable_human` policy.
## Partial Rollback and Blast Radius Containment
When a failure occurs in the middle of a DAG, it is often
undesirable to roll back the entire workflow. ATD defines
partial rollback as rolling back the failed subgraph while
preserving completed sibling branches.
Partial rollback MUST only proceed if:
1. The checkpoints to be rolled back are in the same workflow
(`atd.wf_id`).
2. No completed sibling task depends on the output of the
failed task (verified by walking the DAG forward from the
checkpoint).
The blast radius is the set of agents holding checkpoints that
are descendants of the failed node. Orchestrators SHOULD
compute blast radius before initiating cascade rollback to
avoid unnecessary disruption.
## Rollback Timeout and Escalation
Rollback requests MUST include a timeout implicitly derived from
the original checkpoint's `atd.ttl`. If rollback is not
completed within `atd.ttl / 2` seconds, the agent MUST:
1. Emit an `atd:error` with `error_type: "timeout"` and
`atd.description` noting rollback timeout.
2. Escalate to HITL per {{hitl-escalation}}.
Agents MUST implement idempotent rollback: receiving the same
rollback request ECT `jti` twice MUST return the same result.
## Rollback Authorization {#rollback-authz}
Only agents within the same workflow (`wid`) with checkpoint
lineage in the DAG SHOULD be authorized to request rollback.
Rollback requests from outside the originating workflow MUST be
rejected with HTTP 403.
# Interaction with HITL {#hitl-escalation}
ATD escalates to HITL in the following scenarios:
1. **Irreversible action failure**: An error ECT with
`atd.reversible: false` on the checkpoint MUST trigger
HITL Level 2 (approval required) per the companion HITL
specification.
2. **Rollback failure**: A rollback result with `atd.status:
"failed"` MUST trigger HITL Level 3 (STOP) on the workflow.
3. **Cascaded rollback of critical nodes**: When `atd.cascade:
true` rollback propagates to a node with `atd.severity:
critical`, HITL SHOULD be triggered at Level 1 (PAUSE)
to allow human review before proceeding.
4. **Circuit breaker permanent open**: If a circuit breaker
re-opens after 3 successive HALF-OPEN probes, HITL Level 2
escalation SHOULD be triggered.
ATD-to-HITL escalation is recorded as an ECT linked to both
the triggering error ECT and the HITL override ECT, preserving
the causal chain in the audit DAG.
# Resource Hints {#resources}
## Resource Claim Format
Agents MAY declare resource requirements as ACP-DAG-HITL node
constraints:
~~~json
{
"constraints": {
"atd.resource_cpu": "2",
"atd.resource_memory_mb": 4096,
"atd.resource_timeout_s": 300,
"atd.resource_priority": "high",
"atd.resource_gpu": "0",
"atd.resource_network_mbps": 100
}
}
~~~
{: #fig-resources title="Resource Hints as Node Constraints"}
## Priority Levels
The `atd.resource_priority` field MUST be one of: `critical`,
`high`, `normal`, `low`. Orchestrators SHOULD map these to
scheduling priority classes (e.g., Kubernetes QoS classes:
`critical` → Guaranteed, `high`/`normal` → Burstable, `low`
→ BestEffort).
## Fair-Share Scheduling
When multiple agents compete for a shared resource pool,
orchestrators SHOULD implement fair-share scheduling:
1. Each active workflow receives an equal base allocation.
2. Unused allocation from `low` priority agents is redistributed
to `high`/`critical` agents within the same scheduling cycle.
3. Starvation prevention: `low` priority agents MUST eventually
be scheduled within a configurable maximum wait (default: 300s).
## Unsatisfiable Resource Hints
Resource hints are advisory; agents MUST NOT depend on them for
correctness. When resource hints cannot be satisfied:
- If `atd.resource_priority` is `critical`: orchestrator SHOULD
pre-empt lower-priority tasks.
- If `critical` tasks still cannot be scheduled within 60s:
emit `atd:error` with `error_type: "resource_exhausted"` and
escalate to HITL.
- All other priorities: proceed with degraded resources; log
a warning via `atd:error` with severity `warning`.
# Optional Declarative Workflow Format {#workflow-format}
To support pre-run planning and tooling, ATD defines an optional
declarative workflow descriptor. This is a planning artifact
only; at runtime it is realized as ECTs per this specification.
~~~json
{
"wf_id": "bgp-failover-v2",
"description": "BGP peer failover with validation",
"nodes": [
{
"id": "n1",
"label": "validate-config",
"reversible": true,
"hitl_required": false,
"resource_hints": {
"priority": "normal",
"timeout_s": 30
}
},
{
"id": "n2",
"label": "update-bgp-peer",
"reversible": true,
"hitl_required": true,
"resource_hints": {
"priority": "critical",
"timeout_s": 120
}
},
{
"id": "n3",
"label": "verify-session",
"reversible": false,
"hitl_required": false,
"resource_hints": {
"priority": "high",
"timeout_s": 60
}
}
],
"edges": [
{"from": "n1", "to": "n2"},
{"from": "n2", "to": "n3"}
]
}
~~~
{: #fig-workflow title="Declarative Workflow Descriptor"}
The workflow descriptor media type is
`application/atd-workflow+json`. Orchestrators MAY store and
version workflow descriptors independently of their ECT runtime
realization.
The `hitl_required` field is a hint to the HITL system that this
node MUST have an approval gate as defined in the companion HITL
specification.
# Security Considerations
## Rollback Authorization
Rollback requests are high-privilege operations. Agents MUST
authenticate rollback requests using the ECT identity binding
(L2/L3). The rollback endpoint MUST require mutual TLS or a
signed JWT from an agent within the same workflow DAG.
Only agents that are ancestors in the ECT DAG of the checkpoint
being rolled back SHOULD be authorized to request that rollback.
## Checkpoint Confidentiality
Checkpoint data may contain sensitive system state (API keys,
session tokens, configuration). Agents MUST:
- Encrypt stored checkpoints at rest.
- Reference checkpoint state via `out_hash` only in ECTs.
- MUST NOT include checkpoint contents in error ECTs.
## False Error Injection
A malicious agent could send false `atd:error` ECTs to trigger
unnecessary rollbacks and disrupt workflows. Mitigation:
- Agents SHOULD verify that error ECTs reference valid `par`
values within their own workflow DAG (`wid` claim).
- Rollback MUST require authentication (see {{rollback-authz}}).
- L2/L3 ECT signing prevents unauthenticated error injection.
## Checkpoint Flooding
An adversary could exhaust checkpoint storage by triggering
many checkpoints. Mitigation:
- Agents SHOULD enforce a maximum checkpoint count per workflow.
- Expired checkpoints (past `atd.ttl`) MUST be purged.
- Checkpoint creation rate SHOULD be rate-limited per calling
workflow.
## Circuit Breaker State Leakage
The `atd:circuit_open` ECT reveals system health topology. The
audit ledger SHOULD enforce access controls: only agents within
the same workflow or authorized operators SHOULD be able to query
circuit breaker history.
# IANA Considerations
This document requests registration of the following values in
the AEM Ecosystem Extension Registry established by
draft-aem-agent-ecosystem-model:
## `exec_act` Values
| Value | Description | Reference |
|-------|-------------|-----------|
| `atd:checkpoint` | State snapshot before consequential action | This document |
| `atd:error` | Error signal with severity and type | This document |
| `atd:circuit_open` | Circuit breaker opened to downstream agent | This document |
| `atd:circuit_close` | Circuit breaker returned to CLOSED state | This document |
| `atd:rollback_request` | Initiate rollback to named checkpoint | This document |
| `atd:rollback_result` | Result of rollback attempt | This document |
| `atd:workflow_start` | Workflow began execution | This document |
| `atd:workflow_complete` | Workflow reached terminal state | This document |
{: #fig-iana-actions title="ATD exec_act Registrations"}
## Well-Known URI
This document requests registration of `atd/rollback` as a
well-known URI suffix per {{RFC8615}}.
## Media Type
This document requests registration of
`application/atd-workflow+json` for the declarative workflow
descriptor format defined in {{workflow-format}}.
--- back
# Acknowledgments
{:numbered="false"}
ATD builds on ECT {{I-D.nennemann-wimse-ect}} for execution
evidence and ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}}
for delegation policy. The circuit breaker pattern is adapted
from microservice architecture best practices. The declarative
workflow format is inspired by workflow description languages
(BPEL, BPMN) adapted for lightweight agent coordination.

View File

@@ -0,0 +1,368 @@
---
title: "Human-in-the-Loop (HITL) Primitives for Agent Ecosystems"
abbrev: "HITL"
category: std
docname: draft-hitl-human-in-the-loop-00
submissiontype: IETF
number:
date:
v: 3
area: "OPS"
workgroup: "NMOP"
keyword:
- human override
- HITL
- emergency stop
- agentic safety
author:
-
fullname: TBD
organization: Independent
email: placeholder@example.com
normative:
RFC2119:
RFC8174:
RFC7519:
RFC8446:
RFC8615:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
informative:
--- abstract
This document defines runtime HITL (Human-in-the-Loop) primitives
for agent ecosystems: four escalating override levels, approval
gates, escalation paths, and explainability hooks. ACP-DAG-HITL
defines WHEN humans must intervene (policy rules and triggers).
This specification defines HOW the intervention actually happens at
the protocol level: the HTTP endpoints, override semantics, agent
compliance requirements, and acknowledgment flows. All overrides
and decisions produce ECT nodes, making human interventions part of
the same auditable DAG as agent actions.
--- middle
# Introduction
The current ratio of autonomous capability drafts to human
oversight drafts in the IETF is roughly 7:1. Agents can act but
humans cannot reliably stop them.
ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}} defines the
policy: trigger conditions, required roles, and actions (`pause`,
`escalate`, `abort`). But it deliberately defers the runtime
protocol — how does an operator actually send a stop command? How
does the agent acknowledge it? What happens if the operator is
unreachable?
This specification fills that gap. It is the runtime enforcement
companion to ACP-DAG-HITL, inspired by industrial safety systems:
the e-stop button on factory equipment, the circuit breaker in
electrical systems, and the kill switch in robotics.
HITL is deliberately not a governance framework, policy language,
or accountability protocol. It is a panic button with a
well-defined interface.
# Conventions and Definitions
{::boilerplate bcp14-tagged}
Override:
: A human-initiated command that alters an agent's autonomous
operation, taking precedence over the agent's own decisions.
Operator:
: A human user authorized to issue override commands.
Approval Gate:
: A DAG node that blocks workflow progression until a human
approves or rejects continuation.
# Relationship to ACP-DAG-HITL {#mapping}
ACP-DAG-HITL defines three HITL actions. This specification
maps them to four runtime override levels and extends with
CONSTRAIN (partial restriction):
| ACP-DAG-HITL action | HITL Override Level | Behavior |
|---------------------|---------------------|----------|
| `pause` | Level 1: PAUSE | Suspend autonomous actions, hold state |
| (no equivalent) | Level 2: CONSTRAIN | Restrict to an allowlist of actions |
| `abort` | Level 3: STOP | Cease all actions, enter inert state |
| `escalate` | Level 4: TAKEOVER | Transfer control to human operator |
{: #fig-mapping title="ACP-DAG-HITL to HITL Level Mapping"}
When ACP-DAG-HITL rules trigger, the runtime system uses the
corresponding HITL level to enforce the action.
# Override Levels {#levels}
## Level 1: PAUSE
The agent MUST suspend all autonomous actions and hold current
state. It MUST NOT initiate new actions but MAY complete
in-progress actions if stopping mid-execution would cause harm
(e.g., an in-flight database transaction). The agent resumes
when a RESUME command is received.
## Level 2: CONSTRAIN
The agent MUST restrict its actions to a specified subset. The
override command includes an allowlist of permitted action types.
The agent MUST reject any action not on the allowlist.
## Level 3: STOP
The agent MUST immediately cease all autonomous actions and enter
an inert state. It MUST NOT take any autonomous actions until
explicitly restarted. This is the e-stop.
## Level 4: TAKEOVER
The agent MUST transfer operational control to the human operator.
It enters a pass-through mode where it executes only explicit
operator commands. The agent's sensors and outputs remain
available to the operator as tools.
# Override Protocol {#protocol}
## Override Command
Override commands are sent as HTTP POST to the agent's well-known
endpoint:
~~~
POST /.well-known/hitl/override HTTP/1.1
Content-Type: application/json
Authorization: Bearer <operator-jwt>
Execution-Context: <override-ect>
~~~
The override ECT MUST contain:
- `exec_act`: `"hitl:override"`
- `par`: the most recent ECT from the agent being overridden
(linking the override into the workflow DAG)
~~~json
{
"exec_act": "hitl:override",
"par": ["agent-last-action-ect"],
"ext": {
"hitl.level": 3,
"hitl.reason": "Agent blocking legitimate traffic",
"hitl.operator_id": "user:alice",
"hitl.scope": "*",
"hitl.constraints": null,
"hitl.ttl": null
}
}
~~~
{: #fig-override title="Override ECT"}
Field definitions:
- `hitl.level`: Integer 1-4. MUST be present.
- `hitl.reason`: Human-readable text. MUST be logged.
- `hitl.scope`: `"*"` for all functions, or an array of function
IDs for partial override.
- `hitl.constraints`: For Level 2 only. Array of permitted action
types.
- `hitl.ttl`: Duration in seconds. If set, override auto-expires.
If null, persists until explicitly lifted.
## Acknowledgment
The agent MUST respond with an acknowledgment ECT:
- `exec_act`: `"hitl:ack"`
- `par`: the override ECT
~~~json
{
"exec_act": "hitl:ack",
"par": ["override-ect-uuid"],
"ext": {
"hitl.status": "accepted",
"hitl.prior_state": "autonomous",
"hitl.current_state": "stopped",
"hitl.effective_at": "2026-03-01T12:00:00.123Z"
}
}
~~~
{: #fig-ack title="Acknowledgment ECT"}
The override/ack ECT pair serves as the Decision Record defined
in ACP-DAG-HITL Section 6.5. No separate audit mechanism is
needed.
## Resume and Lift
To resume from PAUSE:
~~~
POST /.well-known/hitl/resume HTTP/1.1
Execution-Context: <resume-ect with exec_act="hitl:resume">
~~~
To lift any override:
~~~
POST /.well-known/hitl/lift HTTP/1.1
Execution-Context: <lift-ect with exec_act="hitl:lift">
~~~
Both produce ECTs linked to the original override ECT via `par`.
# Agent Compliance Requirements {#compliance}
Every HITL-compliant agent MUST:
1. Implement the `/.well-known/hitl/override` endpoint.
2. Process override commands within 1 second of receipt. The
override path MUST be independent of the agent's main
processing loop.
3. Acknowledge every override with an ECT response.
4. An agent MUST NOT respond with "rejected". Overrides are
mandatory. If the agent cannot fully comply, it MUST respond
with status `partial` and describe what it could not do.
5. Expose current override status at:
~~~
GET /.well-known/hitl/status
~~~
~~~json
{
"agent_id": "spiffe://example.com/agent/firewall",
"override_active": true,
"current_level": 3,
"override_ect": "override-ect-uuid",
"since": "2026-03-01T12:00:00Z",
"operator_id": "user:alice"
}
~~~
{: #fig-status title="Override Status Response"}
# Approval Gates {#approval-gates}
An approval gate is a DAG node that blocks workflow progression
until a human approves. Unlike overrides (which interrupt running
agents), approval gates are planned checkpoints in the workflow.
Approval gates are defined as ACP-DAG-HITL nodes with HITL rules:
~~~json
{
"dag": {
"nodes": [
{
"id": "n-approve",
"type": "hitl:approval_gate",
"agent": "system:hitl-gateway",
"constraints": {
"hitl.required_role": "clinician:oncall",
"hitl.timeout_s": 300,
"hitl.timeout_action": "safe_pause"
}
}
]
}
}
~~~
{: #fig-gate title="Approval Gate as DAG Node"}
When the workflow reaches an approval gate, the system:
1. Emits an ECT with `exec_act: "hitl:approval_request"`
2. Notifies the required human role
3. Waits for approval (ECT: `"hitl:approval_granted"`) or
rejection (ECT: `"hitl:approval_denied"`)
4. On timeout, applies `hitl.timeout_action`
# Broadcast Override {#broadcast}
For environments with many agents, an operator MAY send a
broadcast override to a management endpoint:
~~~
POST /hitl/broadcast HTTP/1.1
Execution-Context: <broadcast-override-ect>
{
"targets": ["spiffe://example.com/agent/a",
"spiffe://example.com/agent/b"],
"level": 3,
"reason": "Coordinated emergency stop"
}
~~~
The broadcast endpoint fans out individual override ECTs to each
target and returns per-agent results.
# Dead Man's Switch {#dead-man}
For maximum reliability, agents SHOULD implement a heartbeat
mechanism: the agent periodically pings an operator heartbeat
endpoint. If the heartbeat is missed for a configurable duration,
the agent automatically enters Level 1 (PAUSE).
This provides a safety net when network connectivity to the
operator is lost. The `unreachable_human` policy from
ACP-DAG-HITL governs behavior when the dead man's switch
activates: either `abort` or `safe_pause`.
# Security Considerations
Override commands are high-privilege operations. All override
endpoints MUST require authentication via mutual TLS or signed
JWTs.
Override ECTs MUST be signed at L2 or L3. Agents MUST verify
signatures before processing.
To prevent replay attacks, agents MUST reject override ECTs with
`iat` more than 30 seconds in the past. The `jti` MUST be unique;
agents MUST reject duplicate `jti` values.
Deployments SHOULD implement multi-operator approval for Level 4
(TAKEOVER), requiring two independent operator identities.
The override endpoint SHOULD be served on a separate port or
network interface from the agent's main API to ensure availability
during overload.
# IANA Considerations
This document requests the following registrations:
1. Well-known URI registrations for `hitl/override`,
`hitl/resume`, `hitl/lift`, and `hitl/status` per {{RFC8615}}.
2. Registration of `exec_act` values: `hitl:override`,
`hitl:ack`, `hitl:resume`, `hitl:lift`,
`hitl:approval_request`, `hitl:approval_granted`,
`hitl:approval_denied` in a future ECT action type registry.
--- back
# Acknowledgments
{:numbered="false"}
This specification is the runtime enforcement companion to
ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}}. Override
design is inspired by industrial safety systems (IEC 62061,
ISO 13849).

View File

@@ -0,0 +1,612 @@
---
title: "Human-in-the-Loop (HITL) Primitives for Agent Ecosystems"
abbrev: "HITL"
category: std
docname: draft-hitl-human-in-the-loop-01
submissiontype: IETF
number:
date:
v: 3
area: "OPS"
workgroup: "NMOP"
keyword:
- human override
- HITL
- emergency stop
- agentic safety
- explainability
author:
-
fullname: TBD
organization: Independent
email: placeholder@example.com
normative:
RFC2119:
RFC8174:
RFC7519:
RFC8446:
RFC8615:
RFC9110:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
informative:
--- abstract
This document defines runtime HITL (Human-in-the-Loop) primitives
for agent ecosystems: four escalating override levels, approval
gates, timeout and fallback policies, and explainability hooks.
ACP-DAG-HITL defines WHEN humans must intervene (policy rules and
triggers). This specification defines HOW the intervention
actually happens at the protocol level: the HTTP endpoints,
override semantics, agent compliance requirements,
acknowledgment flows, and explainability tokens that allow
operators to make informed decisions. All overrides and decisions
produce ECT nodes, making human interventions part of the same
auditable DAG as agent actions.
--- middle
# Introduction
The current ratio of autonomous capability drafts to human
oversight drafts in the IETF is roughly 7:1. Agents can act but
humans cannot reliably stop them.
ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}} defines the
policy: trigger conditions, required roles, and actions (`pause`,
`escalate`, `abort`). But it deliberately defers the runtime
protocol — how does an operator actually send a stop command? How
does the agent acknowledge it? What happens if the operator is
unreachable?
This specification fills that gap. It is the runtime enforcement
companion to ACP-DAG-HITL, inspired by industrial safety systems:
the e-stop button on factory equipment, the circuit breaker in
electrical systems, and the kill switch in robotics.
HITL is deliberately not a governance framework, policy language,
or accountability protocol. It is a panic button with a
well-defined interface.
# Conventions and Definitions
{::boilerplate bcp14-tagged}
Override:
: A human-initiated command that alters an agent's autonomous
operation, taking precedence over the agent's own decisions.
Operator:
: A human user authorized to issue override commands.
Approval Gate:
: A DAG node that blocks workflow progression until a human
approves or rejects continuation.
HITL Intensity Level:
: A deployment-wide configuration of how actively human oversight
is required. Distinct from override levels (which are runtime
commands).
# HITL Intensity Levels {#intensity}
A deployment configures a HITL intensity level that determines
the baseline human oversight requirement. This is orthogonal to
the four runtime override levels ({{levels}}): intensity levels
govern planning; override levels govern runtime intervention.
| Intensity | Label | Human requirement | When to use |
|-----------|-------|-------------------|-------------|
| I0 | Autonomous | No HITL required by default | Dev/test; fully trusted agents |
| I1 | Advisory | Notifications; no blocking | Monitoring-only production deployments |
| I2 | Selective | Approval required on critical paths only | Standard production cross-org deployments |
| I3 | Mandatory | Approval required on every consequential action | Regulated environments; EU AI Act critical systems |
{: #fig-intensity title="HITL Intensity Levels"}
Intensity levels are declared in ACP-DAG-HITL workflow policy and
map to AEM assurance levels (see {{assurance-binding}}):
| HITL Intensity | Minimum AEM Assurance Level |
|---------------|----------------------------|
| I0 | L1 |
| I1 | L1 |
| I2 | L2 |
| I3 | L3 |
{: #fig-intensity-assurance title="Intensity to Assurance Level Mapping"}
# Relationship to ACP-DAG-HITL {#mapping}
ACP-DAG-HITL defines three HITL actions. This specification
maps them to four runtime override levels and extends with
CONSTRAIN (partial restriction):
| ACP-DAG-HITL action | HITL Override Level | Behavior |
|---------------------|---------------------|----------|
| `pause` | Level 1: PAUSE | Suspend autonomous actions, hold state |
| (no equivalent) | Level 2: CONSTRAIN | Restrict to an allowlist of actions |
| `abort` | Level 3: STOP | Cease all actions, enter inert state |
| `escalate` | Level 4: TAKEOVER | Transfer control to human operator |
{: #fig-mapping title="ACP-DAG-HITL to HITL Level Mapping"}
When ACP-DAG-HITL rules trigger, the runtime system uses the
corresponding HITL level to enforce the action.
# Override Levels {#levels}
## Level 1: PAUSE
The agent MUST suspend all autonomous actions and hold current
state. It MUST NOT initiate new actions but MAY complete
in-progress actions if stopping mid-execution would cause harm
(e.g., an in-flight database transaction). The agent resumes
when a RESUME command is received.
## Level 2: CONSTRAIN
The agent MUST restrict its actions to a specified subset. The
override command includes an allowlist of permitted action types.
The agent MUST reject any action not on the allowlist, responding
with HTTP 403 and an ECT noting the constraint violation.
## Level 3: STOP
The agent MUST immediately cease all autonomous actions and enter
an inert state. It MUST NOT take any autonomous actions until
explicitly restarted. This is the e-stop. Any in-progress
consequential actions MUST be abandoned; if abandonment would
leave external state inconsistent, the agent MUST emit an
`atd:error` ECT and the ATD rollback protocol applies.
## Level 4: TAKEOVER
The agent MUST transfer operational control to the human operator.
It enters a pass-through mode where it executes only explicit
operator commands. The agent's sensors and outputs remain
available to the operator as tools. Deployments SHOULD require
two-operator authorization for TAKEOVER (see {{security}}).
# Override Protocol {#protocol}
## Override Command
Override commands are sent as HTTP POST to the agent's well-known
endpoint:
~~~
POST /.well-known/hitl/override HTTP/1.1
Content-Type: application/json
Authorization: Bearer <operator-jwt>
Execution-Context: <override-ect>
~~~
The override ECT MUST contain:
- `exec_act`: `"hitl:override"`
- `par`: the most recent ECT from the agent being overridden
(linking the override into the workflow DAG)
~~~json
{
"exec_act": "hitl:override",
"par": ["agent-last-action-ect"],
"ext": {
"hitl.level": 3,
"hitl.reason": "Agent blocking legitimate traffic",
"hitl.operator_id": "user:alice",
"hitl.scope": "*",
"hitl.constraints": null,
"hitl.ttl": null,
"hitl.nonce": "a3f8b2c1"
}
}
~~~
{: #fig-override title="Override ECT"}
Field definitions:
- `hitl.level`: Integer 1-4. MUST be present.
- `hitl.reason`: Human-readable text. MUST be logged.
- `hitl.scope`: `"*"` for all functions, or an array of function
IDs for partial override.
- `hitl.constraints`: For Level 2 only. Array of permitted action
types.
- `hitl.ttl`: Duration in seconds. If set, override auto-expires.
If null, persists until explicitly lifted.
- `hitl.nonce`: REQUIRED. A random value to prevent replay attacks.
## Acknowledgment
The agent MUST respond within 1 second with an acknowledgment ECT:
- `exec_act`: `"hitl:ack"`
- `par`: the override ECT
~~~json
{
"exec_act": "hitl:ack",
"par": ["override-ect-uuid"],
"ext": {
"hitl.status": "accepted",
"hitl.prior_state": "autonomous",
"hitl.current_state": "stopped",
"hitl.effective_at": "2026-03-01T12:00:00.123Z"
}
}
~~~
{: #fig-ack title="Acknowledgment ECT"}
The override/ack ECT pair serves as the Decision Record defined
in ACP-DAG-HITL Section 6.5. No separate audit mechanism is
needed.
## Resume and Lift
To resume from PAUSE:
~~~
POST /.well-known/hitl/resume HTTP/1.1
Execution-Context: <resume-ect with exec_act="hitl:resume">
~~~
To lift any override:
~~~
POST /.well-known/hitl/lift HTTP/1.1
Execution-Context: <lift-ect with exec_act="hitl:lift">
~~~
Both produce ECTs linked to the original override ECT via `par`.
# Agent Compliance Requirements {#compliance}
Every HITL-compliant agent MUST:
1. Implement the `/.well-known/hitl/override` endpoint per
{{RFC8615}}.
2. Process override commands within 1 second of receipt. The
override path MUST be independent of the agent's main
processing loop and MUST NOT be blocked by ongoing tasks.
3. Acknowledge every override with an ECT response.
4. An agent MUST NOT respond with "rejected". Overrides are
mandatory. If the agent cannot fully comply, it MUST respond
with status `partial` and describe what it could not do.
5. Expose current override status at:
~~~
GET /.well-known/hitl/status
~~~
~~~json
{
"agent_id": "spiffe://example.com/agent/firewall",
"override_active": true,
"current_level": 3,
"override_ect": "override-ect-uuid",
"since": "2026-03-01T12:00:00Z",
"operator_id": "user:alice"
}
~~~
{: #fig-status title="Override Status Response"}
6. The override endpoint SHOULD be served on a separate port or
network interface from the agent's main API to ensure
availability under load.
# Approval Gates {#approval-gates}
An approval gate is a DAG node that blocks workflow progression
until a human approves. Unlike overrides (which interrupt running
agents), approval gates are planned checkpoints in the workflow.
Approval gates are defined as ACP-DAG-HITL nodes with HITL rules:
~~~json
{
"dag": {
"nodes": [
{
"id": "n-approve",
"type": "hitl:approval_gate",
"agent": "system:hitl-gateway",
"constraints": {
"hitl.required_role": "clinician:oncall",
"hitl.timeout_s": 300,
"hitl.timeout_action": "safe_pause"
}
}
]
}
}
~~~
{: #fig-gate title="Approval Gate as DAG Node"}
When the workflow reaches an approval gate, the system:
1. Emits an ECT with `exec_act: "hitl:approval_request"`.
2. Notifies the required human role with an explainability
token (see {{explainability}}).
3. Waits for approval (ECT: `"hitl:approval_granted"`) or
rejection (ECT: `"hitl:approval_denied"`).
4. On timeout, applies `hitl.timeout_action` per {{timeout}}.
## Approval Request and Response ECTs
~~~json
{
"exec_act": "hitl:approval_request",
"par": ["pre-gate-ect-uuid"],
"ext": {
"hitl.required_role": "clinician:oncall",
"hitl.context": "Medication dosage adjustment for patient P-1042",
"hitl.timeout_s": 300,
"hitl.explainability_ref": "expl-ect-uuid"
}
}
~~~
{: #fig-approval-req title="Approval Request ECT"}
~~~json
{
"exec_act": "hitl:approval_granted",
"par": ["approval-request-ect-uuid"],
"ext": {
"hitl.operator_id": "user:dr-jones",
"hitl.scope": "medication:adjust",
"hitl.expires": "2026-03-01T13:00:00Z"
}
}
~~~
{: #fig-approval-grant title="Approval Granted ECT"}
~~~json
{
"exec_act": "hitl:approval_denied",
"par": ["approval-request-ect-uuid"],
"ext": {
"hitl.operator_id": "user:dr-jones",
"hitl.reason": "Dosage exceeds safe maximum for patient weight",
"hitl.alternative": "Use standard protocol dosage"
}
}
~~~
{: #fig-approval-deny title="Approval Denied ECT"}
# Timeout and Fallback Policy {#timeout}
When a human does not respond within `hitl.timeout_s`, the
agent applies `hitl.timeout_action`. Three policies are
supported:
fail-closed:
: Abort the workflow. The agent emits `atd:error` with
`error_type: "timeout"` and the ATD rollback protocol
applies. Use when safety requires no action over wrong action.
fail-open:
: Continue as if approved, recording an audit ECT that no human
approved. Use only when workflow continuity is more important
than human review (I0/I1 intensity deployments).
escalate:
: Move the approval request to the next operator in the
escalation chain (defined in ACP-DAG-HITL policy). If the
escalation chain is exhausted, fall back to `fail-closed`.
The timeout policy is set in ACP-DAG-HITL node constraints:
~~~json
{
"constraints": {
"hitl.timeout_s": 300,
"hitl.timeout_action": "escalate"
}
}
~~~
{: #fig-timeout title="Timeout Policy as Node Constraint"}
Timeout policy MUST be `fail-closed` at HITL intensity I3.
Timeout policy MUST NOT be `fail-open` when assurance level is L3.
# Explainability {#explainability}
When a HITL point is triggered, the agent SHOULD provide an
explainability token that allows the operator to make an informed
decision. At AEM assurance L2+, explainability is REQUIRED for
approval gate requests.
An explainability token is an ECT:
- `exec_act`: `"hitl:explanation"`
~~~json
{
"exec_act": "hitl:explanation",
"par": ["last-agent-action-ect"],
"ext": {
"hitl.summary": "Agent proposes to reroute BGP traffic from AS64496 to AS64497 due to packet loss exceeding 15% threshold over 5-minute window.",
"hitl.proposed_action": "update-bgp-peer router-07 neighbor 198.51.100.1 remove-private-as",
"hitl.evidence_ects": [
"snmp-poll-1-ect-uuid",
"snmp-poll-2-ect-uuid",
"loss-calc-ect-uuid"
],
"hitl.confidence": 0.91,
"hitl.risk_level": "medium",
"hitl.reversible": true
}
}
~~~
{: #fig-explanation title="Explainability Token ECT"}
Field definitions:
- `hitl.summary`: Human-readable description of what the agent
was doing and why HITL was reached. REQUIRED.
- `hitl.proposed_action`: What the agent proposes to do.
REQUIRED.
- `hitl.evidence_ects`: Array of `jti` values from prior ECTs
that support the proposal. SHOULD be present.
- `hitl.confidence`: Float 0.0-1.0; agent's self-assessed
confidence in the proposed action. SHOULD be present.
- `hitl.risk_level`: One of `low`, `medium`, `high`, `critical`.
SHOULD be present.
- `hitl.reversible`: Whether the proposed action can be rolled
back. REQUIRED.
The `hitl.explainability_ref` field in the approval request ECT
({{fig-approval-req}}) references the `jti` of this ECT.
# Binding to AEM Assurance Levels {#assurance-binding}
HITL requirements vary by AEM assurance level. The following
table is normative:
| AEM Level | Required HITL Intensity | Override signing | Explainability |
|-----------|------------------------|-----------------|----------------|
| L1 | I0 (optional) | Optional | Optional |
| L2 | I2 or higher | REQUIRED (signed JWT) | REQUIRED for I2+ |
| L3 | I3 | REQUIRED (signed JWT, L3 ECT) | REQUIRED |
{: #fig-assurance-hitl title="HITL Requirements by Assurance Level"}
At L3, approval gate responses (hitl:approval_granted) MUST be
committed to the audit ledger.
# Broadcast Override {#broadcast}
For environments with many agents, an operator MAY send a
broadcast override to a management endpoint:
~~~
POST /hitl/broadcast HTTP/1.1
Execution-Context: <broadcast-override-ect>
{
"targets": ["spiffe://example.com/agent/a",
"spiffe://example.com/agent/b"],
"level": 3,
"reason": "Coordinated emergency stop"
}
~~~
The broadcast endpoint fans out individual override ECTs to each
target and returns per-agent results. Each fan-out is itself an
ECT linked to the broadcast override ECT.
Broadcast overrides MUST be authenticated at L2 or higher.
# Dead Man's Switch {#dead-man}
For maximum reliability, agents SHOULD implement a heartbeat
mechanism: the agent periodically pings an operator heartbeat
endpoint. If the heartbeat is missed for a configurable duration,
the agent automatically enters Level 1 (PAUSE).
The heartbeat interval SHOULD be 30 seconds. The trigger
threshold SHOULD be 3 missed heartbeats.
This provides a safety net when network connectivity to the
operator is lost. The `unreachable_human` policy from
ACP-DAG-HITL governs behavior when the dead man's switch
activates: either `abort` (→ Level 3) or `safe_pause` (→ Level 1).
# Security Considerations {#security}
## Authentication of Override Commands
All override endpoints MUST require authentication via mutual
TLS ({{RFC8446}}) or signed JWTs ({{RFC7519}}). The JWT MUST
contain the operator's identity and be signed by a trusted key
(per ACP-DAG-HITL operator role configuration).
## Replay Prevention
To prevent replay attacks, agents MUST:
1. Reject override ECTs with `iat` more than 30 seconds in the
past.
2. Reject duplicate `jti` values (require a nonce per override).
3. Require the `hitl.nonce` field in override ECTs.
## Impersonation
Override commands carry high privilege. Agents MUST verify:
- The operator JWT is signed by a trusted key in the ACP-DAG-HITL
operator registry.
- The operator role matches the `required_role` in the triggering
HITL rule.
## Two-Operator Approval for TAKEOVER
Deployments SHOULD implement multi-operator approval for Level 4
(TAKEOVER), requiring two independent operator identities. The
two approval ECTs MUST both appear as `par` in the TAKEOVER
override ECT.
## HITL Bypass Prevention
Agents that claim a HITL gate was satisfied MUST provide the
`jti` of the corresponding `hitl:approval_granted` ECT in the
ECT that follows the gate. Agents MUST NOT proceed past an
approval gate without a valid signed approval ECT.
## Escalation Chain Integrity
The escalation chain in ACP-DAG-HITL policy defines which roles
receive escalations. This chain MUST be signed as part of the
policy token to prevent tampering. Agents MUST NOT follow
escalation chains from unsigned or unverified policy tokens.
# IANA Considerations
## Well-Known URI Registrations
This document requests the following registrations per {{RFC8615}}:
| URI Suffix | Purpose |
|------------|---------|
| `hitl/override` | Override command endpoint |
| `hitl/resume` | Resume from PAUSE |
| `hitl/lift` | Lift any active override |
| `hitl/status` | Override status query |
{: #fig-wellknown title="Well-Known URI Registrations"}
## `exec_act` Values
This document requests registration in the AEM Ecosystem
Extension Registry:
| Value | Description | Reference |
|-------|-------------|-----------|
| `hitl:override` | Human override command | This document |
| `hitl:ack` | Agent acknowledgment of override | This document |
| `hitl:resume` | Resume from PAUSE state | This document |
| `hitl:lift` | Lift any active override | This document |
| `hitl:approval_request` | Workflow blocked at approval gate | This document |
| `hitl:approval_granted` | Human approved continuation | This document |
| `hitl:approval_denied` | Human denied continuation | This document |
| `hitl:explanation` | Explainability token for HITL decision | This document |
{: #fig-iana-actions title="HITL exec_act Registrations"}
--- back
# Acknowledgments
{:numbered="false"}
This specification is the runtime enforcement companion to
ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}}. Override
design is inspired by industrial safety systems (IEC 62061,
ISO 13849). The explainability token design is informed by
EU AI Act Article 13 transparency requirements.

View File

@@ -0,0 +1,354 @@
---
title: "Cross-Protocol Agent Translation (CPAT)"
abbrev: "CPAT"
category: std
docname: draft-cpat-cross-protocol-agent-translation-00
submissiontype: IETF
number:
date:
v: 3
area: "ART"
workgroup: "DISPATCH"
keyword:
- agent interoperability
- protocol translation
- agentic workflows
- execution context
author:
-
fullname: Generated by IETF Draft Analyzer
organization: Independent
email: placeholder@example.com
normative:
RFC7519:
RFC7515:
RFC9110:
RFC8615:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
informative:
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
--- abstract
This document defines the Cross-Protocol Agent Translation (CPAT)
framework, a mechanism enabling AI agents using different
communication protocols to interoperate. With over 90 competing
agent-to-agent protocol drafts and no interoperability standard,
protocol fragmentation is the primary barrier to multi-vendor agent
ecosystems. CPAT defines capability advertisement, protocol
negotiation, and translation gateways. Translation hops are
recorded as Execution Context Token (ECT) DAG nodes, giving every
cross-protocol interaction a cryptographic audit trail without
inventing a parallel tracing mechanism.
--- middle
# Introduction
The IETF AI/agent landscape includes over 90 drafts proposing
agent-to-agent communication protocols, yet no standard exists
for agents using different protocols to exchange messages.
CPAT takes a pragmatic approach: rather than mandating a single
protocol, it defines the minimum machinery for agents to discover
each other's protocol support, agree on a common format, and fall
back to translation gateways when no common protocol exists.
CPAT builds on Execution Context Tokens
{{I-D.nennemann-wimse-ect}} as its audit and tracing backbone.
Every translation hop produces an ECT, linking into the workflow
DAG alongside the source and destination agents. This eliminates
the need for a separate tracing or provenance mechanism -- the ECT
DAG already provides it.
Design principles:
1. Reuse existing standards (HTTP, JSON, TLS, ECT) wherever
possible.
2. Keep the core mechanism small enough to implement in a day.
3. Do not require agents to support any protocol beyond their own
plus CPAT negotiation.
# Conventions and Definitions
{::boilerplate bcp14-tagged}
The following terms are used in this document:
Agent Protocol:
: A communication protocol used by an AI agent for peer-to-peer
message exchange (e.g., A2A, MCP, SLIM, uACP).
Capability Document:
: A JSON object describing the protocols an agent supports, served
at a well-known URI.
Translation Gateway:
: A service that converts messages between two agent protocols,
recording each translation as an ECT DAG node.
# Problem Statement
Consider three agents: Agent A speaks Protocol X, Agent B speaks
Protocol Y, and Agent C speaks both X and Z. Today there is no
standard way for A to discover that B uses a different protocol,
negotiate a common format, or route through a translator.
Existing work on Agent Name Service (ANS) and agent discovery
addresses finding agents but not protocol compatibility. CPAT
fills the gap between discovery and communication.
# Protocol Capability Advertisement {#capability-ad}
Each CPAT-compliant agent MUST serve a capability document at the
well-known URI `/.well-known/cpat` {{RFC8615}}. The document is a
JSON object:
~~~json
{
"cpat_version": "1.0",
"agent_id": "spiffe://example.com/agent/pricing",
"protocols": [
{
"id": "a2a-v1",
"version": "1.0",
"endpoint": "https://agent.example.com/a2a",
"priority": 10
},
{
"id": "mcp-v1",
"version": "2025-03-26",
"endpoint": "https://agent.example.com/mcp",
"priority": 20
}
],
"translation_gateways": [
"https://gateway.example.com/cpat/translate"
],
"ect_assurance_level": "L2"
}
~~~
{: #fig-capability title="Capability Document Example"}
The `protocols` array MUST contain at least one entry. Each entry
MUST include `id` (a registered protocol identifier), `version`,
and `endpoint`. The `priority` field is OPTIONAL; lower values
indicate higher preference.
The `ect_assurance_level` field declares the minimum ECT assurance
level the agent requires for interactions. This enables gateways
to produce ECTs at the correct level.
Agents SHOULD also advertise their capability document URI in DNS
SVCB records. The DNS record type `_cpat._tcp` SHOULD be used.
# Negotiation Handshake {#negotiation}
When Agent A wants to communicate with Agent B:
Step 1:
: Agent A fetches Agent B's capability document from B's
well-known CPAT URI over HTTPS.
Step 2:
: Agent A computes the intersection of its own protocol list with
Agent B's. If the intersection is non-empty, the protocol with
the lowest combined priority score is selected. Communication
proceeds directly using that protocol.
Step 3:
: If no common protocol exists, Agent A checks whether any
translation gateway listed by either agent supports both
protocols. Agent A queries the gateway:
~~~
GET /.well-known/cpat/gateway?from=a2a-v1&to=slim-v1
~~~
The gateway responds with 200 OK if it supports the pair, or
404 if not.
Step 4:
: If a suitable gateway is found, Agent A sends its message to the
gateway, which translates and forwards it to Agent B. The
gateway records the translation as an ECT (see {{ect-integration}}).
Step 5:
: If no gateway supports the required pair, Agent A returns an
error to its caller with error code `no_translation_path`.
The entire negotiation is stateless and cacheable. Agents SHOULD
cache capability documents for the duration indicated by HTTP
Cache-Control headers, defaulting to 3600 seconds.
# ECT Integration {#ect-integration}
Every translation hop produces an ECT {{I-D.nennemann-wimse-ect}}
that links into the workflow DAG. This provides cryptographic
proof of protocol translation without a separate tracing mechanism.
## Translation ECT Claims
A gateway producing a translation ECT MUST set:
- `exec_act`: `"cpat:translate"`
- `par`: array containing the `jti` of the source agent's ECT
- `wid`: the workflow identifier from the source ECT (preserving
workflow continuity across protocol boundaries)
The `ext` claim carries CPAT-specific metadata:
~~~json
{
"ext": {
"cpat.source_protocol": "a2a-v1",
"cpat.dest_protocol": "slim-v1",
"cpat.gateway_id": "spiffe://gw.example.com/cpat",
"cpat.translation_warnings": []
}
}
~~~
{: #fig-translation-ect title="Translation ECT Extension Claims"}
The `inp_hash` claim MUST contain the SHA-256 hash of the source
protocol message. The `out_hash` claim MUST contain the SHA-256
hash of the translated message. This allows verifiers to confirm
that a specific input produced a specific output without accessing
the message content.
## Assurance Level Inheritance
The gateway MUST produce ECTs at the higher of:
- The source agent's declared `ect_assurance_level`
- The destination agent's declared `ect_assurance_level`
At L3, the translation ECT MUST be recorded in the audit ledger
before the translated message is forwarded to the destination agent.
## DAG Continuity
The translation creates a three-node subgraph in the workflow DAG:
~~~
Source Agent ECT (exec_act: "send_task")
|
v [par reference]
Gateway ECT (exec_act: "cpat:translate")
|
v [par reference]
Dest Agent ECT (exec_act: "receive_task")
~~~
{: #fig-dag-continuity title="Translation DAG Subgraph"}
The Execution-Context HTTP header {{I-D.nennemann-wimse-ect}}
survives protocol translation: the gateway includes the
translation ECT in the Execution-Context header of the forwarded
request to the destination agent.
# Translation Gateway Requirements {#gateway-reqs}
A CPAT translation gateway MUST:
1. Serve a capability document listing all supported protocol
pairs at `/.well-known/cpat/gateway`.
2. Accept messages via HTTP POST at its translate endpoint.
3. Produce an ECT for every translation per {{ect-integration}}.
4. Preserve message semantics: the intent, core payload content,
and metadata MUST survive translation. Fields with no
equivalent in the destination protocol SHOULD be carried in a
protocol-specific extension field or dropped with a warning
recorded in `cpat.translation_warnings`.
5. Return the translated message in the response body, or forward
it directly to the destination agent.
A gateway MUST NOT modify payload semantics during translation.
Gateways MUST require TLS 1.3 for all connections and SHOULD
implement rate limiting per source agent.
# Policy Integration {#policy-integration}
When used with the Agent Context Policy Token
{{I-D.nennemann-agent-dag-hitl-safety}}, CPAT-related policies
can be expressed as DAG node constraints:
~~~json
{
"dag": {
"nodes": [
{
"id": "n-translate",
"type": "cpat:translate",
"agent": "spiffe://gw.example.com/cpat",
"constraints": {
"allowed_source_protocols": ["a2a-v1", "mcp-v1"],
"allowed_dest_protocols": ["slim-v1"],
"max_translation_hops": 2
}
}
]
}
}
~~~
{: #fig-policy title="CPAT Policy as DAG Node Constraints"}
The `max_translation_hops` constraint prevents messages from being
translated through an excessive number of gateways. Agents
receiving a message SHOULD reject it if the ECT DAG contains more
translation hops than allowed by policy.
# Security Considerations
Capability documents are served over HTTPS, ensuring transport
security. Agents SHOULD verify TLS certificates before trusting
capability documents.
Gateways are trusted intermediaries with access to message content
during translation. For end-to-end confidentiality, agents MAY
encrypt the message payload using a shared key established out of
band; the gateway translates only the protocol framing, not the
encrypted content.
The ECT audit trail ({{ect-integration}}) enables detection of:
- Unauthorized gateways (unexpected `cpat.gateway_id` in the DAG)
- Content tampering (mismatched `inp_hash`/`out_hash` relative to
message content)
- Routing loops (repeated gateway IDs in the DAG ancestry)
At L3, the audit ledger provides tamper-evident proof of all
translations for regulatory compliance.
# IANA Considerations
This document requests the following IANA registrations:
1. A "CPAT Protocol Identifier" registry under Expert Review
policy. Initial entries: "a2a-v1", "mcp-v1", "slim-v1",
"uacp-v1", "ainp-v1".
2. A well-known URI registration for "cpat" per {{RFC8615}}.
3. Registration of the `exec_act` value "cpat:translate" in a
future ECT action type registry.
--- back
# Acknowledgments
{:numbered="false"}
This document builds on the Execution Context Token specification
{{I-D.nennemann-wimse-ect}} and the Agent Context Policy Token
{{I-D.nennemann-agent-dag-hitl-safety}}.

View File

@@ -0,0 +1,281 @@
Internet-Draft AI/Agent WG
Intended status: Standards Track March 2026
Expires: September 15, 2026
Cross-Protocol Agent Translation (CPAT)
draft-cpat-cross-protocol-agent-translation-00
Abstract
This document defines the Cross-Protocol Agent Translation (CPAT)
framework, a lightweight mechanism enabling AI agents using
different communication protocols to interoperate through
capability advertisement and message translation. With over 90
competing agent-to-agent (A2A) protocol drafts and no
interoperability standard, protocol fragmentation is the primary
barrier to multi-vendor agent ecosystems. CPAT defines three
components: a capability advertisement format for agents to
declare supported protocols, a negotiation handshake to select a
common protocol or translation path, and a canonical envelope
format that enables translation gateways to convert messages
between incompatible protocols. CPAT reuses existing HTTP
content negotiation patterns and builds on JSON for simplicity.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
This document is intended to have Standards Track status.
Distribution of this memo is unlimited.
Table of Contents
1. Introduction
2. Terminology
3. Problem Statement
4. Protocol Capability Advertisement
5. Negotiation Handshake
6. Canonical Envelope Format
7. Translation Gateway Requirements
8. Security Considerations
9. IANA Considerations
1. Introduction
The IETF AI/agent landscape includes over 90 drafts proposing
agent-to-agent communication protocols, yet no standard exists
for agents using different protocols to exchange messages. This
fragmentation mirrors the early days of instant messaging, where
users on different networks could not communicate until gateway
and federation standards emerged.
CPAT takes a pragmatic approach: rather than mandating a single
protocol, it defines the minimum machinery for agents to
discover each other's protocol support, agree on a common
format, and fall back to translation gateways when no common
protocol exists. The design follows three principles:
1. Reuse existing standards (HTTP, JSON, TLS) wherever possible.
2. Keep the core mechanism small enough to implement in a day.
3. Do not require agents to support any protocol beyond their own
plus CPAT negotiation.
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described
in RFC 2119 [RFC2119].
Agent Protocol: A communication protocol used by an AI agent for
peer-to-peer message exchange (e.g., A2A, MCP, SLIM, uACP).
Capability Document: A JSON object describing the protocols an
agent supports, served at a well-known URI.
Translation Gateway: A service that converts messages between
two agent protocols using the CPAT canonical envelope as an
intermediate representation.
3. Problem Statement
Consider three agents: Agent A speaks Protocol X, Agent B speaks
Protocol Y, and Agent C speaks both X and Z. Today there is no
standard way for A to discover that B uses a different protocol,
negotiate a common format, or route through a translator. Each
protocol defines its own discovery and messaging layer, creating
isolated silos.
Existing work on Agent Name Service (ANS) and agent discovery
addresses finding agents but not protocol compatibility. The
ADOL draft addresses token efficiency within a single protocol
but not cross-protocol translation. CPAT fills the gap between
discovery and communication.
4. Protocol Capability Advertisement
Each CPAT-compliant agent MUST serve a capability document at
the well-known URI /.well-known/cpat. The document is a JSON
object with the following structure:
{
"cpat_version": "1.0",
"agent_id": "urn:uuid:550e8400-e29b-41d4-a716-446655440000",
"protocols": [
{
"id": "a2a-v1",
"version": "1.0",
"endpoint": "https://agent.example.com/a2a",
"priority": 10
},
{
"id": "mcp-v1",
"version": "2025-03-26",
"endpoint": "https://agent.example.com/mcp",
"priority": 20
}
],
"translation_gateways": [
"https://gateway.example.com/cpat/translate"
],
"envelope_formats": ["cpat-envelope-v1"]
}
The "protocols" array MUST contain at least one entry. Each
entry MUST include "id" (a registered protocol identifier),
"version", and "endpoint". The "priority" field is OPTIONAL;
lower values indicate higher preference.
Agents SHOULD also advertise their capability document URI in
DNS SRV or SVCB records for automated discovery. The DNS
record type "_cpat._tcp" SHOULD be used.
5. Negotiation Handshake
When Agent A wants to communicate with Agent B, the following
negotiation procedure applies:
Step 1: Agent A fetches Agent B's capability document from
B's well-known CPAT URI over HTTPS.
Step 2: Agent A computes the intersection of its own protocol
list with Agent B's. If the intersection is non-empty, the
protocol with the lowest combined priority score is selected.
Communication proceeds directly using that protocol.
Step 3: If no common protocol exists, Agent A checks whether
any translation gateway listed by either agent supports both
protocols. Agent A queries the gateway's capability endpoint
at /.well-known/cpat/gateway:
GET /.well-known/cpat/gateway?from=a2a-v1&to=slim-v1
The gateway responds with 200 OK and a translation descriptor
if it supports the pair, or 404 if not.
Step 4: If a suitable gateway is found, Agent A sends its
message wrapped in a CPAT envelope (Section 6) to the gateway,
which translates and forwards it to Agent B.
Step 5: If no gateway supports the required pair, Agent A
SHOULD return an error to its caller indicating protocol
incompatibility, using the CPAT error code "no_translation_path".
The entire negotiation is stateless and cacheable. Agents
SHOULD cache capability documents for the duration indicated by
HTTP Cache-Control headers, defaulting to 3600 seconds.
6. Canonical Envelope Format
The CPAT envelope wraps a protocol-specific message in a
standard container for gateway translation. The envelope is a
JSON object:
{
"cpat_version": "1.0",
"message_id": "urn:uuid:6ba7b810-9dad-11d1-80b4-00c04fd430c8",
"timestamp": "2026-03-01T12:00:00Z",
"source": {
"agent_id": "urn:uuid:...",
"protocol": "a2a-v1"
},
"destination": {
"agent_id": "urn:uuid:...",
"protocol": "slim-v1"
},
"intent": "task_request",
"payload": {
"content_type": "application/json",
"body": "...base64-encoded protocol-specific message..."
},
"trace": ["urn:uuid:...source", "urn:uuid:...gateway"]
}
The "intent" field MUST be one of: "task_request",
"task_response", "notification", "error", "capability_query".
This allows gateways to perform semantic translation even when
protocol message structures differ significantly.
The "trace" array provides a simple provenance chain of all
agents and gateways that have handled the message. Each
intermediary MUST append its own identifier.
The "payload.body" field contains the original protocol message,
base64-encoded. Gateways translate by decoding the source
protocol message, mapping it to the CPAT semantic model (intent
+ standard fields), and re-encoding in the destination protocol.
7. Translation Gateway Requirements
A CPAT translation gateway MUST:
1. Serve a capability document listing all supported protocol
pairs at /.well-known/cpat/gateway.
2. Accept CPAT envelopes via HTTP POST at its translate endpoint.
3. Validate envelope integrity before translation.
4. Preserve message semantics: the intent, core payload content,
and metadata MUST survive translation. Fields with no
equivalent in the destination protocol SHOULD be carried in
a protocol-specific extension field or dropped with a warning.
5. Return the translated envelope in the response body, or
forward it directly to the destination agent.
6. Log all translations with source, destination, and timestamp
for audit purposes.
A gateway MUST NOT modify the payload semantics during
translation. If exact translation is not possible, the gateway
MUST include a "translation_warnings" array in the envelope
listing fields that were approximated or dropped.
Gateways SHOULD implement rate limiting per source agent and
MUST require TLS 1.3 [RFC8446] for all connections.
8. Security Considerations
Capability documents are served over HTTPS, ensuring transport
security. Agents SHOULD verify the TLS certificate of peers
before trusting their capability documents.
CPAT envelopes in transit through gateways are visible to the
gateway operator. For end-to-end confidentiality, agents MAY
encrypt the payload.body field using a shared key established
out of band. The envelope metadata (intent, agent IDs,
timestamps) remains visible to enable routing.
Gateways are trusted intermediaries. Deployments SHOULD use
gateways operated by mutually trusted parties or verified
through attestation mechanisms such as those in
draft-aylward-daap-v2.
The trace array enables detection of routing loops and
unauthorized intermediaries. Agents SHOULD reject messages
with unexpected entries in the trace.
Denial-of-service attacks against gateways are mitigated by
rate limiting (Section 7) and standard HTTP-layer protections.
9. IANA Considerations
This document requests IANA establish the following:
1. A "CPAT Protocol Identifier" registry under Expert Review
policy. Initial entries: "a2a-v1", "mcp-v1", "slim-v1",
"uacp-v1", "ainp-v1".
2. A "CPAT Intent Type" registry under Specification Required
policy. Initial entries: "task_request", "task_response",
"notification", "error", "capability_query".
3. A well-known URI registration for "cpat" per RFC 8615.
Author's Address
Generated by IETF Draft Analyzer
2026-03-01

View File

@@ -0,0 +1,315 @@
---
title: "Agent Ecosystem Protocol Binding (AEPB): Interop and Lifecycle"
abbrev: "AEPB"
category: std
docname: draft-aepb-agent-ecosystem-protocol-binding-00
submissiontype: IETF
number:
date:
v: 3
area: "ART"
workgroup: "DISPATCH"
keyword:
- agent interoperability
- protocol translation
- lifecycle
- agentic workflows
author:
-
fullname: TBD
organization: Independent
email: placeholder@example.com
normative:
RFC2119:
RFC8174:
RFC8446:
RFC8615:
RFC9110:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
informative:
--- abstract
This document defines the Agent Ecosystem Protocol Binding (AEPB),
the interoperability and lifecycle layer of the agent ecosystem.
With over 90 competing A2A protocol drafts and no interoperability
standard, AEPB defines capability advertisement, protocol
negotiation, translation gateways, and agent lifecycle management
(versioning, graceful shutdown, retirement). Translation hops
produce ECT nodes, preserving DAG continuity across protocol
boundaries. Protocol constraints are expressed as ACP-DAG-HITL
node constraints.
--- middle
# Introduction
The IETF AI/agent landscape includes over 90 drafts proposing
agent-to-agent communication protocols. No standard exists for
agents using different protocols to exchange messages, and no
standard exists for how agents evolve, get replaced, or retire
without disrupting dependent services.
AEPB addresses both gaps with a pragmatic approach: rather than
mandating a single protocol, it defines the minimum machinery for
agents to discover each other's protocol support, agree on a
common format, fall back to translation gateways, and manage their
lifecycle.
AEPB builds on ECT {{I-D.nennemann-wimse-ect}} for audit (every
translation hop is a DAG node) and ACP-DAG-HITL
{{I-D.nennemann-agent-dag-hitl-safety}} for policy (protocol
constraints as node constraints).
# Conventions and Definitions
{::boilerplate bcp14-tagged}
Agent Protocol:
: A communication protocol used by an AI agent for peer-to-peer
message exchange (e.g., A2A, MCP, SLIM, uACP).
Capability Document:
: A JSON object describing the protocols an agent supports.
Translation Gateway:
: A service that converts messages between two agent protocols,
recording each translation as an ECT DAG node.
# Capability Advertisement {#capability}
Each AEPB-compliant agent MUST serve a capability document at
`/.well-known/aepb` {{RFC8615}}:
~~~json
{
"aepb_version": "1.0",
"agent_id": "spiffe://example.com/agent/pricing",
"protocols": [
{
"id": "a2a-v1",
"version": "1.0",
"endpoint": "https://agent.example.com/a2a",
"priority": 10
},
{
"id": "mcp-v1",
"version": "2025-03-26",
"endpoint": "https://agent.example.com/mcp",
"priority": 20
}
],
"translation_gateways": [
"https://gateway.example.com/aepb/translate"
],
"ect_assurance_level": "L2",
"lifecycle": {
"status": "active",
"version": "2.1.0",
"deprecated_at": null,
"sunset_at": null,
"successor": null
}
}
~~~
{: #fig-capability title="Capability Document"}
The `protocols` array MUST contain at least one entry. `priority`
is OPTIONAL; lower values indicate higher preference.
The `lifecycle` object (see {{lifecycle}}) provides versioning and
deprecation metadata.
Agents SHOULD advertise via DNS SVCB records (`_aepb._tcp`).
# Protocol Negotiation {#negotiation}
When Agent A wants to communicate with Agent B:
1. Agent A fetches B's capability document over HTTPS.
2. Agent A computes the intersection of protocol lists. If
non-empty, the protocol with the lowest combined priority is
selected. Communication proceeds directly.
3. If no common protocol exists, Agent A checks translation
gateways listed by either agent:
~~~
GET /.well-known/aepb/gateway?from=a2a-v1&to=slim-v1
~~~
The gateway responds 200 if it supports the pair, 404 if not.
4. If a suitable gateway is found, Agent A sends its message to
the gateway, which translates and forwards.
5. If no gateway supports the pair, Agent A returns error
`no_translation_path`.
Negotiation is stateless and cacheable (Cache-Control, default
3600s).
# Translation as ECT DAG Nodes {#translation-ect}
Every translation hop produces an ECT:
- `exec_act`: `"aepb:translate"`
- `par`: the source agent's ECT
- `inp_hash`: SHA-256 of source protocol message
- `out_hash`: SHA-256 of translated message
~~~json
{
"exec_act": "aepb:translate",
"par": ["source-agent-ect-uuid"],
"inp_hash": "sha256-of-source-message",
"out_hash": "sha256-of-translated-message",
"ext": {
"aepb.source_protocol": "a2a-v1",
"aepb.dest_protocol": "slim-v1",
"aepb.gateway_id": "spiffe://gw.example.com/aepb",
"aepb.translation_warnings": []
}
}
~~~
{: #fig-translate-ect title="Translation ECT"}
This creates a three-node subgraph:
~~~
Source ECT → Gateway ECT (aepb:translate) → Dest ECT
~~~
The Execution-Context HTTP header survives protocol translation:
the gateway includes the translation ECT in the header of the
forwarded request.
## Translation Policy
Protocol constraints are ACP-DAG-HITL node constraints:
~~~json
{
"constraints": {
"aepb.allowed_source_protocols": ["a2a-v1", "mcp-v1"],
"aepb.allowed_dest_protocols": ["slim-v1"],
"aepb.max_translation_hops": 2
}
}
~~~
{: #fig-policy title="Translation Policy"}
Agents receiving a message SHOULD reject it if the ECT DAG
contains more translation hops than `aepb.max_translation_hops`.
# Translation Gateway Requirements {#gateway}
A gateway MUST:
1. Serve a capability document at `/.well-known/aepb/gateway`.
2. Accept messages via HTTP POST at its translate endpoint.
3. Produce an ECT per {{translation-ect}} for every translation.
4. Preserve message semantics. Fields without a destination
equivalent are carried in an extension field or dropped with
a warning in `aepb.translation_warnings`.
5. Require TLS 1.3 {{RFC8446}} for all connections.
6. Implement rate limiting per source agent.
A gateway MUST NOT modify payload semantics.
# Agent Lifecycle Management {#lifecycle}
## Lifecycle States
An agent's `lifecycle.status` MUST be one of:
- `active`: Normal operation. Default state.
- `deprecated`: Agent is functional but will be retired.
`deprecated_at` MUST be set. Clients SHOULD migrate to
`successor` if provided.
- `draining`: Agent is rejecting new workflows but completing
in-progress ones. New delegation requests return HTTP 503
with `Retry-After` header pointing to `successor`.
- `retired`: Agent is offline. Capability document returns
HTTP 410 Gone with `successor` for redirect.
## Versioning
The `lifecycle.version` field uses semantic versioning. Agents
MUST increment the major version when breaking changes occur
(incompatible protocol or behavior changes).
Capability documents MUST include the version. Agents SHOULD
include version in ECT `ext` claims (`aepb.agent_version`) so
the audit trail records which version performed each action.
## Graceful Shutdown
When an agent transitions to `draining`:
1. Update capability document: `status: "draining"`,
set `sunset_at` timestamp.
2. Reject new workflow delegations with HTTP 503.
3. Complete all in-progress workflows.
4. Emit a final ECT: `exec_act: "aepb:shutdown"`.
5. Transition to `retired`.
Agents SHOULD provide at least 24 hours between `deprecated`
and `draining` to allow clients to discover the change via
cached capability documents.
## Successor Discovery
When `successor` is set, it MUST be the URI of the replacement
agent's capability document. Clients SHOULD transparently
redirect to the successor after verifying its capability
document.
# Security Considerations
Capability documents are served over HTTPS. Agents SHOULD verify
TLS certificates before trusting capability documents.
Gateways are trusted intermediaries with access to message content.
For end-to-end confidentiality, agents MAY encrypt message payloads
with a shared key established out of band.
The ECT audit trail enables detection of unauthorized gateways,
content tampering (mismatched `inp_hash`/`out_hash`), and routing
loops (repeated gateway IDs in DAG ancestry).
Lifecycle transitions (especially `draining` and `retired`) can be
exploited for denial of service. Only the agent operator (verified
via identity binding) SHOULD be able to update lifecycle status.
# IANA Considerations
This document requests:
1. A "AEPB Protocol Identifier" registry under Expert Review.
Initial entries: `a2a-v1`, `mcp-v1`, `slim-v1`, `uacp-v1`,
`ainp-v1`.
2. Well-known URI registrations for `aepb` and `aepb/gateway`
per {{RFC8615}}.
3. Registration of `exec_act` values: `aepb:translate`,
`aepb:shutdown` in a future ECT action type registry.
--- back
# Acknowledgments
{:numbered="false"}
AEPB builds on ECT {{I-D.nennemann-wimse-ect}} for translation
audit trails and ACP-DAG-HITL
{{I-D.nennemann-agent-dag-hitl-safety}} for protocol policy.

View File

@@ -0,0 +1,577 @@
---
title: "Agent Ecosystem Protocol Binding (AEPB): Interop and Lifecycle"
abbrev: "AEPB"
category: std
docname: draft-aepb-agent-ecosystem-protocol-binding-01
submissiontype: IETF
number:
date:
v: 3
area: "ART"
workgroup: "DISPATCH"
keyword:
- agent interoperability
- protocol translation
- lifecycle
- agentic workflows
author:
-
fullname: TBD
organization: Independent
email: placeholder@example.com
normative:
RFC2119:
RFC8174:
RFC8446:
RFC8615:
RFC9110:
RFC8594:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
informative:
--- abstract
This document defines the Agent Ecosystem Protocol Binding (AEPB),
the interoperability and lifecycle layer of the agent ecosystem.
With over 90 competing A2A protocol drafts and no interoperability
standard, AEPB defines capability advertisement, protocol
negotiation, formal binding requirements, translation gateway
architecture, and agent lifecycle management (versioning, graceful
shutdown, retirement). Translation hops produce ECT nodes,
preserving DAG continuity across protocol boundaries. Protocol
constraints are expressed as ACP-DAG-HITL node constraints.
--- middle
# Introduction
The IETF AI/agent landscape includes over 90 drafts proposing
agent-to-agent communication protocols. No standard exists for
agents using different protocols to exchange messages, and no
standard exists for how agents evolve, get replaced, or retire
without disrupting dependent services.
AEPB addresses both gaps with a pragmatic approach: rather than
mandating a single protocol, it defines the minimum machinery for
agents to discover each other's protocol support, agree on a
common format, fall back to translation gateways, and manage their
lifecycle.
AEPB builds on ECT {{I-D.nennemann-wimse-ect}} for audit (every
translation hop is a DAG node) and ACP-DAG-HITL
{{I-D.nennemann-agent-dag-hitl-safety}} for policy (protocol
constraints as node constraints).
# Conventions and Definitions
{::boilerplate bcp14-tagged}
Agent Protocol:
: A communication protocol used by an AI agent for peer-to-peer
message exchange (e.g., A2A, MCP, SLIM, uACP).
Capability Document:
: A JSON object describing the protocols an agent supports,
lifecycle status, and ECT assurance level.
Translation Gateway:
: A service that converts messages between two agent protocols,
recording each translation as an ECT DAG node.
Protocol Binding:
: The mapping between the AEPB ecosystem semantics and a specific
agent protocol. Each binding has a stable identifier string.
Binding Identifier:
: A short string identifying a specific protocol binding
version (e.g., `a2a-v1`, `mcp-v1`).
# Capability Advertisement {#capability}
## Capability Document Format
Each AEPB-compliant agent MUST serve a capability document at
`/.well-known/aepb` per {{RFC8615}}:
~~~json
{
"aepb_version": "1.0",
"agent_id": "spiffe://example.com/agent/pricing",
"protocols": [
{
"id": "a2a-v1",
"version": "1.0",
"endpoint": "https://agent.example.com/a2a",
"priority": 10
},
{
"id": "mcp-v1",
"version": "2025-03-26",
"endpoint": "https://agent.example.com/mcp",
"priority": 20
}
],
"translation_gateways": [
"https://gateway.example.com/aepb/translate"
],
"ect_assurance_level": "L2",
"ect_namespaces": ["atd", "hitl", "apae"],
"lifecycle": {
"status": "active",
"version": "2.1.0",
"deprecated_at": null,
"sunset_at": null,
"successor": null
}
}
~~~
{: #fig-capability title="Capability Document"}
The `protocols` array MUST contain at least one entry. `priority`
is OPTIONAL; lower values indicate higher preference.
The `ect_namespaces` field MUST list all ECT extension namespaces
(ATD, HITL, APAE) that this agent emits and can process. Peers
use this to determine whether ecosystem semantics are compatible.
The `lifecycle` object (see {{lifecycle}}) provides versioning and
deprecation metadata.
## DNS-SD Advertisement
Agents SHOULD advertise via DNS SVCB records (`_aepb._tcp`) as
an alternative to well-known URI discovery. The SVCB record
MUST include a `hint` parameter pointing to the well-known URI.
## Capability Document Caching
Capability documents MAY be cached per HTTP cache-control
semantics per {{RFC9110}}. The default max-age is 3600 seconds.
Agents MUST set `Expires` or `Cache-Control: max-age` on
capability document responses.
# Protocol Negotiation {#negotiation}
When Agent A wants to communicate with Agent B:
1. Agent A fetches B's capability document over HTTPS.
2. Agent A computes the intersection of protocol lists. If
non-empty, the protocol with the lowest combined priority is
selected. Communication proceeds directly.
3. If no common protocol exists, Agent A checks translation
gateways listed by either agent:
~~~
GET /.well-known/aepb/gateway?from=a2a-v1&to=slim-v1 HTTP/1.1
~~~
The gateway responds 200 if it supports the pair, 404 if not.
4. If a suitable gateway is found, Agent A sends its message to
the gateway, which translates and forwards.
5. If no gateway supports the pair, Agent A MUST return error
`no_translation_path` and MUST NOT proceed.
Negotiation is stateless and cacheable (Cache-Control, default
3600s).
## Protocol Downgrade Prevention
Protocol negotiation MUST NOT result in selection of a binding
below the minimum configured in ACP-DAG-HITL node constraints:
~~~json
{
"constraints": {
"aepb.min_protocol_security": "tls-1.3"
}
}
~~~
Agents MUST reject capability documents that advertise only
protocols below their configured minimum security requirement.
Specifically, all protocols MUST use TLS 1.3 {{RFC8446}}; no
plaintext bindings are permitted in production deployments.
# Conforming Protocol Binding Requirements {#binding-requirements}
A protocol binding MUST satisfy the following requirements to be
registered in the AEPB Protocol Binding Registry.
## ECT Carriage
A conforming binding MUST provide a mechanism to carry ECTs
alongside protocol messages. For HTTP-based protocols, this
MUST be the `Execution-Context` header as defined in
{{I-D.nennemann-wimse-ect}}. For non-HTTP protocols, the
binding specification MUST define an equivalent envelope field.
## Task Invocation with Parent Reference
A conforming binding MUST support task invocation messages that
include a reference to the parent ECT `jti`. This allows the
receiving agent to link the new task into the ECT DAG.
## Checkpoint and Rollback Signal Carriage
A conforming binding MUST support conveying ATD rollback requests
and results. For HTTP-based bindings, the `/.well-known/atd/rollback`
endpoint MUST be accessible independent of the main protocol
endpoint.
## HITL Callback Registration
A conforming binding MUST support HITL approval callback
registration. When a task involves a planned approval gate, the
initiating agent MUST be able to register a callback URI that
receives the `hitl:approval_granted` or `hitl:approval_denied`
ECT when the human responds. For HTTP bindings, this is a
standard webhook registration.
## Summary Table
| Requirement | Minimum | Rationale |
|-------------|---------|-----------|
| ECT carriage | `Execution-Context` header or equivalent | DAG continuity |
| Parent ECT reference | In task invocation | DAG linkage |
| Rollback signal | `/.well-known/atd/rollback` accessible | Error recovery |
| HITL callback | Webhook or equivalent | Async approval |
| Transport security | TLS 1.3 | Integrity and confidentiality |
{: #fig-requirements title="Protocol Binding Conformance Requirements"}
# Translation Gateway Architecture {#translation}
## Gateway as DAG Node
Every translation hop produces an ECT:
- `exec_act`: `"aepb:translate"`
- `par`: the source agent's ECT
~~~json
{
"exec_act": "aepb:translate",
"par": ["source-agent-ect-uuid"],
"inp_hash": "sha256-of-source-message",
"out_hash": "sha256-of-translated-message",
"ext": {
"aepb.source_protocol": "a2a-v1",
"aepb.dest_protocol": "slim-v1",
"aepb.gateway_id": "spiffe://gw.example.com/aepb",
"aepb.translation_warnings": []
}
}
~~~
{: #fig-translate-ect title="Translation ECT"}
This creates a three-node subgraph in the ECT DAG:
~~~
Source ECT → Gateway ECT (aepb:translate) → Dest ECT
~~~
The `Execution-Context` HTTP header survives protocol translation:
the gateway includes the translation ECT in the header of the
forwarded request, maintaining DAG continuity.
## Multi-Hop Translation
When a single gateway cannot handle a translation pair, messages
may traverse multiple gateways. Each hop produces an
`aepb:translate` ECT, all linked in the same DAG:
~~~
Agent-A ECT
Gateway-1 ECT (a2a-v1 → mcp-v1)
Gateway-2 ECT (mcp-v1 → slim-v1)
Agent-B ECT
~~~
{: #fig-multihop title="Multi-Hop Translation DAG"}
The maximum number of translation hops is configured as a
node constraint:
~~~json
{
"constraints": {
"aepb.max_translation_hops": 2
}
}
~~~
Agents receiving a message MUST count `aepb:translate` ECTs in
the `par` ancestry and MUST reject messages exceeding
`aepb.max_translation_hops`. The default maximum is 3.
## Gateway Requirements
A gateway MUST:
1. Serve a capability document at `/.well-known/aepb/gateway`
listing supported translation pairs.
2. Accept messages via HTTP POST at its translate endpoint.
3. Produce an `aepb:translate` ECT per {{translation}} for
every translation.
4. Preserve message semantics. Fields without a destination
equivalent are carried in an extension field or dropped with
a warning in `aepb.translation_warnings`.
5. Require TLS 1.3 {{RFC8446}} for all connections.
6. Implement per-source-agent rate limiting.
7. Verify gateway ECTs at L2 or higher (signed JWT minimum).
A gateway MUST NOT modify payload semantics beyond what is
required for protocol translation.
## Translation Failure Handling
When a gateway fails to translate a message, it MUST emit an
error ECT:
~~~json
{
"exec_act": "aepb:translate_error",
"par": ["source-agent-ect-uuid"],
"ext": {
"aepb.source_protocol": "a2a-v1",
"aepb.dest_protocol": "slim-v1",
"aepb.error": "semantic_loss",
"aepb.description": "Source message contains field 'action.stream' with no slim-v1 equivalent"
}
}
~~~
{: #fig-translate-error title="Translation Error ECT"}
Error values: `semantic_loss` (untranslatable field), `timeout`,
`policy_violation` (exceeds hop limit), `internal_error`.
On translation failure:
- The ATD circuit breaker for the gateway agent SHOULD be
updated.
- If `atd.cascade: false`, the calling agent returns
`no_translation_path` to its upstream caller.
- If `atd.cascade: true`, the ATD rollback protocol applies
to the entire workflow subgraph.
## Translation Policy
Protocol constraints are ACP-DAG-HITL node constraints:
~~~json
{
"constraints": {
"aepb.allowed_source_protocols": ["a2a-v1", "mcp-v1"],
"aepb.allowed_dest_protocols": ["slim-v1"],
"aepb.max_translation_hops": 2
}
}
~~~
{: #fig-policy title="Translation Policy"}
# Agent Lifecycle Management {#lifecycle}
## Lifecycle States
An agent's `lifecycle.status` MUST be one of:
active:
: Normal operation. Default state.
deprecated:
: Agent is functional but will be retired.
`deprecated_at` MUST be set. The agent MUST include a
`Deprecation` header per {{RFC8594}} in all responses.
Clients SHOULD migrate to `successor` if provided.
draining:
: Agent is rejecting new workflows but completing in-progress
ones. New delegation requests MUST return HTTP 503 with
`Retry-After` header and, if set, `Location` pointing to
`successor`.
retired:
: Agent is offline. Capability document MUST return HTTP 410
Gone with `Link: <successor>; rel="successor-version"`.
## Lifecycle State Transitions
~~~
deprecate drain
active ──────────► deprecated ────────► draining ──► retired
▲ │ │
│ │ immediate drain │
└────────────────────┴────────────────────┘
(operator discretion)
~~~
{: #fig-lifecycle-fsm title="Lifecycle State Machine"}
All transitions MUST be recorded as ECTs:
~~~json
{
"exec_act": "aepb:lifecycle_change",
"ext": {
"aepb.agent_id": "spiffe://example.com/agent/pricing",
"aepb.from_state": "active",
"aepb.to_state": "deprecated",
"aepb.reason": "Replaced by pricing-v3"
}
}
~~~
{: #fig-lifecycle-ect title="Lifecycle Change ECT"}
## Versioning
The `lifecycle.version` field uses semantic versioning. Agents
MUST increment the major version when breaking changes occur
(incompatible protocol or behavior changes).
Capability documents MUST include the version. Agents SHOULD
include version in ECT `ext` claims (`aepb.agent_version`) so
the audit trail records which version performed each action.
## Graceful Shutdown
When an agent transitions to `draining`:
1. Update capability document: `status: "draining"`,
set `sunset_at` timestamp.
2. Reject new workflow delegations with HTTP 503.
3. Complete all in-progress workflows.
4. Emit a final ECT: `exec_act: "aepb:shutdown"`.
5. Transition to `retired`.
Agents SHOULD provide at least 24 hours between `deprecated`
and `draining` to allow clients to discover the change via
cached capability documents.
## Successor Discovery
When `successor` is set, it MUST be the URI of the replacement
agent's capability document. Clients SHOULD transparently
redirect to the successor after verifying its capability
document. Clients MUST verify that the successor's assurance
level is equal to or greater than the predecessor's.
# Security Considerations
## Capability Document Integrity
Capability documents are served over HTTPS with TLS 1.3.
Agents SHOULD verify TLS certificates before trusting capability
documents. For high-assurance deployments, capability documents
SHOULD be signed as JWTs ({{RFC7519}}) so their integrity can
be verified independently of transport security.
## Gateway Trust
Gateways are trusted intermediaries with access to message
content. For end-to-end confidentiality, agents MAY encrypt
message payloads with a shared key established out of band.
The ECT audit trail enables detection of:
- Unauthorized gateways (unknown `aepb.gateway_id`).
- Content tampering (`inp_hash`/`out_hash` mismatch).
- Routing loops (repeated gateway IDs in DAG ancestry).
Gateways MUST authenticate using WIMSE/SPIFFE identities at
ECT assurance L2+.
## Protocol Downgrade Attacks
An attacker may attempt to force negotiation to a weaker
protocol. Mitigation:
- Agents MUST enforce `aepb.min_protocol_security` constraint.
- TLS 1.3 is the minimum transport; lower versions MUST be
rejected.
- Protocol negotiation results MUST be logged as part of the
workflow ECT DAG.
## Translation Amplification
A single cross-protocol request could trigger a chain of N
translations, each consuming resources. Mitigation:
- `aepb.max_translation_hops` (default 3) prevents unbounded
chains.
- Per-source rate limiting at each gateway prevents a single
agent from flooding the translation infrastructure.
## Lifecycle Denial of Service
Transitioning an agent to `draining` or `retired` disrupts
its callers. Only the agent operator (verified via ACP-DAG-HITL
identity binding) SHOULD be able to trigger lifecycle
transitions. Lifecycle-change ECTs MUST be signed at L2+.
# IANA Considerations
## AEPB Protocol Binding Registry
This document requests the creation of the "AEPB Protocol Binding
Registry" under IANA. Registration policy: Specification Required.
Required fields: Binding Identifier, Protocol Name, Specification
Reference, Minimum ECT Assurance Level, HITL Callback Support.
Initial entries:
| Identifier | Protocol | Spec Reference | Min Assurance | HITL Callback |
|------------|----------|---------------|--------------|---------------|
| `a2a-v1` | A2A | (TBD) | L1 | Webhook |
| `mcp-v1` | Model Context Protocol | (TBD) | L1 | Webhook |
| `slim-v1` | SLIM | (TBD) | L1 | Webhook |
| `uacp-v1` | uACP | (TBD) | L1 | Webhook |
| `ainp-v1` | AINP | (TBD) | L1 | Webhook |
{: #fig-registry title="Initial Protocol Binding Registry Entries"}
## Well-Known URIs
This document requests registration per {{RFC8615}}:
| URI Suffix | Purpose |
|------------|---------|
| `aepb` | Agent capability document |
| `aepb/gateway` | Translation gateway capability |
{: #fig-wellknown title="Well-Known URI Registrations"}
## `exec_act` Values
This document requests registration in the AEM Ecosystem
Extension Registry:
| Value | Description | Reference |
|-------|-------------|-----------|
| `aepb:translate` | Protocol translation hop | This document |
| `aepb:translate_error` | Translation failure | This document |
| `aepb:shutdown` | Agent graceful shutdown complete | This document |
| `aepb:lifecycle_change` | Lifecycle state transition | This document |
{: #fig-iana-actions title="AEPB exec_act Registrations"}
--- back
# Acknowledgments
{:numbered="false"}
AEPB builds on ECT {{I-D.nennemann-wimse-ect}} for translation
audit trails and ACP-DAG-HITL
{{I-D.nennemann-agent-dag-hitl-safety}} for protocol policy.
The lifecycle model is inspired by Kubernetes graceful shutdown
semantics and the `Deprecation` header {{RFC8594}}.

View File

@@ -0,0 +1,360 @@
---
title: "Dynamic Agent Trust Scoring (DATS)"
abbrev: "DATS"
category: std
docname: draft-dats-dynamic-agent-trust-scoring-00
submissiontype: IETF
number:
date:
v: 3
area: "SEC"
workgroup: "Security Dispatch"
keyword:
- dynamic trust
- reputation
- agentic workflows
- execution context
author:
-
fullname: Generated by IETF Draft Analyzer
organization: Independent
email: placeholder@example.com
normative:
RFC7519:
RFC7515:
RFC7518:
RFC9110:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
informative:
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
--- abstract
This document defines the Dynamic Agent Trust Scoring (DATS)
protocol, a mechanism for AI agents to build, assess, and revoke
trust relationships based on observed behavior over time. Static
authentication verifies identity but says nothing about reliability.
DATS augments identity-based auth with a numeric trust score that
adjusts dynamically based on interaction outcomes recorded in the
ECT DAG. Trust events are derived from ECT action outcomes rather
than agent-local tracking, making trust computation auditable and
tamper-evident. Trust assertions are ECTs themselves, and trust
thresholds integrate with ACP-DAG-HITL node constraints as
enforceable policy.
--- middle
# Introduction
The IETF has 98 drafts addressing agent identity and
authentication, providing strong mechanisms for verifying who an
agent is. But identity alone is insufficient for long-running
autonomous systems. A properly authenticated agent may still
produce bad results, violate expectations, or degrade over time.
DATS adds a behavioral dimension to trust. It answers: "I know
who you are, but should I rely on you?" The model is deliberately
simple -- a single floating-point score between 0.0 and 1.0 per
agent relationship -- because complex reputation systems tend to
be gamed or ignored.
By building on ECT {{I-D.nennemann-wimse-ect}}, DATS derives trust
from the cryptographically signed record of actual interactions
rather than agent-local counters that can be manipulated. At L3,
the audit ledger provides an immutable interaction history.
The protocol is inspired by:
- TCP congestion control: trust increases slowly (additive) and
decreases quickly (multiplicative) on failure.
- TLS certificate transparency: trust assertions are logged.
- Web of trust (PGP): trust propagates through intermediaries
with attenuation.
# Conventions and Definitions
{::boilerplate bcp14-tagged}
Trust Score:
: A floating-point value in \[0.0, 1.0\] representing one agent's
assessed reliability of another, based on observed ECT outcomes.
Trust Event:
: An observable interaction outcome that causes a trust score
adjustment. Derived from ECTs in the workflow DAG.
Trust Decay:
: Automatic reduction of trust scores over inactivity, reflecting
the principle that trust requires ongoing evidence.
Trust Assertion:
: An ECT recording one agent's trust score for another,
transportable as a signed token.
# Problem Statement
Agent A delegates a task to Agent B. After 100 successful
interactions, Agent B starts returning incorrect results (model
drift, adversarial manipulation, or degradation). Agent A has no
standard way to:
1. Track B's reliability over time.
2. Reduce B's privileges based on degraded performance.
3. Share its experience with Agent C.
4. Automatically revoke B's access when trust drops below
acceptable levels.
Existing attestation drafts (STAMP, DAAP) provide cryptographic
proof of specific actions but not ongoing behavioral assessment.
The ECT DAG records what happened; DATS adds evaluation of
whether what happened was good.
# Trust Score Model {#trust-model}
Each agent maintains a trust table: a mapping from peer agent IDs
to trust scores.
~~~json
{
"spiffe://example.com/agent/b": {
"score": 0.82,
"interactions": 147,
"last_updated": "2026-03-01T11:30:00Z",
"last_event_ect": "550e8400-e29b-41d4-a716-446655440099"
}
}
~~~
{: #fig-trust-table title="Trust Table Entry"}
Initial trust for an unknown agent is deployment-configured. A
value of 0.5 is RECOMMENDED as a neutral starting point.
Zero-trust deployments MAY use 0.1.
Trust scores are updated using additive-increase,
multiplicative-decrease (AIMD):
On positive event:
: `score = min(1.0, score + alpha)`
On negative event:
: `score = max(0.0, score * beta)`
Default parameters: `alpha = 0.01`, `beta = 0.8`.
This means trust builds slowly (100 successes from 0.5 to ~1.0)
but drops quickly (a single failure takes 0.82 to 0.66). This
asymmetry is intentional: in autonomous systems, the cost of
trusting a bad agent exceeds the cost of slow trust building.
# Trust Events from ECT {#trust-events}
Trust events are derived from ECTs in the workflow DAG rather than
agent-local tracking. This makes trust computation auditable.
## Standard Trust Events
| ECT condition | Event | Adjustment |
|--------------|-------|------------|
| `exec_act` completed, no error ECT follows | `task_success` | +1x alpha |
| `exec_act` completed, partial result | `task_partial` | +0.5x alpha |
| `aerr:error` ECT with `par` referencing agent | `task_failure` | 1x beta |
| Timeout (no response ECT within threshold) | `task_timeout` | 1x beta |
| `aerr:error` with `constraint_violation` | `policy_violation` | beta^2 |
| ECT signature verification fails | `attestation_invalid` | beta^2 |
| `aerr:rollback_request` targeting agent | `rollback_triggered` | 1x beta |
{: #fig-events title="Trust Events Derived from ECT"}
`beta^2` means the multiplicative decrease is applied twice
(`score * beta * beta`), reflecting the severity of policy
violations versus simple failures.
## Trust Decay
If no interaction (no ECT involving the peer) occurs for a
configurable period (default: 7 days), the trust score decays:
`score = max(initial_default, score - decay_rate)`
Default `decay_rate`: 0.01 per day.
Agents MUST record all trust events in a local audit log. At L3,
the trust events are derivable from the audit ledger, providing
independent verifiability.
# Trust Assertions as ECT {#trust-assertions}
Agent A shares its trust assessment of Agent B with Agent C via a
trust assertion ECT:
- `exec_act`: `"dats:assertion"`
- `par`: empty (trust assertions are standalone) or referencing
the most recent interaction ECT
~~~json
{
"iss": "spiffe://example.com/agent/a",
"ext": {
"dats.subject": "spiffe://example.com/agent/b",
"dats.score": 0.82,
"dats.interactions": 147,
"dats.confidence": "high",
"dats.hops": 0
}
}
~~~
{: #fig-assertion title="Trust Assertion ECT"}
`dats.confidence` is based on interaction count: `low` (<10),
`medium` (10-99), `high` (100+).
## Trust Propagation with Attenuation
When Agent C receives a trust assertion from Agent A about Agent B,
it MAY incorporate it:
~~~
c_score_for_b = max(c_score_for_b,
a_score_for_b * trust_of_a * attenuation)
~~~
Where:
- `a_score_for_b` = A's reported score for B (0.82)
- `trust_of_a` = C's own trust score for A
- `attenuation` = constant (default: 0.5)
Trust assertions are advisory. An agent's own direct observations
always take precedence over propagated trust.
## Anti-Gaming Measures
To prevent trust laundering (colluding agents inflating each
other's scores):
- Agents SHOULD limit propagation depth to 1 hop by default
- The `dats.hops` field tracks depth; agents MUST NOT propagate
assertions where `dats.hops` exceeds their configured maximum
- At L3, trust assertions are recorded in the audit ledger,
making collusion patterns detectable through graph analysis
# Trust Thresholds as Policy {#trust-policy}
## Threshold-Based Access
Agents SHOULD define trust thresholds per action type:
~~~json
{
"thresholds": {
"read_data": 0.3,
"execute_task": 0.5,
"modify_config": 0.7,
"delegate_auth": 0.9
}
}
~~~
{: #fig-thresholds title="Trust Thresholds"}
When a request arrives, the agent checks the requester's trust
score against the threshold. If below threshold, the request is
denied with HTTP 403 and error `trust_insufficient`.
## Integration with ACP-DAG-HITL
Trust thresholds can be expressed as DAG node constraints
{{I-D.nennemann-agent-dag-hitl-safety}}:
~~~json
{
"dag": {
"nodes": [{
"id": "n-critical-action",
"type": "modify_config",
"agent": "spiffe://example.com/agent/b",
"constraints": {
"dats.min_trust": 0.7,
"dats.min_confidence": "medium"
}
}]
},
"hitl": {
"rules": [{
"id": "r-low-trust",
"trigger": {
"kind": "confidence_below",
"op": "lt",
"value": 0.5,
"input_ref": "dats.peer_trust_score"
},
"required_role": "operator:security",
"action": "escalate",
"allow_override": true,
"override_action": "continue"
}]
}
}
~~~
{: #fig-policy title="Trust Policy as DAG Constraints + HITL"}
This means: if the delegated agent's trust score drops below 0.5,
escalate to a human security operator before proceeding.
## Automatic Revocation
When an agent's trust score drops below a configured floor
(default: 0.2), the trusting agent SHOULD:
1. Revoke all outstanding delegations to that agent
2. Produce a revocation ECT (`exec_act`: `"dats:revoke"`)
3. Emit an error ECT per AERR if the agent was part of an
active workflow
# Security Considerations
Trust scores are sensitive metadata. Agents MUST NOT expose
their full trust tables to peers. Only pairwise trust assertions
should be shared intentionally.
Trust assertion ECTs MUST be signed at L2 or L3. Agents MUST
verify signatures before processing.
Score manipulation: a malicious agent could behave well to build
trust, then exploit it. Mitigation: `policy_violation` events
apply double penalties, and deployments SHOULD set high thresholds
for critical actions.
Sybil attacks: an attacker creates many agents for fake positive
assertions. Mitigation: attenuation ({{trust-assertions}}),
hop limits, and requiring agents to be registered in a trusted
directory before accepting assertions.
All trust-related communications MUST use TLS 1.3.
# IANA Considerations
This document requests the following IANA registrations:
1. Registration of `exec_act` values `dats:assertion` and
`dats:revoke` in a future ECT action type registry.
2. A "DATS Trust Event Type" registry under Specification Required
policy. Initial entries: `task_success`, `task_partial`,
`task_failure`, `task_timeout`, `policy_violation`,
`attestation_invalid`, `rollback_triggered`.
--- back
# Acknowledgments
{:numbered="false"}
This document builds on the Execution Context Token specification
{{I-D.nennemann-wimse-ect}} for interaction evidence and the
Agent Context Policy Token {{I-D.nennemann-agent-dag-hitl-safety}}
for trust threshold policy enforcement.

View File

@@ -0,0 +1,298 @@
Internet-Draft AI/Agent WG
Intended status: Standards Track March 2026
Expires: September 15, 2026
Dynamic Agent Trust Scoring (DATS)
draft-dats-dynamic-agent-trust-scoring-00
Abstract
This document defines the Dynamic Agent Trust Scoring (DATS)
protocol, a mechanism for AI agents to build, assess, and
revoke trust relationships based on observed behavior over
time. Static authentication (certificates, API keys) verifies
identity but says nothing about whether an agent is reliable,
accurate, or well-behaved. DATS augments identity-based auth
with a numeric trust score that adjusts dynamically based on
interaction outcomes. The protocol defines trust score
computation, propagation between agents, decay over inactivity,
and threshold-based access policies. DATS is intentionally
simple: a single score per agent-pair, standard adjustment
events, and a JWT-based transport for trust assertions.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
This document is intended to have Standards Track status.
Distribution of this memo is unlimited.
Table of Contents
1. Introduction
2. Terminology
3. Problem Statement
4. Trust Score Model
5. Trust Events and Adjustments
6. Trust Propagation
7. Threshold-Based Access Policies
8. Security Considerations
9. IANA Considerations
1. Introduction
The IETF has 98 drafts addressing agent identity and
authentication, providing strong mechanisms for verifying who
an agent is. But identity alone is insufficient for long-
running autonomous systems. A properly authenticated agent
may still produce bad results, violate expectations, or
degrade over time. Static certificates cannot capture this.
DATS adds a behavioral dimension to agent trust. It answers
the question: "I know who you are, but should I rely on you?"
The model is deliberately simple — a single floating-point
score between 0.0 and 1.0 per agent relationship — because
complex reputation systems tend to be gamed or ignored.
The protocol is inspired by:
- TCP congestion control: trust increases slowly (additive)
and decreases quickly (multiplicative) on failure.
- TLS certificate transparency: trust assertions are logged
for auditability.
- Web of trust (PGP): trust can propagate through
intermediaries, with attenuation.
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described
in RFC 2119 [RFC2119].
Trust Score: A floating-point value in [0.0, 1.0] representing
one agent's assessed reliability of another, based on observed
interaction outcomes.
Trust Event: An observable interaction outcome that causes a
trust score adjustment. Events are either positive (task
completed successfully) or negative (task failed, timeout,
policy violation).
Trust Decay: The automatic reduction of trust scores over
periods of inactivity, reflecting the principle that trust
requires ongoing evidence.
Trust Assertion: A signed statement by one agent about another
agent's trust score, transportable as a JWT claim.
3. Problem Statement
Agent A delegates a task to Agent B. Agent B completes it
correctly. Agent A delegates again. After 100 successful
interactions, Agent B starts returning subtly incorrect results
(model drift, adversarial manipulation, or simple degradation).
Agent A has no standard way to:
1. Track B's reliability over time.
2. Reduce B's privileges based on degraded performance.
3. Share its experience with Agent C, who is considering
delegating to Agent B.
4. Automatically revoke B's access when trust drops below
acceptable levels.
Existing attestation drafts (STAMP, DAAP) provide
cryptographic proof of specific actions but not ongoing
behavioral assessment. DATS fills this gap.
4. Trust Score Model
Each agent maintains a trust table: a mapping from peer agent
IDs to trust scores.
{
"urn:uuid:agent-b": {
"score": 0.82,
"interactions": 147,
"last_updated": "2026-03-01T11:30:00Z",
"last_event": "task_success"
}
}
Initial trust for an unknown agent is a deployment-configured
default. A value of 0.5 is RECOMMENDED as a neutral starting
point, but deployments MAY use lower values (e.g., 0.1) for
zero-trust environments.
Trust scores are updated using an additive-increase,
multiplicative-decrease (AIMD) algorithm:
On positive event:
score = min(1.0, score + alpha)
On negative event:
score = max(0.0, score * beta)
Default parameters: alpha = 0.01, beta = 0.8.
This means trust builds slowly (100 successes to go from 0.5
to ~1.0) but drops quickly (a single failure reduces an 0.82
score to 0.66). This asymmetry is intentional: in autonomous
systems, the cost of trusting a bad agent exceeds the cost of
being slow to trust a good one.
Agents MAY tune alpha and beta per relationship or per action
type, but MUST use the AIMD structure.
5. Trust Events and Adjustments
The following standard trust events are defined:
| Event | Direction | Default Weight |
|----------------------|-----------|----------------|
| task_success | positive | 1x alpha |
| task_partial_success | positive | 0.5x alpha |
| task_failure | negative | 1x beta |
| task_timeout | negative | 1x beta |
| policy_violation | negative | applied twice |
| attestation_invalid | negative | applied twice |
| rollback_triggered | negative | 1x beta |
"applied twice" means the multiplicative decrease is applied
two times in succession (score * beta * beta), reflecting the
severity of policy violations versus simple failures.
Trust decay: if no interaction occurs for a configurable
period (default: 7 days), the trust score decays:
score = max(initial_default, score - decay_rate)
Default decay_rate: 0.01 per day. This ensures that stale
trust relationships gradually return to the default level
rather than persisting indefinitely.
Agents MUST record all trust events in a local audit log.
6. Trust Propagation
Agent A may share its trust assessment of Agent B with Agent C
through a signed trust assertion. The assertion is a JWT
(RFC 7519) with the following claims:
{
"iss": "urn:uuid:agent-a",
"sub": "urn:uuid:agent-b",
"iat": 1709294400,
"exp": 1709380800,
"dats_score": 0.82,
"dats_interactions": 147,
"dats_confidence": "high"
}
"dats_confidence" is based on interaction count: "low" (<10),
"medium" (10-99), "high" (100+).
When Agent C receives this assertion, it MAY incorporate it
into its own trust score for Agent B using attenuation:
c_score_for_b = max(c_score_for_b,
a_score_for_b * trust_of_a * attenuation)
Where:
- a_score_for_b is Agent A's reported score for B (0.82)
- trust_of_a is Agent C's trust score for Agent A
- attenuation is a constant (default: 0.5) preventing
unbounded trust propagation
Trust assertions are advisory. Agents MUST NOT blindly adopt
propagated scores. An agent's own direct observations always
take precedence over propagated trust.
To prevent trust laundering (colluding agents inflating each
other's scores), agents SHOULD limit propagation depth to 1
hop by default. The "dats_hops" claim tracks propagation
depth; agents MUST NOT propagate assertions where dats_hops
exceeds their configured maximum.
7. Threshold-Based Access Policies
Agents SHOULD define trust thresholds for different action
categories:
{
"thresholds": {
"read_data": 0.3,
"execute_task": 0.5,
"modify_config": 0.7,
"delegate_auth": 0.9
}
}
When an agent requests an action, the serving agent checks the
requester's trust score against the threshold for that action
type. If the score is below the threshold, the request is
denied with a 403 response including a DATS-specific error:
{
"error": "trust_insufficient",
"required_score": 0.7,
"current_score": 0.54,
"action": "modify_config"
}
The response SHOULD NOT reveal the exact current score in
production deployments to prevent score probing. Instead, it
MAY return only the "trust_insufficient" error.
Automatic revocation: when an agent's trust score drops below
a configured floor (default: 0.2), the trusting agent SHOULD
revoke all outstanding delegations and emit a trust revocation
event. This provides automatic containment of agents that
have become unreliable.
8. Security Considerations
Trust scores are sensitive metadata. Agents MUST NOT expose
their full trust tables to peers. Only pairwise trust
assertions (Section 6) should be shared, and only
intentionally.
Trust assertion JWTs MUST be signed using algorithms from
RFC 7518 (e.g., ES256, EdDSA). Agents MUST verify signatures
before processing trust assertions.
Score manipulation attacks: a malicious agent could
intentionally behave well for many interactions to build trust,
then exploit high trust for a damaging action. Mitigation:
policy_violation events apply double penalties, and
deployments SHOULD set trust thresholds high for critical
actions regardless of accumulated trust.
Sybil attacks: an attacker could create many agents to
generate fake positive trust assertions. Mitigation: agents
SHOULD weight propagated trust by their own direct trust in
the asserting agent (Section 6 attenuation) and SHOULD
require agents to be registered in a trusted directory (e.g.,
ANS) before accepting trust assertions.
All trust-related communications MUST use TLS 1.3 [RFC8446].
9. IANA Considerations
This document requests IANA establish the following:
1. Registration of JWT claims "dats_score",
"dats_interactions", "dats_confidence", and "dats_hops"
in the JSON Web Token Claims registry per RFC 7519.
2. A "DATS Trust Event Type" registry under Specification
Required policy. Initial entries: "task_success",
"task_partial_success", "task_failure", "task_timeout",
"policy_violation", "attestation_invalid",
"rollback_triggered".
Author's Address
Generated by IETF Draft Analyzer
2026-03-01

View File

@@ -0,0 +1,384 @@
---
title: "Assurance Profiles for Agent Ecosystems (APAE)"
abbrev: "APAE"
category: info
docname: draft-apae-assurance-profiles-00
submissiontype: IETF
number:
date:
v: 3
area: "SEC"
workgroup: "Security Dispatch"
keyword:
- dynamic trust
- assurance
- behavior verification
- data provenance
author:
-
fullname: TBD
organization: Independent
email: placeholder@example.com
normative:
RFC2119:
RFC8174:
RFC7519:
RFC7518:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
informative:
--- abstract
This document defines Assurance Profiles for Agent Ecosystems
(APAE): dynamic trust scoring, behavior verification, data
provenance, and graduated assurance profiles that allow the same
agent ecosystem to operate in relaxed (dev/K8s) and regulated
(healthcare, finance) environments. Trust events are derived from
ECT outcomes. Trust assertions are ECTs. Behavior verification
references ECT claims. Provenance chains are implicit in the ECT
DAG. Assurance profiles select which combination of these
mechanisms is required for a given deployment, mapping to ECT
assurance levels L1/L2/L3.
--- middle
# Introduction
Identity verifies who an agent is. ECT records what an agent did.
But neither answers: should I rely on this agent? Is it doing what
it promised? Can I trace where this data came from?
APAE adds three capabilities to the ecosystem:
1. **Dynamic trust scoring** — behavioral reputation that adjusts
based on interaction outcomes (AIMD model).
2. **Behavior verification** — checking agent actions against
declared specifications.
3. **Data provenance** — tracing data lineage through the DAG.
These three capabilities are bundled into assurance profiles
(relaxed, standard, regulated) that map to ECT assurance levels,
so the same ecosystem works from a dev cluster to a hospital.
# Conventions and Definitions
{::boilerplate bcp14-tagged}
Trust Score:
: A floating-point value in \[0.0, 1.0\] representing one agent's
assessed reliability of another.
Trust Event:
: An interaction outcome that causes a trust score adjustment.
Derived from ECTs.
Behavior Specification:
: A machine-readable declaration of permitted agent actions and
constraints.
Provenance Chain:
: The sequence of ECT nodes recording how a piece of data was
produced, transformed, and consumed.
Assurance Profile:
: A named configuration selecting which trust, verification, and
provenance mechanisms are required.
# Dynamic Trust Scoring {#trust}
## Trust Score Model
Each agent maintains a trust table: peer agent IDs mapped to
trust scores. Initial trust for unknown agents is deployment-
configured (RECOMMENDED: 0.5; zero-trust: 0.1).
Scores update using additive-increase, multiplicative-decrease
(AIMD):
- Positive event: `score = min(1.0, score + alpha)`
- Negative event: `score = max(0.0, score * beta)`
Defaults: `alpha = 0.01`, `beta = 0.8`.
Trust builds slowly (100 successes: 0.5 → ~1.0) and drops fast
(one failure: 0.82 → 0.66).
## Trust Events from ECT {#trust-events}
Trust events are derived from ECTs rather than agent-local
counters, making trust computation auditable:
| ECT condition | Event | Adjustment |
|--------------|-------|------------|
| Completed, no error follows | `task_success` | +1x alpha |
| Completed, partial result | `task_partial` | +0.5x alpha |
| `atd:error` referencing agent | `task_failure` | 1x beta |
| No response within threshold | `task_timeout` | 1x beta |
| `atd:error` with `constraint_violation` | `policy_violation` | beta^2 |
| ECT signature verification fails | `attestation_invalid` | beta^2 |
| `atd:rollback_request` targeting agent | `rollback_triggered` | 1x beta |
{: #fig-events title="Trust Events from ECT"}
## Trust Decay
If no interaction occurs for a configurable period (default:
7 days): `score = max(initial_default, score - 0.01/day)`.
## Trust Assertions as ECT {#trust-assertions}
Agent A shares its trust assessment via a trust assertion ECT:
- `exec_act`: `"apae:trust_assertion"`
~~~json
{
"exec_act": "apae:trust_assertion",
"ext": {
"apae.subject": "spiffe://example.com/agent/b",
"apae.trust_score": 0.82,
"apae.interactions": 147,
"apae.confidence": "high",
"apae.hops": 0
}
}
~~~
{: #fig-assertion title="Trust Assertion ECT"}
Confidence: `low` (<10 interactions), `medium` (10-99),
`high` (100+).
## Trust Propagation
When Agent C receives A's assertion about B:
~~~
c_score_for_b = max(c_score_for_b,
a_score * trust_of_a * attenuation)
~~~
Default `attenuation`: 0.5. Direct observations always take
precedence. `apae.hops` tracks propagation depth; agents MUST NOT
propagate beyond their configured maximum (default: 1).
## Trust Thresholds as Policy
Trust thresholds are ACP-DAG-HITL node constraints:
~~~json
{
"constraints": {
"apae.min_trust": 0.7,
"apae.min_confidence": "medium"
}
}
~~~
{: #fig-threshold title="Trust Threshold as Node Constraint"}
Requests from agents below threshold are denied with HTTP 403.
Low trust can trigger HITL escalation:
~~~json
{
"id": "r-low-trust",
"trigger": {
"kind": "confidence_below",
"op": "lt",
"value": 0.5,
"input_ref": "apae.peer_trust_score"
},
"required_role": "operator:security",
"action": "escalate",
"allow_override": true,
"override_action": "continue"
}
~~~
{: #fig-trust-hitl title="HITL Rule for Low Trust"}
## Automatic Revocation
When trust drops below a floor (default: 0.2), the trusting agent
SHOULD revoke delegations and emit:
`exec_act: "apae:trust_revoke"`.
# Behavior Verification {#behavior}
## Behavior Specifications
A behavior specification declares what an agent is permitted to do.
Specifications are JSON documents referencing ECT claims:
~~~json
{
"spec_version": "1.0",
"agent_id": "spiffe://example.com/agent/firewall",
"allowed_actions": ["update_rules", "read_config", "report"],
"constraints": {
"max_actions_per_minute": 60,
"forbidden_targets": ["core-router-*"],
"require_checkpoint_before": ["update_rules"]
},
"verification_frequency": "continuous"
}
~~~
{: #fig-spec title="Behavior Specification"}
## Verification Against ECT Stream
A verifier monitors the agent's ECT stream and checks:
1. `exec_act` values are in `allowed_actions`.
2. Action rate does not exceed `max_actions_per_minute` (computed
from `iat` timestamps).
3. `atd:checkpoint` ECTs precede `update_rules` ECTs (from
`require_checkpoint_before`).
4. Targets in `ext` claims do not match `forbidden_targets`.
Verification results are ECTs:
- `exec_act`: `"apae:compliance_check"`
~~~json
{
"exec_act": "apae:compliance_check",
"par": ["latest-agent-ect-uuid"],
"ext": {
"apae.compliance_status": "passing",
"apae.violations": [],
"apae.spec_version": "1.0",
"apae.window": "2026-03-01T12:00:00Z/PT1H"
}
}
~~~
{: #fig-compliance title="Compliance Check ECT"}
Violations trigger trust score decreases (`policy_violation` event)
and MAY trigger HITL escalation.
# Data Provenance {#provenance}
## DAG as Provenance Chain
The ECT DAG already encodes data provenance: each ECT's `par`
references show which prior tasks produced its inputs. The
`inp_hash` and `out_hash` claims prove what was processed without
revealing the data.
For deployments requiring explicit provenance metadata, agents
MAY include:
~~~json
{
"ext": {
"apae.data_source": "database:patients",
"apae.data_classification": "pii",
"apae.retention_days": 365,
"apae.transformations": ["anonymize", "aggregate"]
}
}
~~~
{: #fig-provenance title="Provenance Extension Claims"}
## Provenance Queries
At L3, the audit ledger enables provenance queries:
- "Which agents touched this data?" → walk `par` chain from
final ECT to roots.
- "Was this data transformed?" → check `apae.transformations`
along the chain.
- "Is provenance complete?" → verify all `par` references
resolve to ledger entries.
# Assurance Profiles {#profiles}
An assurance profile is a named configuration that selects which
mechanisms are required:
| | Relaxed | Standard | Regulated |
|---|---------|----------|-----------|
| **ECT level** | L1 | L2 | L3 |
| **Trust scoring** | Optional | RECOMMENDED | REQUIRED |
| **Trust threshold enforcement** | Optional | RECOMMENDED | REQUIRED |
| **Behavior verification** | Off | Periodic | Continuous |
| **HITL approval gates** | Optional | Critical paths | Mandatory |
| **Data provenance** | Off | Optional | REQUIRED |
| **Checkpoint before consequential** | RECOMMENDED | REQUIRED | REQUIRED |
| **Audit ledger** | Optional | Optional | REQUIRED |
{: #fig-profiles title="Assurance Profiles"}
Relaxed:
: Internal dev/staging. L1 ECTs. Trust and verification
optional. Useful for debugging and observability without
cryptographic overhead.
Standard:
: Production cross-org. L2 ECTs. Trust scoring and thresholds
recommended. Periodic behavior verification. HITL on critical
paths.
Regulated:
: Healthcare, finance, EU AI Act. L3 ECTs with audit ledger.
Continuous behavior verification. All trust mechanisms
required. Full provenance chain. Mandatory HITL gates.
Profiles are declared in ACP-DAG-HITL node constraints:
~~~json
{
"constraints": {
"apae.assurance_profile": "regulated"
}
}
~~~
{: #fig-profile-policy title="Profile as Node Constraint"}
A single deployment MAY use different profiles for different
workflows.
# Security Considerations
Trust scores are sensitive metadata. Agents MUST NOT expose full
trust tables. Only pairwise assertions should be shared.
Trust assertion ECTs MUST be signed at L2/L3.
Score manipulation (building trust then exploiting it): mitigated
by double penalties for `policy_violation` and high thresholds for
critical actions.
Sybil attacks (fake agents inflating trust): mitigated by
attenuation ({{trust-assertions}}), hop limits, and requiring
agents to be registered in a trusted directory.
Behavior specifications could be tampered with. Specifications
SHOULD be signed and versioned. Changes MUST be recorded as ECTs.
All trust and verification communications MUST use TLS 1.3.
# IANA Considerations
This document requests registration of `exec_act` values:
- `apae:trust_assertion`
- `apae:trust_revoke`
- `apae:compliance_check`
--- back
# Acknowledgments
{:numbered="false"}
APAE builds on ECT {{I-D.nennemann-wimse-ect}} for interaction
evidence and audit, and ACP-DAG-HITL
{{I-D.nennemann-agent-dag-hitl-safety}} for trust threshold and
assurance profile policy enforcement. The AIMD trust model is
adapted from TCP congestion control.

View File

@@ -0,0 +1,695 @@
---
title: "Assurance Profiles for Agent Ecosystems (APAE)"
abbrev: "APAE"
category: info
docname: draft-apae-assurance-profiles-01
submissiontype: IETF
number:
date:
v: 3
area: "SEC"
workgroup: "Security Dispatch"
keyword:
- dynamic trust
- assurance
- behavior verification
- data provenance
- quarantine
author:
-
fullname: TBD
organization: Independent
email: placeholder@example.com
normative:
RFC2119:
RFC8174:
RFC7519:
RFC7518:
RFC9110:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
informative:
RFC9334:
--- abstract
This document defines Assurance Profiles for Agent Ecosystems
(APAE): dynamic trust scoring, behavior verification, data
provenance, cross-domain trust, and graduated assurance profiles
that allow the same agent ecosystem to operate in relaxed
(dev/K8s) and regulated (healthcare, finance) environments.
Trust events are derived from ECT outcomes. Trust assertions are
ECTs. Behavior verification references ECT claims. Provenance
chains are implicit in the ECT DAG. Assurance profiles select
which combination of these mechanisms is required for a given
deployment, mapping to ECT assurance levels L1/L2/L3. Agents
whose trust falls below a floor are quarantined via a protocol
defined here.
--- middle
# Introduction
Identity verifies who an agent is. ECT records what an agent did.
But neither answers: should I rely on this agent? Is it doing what
it promised? Can I trace where this data came from?
APAE adds four capabilities to the ecosystem:
1. **Dynamic trust scoring** — behavioral reputation that adjusts
based on interaction outcomes (AIMD model).
2. **Behavior verification** — checking agent actions against
declared specifications.
3. **Data provenance** — tracing data lineage through the DAG.
4. **Cross-domain trust** — federating trust across administrative
domains.
These capabilities are bundled into assurance profiles
(Relaxed, Standard, Regulated) that map to ECT assurance levels,
so the same ecosystem works from a dev cluster to a hospital.
# Conventions and Definitions
{::boilerplate bcp14-tagged}
Trust Score:
: A floating-point value in \[0.0, 1.0\] representing one agent's
assessed reliability of another.
Trust Event:
: An interaction outcome that causes a trust score adjustment.
Derived from ECTs.
Trust Domain:
: An administrative boundary within which a single trust anchor
(CA or JWK set) governs agent identity.
Behavior Specification:
: A machine-readable declaration of permitted agent actions and
constraints.
Provenance Chain:
: The sequence of ECT nodes recording how a piece of data was
produced, transformed, and consumed.
Assurance Profile:
: A named configuration selecting which trust, verification, and
provenance mechanisms are required.
Quarantine:
: A state in which an agent's trust score has dropped below a
configured floor; the agent is prohibited from accepting new
delegations.
# Dynamic Trust Scoring {#trust}
## Trust Score Model
Each agent maintains a trust table: peer agent IDs mapped to
trust scores. Initial trust for unknown agents is deployment-
configured (RECOMMENDED: 0.5; zero-trust deployments: 0.1).
Scores update using additive-increase, multiplicative-decrease
(AIMD):
- Positive event: `score = min(1.0, score + alpha)`
- Negative event: `score = max(0.0, score * beta)`
Defaults: `alpha = 0.01`, `beta = 0.8`.
Trust builds slowly (100 successes: 0.5 → ~1.0) and drops fast
(one failure: 0.82 → 0.66).
## Trust Events from ECT {#trust-events}
Trust events are derived from ECTs rather than agent-local
counters, making trust computation auditable:
| ECT condition | Event | Adjustment |
|--------------|-------|------------|
| Completed, no error follows | `task_success` | +1x alpha |
| Completed, partial result | `task_partial` | +0.5x alpha |
| `atd:error` referencing agent | `task_failure` | 1x beta |
| No response within threshold | `task_timeout` | 1x beta |
| `atd:error` with `constraint_violation` | `policy_violation` | beta^2 |
| ECT signature verification fails | `attestation_invalid` | beta^2 |
| `atd:rollback_request` targeting agent | `rollback_triggered` | 1x beta |
{: #fig-events title="Trust Events from ECT"}
## Trust Decay
If no interaction occurs for a configurable period (default:
7 days): `score = max(initial_default, score - 0.01/day)`.
## Trust Assertions as ECT {#trust-assertions}
Agent A shares its trust assessment via a trust assertion ECT:
- `exec_act`: `"apae:trust_assertion"`
~~~json
{
"exec_act": "apae:trust_assertion",
"ext": {
"apae.subject": "spiffe://example.com/agent/b",
"apae.trust_score": 0.82,
"apae.interactions": 147,
"apae.confidence": "high",
"apae.hops": 0
}
}
~~~
{: #fig-assertion title="Trust Assertion ECT"}
Confidence: `low` (<10 interactions), `medium` (10-99),
`high` (100+).
Trust assertion ECTs MUST be signed at L2/L3.
## Trust Propagation
When Agent C receives A's assertion about B:
~~~
c_score_for_b = max(c_score_for_b,
a_score * trust_of_a * attenuation)
~~~
Default `attenuation`: 0.5. Direct observations always take
precedence. `apae.hops` tracks propagation depth; agents MUST NOT
propagate beyond their configured maximum (default: 1).
## Trust Thresholds as Policy
Trust thresholds are ACP-DAG-HITL node constraints:
~~~json
{
"constraints": {
"apae.min_trust": 0.7,
"apae.min_confidence": "medium"
}
}
~~~
{: #fig-threshold title="Trust Threshold as Node Constraint"}
Requests from agents below threshold MUST be denied with HTTP 403.
The `apae.peer_trust_score` is a runtime context value derived
from the trusting agent's trust table for the requesting peer;
it is not an ECT claim itself.
Low trust can trigger HITL escalation:
~~~json
{
"id": "r-low-trust",
"trigger": {
"kind": "confidence_below",
"op": "lt",
"value": 0.5,
"input_ref": "apae.peer_trust_score"
},
"required_role": "operator:security",
"action": "escalate",
"allow_override": true,
"override_action": "continue"
}
~~~
{: #fig-trust-hitl title="HITL Rule for Low Trust"}
## Automatic Revocation
When trust drops below a floor (default: 0.2), the trusting agent
SHOULD revoke delegations and emit:
`exec_act: "apae:trust_revoke"`.
# Quarantine Protocol {#quarantine}
When a trust score drops below the configured quarantine floor
(default: 0.15), the agent enters quarantine.
## Quarantine Entry
The detecting agent MUST emit a quarantine ECT:
~~~json
{
"exec_act": "apae:quarantine",
"ext": {
"apae.subject": "spiffe://example.com/agent/b",
"apae.score": 0.12,
"apae.threshold": 0.15,
"apae.quarantine_until": "2026-03-02T12:00:00Z",
"apae.reason": "Repeated policy_violation events (3 in 1 hour)"
}
}
~~~
{: #fig-quarantine title="Quarantine ECT"}
The quarantine ECT MUST be broadcast to all agents that have
received trust assertions about the quarantined agent
(via `apae:trust_assertion` with matching `apae.subject`).
## Quarantined Agent Behavior
While quarantined, a subject agent:
- MUST NOT accept new delegations. New delegation requests MUST
return HTTP 503 with `Retry-After` set to `apae.quarantine_until`.
- MUST complete in-progress workflows (drain behavior per AEPB).
- MAY accept direct operator commands (HITL Level 4 is unaffected).
Agents receiving the quarantine notification MUST update their
trust table and MUST NOT delegate new tasks to the quarantined
agent until the quarantine expires or is lifted.
## Quarantine Duration
The default quarantine duration is 1 hour, doubling on each
successive quarantine entry:
| Quarantine count | Duration |
|-----------------|---------|
| 1 | 1 hour |
| 2 | 2 hours |
| 3 | 4 hours |
| n | 2^(n-1) hours (max 168 hours / 7 days) |
{: #fig-quarantine-duration title="Quarantine Duration Escalation"}
## Quarantine Expiry and Recovery
When the quarantine period expires:
1. The agent's trust score is reset to the initial default
(deployment-configured; RECOMMENDED: 0.5 for recovery).
2. The agent transitions back to active status per AEPB lifecycle.
3. A recovery ECT MAY be emitted: `exec_act: "apae:quarantine"` with
`apae.to_state: "active"`.
An operator MAY lift a quarantine early by issuing a HITL override
(Level 1 or higher) with scope `apae:quarantine_lift` for the
subject agent.
# Behavior Verification {#behavior}
## Behavior Specifications
A behavior specification declares what an agent is permitted to do.
Specifications are JSON documents referencing ECT claims:
~~~json
{
"spec_version": "1.0",
"agent_id": "spiffe://example.com/agent/firewall",
"allowed_actions": ["update_rules", "read_config", "report"],
"constraints": {
"max_actions_per_minute": 60,
"forbidden_targets": ["core-router-*"],
"require_checkpoint_before": ["update_rules"]
},
"verification_frequency": "continuous"
}
~~~
{: #fig-spec title="Behavior Specification"}
Behavior specifications SHOULD be signed and versioned. Changes
MUST be recorded as ECTs.
## Verification Against ECT Stream
A verifier monitors the agent's ECT stream and checks:
1. `exec_act` values are in `allowed_actions`.
2. Action rate does not exceed `max_actions_per_minute` (computed
from `iat` timestamps).
3. `atd:checkpoint` ECTs precede `update_rules` ECTs (from
`require_checkpoint_before`).
4. Targets in `ext` claims do not match `forbidden_targets`.
Verification results are ECTs:
- `exec_act`: `"apae:compliance_check"`
~~~json
{
"exec_act": "apae:compliance_check",
"par": ["latest-agent-ect-uuid"],
"ext": {
"apae.compliance_status": "passing",
"apae.violations": [],
"apae.spec_version": "1.0",
"apae.window": "2026-03-01T12:00:00Z/PT1H"
}
}
~~~
{: #fig-compliance title="Compliance Check ECT"}
Violations trigger trust score decreases (`policy_violation` event)
and MAY trigger HITL escalation.
A violation compliance check ECT looks like:
~~~json
{
"exec_act": "apae:compliance_check",
"par": ["offending-ect-uuid"],
"ext": {
"apae.compliance_status": "failing",
"apae.violations": [
{
"rule": "require_checkpoint_before",
"action": "update_rules",
"ect": "offending-ect-uuid",
"description": "update_rules at 12:03:15 has no preceding atd:checkpoint within 10s"
}
],
"apae.spec_version": "1.0",
"apae.window": "2026-03-01T12:00:00Z/PT1H"
}
}
~~~
{: #fig-violation title="Compliance Violation ECT"}
# Data Provenance {#provenance}
## DAG as Provenance Chain
The ECT DAG already encodes data provenance: each ECT's `par`
references show which prior tasks produced its inputs. The
`inp_hash` and `out_hash` claims prove what was processed without
revealing the data.
For deployments requiring explicit provenance metadata, agents
MAY include:
~~~json
{
"ext": {
"apae.data_source": "database:patients",
"apae.data_classification": "pii",
"apae.retention_days": 365,
"apae.transformations": ["anonymize", "aggregate"]
}
}
~~~
{: #fig-provenance title="Provenance Extension Claims"}
At Regulated assurance level, all data-transforming ECT nodes
MUST include provenance claims.
## Provenance Queries
At L3, the audit ledger enables provenance queries:
- "Which agents touched this data?" → walk `par` chain from
final ECT to roots.
- "Was this data transformed?" → check `apae.transformations`
along the chain.
- "Is provenance complete?" → verify all `par` references
resolve to ledger entries.
# Cross-Domain Trust {#cross-domain}
## Trust Domain Basics
A trust domain is an administrative boundary within which a
single trust anchor (CA certificate or JWK set) governs agent
identity. Trust scores are local to a trust domain by default.
## Trust Domain Registration
Each trust domain MUST publish a trust anchor at a well-known URI:
~~~
GET /.well-known/apae/trust-anchor HTTP/1.1
~~~
The response MUST be a JSON object containing:
~~~json
{
"domain": "example.com",
"trust_anchor_type": "jwks",
"trust_anchor_uri": "https://example.com/.well-known/jwks.json",
"contact": "trust-admin@example.com"
}
~~~
{: #fig-trust-anchor title="Trust Anchor Document"}
## Cross-Domain Delegation
When Agent A (domain X) delegates to Agent B (domain Y):
1. A MUST verify that its ACP-DAG-HITL policy permits cross-domain
delegation to domain Y (bilateral trust agreement).
2. A fetches B's trust anchor document to verify B's identity.
3. A creates an `apae:cross_domain_assertion` ECT linking the
two domains.
4. Both A and B include their domain in ECT `iss` claims.
~~~json
{
"exec_act": "apae:cross_domain_assertion",
"ext": {
"apae.source_domain": "example.com",
"apae.dest_domain": "hospital.example",
"apae.bilateral_agreement_ref": "agreement-id-2026-001",
"apae.min_assurance": "L2"
}
}
~~~
{: #fig-cross-domain title="Cross-Domain Assertion ECT"}
The ASCII diagram below illustrates a cross-domain delegation:
~~~
Domain: example.com Domain: hospital.example
┌──────────────────┐ ┌──────────────────────┐
│ Agent A │ AEPB │ Agent B │
│ (orchestrator) ├───────►│ (treatment planner) │
│ ECT: L2 │ │ ECT: L3 │
└──────────────────┘ └──────────────────────┘
│ │
└─── cross_domain_assertion ECT ──┘
(bilateral agreement verified)
~~~
{: #fig-cross-domain-diag title="Cross-Domain Delegation"}
## Cross-Domain Trust Scores
Trust scores do not transfer across domain boundaries by default.
When Agent A in domain X has no prior interactions with Agent B
in domain Y:
- If a bilateral trust agreement exists: initial trust is set to
the agreement's `default_trust` value (negotiated out of band).
- If no agreement exists: delegation MUST be rejected (zero-trust
default).
Cross-domain trust scores are isolated from intra-domain scores
and are stored separately in the trust table.
# Assurance Profiles {#profiles}
## Profile Definitions
An assurance profile is a named configuration that selects which
mechanisms are required. Profiles MUST be declared in ACP-DAG-HITL
workflow policy and announced in the AEPB capability document.
| Mechanism | Relaxed | Standard | Regulated |
|-----------|---------|----------|-----------|
| **ECT level** | L1 | L2 | L3 |
| **Trust scoring** | Optional | RECOMMENDED | REQUIRED |
| **Trust threshold enforcement** | Optional | RECOMMENDED | REQUIRED |
| **Behavior verification** | Off | Periodic | Continuous |
| **HITL approval gates** | Optional | Critical paths | Mandatory |
| **Data provenance claims** | Off | Optional | REQUIRED |
| **Checkpoint before consequential** | RECOMMENDED | REQUIRED | REQUIRED |
| **Audit ledger** | Optional | Optional | REQUIRED |
| **Quarantine protocol** | Optional | RECOMMENDED | REQUIRED |
| **Cross-domain trust agreements** | Optional | Required if cross-domain | Required if cross-domain |
{: #fig-profiles title="Assurance Profile Requirements"}
Relaxed:
: Internal dev/staging. L1 ECTs. Trust and verification
optional. Useful for debugging and observability without
cryptographic overhead.
Standard:
: Production cross-org. L2 ECTs. Trust scoring and thresholds
recommended. Periodic behavior verification. HITL on critical
paths.
Regulated:
: Healthcare, finance, EU AI Act. L3 ECTs with audit ledger.
Continuous behavior verification. All trust mechanisms
required. Full provenance chain. Mandatory HITL gates.
Profiles are declared in ACP-DAG-HITL node constraints:
~~~json
{
"constraints": {
"apae.assurance_profile": "regulated"
}
}
~~~
{: #fig-profile-policy title="Profile as Node Constraint"}
A single deployment MAY use different profiles for different
workflows.
## Profile Selection Guidance
Operators SHOULD select profiles using the following decision
table:
| Deployment context | Recommended profile |
|-------------------|--------------------|
| Unit tests, local development | Relaxed |
| Internal production (single org) | Standard |
| Cross-organization production | Standard (with trust agreements) |
| Financial services, EU AI Act critical | Regulated |
| Healthcare (HIPAA, clinical trials) | Regulated |
| Critical infrastructure (NIS2) | Regulated |
{: #fig-profile-selection title="Profile Selection Guidance"}
## Upgrade Path Between Profiles
Operators MUST NOT downgrade assurance profile during an active
workflow.
Relaxed → Standard:
: (1) Add ECT signing keys (WIMSE WIT or X.509). (2) Update ECT
emission to sign tokens. (3) Configure trust scoring
(alpha/beta, initial trust, thresholds). (4) Define behavior
specifications for critical agents. (5) Add HITL approval gates
on critical DAG paths.
Standard → Regulated:
: (1) Configure audit ledger endpoint. (2) Update ECT emission
to commit each ECT to ledger. (3) Enable continuous behavior
verification (change `verification_frequency` from `periodic`
to `continuous`). (4) Enable provenance claims on all
data-transforming ECTs. (5) Add mandatory HITL gates on all
consequential actions. (6) Enable quarantine protocol.
# Security Considerations
## Trust Score Sensitivity
Trust scores are sensitive metadata. Agents MUST NOT expose
full trust tables. Only pairwise assertions SHOULD be shared,
and only in response to explicit authenticated requests.
## Score Inflation (Adversarial Trust Building)
An adversary performs many small successful interactions to
inflate trust, then executes a malicious action. Mitigation:
- Apply double penalty (`beta^2`) for `policy_violation` events.
- Enforce high trust thresholds for high-risk actions.
- Rate-limit trust score increases: an agent MUST NOT increase
trust by more than 0.1 per day toward any single peer.
- Use behavior verification continuously at Standard+.
## Attestation Freshness
Stale compliance check ECTs MUST be rejected. The verifier MUST
check that `apae:compliance_check` ECTs have `iat` within the
configured verification window (default: 1 hour for Standard,
5 minutes for Regulated).
## Provenance Chain Forgery
Each provenance hop must be signed (L2+) to prevent injection
of false provenance records. Agents MUST verify the signature
on all `par`-linked ECTs before accepting provenance claims.
## Sybil Attack on Trust
Fake agents inflate trust for each other to gain influence.
Mitigation:
- Trust propagation attenuation (default 0.5) limits the impact
of second-hand assertions.
- Maximum hop count of 1 for trust propagation.
- Require agents to be registered in a trusted directory before
initial trust is assigned above the floor value.
## Cross-Domain Trust Downgrade
An attacker forces delegation through an untrusted domain by
presenting a forged bilateral agreement. Mitigation:
- Bilateral trust agreements MUST be signed by operators of
both domains.
- Agents MUST verify the agreement signature before accepting
cross-domain delegations.
- Cross-domain ECTs MUST use L2+ assurance.
## Quarantine Evasion
An agent subject to quarantine re-registers under a different
identity to escape the quarantine. Mitigation:
- Quarantine ECTs are broadcast; receiving agents record the
quarantine by both agent ID and by behavioral fingerprint.
- Agents SHOULD require re-onboarding with operator approval
before accepting new identities from known-quarantined domains.
# IANA Considerations
## Assurance Profile Registry
This document requests the creation of the "APAE Assurance Profile
Registry" under IANA. Registration policy: Specification Required.
Initial entries:
| Profile Name | Profile URI | Description | Reference |
|-------------|------------|-------------|-----------|
| Relaxed | `urn:ietf:params:apae:profile:relaxed` | Dev/test, L1 ECTs | This document |
| Standard | `urn:ietf:params:apae:profile:standard` | Production, L2 ECTs | This document |
| Regulated | `urn:ietf:params:apae:profile:regulated` | Regulated, L3 ECTs | This document |
{: #fig-profile-registry title="Assurance Profile Registry"}
## `exec_act` Values
This document requests registration in the AEM Ecosystem
Extension Registry:
| Value | Description | Reference |
|-------|-------------|-----------|
| `apae:trust_assertion` | Sharing trust score for a peer | This document |
| `apae:trust_revoke` | Revoking delegations due to low trust | This document |
| `apae:compliance_check` | Behavior verification result | This document |
| `apae:quarantine` | Agent quarantine entry or exit | This document |
| `apae:cross_domain_assertion` | Cross-domain delegation evidence | This document |
{: #fig-iana-actions title="APAE exec_act Registrations"}
## Well-Known URI
This document requests registration of `apae/trust-anchor` as a
well-known URI suffix per RFC 8615 for trust domain anchor
publication.
--- back
# Acknowledgments
{:numbered="false"}
APAE builds on ECT {{I-D.nennemann-wimse-ect}} for interaction
evidence and audit, and ACP-DAG-HITL
{{I-D.nennemann-agent-dag-hitl-safety}} for trust threshold and
assurance profile policy enforcement. The AIMD trust model is
adapted from TCP congestion control (RFC 5681). Behavior
verification is informed by RATS architecture {{RFC9334}}.

View File

@@ -0,0 +1,372 @@
---
title: "Human Emergency Override Protocol (HEOP)"
abbrev: "HEOP"
category: std
docname: draft-heop-human-emergency-override-00
submissiontype: IETF
number:
date:
v: 3
area: "SEC"
workgroup: "Security Dispatch"
keyword:
- human override
- emergency stop
- agentic workflows
- HITL
- execution context
author:
-
fullname: Generated by IETF Draft Analyzer
organization: Independent
email: placeholder@example.com
normative:
RFC7519:
RFC7515:
RFC9110:
RFC8615:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
informative:
--- abstract
This document defines the Human Emergency Override Protocol (HEOP),
the runtime enforcement mechanism for human intervention in
autonomous AI agent operations. HEOP is the "how" to ACP-DAG-HITL's
"when": where the Agent Context Policy Token defines conditions
that require human decision, HEOP defines the wire protocol for
override commands, agent compliance, and acknowledgment. HEOP
specifies four override levels (pause, constrain, stop, takeover),
a mandatory agent compliance endpoint, and records every override
as an ECT DAG node for tamper-evident audit. Override levels map
directly to ACP-DAG-HITL actions.
--- middle
# Introduction
As AI agents gain autonomy in critical infrastructure, the ability
for humans to intervene quickly and reliably becomes essential.
The current ratio of autonomous capability drafts to human
oversight drafts in the IETF is roughly 7:1.
The Agent Context Policy Token
{{I-D.nennemann-agent-dag-hitl-safety}} defines a policy language
for human-in-the-loop safety: trigger conditions, required roles,
and permitted actions (`pause`, `escalate`, `abort`). But it does
not define the runtime protocol for how overrides are transmitted to
agents, how agents acknowledge them, or how the intervention is
recorded. HEOP fills this gap.
HEOP draws from industrial safety: the emergency stop button on
factory equipment, the circuit breaker in electrical systems, the
kill switch in robotics. The override mechanism must be simpler
and more reliable than the system it controls.
Every override command and acknowledgment is recorded as an ECT
{{I-D.nennemann-wimse-ect}}, linking into the workflow DAG. At
L3, this provides the tamper-evident audit trail that regulated
environments (FDA, MiFID II, EU AI Act) require for human
intervention records.
# Conventions and Definitions
{::boilerplate bcp14-tagged}
Override:
: A human-initiated command that alters an agent's autonomous
operation, taking precedence over the agent's own decision-making.
Operator:
: A human user authorized to issue override commands, corresponding
to a `required_role` in ACP-DAG-HITL policy.
Override Level:
: One of four escalating intervention types, each with
deterministic agent behavior requirements.
# Mapping to ACP-DAG-HITL Actions {#mapping}
HEOP override levels are the runtime realization of ACP-DAG-HITL
actions:
| ACP-DAG-HITL action | HEOP Level | Behavior |
|---------------------|------------|----------|
| `pause` | 1 (PAUSE) | Suspend autonomous actions, hold state |
| (no equivalent) | 2 (CONSTRAIN) | Restrict to allowed action subset |
| `abort` | 3 (STOP) | Cease all actions, enter inert state |
| `escalate` | 4 (TAKEOVER) | Transfer control to human operator |
{: #fig-mapping title="ACP-DAG-HITL to HEOP Mapping"}
Level 2 (CONSTRAIN) extends beyond ACP-DAG-HITL's current action
vocabulary. When a HITL rule triggers with `action: "pause"` and
`override_action: "continue"`, the operator MAY continue with
HEOP Level 2 constraints rather than full resumption.
# Override Levels {#levels}
## Level 1 -- PAUSE
The agent MUST suspend all autonomous actions and hold its current
state. It MUST NOT initiate new actions but MAY complete
in-progress actions if stopping mid-execution would cause harm.
The agent resumes when a RESUME command is received.
## Level 2 -- CONSTRAIN
The agent MUST restrict its actions to a specified subset defined
in the override command. The agent MUST reject any action not on
the allowlist.
## Level 3 -- STOP
The agent MUST immediately cease all autonomous actions, abandon
in-progress actions where safe, and enter an inert state. It
MUST NOT act until explicitly restarted. This is the e-stop.
## Level 4 -- TAKEOVER
The agent MUST transfer operational control to the human operator,
entering pass-through mode where it executes only explicit operator
commands. The agent's sensors and outputs remain available to the
operator as tools.
# Override Command Format {#command-format}
Override commands are HTTP POST requests to the agent's well-known
endpoint, carrying an ECT in the Execution-Context header:
~~~
POST /.well-known/heop/override HTTP/1.1
Content-Type: application/json
Authorization: Bearer <operator-jwt>
Execution-Context: <override-ECT>
{
"override_id": "urn:uuid:...",
"level": 3,
"reason": "Agent blocking legitimate traffic",
"operator_id": "spiffe://example.com/human/alice",
"scope": "*",
"constraints": null,
"ttl": null
}
~~~
{: #fig-override title="Override Command"}
Field definitions:
`level`:
: Integer 1-4. MUST be present.
`reason`:
: Human-readable text. MUST be present and logged.
`scope`:
: Which agent functions to override. `"*"` means all. MAY be a
list of function identifiers for partial overrides.
`constraints`:
: For Level 2 only. JSON array of permitted action types, e.g.,
`["read", "monitor", "report"]`.
`ttl`:
: Optional duration in seconds. If set, the override expires
automatically and the agent resumes its prior mode.
## Resume and Lift
~~~
POST /.well-known/heop/resume HTTP/1.1
{"override_id": "urn:uuid:...", "operator_id": "..."}
POST /.well-known/heop/lift HTTP/1.1
{"override_id": "urn:uuid:...", "operator_id": "..."}
~~~
{: #fig-resume title="Resume and Lift Commands"}
# ECT Integration {#ect-integration}
## Override ECT
The operator (or operator's tooling) MUST produce an ECT for
every override command:
- `exec_act`: `"heop:override"`
- `par`: the `jti` of the HITL trigger ECT (if the override was
triggered by ACP-DAG-HITL policy) or empty (if manually
initiated)
~~~json
{
"ext": {
"heop.level": 3,
"heop.reason": "Agent blocking legitimate traffic",
"heop.operator_id": "spiffe://example.com/human/alice",
"heop.scope": "*"
}
}
~~~
{: #fig-override-ect title="Override ECT Extension Claims"}
## Acknowledgment ECT
The agent MUST produce an acknowledgment ECT:
- `exec_act`: `"heop:ack"`
- `par`: the `jti` of the override ECT
~~~json
{
"ext": {
"heop.status": "accepted",
"heop.prior_state": "autonomous",
"heop.current_state": "stopped",
"heop.effective_at": "2026-03-01T12:00:00.123Z"
}
}
~~~
{: #fig-ack-ect title="Acknowledgment ECT Extension Claims"}
## Decision Record Alignment
The override/ack ECT pair serves as the ACP-DAG-HITL Decision
Record {{I-D.nennemann-agent-dag-hitl-safety}}. The required
Decision Record fields map as follows:
| Decision Record field | ECT source |
|----------------------|------------|
| `decision_id` | Override ECT `jti` |
| `token_jti` | HITL trigger ECT `jti` (from `par`) |
| `rule_ids` | From HITL trigger context |
| `human_id` | `heop.operator_id` |
| `human_role` | From operator JWT claims |
| `decision` | Derived from `heop.level` |
| `time` | Override ECT `iat` |
{: #fig-decision-record title="Decision Record Mapping"}
At L3, both ECTs are recorded in the audit ledger, providing a
tamper-evident record of every human intervention.
# Agent Compliance Requirements {#compliance}
Every HEOP-compliant agent MUST:
1. Implement the `/.well-known/heop/override` endpoint.
2. Process override commands within 1 second of receipt. The
override path MUST be independent of the agent's main
processing loop.
3. Produce an acknowledgment ECT for every override.
4. If the agent cannot fully comply (e.g., hardware limitation),
it MUST respond with `heop.status`: `"partial"` and a
description. An agent MUST NOT respond with `"rejected"`.
5. Expose current override status at:
~~~
GET /.well-known/heop/status
~~~
Response:
~~~json
{
"agent_id": "spiffe://example.com/agent/firewall-mgr",
"override_active": true,
"current_level": 3,
"override_ect_jti": "550e8400-e29b-41d4-a716-446655440055",
"since": "2026-03-01T12:00:00Z",
"operator_id": "spiffe://example.com/human/alice"
}
~~~
{: #fig-status title="Override Status"}
# Broadcast Overrides {#broadcast}
For environments with many agents, HEOP supports broadcast. An
operator sends a single command to a management endpoint:
~~~
POST /heop/broadcast HTTP/1.1
{
"override_id": "urn:uuid:...",
"level": 3,
"reason": "Coordinated emergency stop",
"targets": ["spiffe://example.com/agent/a1", "spiffe://example.com/agent/a2"]
}
~~~
{: #fig-broadcast title="Broadcast Override"}
The broadcast endpoint produces a parent ECT with
`exec_act`: `"heop:broadcast"`, and each per-agent override ECT
references it via `par`.
# Dead Man's Switch {#dead-mans-switch}
Agents SHOULD support a heartbeat-based safety net: the agent
periodically pings an operator heartbeat endpoint. If the
heartbeat is missed for a configurable duration, the agent
automatically enters Level 1 (PAUSE) and produces a
self-override ECT with `exec_act`: `"heop:dead_mans_switch"`.
This provides safety when network connectivity to the operator
is lost.
# Security Considerations
Override commands are high-privilege operations. All override
endpoints MUST require authentication via signed JWTs with the
`heop_override` scope. The JWT MUST include the operator's
identity, a timestamp, and be signed using an asymmetric algorithm.
Override commands MUST be transmitted over TLS 1.3.
To prevent replay, agents MUST reject overrides with timestamps
more than 30 seconds in the past. The `override_id` MUST be
unique; agents MUST reject duplicates.
Deployments SHOULD implement multi-operator approval for Level 4
(TAKEOVER), requiring two independent operator JWTs.
The override endpoint SHOULD be served on a separate port or
network interface from the agent's main API to ensure availability
during overload.
The ECT DAG provides tamper-evident audit of all overrides. At
L3, the audit ledger prevents override records from being deleted
or modified after the fact.
# IANA Considerations
This document requests the following IANA registrations:
1. Well-known URI registrations for `heop/override`,
`heop/resume`, `heop/lift`, and `heop/status` per {{RFC8615}}.
2. Registration of `exec_act` values `heop:override`, `heop:ack`,
`heop:broadcast`, `heop:dead_mans_switch` in a future ECT
action type registry.
3. Registration of the `heop_override` OAuth scope.
--- back
# Acknowledgments
{:numbered="false"}
This document is the runtime enforcement companion to the Agent
Context Policy Token {{I-D.nennemann-agent-dag-hitl-safety}},
which defines the HITL policy language, and builds on the
Execution Context Token {{I-D.nennemann-wimse-ect}} for
audit and tracing.

View File

@@ -0,0 +1,307 @@
Internet-Draft AI/Agent WG
Intended status: Standards Track March 2026
Expires: September 15, 2026
Human Emergency Override Protocol (HEOP)
draft-heop-human-emergency-override-00
Abstract
This document defines the Human Emergency Override Protocol
(HEOP), a standard mechanism for human operators to intervene
in autonomous AI agent operations during critical situations.
Current IETF drafts include 60 autonomous operations proposals
but only 22 addressing human-agent interaction, with none
defining emergency override procedures. HEOP specifies four
escalating override levels (pause, constrain, stop, takeover),
a mandatory agent compliance interface, and acknowledgment
semantics that ensure overrides are received and acted upon.
The protocol is intentionally minimal: a single HTTP endpoint
per agent, four command types, and deterministic agent
behavior for each.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
This document is intended to have Standards Track status.
Distribution of this memo is unlimited.
Table of Contents
1. Introduction
2. Terminology
3. Problem Statement
4. Override Levels
5. Override Command Format
6. Agent Compliance Requirements
7. Override Management Interface
8. Security Considerations
9. IANA Considerations
1. Introduction
As AI agents gain autonomy in critical infrastructure, the
ability for humans to intervene quickly and reliably becomes
essential. The current ratio of autonomous capability drafts
to human oversight drafts in the IETF is roughly 7:1, creating
an asymmetry where agents can act but humans cannot reliably
stop them.
HEOP draws inspiration from industrial safety systems: the
emergency stop (e-stop) button on factory equipment, the
circuit breaker in electrical systems, and the kill switch in
robotics. These systems share a design philosophy: the
override mechanism must be simpler and more reliable than the
system it controls.
HEOP is deliberately not a governance framework, policy
language, or accountability protocol. It is a panic button
with a well-defined interface.
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described
in RFC 2119 [RFC2119].
Override: A human-initiated command that alters an agent's
autonomous operation, taking precedence over the agent's own
decision-making.
Operator: A human user authorized to issue override commands
to one or more agents.
Override Level: One of four escalating intervention types,
each with deterministic agent behavior requirements.
3. Problem Statement
An autonomous network management agent detects what it believes
is a DDoS attack and begins blocking traffic. It is wrong —
the traffic spike is legitimate (a product launch). The
operator sees revenue dropping and needs to stop the agent
immediately. Today, the operator must:
1. Figure out which agent is responsible.
2. Find that agent's proprietary management interface.
3. Understand its specific stop mechanism (if one exists).
4. Hope the agent actually stops.
There is no standard for any of these steps. HEOP addresses
steps 2-4 by defining a universal override interface that all
agents MUST implement.
4. Override Levels
HEOP defines four override levels, each more restrictive than
the last:
Level 1 — PAUSE
The agent MUST suspend all autonomous actions and hold its
current state. It MUST NOT initiate new actions but MAY
complete actions already in progress if stopping them mid-
execution would cause more harm (e.g., an in-flight database
transaction). The agent MUST resume normal operation when a
RESUME command is received.
Level 2 — CONSTRAIN
The agent MUST restrict its actions to a specified subset.
The override command includes an allowlist of permitted action
types. The agent MUST reject any action not on the allowlist.
This enables operators to let the agent continue operating in
a limited, safe capacity.
Level 3 — STOP
The agent MUST immediately cease all autonomous actions,
abandon in-progress actions where safe to do so, and enter an
inert state. It MUST NOT take any autonomous actions until
explicitly restarted by an operator. This is the equivalent
of an e-stop.
Level 4 — TAKEOVER
The agent MUST transfer operational control to the human
operator. It enters a pass-through mode where it executes
only explicit operator commands and takes no autonomous
actions. The agent's sensors and outputs remain available to
the operator as tools.
5. Override Command Format
Override commands are sent as HTTP POST requests to the agent's
well-known override endpoint:
POST /.well-known/heop/override HTTP/1.1
Content-Type: application/json
Authorization: Bearer <operator-jwt>
{
"override_id": "urn:uuid:...",
"level": 3,
"reason": "Agent blocking legitimate traffic",
"operator_id": "urn:uuid:...",
"timestamp": "2026-03-01T12:00:00Z",
"scope": "*",
"constraints": null,
"ttl": null
}
Field definitions:
"level": Integer 1-4, corresponding to the override levels in
Section 4. MUST be present.
"reason": Human-readable text. MUST be present and MUST be
logged by the agent.
"scope": Which of the agent's functions to override. "*" means
all functions. MAY be a list of function identifiers for
partial overrides.
"constraints": For Level 2 only. A JSON array of permitted
action types, e.g., ["read", "monitor", "report"].
"ttl": Optional duration in seconds. If set, the override
automatically expires after this duration and the agent
resumes its prior operating mode. If null, the override
persists until explicitly lifted.
To resume from Level 1 (PAUSE):
POST /.well-known/heop/resume HTTP/1.1
Authorization: Bearer <operator-jwt>
{"override_id": "urn:uuid:...", "operator_id": "urn:uuid:..."}
To lift any override:
POST /.well-known/heop/lift HTTP/1.1
Authorization: Bearer <operator-jwt>
{"override_id": "urn:uuid:...", "operator_id": "urn:uuid:..."}
6. Agent Compliance Requirements
Every HEOP-compliant agent MUST:
1. Implement the /.well-known/heop/override endpoint.
2. Process override commands within 1 second of receipt.
The override path MUST be independent of the agent's main
processing loop to ensure responsiveness even when the
agent is under heavy load or in a failure state.
3. Acknowledge every override with an HTTP response:
200 OK:
{
"override_id": "urn:uuid:...",
"status": "accepted",
"effective_at": "2026-03-01T12:00:00.123Z",
"prior_state": "autonomous",
"current_state": "stopped"
}
4. Log all overrides, including the full command, timestamp,
operator identity, and agent state before and after.
5. If the agent cannot comply (e.g., hardware limitation), it
MUST respond with status "partial" and a description of
what it could and could not do. An agent MUST NOT respond
with "rejected" — overrides are mandatory.
6. Expose current override status at:
GET /.well-known/heop/status
{
"agent_id": "urn:uuid:...",
"override_active": true,
"current_level": 3,
"override_id": "urn:uuid:...",
"since": "2026-03-01T12:00:00Z",
"operator_id": "urn:uuid:..."
}
7. Override Management Interface
For environments with many agents, HEOP supports broadcast
overrides. An operator MAY send a single override command to
a management endpoint that fans out to multiple agents:
POST /heop/broadcast HTTP/1.1
{
"override_id": "urn:uuid:...",
"level": 3,
"reason": "Coordinated emergency stop",
"targets": ["urn:uuid:agent-1", "urn:uuid:agent-2"],
"operator_id": "urn:uuid:..."
}
The broadcast endpoint MUST return per-agent results:
{
"results": [
{"agent_id": "urn:uuid:agent-1", "status": "accepted"},
{"agent_id": "urn:uuid:agent-2", "status": "accepted"}
],
"failed": []
}
For maximum reliability, operators SHOULD also implement a
dead man's switch: agents periodically ping an operator
heartbeat endpoint, and if the heartbeat is missed for a
configurable duration, the agent automatically enters Level 1
(PAUSE). This provides a safety net when network connectivity
to the operator is lost.
8. Security Considerations
Override commands are high-privilege operations. All override
endpoints MUST require authentication via mutual TLS or signed
JWTs issued by a trusted operator identity provider.
The JWT MUST include the operator's identity, a timestamp, and
the "heop_override" scope. Agents MUST verify JWT signatures
and reject expired tokens.
Override commands MUST be transmitted over TLS 1.3 [RFC8446].
To prevent override replay attacks, agents MUST reject
override commands with timestamps more than 30 seconds in the
past. The override_id MUST be unique; agents MUST reject
duplicate override_ids.
Rogue operators are mitigated through the operator identity
framework. Deployments SHOULD implement multi-operator
approval for Level 4 (TAKEOVER) overrides, requiring two
independent operator JWTs.
The override mechanism itself MUST be resistant to denial of
service. The override endpoint SHOULD be served on a
separate port or network interface from the agent's main
API to ensure availability during agent overload conditions.
9. IANA Considerations
This document requests IANA establish the following:
1. A well-known URI registration for "heop/override",
"heop/resume", "heop/lift", and "heop/status" per
RFC 8615.
2. A "HEOP Override Level" registry under Standards Action
policy. Initial entries: 1 (PAUSE), 2 (CONSTRAIN),
3 (STOP), 4 (TAKEOVER).
3. Registration of the "heop_override" OAuth scope in the
OAuth Parameters registry.
Author's Address
Generated by IETF Draft Analyzer
2026-03-01

View File

@@ -0,0 +1,598 @@
Internet-Draft AI/Agent WG
Intended status: standards-track March 2026
Expires: September 02, 2026
Agent Behavior Verification Protocol (ABVP) for Runtime Compliance Validation
draft-ai-agent-behavior-verification-protocol-00
Abstract
This document defines the Agent Behavior Verification Protocol
(ABVP), a standardized framework for continuously validating that
deployed AI agents operate according to their declared policies
and specifications. As autonomous agents become increasingly
prevalent in critical systems, there is a growing gap between
stated agent capabilities and actual runtime behavior
verification. ABVP provides mechanisms for real-time behavior
monitoring, policy compliance validation, and cryptographic
attestation of agent actions against predefined behavioral
specifications. The protocol defines a verification architecture
that includes behavior witnesses, compliance checkers, and
attestation chains to ensure agents maintain fidelity to their
declared operational parameters. ABVP integrates with existing
agent accountability frameworks while providing specific
mechanisms for runtime verification, behavioral drift detection,
and compliance reporting. This specification addresses the
critical need for trustworthy agent deployment by enabling
operators to continuously verify agent behavior matches stated
policies throughout the agent lifecycle.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
This document is intended to have standards-track status.
Distribution of this memo is unlimited.
Table of Contents
1. Introduction ................................................ 3
2. Terminology ................................................. 4
3. Problem Statement ........................................... 5
4. Agent Behavior Verification Architecture .................... 6
5. Behavior Specification Format ............................... 7
6. Runtime Verification Protocol ............................... 8
7. Compliance Reporting and Attestation ........................ 9
8. Security Considerations ..................................... 10
9. IANA Considerations ......................................... 11
1. Introduction
The proliferation of autonomous AI agents in critical
infrastructure, financial systems, and decision-making processes
has created an urgent need for continuous verification that these
agents operate according to their declared policies and behavioral
specifications. Traditional approaches to agent deployment rely on
pre-deployment testing and static policy validation, which fail to
address the dynamic nature of agent behavior in production
environments. As agents adapt, learn, and respond to changing
conditions, their actual runtime behavior may diverge
significantly from their original specifications, creating
security vulnerabilities, compliance violations, and operational
risks that remain undetected until system failures occur.
Existing agent accountability frameworks primarily focus on post-
hoc analysis and audit trails, providing limited capability for
real-time behavior verification and immediate detection of policy
violations. This reactive approach is insufficient for autonomous
systems that make critical decisions with limited human oversight,
where behavioral drift or policy violations can have immediate and
severe consequences. Current verification methodologies also lack
standardized protocols for expressing behavioral constraints in
machine-verifiable formats, making it difficult to establish
consistent compliance validation across diverse agent
implementations and deployment environments.
The gap between declared agent capabilities and actual runtime
behavior verification represents a fundamental trust problem in
autonomous systems deployment. Organizations deploying AI agents
face significant challenges in ensuring that agents continue to
operate within specified parameters throughout their operational
lifecycle, particularly as agents encounter novel situations not
covered in initial testing scenarios. This verification gap
undermines confidence in agent reliability and limits the adoption
of autonomous systems in high-stakes environments where behavioral
compliance is critical for safety, security, and regulatory
compliance.
The Agent Behavior Verification Protocol (ABVP) addresses these
challenges by providing a standardized framework for continuous
runtime verification of agent behavior against declared
specifications. ABVP enables real-time monitoring of agent
actions, automated compliance checking against behavioral
policies, and cryptographic attestation of verification results to
establish trust chains for agent operation validation. The
protocol is designed to integrate with existing agent
architectures while providing mechanisms for detecting behavioral
drift, validating policy adherence, and generating verifiable
evidence of agent compliance throughout the operational lifecycle.
This specification defines the core protocol mechanisms, message
formats, and verification procedures necessary to implement
trustworthy agent behavior validation in production deployments.
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
in this document are to be interpreted as described in RFC 2119
[RFC2119].
Agent: An autonomous software entity that performs actions or
makes decisions according to defined policies and specifications.
In the context of ABVP, an agent is a system whose runtime
behavior requires continuous verification against its declared
operational parameters and behavioral constraints.
Behavior Specification: A formally defined set of policies,
constraints, and operational parameters that describe the expected
and permitted actions of an agent. A behavior specification MUST
be machine-readable and verifiable, containing sufficient detail
to enable automated compliance checking during agent runtime
operation.
Behavior Witness: A system component or external entity that
observes and records agent actions for verification purposes. A
behavior witness MUST provide cryptographically signed
attestations of observed agent behavior and MAY operate
independently of the agent being monitored to ensure verification
integrity.
Compliance Validation: The process of evaluating agent runtime
behavior against its declared behavior specification to determine
conformance. Compliance validation encompasses real-time
monitoring, policy checking, and the generation of verification
results that attest to agent adherence to specified behavioral
constraints.
Verification Attestation: A cryptographically signed statement
that asserts the compliance status of an agent's behavior relative
to its specification during a defined time period. Verification
attestations MUST include sufficient detail to enable third-party
validation and SHOULD reference the specific behavior
specification version and verification criteria used in the
assessment.
Behavioral Drift: The phenomenon where an agent's actual runtime
behavior gradually diverges from its declared specification over
time, either due to learning adaptations, environmental changes,
or system degradation. ABVP mechanisms MUST be capable of
detecting behavioral drift and reporting deviations from
established behavioral baselines.
3. Problem Statement
The proliferation of autonomous AI agents in critical
infrastructure, financial systems, and safety-critical
applications has created an urgent need for continuous
verification that deployed agents operate within their declared
behavioral boundaries. Current agent deployment practices rely
primarily on pre-deployment testing and static policy
declarations, creating a significant verification gap between an
agent's stated capabilities and constraints and its actual runtime
behavior. This gap becomes particularly problematic as agents
adapt their behavior through learning mechanisms, interact with
dynamic environments, or experience gradual behavioral drift due
to model degradation or adversarial influences.
Traditional software verification approaches are insufficient for
autonomous agents because agent behavior is often non-
deterministic, context-dependent, and may evolve over time through
machine learning processes. Unlike conventional software systems
where behavior can be predicted from code analysis, agent systems
exhibit emergent behaviors that arise from complex interactions
between training data, environmental inputs, and decision-making
algorithms. The absence of standardized mechanisms for expressing
machine-verifiable behavioral specifications further complicates
runtime verification, as operators lack a common framework for
defining what constitutes compliant agent behavior and how
compliance can be automatically validated.
The security and trust implications of unverified agent behavior
are substantial, particularly in scenarios where agents operate
with elevated privileges or make decisions affecting human safety
or economic systems. Behavioral drift, where an agent's actions
gradually deviate from intended policies, may go undetected for
extended periods without continuous verification mechanisms.
Similarly, adversarial attacks that subtly modify agent behavior
to achieve malicious objectives could remain unnoticed in systems
that lack real-time compliance monitoring. The inability to
provide cryptographic attestations of agent behavior compliance
also prevents the establishment of trust chains necessary for
multi-agent systems or cross-organizational agent interactions.
Current accountability frameworks for AI systems focus primarily
on explainability and audit trails but do not provide mechanisms
for real-time verification of behavioral compliance against
formally specified policies. This creates operational risks where
agents may violate their declared constraints without immediate
detection, potentially causing system failures, security breaches,
or regulatory violations. The lack of standardized verification
protocols also prevents interoperability between different agent
verification systems and limits the ability to establish industry-
wide trust frameworks for autonomous agent deployment.
4. Agent Behavior Verification Architecture
The ABVP architecture consists of four primary components that
work together to provide continuous runtime verification of agent
behavior: Agent Runtime Environments (AREs), Behavior Verification
Nodes (BVNs), Attestation Authorities (AAs), and Verification
Clients (VCs). Agent Runtime Environments host the deployed agents
and MUST implement behavior monitoring capabilities that capture
relevant behavioral data and forward it to designated Behavior
Verification Nodes. These environments MUST provide secure
isolation between the agent execution context and the monitoring
subsystem to prevent agents from interfering with their own
verification processes. The ARE MUST also implement a trusted
communication channel to BVNs using protocols such as TLS 1.3
[RFC8446] or QUIC [RFC9000] to ensure behavior data integrity
during transmission.
Behavior Verification Nodes serve as the core verification engines
within the ABVP architecture and MUST implement the runtime
verification protocol defined in Section 6. Each BVN maintains a
repository of behavior specifications for agents under its
verification authority and continuously processes behavioral
evidence received from AREs. BVNs MUST validate incoming behavior
data against the appropriate specifications and generate
compliance assessments in real-time. Multiple BVNs MAY collaborate
in a distributed verification network to provide redundancy and
prevent single points of failure. When operating in a distributed
configuration, BVNs MUST implement consensus mechanisms to ensure
consistent verification results across the network. BVNs MUST also
implement rate limiting and resource management to handle high-
volume verification requests without compromising verification
quality.
Attestation Authorities provide cryptographic attestation services
for verified behavior compliance and MUST maintain secure key
management infrastructure capable of generating unforgeable
attestations. AAs receive compliance reports from BVNs and MUST
verify the authenticity and integrity of these reports before
issuing attestations. The AA MUST implement a hierarchical trust
model where attestations can be validated through a chain of trust
extending to a root certificate authority. AAs SHOULD implement
hardware security modules (HSMs) or equivalent trusted execution
environments to protect attestation signing keys from compromise.
Multiple AAs MAY participate in cross-attestation relationships to
provide attestation redundancy and prevent single points of trust
failure.
Verification Clients represent entities that consume ABVP
attestations to make trust decisions about agent behavior and MAY
include system operators, regulatory bodies, or other automated
systems. VCs MUST implement attestation verification capabilities
including certificate chain validation and revocation checking as
specified in Section 7. The architecture MUST support both real-
time verification queries and batch verification processes to
accommodate different operational requirements. VCs SHOULD
implement local attestation caching with appropriate cache
invalidation mechanisms to reduce verification latency while
maintaining attestation freshness. The ABVP architecture MUST
provide clear separation of duties between verification components
to prevent conflicts of interest and ensure independent
verification processes.
The communication between architectural components MUST follow the
protocol specifications defined in Section 6, with all inter-
component communications authenticated and encrypted. The
architecture MUST support both synchronous and asynchronous
verification modes to accommodate different agent deployment
scenarios and performance requirements. Components MUST implement
appropriate logging and audit trail capabilities to support
forensic analysis and compliance reporting. The overall
architecture SHOULD be designed for horizontal scalability to
support large-scale agent deployments while maintaining
verification performance and reliability.
5. Behavior Specification Format
This section defines the standardized format for expressing agent
behavioral policies and constraints within the ABVP framework. The
behavior specification format enables machine-readable policy
declarations that can be automatically verified during agent
runtime. All behavior specifications MUST be expressed in a
structured format that supports both human readability and
automated processing by verification systems.
The core behavior specification is structured as a JSON document
conforming to the ABVP Behavior Schema. Each specification MUST
contain a policy declaration section, verification parameters, and
compliance thresholds. The policy declaration section includes
behavioral constraints expressed as logical predicates, allowed
action sets, and resource utilization bounds. Verification
parameters specify the monitoring frequency, sampling rates, and
attestation requirements for each declared behavior. Compliance
thresholds define the acceptable deviation ranges and tolerance
levels for measured behaviors compared to declared specifications.
Behavioral constraints within the specification are expressed
using a formal constraint language based on temporal logic
predicates. Each constraint MUST specify a behavioral property
(such as "response_time_bound" or "resource_utilization_limit"),
an operator (such as "less_than", "equals", or "within_range"),
and target values or ranges. Complex behavioral policies MAY be
constructed using logical operators (AND, OR, NOT) to combine
multiple constraints. The specification format supports
hierarchical constraint groupings to represent different
operational modes or contextual behavior variations.
The behavior specification includes a verification requirements
section that defines how each behavioral constraint should be
monitored and validated. This section MUST specify the required
verification frequency, acceptable measurement methods, and
cryptographic attestation parameters for each constraint.
Verification requirements MAY include sampling strategies for
performance-sensitive constraints and continuous monitoring
directives for safety-critical behaviors. The specification format
also supports conditional verification rules that adjust
monitoring parameters based on agent operational context or
detected behavioral patterns.
Each behavior specification MUST include metadata sections
containing versioning information, validity periods, and
specification dependencies. The metadata enables proper
specification lifecycle management and ensures compatibility
between agent deployments and verification infrastructure.
Specifications SHOULD include digital signatures from authorized
policy authors to ensure specification integrity and authenticity.
The format supports specification inheritance and composition,
allowing complex agent policies to be built from validated
behavioral specification components while maintaining verification
traceability throughout the composition hierarchy.
6. Runtime Verification Protocol
The Runtime Verification Protocol defines the message exchange
patterns and procedures that enable continuous monitoring and
validation of agent behavior against declared specifications. The
protocol operates on a request-response model where Verification
Requesters initiate compliance checks, Behavior Monitors observe
agent actions, and Compliance Checkers evaluate adherence to
behavioral specifications. All protocol participants MUST
implement the core verification message set defined in this
section, and MAY implement optional extensions for specialized
verification scenarios. The protocol is designed to operate over
existing transport mechanisms including HTTP/2 [RFC7540],
WebSocket [RFC6455], or dedicated secure channels established
through TLS 1.3 [RFC8446].
Verification sessions are initiated through a VERIFICATION_REQUEST
message that specifies the agent identifier, behavioral
specification reference, verification scope, and temporal
parameters for the compliance check. The requesting entity MUST
include a cryptographically secure session identifier, timestamp
bounds for the verification window, and references to the specific
behavioral constraints to be validated. Behavior Monitors respond
with MONITORING_DATA messages containing timestamped observations
of agent actions, decision traces, and relevant contextual
information captured during the specified verification window.
These messages MUST include integrity protection through digital
signatures and SHOULD include privacy-preserving mechanisms when
agent actions contain sensitive information.
Compliance evaluation proceeds through COMPLIANCE_CHECK messages
exchanged between Verification Requesters and designated
Compliance Checkers. Each compliance check message MUST reference
the behavioral specification being evaluated, include the
monitoring data to be assessed, and specify the verification
algorithms or rules to be applied. Compliance Checkers process the
monitoring data against the behavioral constraints and generate
COMPLIANCE_RESULT messages indicating whether the observed
behavior satisfies the specified requirements. Results MUST
include binary compliance indicators, detailed violation reports
when non-compliance is detected, and confidence metrics indicating
the reliability of the compliance assessment.
The protocol includes mechanisms for handling streaming
verification scenarios where agent behavior must be validated
continuously rather than in discrete sessions. Streaming
verification employs persistent connections where MONITORING_DATA
messages are transmitted in near real-time as agent actions occur,
enabling immediate detection of behavioral deviations. Compliance
Checkers maintain running assessments of behavioral compliance and
generate COMPLIANCE_ALERT messages when violations are detected or
when behavioral patterns indicate potential drift from specified
policies. All streaming verification sessions MUST implement flow
control mechanisms to prevent resource exhaustion and SHOULD
include adaptive sampling techniques to manage verification
overhead in high-throughput scenarios.
Attestation generation occurs through ATTESTATION_REQUEST messages
that trigger the creation of cryptographic proofs of compliance
assessment results. These requests MUST specify the compliance
results to be attested, the cryptographic algorithms to be used
for attestation generation, and any additional claims or
assertions to be included in the attestation. The resulting
ATTESTATION_RESPONSE messages contain digitally signed
attestations that bind compliance results to specific agents, time
periods, and behavioral specifications through tamper-evident
cryptographic structures. Attestations MUST include sufficient
information to enable independent verification of compliance
claims and SHOULD reference the complete verification audit trail
to support forensic analysis when behavioral violations occur.
7. Compliance Reporting and Attestation
Compliance reporting in ABVP provides a standardized mechanism for
documenting and cryptographically attesting to agent behavior
verification results. A compliance report MUST contain the agent
identifier, verification period, evaluated behavior
specifications, compliance status for each specification, and
supporting evidence including behavioral observations and
verification computations. Reports MUST be generated at
configurable intervals or upon detection of compliance violations,
with emergency reports triggered immediately when critical policy
violations occur. The reporting format MUST support both human-
readable summaries and machine-processable structured data to
enable automated compliance monitoring and audit trail generation.
Cryptographic attestation ensures the integrity and non-
repudiation of compliance reports through digital signatures and
hash chain mechanisms. Each compliance report MUST be digitally
signed by the generating Compliance Checker using keys certified
within the ABVP trust framework. Attestations MUST include a
timestamp from a trusted time source, the hash of the previous
attestation to form a verification chain, and sufficient
cryptographic binding to prevent tampering or replay attacks. The
attestation format SHOULD follow established standards such as RFC
8392 (CWT) or RFC 7519 (JWT) to ensure interoperability with
existing security infrastructures.
Trust chain establishment requires a hierarchical certification
authority structure where Compliance Checkers obtain certificates
from trusted ABVP Certificate Authorities. Root certificates for
ABVP trust anchors MUST be distributed through secure channels and
updated using standard certificate management practices as defined
in RFC 5280. Verification entities MUST validate the complete
certificate chain from the signing Compliance Checker to a trusted
root before accepting attestations. Certificate revocation MUST be
supported through standard mechanisms such as Certificate
Revocation Lists (CRLs) or Online Certificate Status Protocol
(OCSP) as specified in RFC 5280 and RFC 6960 respectively.
The compliance reporting protocol defines specific message formats
for distributing attestations to interested parties including
agent operators, regulatory authorities, and other verification
systems. Compliance reports MAY be distributed through push
mechanisms to subscribed entities or pulled on-demand through
standardized query interfaces. Report distribution MUST preserve
attestation integrity while allowing for appropriate access
control based on the sensitivity of the reported agent behaviors.
Long-term storage and archival of compliance reports SHOULD
implement tamper-evident logging mechanisms to support forensic
analysis and regulatory compliance requirements.
8. Security Considerations
The ABVP verification infrastructure introduces several security
considerations that must be addressed to ensure the integrity and
trustworthiness of agent behavior verification. The protocol's
reliance on continuous monitoring and attestation creates
potential attack vectors that could compromise the verification
process itself. Attackers may attempt to subvert verification
mechanisms to mask non-compliant agent behavior or to falsely
indicate compliance violations where none exist. The verification
system MUST be designed with the assumption that both the
monitored agents and the verification infrastructure may be
targets of sophisticated adversaries seeking to undermine
behavioral compliance validation.
Attestation integrity represents a critical security requirement
for ABVP implementations. Verification attestations MUST be
cryptographically signed using mechanisms that provide non-
repudiation and tamper detection capabilities. The attestation
chain MUST be anchored in a trusted root of trust, such as
hardware security modules or trusted platform modules, to prevent
forgery of compliance attestations. Implementations SHOULD employ
time-stamping mechanisms to prevent replay attacks where old
attestations are reused to mask current non-compliance. The
cryptographic algorithms used for attestation signing MUST conform
to current best practices for digital signatures and SHOULD
support algorithm agility to enable updates as cryptographic
standards evolve. Key management for attestation signing MUST
follow established security practices, including regular key
rotation and secure key storage.
The distributed nature of ABVP verification creates additional
security challenges related to verification node compromise and
Byzantine behavior among verification participants. Verification
nodes may be compromised by attackers seeking to manipulate
compliance reporting or inject false verification results.
Implementations MUST employ consensus mechanisms or threshold-
based verification approaches to detect and mitigate the impact of
compromised verification nodes. The protocol SHOULD include
mechanisms for verification node authentication and authorization
to prevent unauthorized participants from joining verification
networks. Network communications between verification components
MUST be encrypted and authenticated to prevent eavesdropping and
man-in-the-middle attacks. Implementations SHOULD implement rate
limiting and anomaly detection to identify potential denial-of-
service attacks against verification infrastructure.
Behavioral specification tampering and specification substitution
attacks pose significant threats to the ABVP framework's
effectiveness. Attackers may attempt to modify behavioral
specifications to make non-compliant behavior appear compliant or
to introduce specifications that are impossible to verify
accurately. Behavioral specifications MUST be cryptographically
protected through digital signatures and integrity checking
mechanisms. The protocol MUST include versioning and change
tracking for behavioral specifications to detect unauthorized
modifications. Verification systems SHOULD implement specification
validation to detect specifications that contain logical
inconsistencies or verification bypasses. Access controls for
specification modification MUST follow principle of least
privilege and include audit logging of all specification changes.
The ABVP verification process may inadvertently expose sensitive
information about agent operations, internal state, or the systems
being monitored. Verification data collection MUST be designed to
minimize information disclosure while maintaining verification
effectiveness. Implementations SHOULD employ privacy-preserving
techniques such as zero-knowledge proofs or selective disclosure
mechanisms where appropriate to limit exposure of sensitive
operational details. Verification logs and attestations MUST be
protected against unauthorized access and SHOULD include data
retention policies that balance verification auditability with
privacy requirements. The protocol MUST consider the implications
of cross-border data flows when verification infrastructure spans
multiple jurisdictions with different privacy regulations.
Side-channel attacks and covert channels represent additional
security considerations for ABVP implementations. The verification
process itself may create observable patterns that could be
exploited by attackers to infer information about agent behavior
or verification outcomes. Timing-based side channels in
verification operations MAY reveal information about the
complexity or results of compliance checking. Implementations
SHOULD consider countermeasures such as constant-time operations
and traffic analysis resistance where appropriate. The protocol
design MUST consider how verification metadata and communication
patterns might be used to build profiles of agent behavior that
could compromise operational security or reveal sensitive system
characteristics.
9. IANA Considerations
This document requires the registration of several new namespaces
and protocol parameters with the Internet Assigned Numbers
Authority (IANA). These registrations are necessary to ensure
global uniqueness and interoperability of ABVP implementations
across different vendors and deployment environments.
IANA SHALL establish a new registry titled "Agent Behavior
Verification Protocol (ABVP) Parameters" under the "Structured
Syntax Suffixes" registry group. This registry SHALL contain three
sub-registries: "Behavior Specification Schema Types",
"Verification Message Types", and "Attestation Format
Identifiers". The registration policy for all ABVP parameter sub-
registries SHALL follow the "Specification Required" policy as
defined in RFC 8126, with the additional requirement that all
registrations include a reference to a publicly available
specification document and demonstrate interoperability with at
least one existing ABVP implementation.
The "Behavior Specification Schema Types" sub-registry SHALL
maintain identifiers for standardized behavior specification
formats as defined in Section 5. Each registration MUST include a
unique identifier string, a human-readable description, a
reference specification, and version information. Initial
registrations SHALL include "abvp-policy-v1" for the base policy
specification format and "abvp-constraints-v1" for behavioral
constraint specifications. The "Verification Message Types" sub-
registry SHALL contain identifiers for protocol messages defined
in Section 6, including verification requests, compliance reports,
and attestation messages. Registration entries MUST specify the
message identifier, purpose, required parameters, and applicable
verification contexts.
The "Attestation Format Identifiers" sub-registry SHALL maintain
identifiers for cryptographic attestation formats used in
compliance reporting as specified in Section 7. Each registration
MUST include the attestation format identifier, cryptographic
algorithm requirements, trust model specifications, and
interoperability considerations. IANA SHALL reserve the identifier
prefix "abvp-" for protocol-specific attestation formats and MAY
delegate sub-namespace management to recognized standards bodies
for domain-specific attestation requirements. All registry entries
MUST include contact information for the registrant and SHALL be
subject to periodic review to ensure continued relevance and
security adequacy.
Author's Address
Generated by IETF Draft Analyzer
2026-03-01