Files
ietf-draft-analyzer/workspace/drafts/gap-analysis/draft-nennemann-agent-gap-analysis-00.md
Christian Nennemann 2506b6325a
Some checks failed
CI / test (3.11) (push) Failing after 1m37s
CI / test (3.12) (push) Failing after 57s
feat: add draft data, gap analysis report, and workspace config
2026-04-06 18:47:15 +02:00

780 lines
26 KiB
Markdown

---
title: "Gap Analysis for Autonomous Agent Protocols"
abbrev: "Agent Gap Analysis"
category: info
docname: draft-nennemann-agent-gap-analysis-00
area: "OPS"
workgroup: "NMOP"
submissiontype: IETF
v: 3
author:
- fullname: Christian Nennemann
organization: Independent Researcher
email: ietf@nennemann.de
normative:
RFC2119:
RFC8174:
informative:
RFC9334:
RFC7519:
RFC8615:
RFC9110:
I-D.nennemann-wimse-ect:
title: "Execution Context Tokens for Distributed Agentic Workflows"
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
I-D.nennemann-agent-dag-hitl-safety:
title: "Agent Context Policy Token: DAG Delegation with Human Override"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
I-D.nennemann-exec-audit:
title: "Cross-Domain Execution Audit Tokens"
target: https://datatracker.ietf.org/doc/draft-nennemann-exec-audit/
I-D.nennemann-agent-behavioral-verification:
title: "Agent Behavioral Verification and Performance Benchmarking"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-behavioral-verification/
I-D.nennemann-agent-cascade-prevention:
title: "Agent Failure Cascade Prevention and Rollback"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-cascade-prevention/
I-D.nennemann-agent-consensus:
title: "Multi-Agent Consensus and Capability Negotiation Protocols"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-consensus/
I-D.nennemann-agent-cross-domain-audit:
title: "Cross-Domain Agent Audit Trails and Resource Accounting"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-cross-domain-audit/
I-D.nennemann-agent-override-protocol:
title: "Standardized Human Override Protocol for Autonomous Agents"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-override-protocol/
I-D.nennemann-agent-federation-privacy:
title: "Federated Agent Learning Privacy and Cross-Protocol Migration"
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-federation-privacy/
--- abstract
This document maps the IETF autonomous agent landscape,
identifies eleven gap areas where standardization is absent
or insufficient, and introduces six companion drafts that
address the most critical gaps. Over 260 IETF drafts touch
on agent communication, identity, safety, and operations,
yet no single reference architecture ties them together.
This gap analysis provides a structured roadmap for the
standards work needed to enable safe, interoperable, and
auditable autonomous agent ecosystems.
--- middle
# Introduction
Autonomous software agents are increasingly deployed in
network management, cloud orchestration, supply-chain
logistics, and AI-driven workflows. Over 260 IETF drafts
touch on aspects of agent communication, identity, safety,
and operations. However, these efforts remain fragmented:
no single reference architecture ties them together, and
several critical capabilities lack any standardization at
all.
This document provides three contributions:
1. A reference architecture that organizes agent
capabilities into layers (Section 3).
2. A survey of existing IETF work relevant to autonomous
agents (Section 4).
3. A gap analysis identifying eleven areas where new or
extended standards are needed, together with a roadmap
of six companion drafts that address the most critical
gaps (Sections 5 and 6).
The intended audience includes working group chairs,
area directors, and protocol designers evaluating where
autonomous-agent standardization efforts should focus.
# Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
"SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",
"NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document
are to be interpreted as described in BCP 14 {{RFC2119}}
{{RFC8174}} when, and only when, they appear in all
capitals, as shown here.
{::boilerplate bcp14-tagged}
The following terms are used throughout this document:
Agent:
: A software component that acts on behalf of a principal
(human or organizational) to perform tasks, communicate
with other agents, or interact with external systems.
Autonomous Agent:
: An agent capable of executing multi-step tasks without
continuous human supervision, including making decisions
based on policy, context, and environmental state.
Agent Ecosystem:
: The set of agents, their principals, the policies that
govern them, and the infrastructure services (identity,
discovery, audit) they rely on.
DAG (Directed Acyclic Graph):
: A graph structure used to represent multi-step agent
execution plans where tasks have dependency ordering
but no circular dependencies.
HITL (Human-in-the-Loop):
: A control pattern in which a human operator must
approve, modify, or reject an agent action before
it takes effect.
ECT (Execution Context Token):
: A cryptographically signed token that carries the
execution context (task identity, delegated authority,
constraints) for an agent action. See
{{I-D.nennemann-wimse-ect}}.
ACP (Agent Context Policy):
: A policy document that specifies permitted behaviors,
resource limits, and escalation rules for an agent
within a given execution context. See
{{I-D.nennemann-agent-dag-hitl-safety}}.
Behavioral Attestation:
: A verifiable claim that an agent's runtime behavior
conforms to a declared policy or behavioral profile.
Cascade Failure:
: A failure mode in which an error in one agent
propagates through a multi-agent workflow, causing
successive agents to fail or produce incorrect
results.
Consensus Protocol:
: A protocol by which multiple agents reach agreement
on a shared decision, state, or action plan.
Override Signal:
: A message from a human operator or supervisory system
that instructs an agent to halt, modify, or roll back
its current action.
# Reference Architecture
The following diagram presents a layered reference
architecture for autonomous agent ecosystems. Each layer
identifies the relevant gap areas addressed by this
analysis.
~~~ ascii-art
+-------------------------------------------------------------+
| HUMAN OPERATORS |
| [Override & HITL Layer -- GAP 7] |
+-------------------------------------------------------------+
| AGENT INTERACTION LAYER |
| +---------+ +---------+ +---------+ +---------+ |
| | Agent A |<>| Agent B |<>| Agent C |<>| Agent D | |
| +----+----+ +----+----+ +----+----+ +----+----+ |
| | GAP 3: | GAP 10: | GAP 1: | |
| | Consensus | Cap.Neg. | Behav. | |
| | | | Verif. | |
+-------+------------+------------+------------+--------------+
| EXECUTION LAYER (ECT) |
| DAG Execution | Checkpoints | Rollback | Circuit Breakers |
| [GAP 2: Cascade Prevention] [GAP 4: Rollback] |
+-------------------------------------------------------------+
| POLICY & GOVERNANCE LAYER |
| ACP-DAG-HITL | Trust Scoring | Assurance Profiles |
| [GAP 5: Federated Privacy] [GAP 6: Cross-Domain Audit] |
+-------------------------------------------------------------+
| INFRASTRUCTURE LAYER |
| Identity | Discovery | Registration | Protocol Bridges |
| [GAP 8: Cross-Protocol] [GAP 9: Resource Accounting] |
| [GAP 11: Performance Benchmarking] |
+-------------------------------------------------------------+
~~~
{: #fig-arch title="Agent Ecosystem Reference Architecture"}
The Human Operators layer provides override and
human-in-the-loop controls (Gap 7). The Agent Interaction
layer is where agents communicate, negotiate capabilities
(Gap 10), reach consensus (Gap 3), and undergo behavioral
verification (Gap 1). The Execution layer manages DAG-based
workflows with cascade prevention (Gap 2) and rollback
(Gap 4). The Policy and Governance layer enforces privacy
in federated learning (Gap 5) and cross-domain audit trails
(Gap 6). The Infrastructure layer handles identity,
discovery, cross-protocol migration (Gap 8), resource
accounting (Gap 9), and performance benchmarking (Gap 11).
# Existing IETF Work
This section briefly surveys existing IETF efforts relevant
to autonomous agent protocols.
## WIMSE (Workload Identity in Multi-System Environments)
The WIMSE working group addresses workload identity and
Execution Context Tokens (ECTs) {{I-D.nennemann-wimse-ect}}.
ECTs provide the foundation for carrying delegated authority
and task context across agent boundaries.
## RATS (Remote ATtestation procedureS)
RATS defines architectures and protocols for remote
attestation {{RFC9334}}. Attestation evidence and
appraisal are directly applicable to verifying agent
behavioral claims.
## OAuth and GNAP
OAuth 2.0 and the Grant Negotiation and Authorization
Protocol (GNAP) provide authorization frameworks.
Transaction tokens and token exchange mechanisms are
relevant to agent-to-agent delegation chains.
## SCITT (Supply Chain Integrity, Transparency, and Trust)
SCITT defines transparency services for supply chain
artifacts. Its append-only log model is relevant to
agent audit trails and provenance tracking.
## NMOP (Network Management Operations)
The NMOP working group focuses on network management
operations including intent-based management and
autonomous network functions. Agent-driven network
management is a primary use case for the gaps identified
in this document.
## Industry Protocols: A2A and MCP
The Agent-to-Agent (A2A) protocol and Model Context
Protocol (MCP) are emerging industry standards for agent
communication. While not IETF specifications, they
inform the gap analysis by highlighting capabilities
that lack standardized, interoperable definitions.
# Gap Analysis
This section identifies eleven gaps in the current
standards landscape for autonomous agent protocols.
Gaps are classified by severity:
- CRITICAL: No existing standard addresses the problem;
failure to standardize poses immediate safety or
interoperability risks.
- HIGH: Partial coverage exists but is insufficient for
production autonomous agent deployments.
- MEDIUM: The gap affects efficiency or completeness but
does not pose immediate safety risks.
## Gap 1: Agent Behavioral Verification {#gap-1}
Severity:
: CRITICAL
Category:
: AI Safety
Problem Statement:
: Autonomous agents operating in production environments
currently lack any standardized mechanism for runtime
verification of policy compliance. While RATS
{{RFC9334}} provides attestation for platform integrity,
no equivalent exists for verifying that an agent's
observed behavior conforms to its declared behavioral
profile or policy constraints.
: Without behavioral verification, operators cannot
distinguish between an agent that faithfully executes
its policy and one that has drifted, been compromised,
or is operating outside its intended parameters. This
is especially dangerous in multi-agent workflows where
one misbehaving agent can corrupt downstream results.
: The gap extends to the absence of standardized
behavioral profiles, verification evidence formats,
and appraisal procedures specific to agent conduct.
Impact if Unaddressed:
: Undetected policy violations in autonomous agents
could cause safety incidents, data breaches, or
cascading failures in critical infrastructure.
Existing Partial Coverage:
: RATS {{RFC9334}} covers platform attestation but not
behavioral compliance. ACP-DAG-HITL
{{I-D.nennemann-agent-dag-hitl-safety}} defines
policies but not verification mechanisms.
Companion Draft:
: {{I-D.nennemann-agent-behavioral-verification}}
## Gap 2: Agent Failure Cascade Prevention {#gap-2}
Severity:
: CRITICAL
Category:
: AI Safety, Resilience
Problem Statement:
: Multi-agent workflows create dependency chains where a
failure in one agent can propagate to downstream agents,
causing cascade failures. No standardized mechanism
exists for circuit breakers, failure isolation, or
cascade containment in agent-to-agent interactions.
: Current practice relies on ad-hoc timeout and retry
logic that is neither interoperable nor sufficient for
complex DAG-structured workflows. Agents from
different vendors or administrative domains have no
common way to signal failure states or trigger
containment procedures.
: The absence of cascade prevention is especially
critical in network management scenarios where agent
failures could propagate to affect live network
operations.
Impact if Unaddressed:
: A single agent failure could cascade through an entire
multi-agent deployment, causing widespread service
disruption with no automated containment.
Existing Partial Coverage:
: ECT {{I-D.nennemann-wimse-ect}} provides execution
context but no failure containment semantics.
Companion Draft:
: {{I-D.nennemann-agent-cascade-prevention}}
## Gap 3: Multi-Agent Consensus Protocols {#gap-3}
Severity:
: HIGH
Category:
: A2A Protocols
Problem Statement:
: When multiple agents must agree on a shared decision
(e.g., a network configuration change, a resource
allocation plan, or a coordinated response to an
incident), no standardized consensus protocol exists
for agent-to-agent agreement.
: Distributed systems consensus protocols (Raft, Paxos)
are designed for replicated state machines, not for
heterogeneous agents with different capabilities,
trust levels, and policy constraints. Agent consensus
requires additional semantics such as weighted voting,
capability-based participation, and policy-constrained
proposals.
: Without a standard protocol, multi-agent coordination
relies on proprietary mechanisms that are not
interoperable across vendors or administrative domains.
Impact if Unaddressed:
: Multi-vendor agent deployments cannot coordinate
decisions, limiting autonomous agents to single-vendor
silos.
Existing Partial Coverage:
: No existing IETF work directly addresses multi-agent
consensus.
Companion Draft:
: {{I-D.nennemann-agent-consensus}}
## Gap 4: Real-Time Agent Rollback Mechanisms {#gap-4}
Severity:
: HIGH
Category:
: Resilience, Operations
Problem Statement:
: When an autonomous agent takes an action that produces
unintended consequences, no standardized mechanism
exists for rolling back the action and restoring
the previous state. This is particularly important
for network management agents that modify device
configurations.
: Rollback requires standardized checkpointing, state
snapshots, and undo semantics that work across agent
boundaries and administrative domains. Current
rollback mechanisms (e.g., NETCONF confirmed-commit)
are protocol-specific and do not generalize to
arbitrary agent actions.
: The lack of rollback is compounded in multi-agent
workflows where multiple agents may have taken
coordinated actions that must be reversed as a unit.
Impact if Unaddressed:
: Operators cannot safely deploy autonomous agents for
critical operations without manual intervention
capability for every action.
Existing Partial Coverage:
: NETCONF confirmed-commit provides rollback for
configuration changes only.
Companion Draft:
: {{I-D.nennemann-agent-cascade-prevention}}
## Gap 5: Federated Agent Learning Privacy {#gap-5}
Severity:
: HIGH
Category:
: Privacy, Federated Systems
Problem Statement:
: Agents that participate in federated learning or
share operational data across administrative domains
require privacy guarantees that go beyond transport
encryption. No IETF specification addresses the
privacy requirements of federated agent learning,
including differential privacy parameters, data
minimization for shared agent telemetry, and
consent management for cross-domain data sharing.
: As agents are deployed across organizational
boundaries, the data they generate and share can
reveal sensitive information about network topology,
operational patterns, and business logic. Privacy-
preserving mechanisms specific to agent interactions
are needed.
Impact if Unaddressed:
: Organizations will be unable to participate in
federated agent ecosystems without unacceptable
privacy risks, limiting the value of multi-domain
agent deployments.
Existing Partial Coverage:
: General privacy frameworks exist but none address
agent-specific federated learning scenarios.
Companion Draft:
: {{I-D.nennemann-agent-federation-privacy}}
## Gap 6: Cross-Domain Agent Audit Trails {#gap-6}
Severity:
: HIGH
Category:
: Audit, Compliance
Problem Statement:
: When agents operate across multiple administrative
domains, their actions must be auditable end-to-end.
No standardized format exists for cross-domain agent
audit trails that preserves causal ordering, links
related actions across domain boundaries, and provides
tamper-evident logging.
: Execution Audit Tokens {{I-D.nennemann-exec-audit}}
provide per-action audit records, but no standard
defines how these records are aggregated, correlated,
and queried across domains. SCITT provides
transparency log primitives but does not define
agent-specific audit semantics.
: Regulatory and compliance requirements increasingly
demand end-to-end audit trails for automated
decision-making, making this gap urgent for
enterprise deployments.
Impact if Unaddressed:
: Organizations cannot demonstrate compliance for
cross-domain agent operations, blocking adoption
in regulated industries.
Existing Partial Coverage:
: SCITT provides transparency log primitives.
{{I-D.nennemann-exec-audit}} defines per-action
audit tokens.
Companion Draft:
: {{I-D.nennemann-agent-cross-domain-audit}}
## Gap 7: Human Override Standardization {#gap-7}
Severity:
: HIGH
Category:
: AI Safety, Human Control
Problem Statement:
: Autonomous agents must always be subject to human
override, but no cross-vendor protocol exists for
sending override signals to agents. Override
requirements include emergency stop, graceful pause,
parameter modification, and forced rollback.
: Current override mechanisms are vendor-specific and
cannot be used in multi-vendor agent deployments.
ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}}
defines when human approval is required but does not
specify the protocol for delivering override signals
to running agents.
: The absence of a standard override protocol creates
a significant safety risk: if an agent misbehaves,
the operator may not have a reliable way to stop it
if the agent comes from a different vendor than the
management platform.
Impact if Unaddressed:
: Operators lose the ability to control autonomous
agents in emergency situations, creating unacceptable
safety risks.
Existing Partial Coverage:
: ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}}
defines HITL policies but not override delivery.
Companion Draft:
: {{I-D.nennemann-agent-override-protocol}}
## Gap 8: Cross-Protocol Agent Migration {#gap-8}
Severity:
: MEDIUM
Category:
: Interoperability
Problem Statement:
: Agents may need to migrate between protocol
environments (e.g., from an A2A-based system to an
MCP-based system) while preserving execution context,
identity, and accumulated state. No standard defines
how agent context is translated or preserved across
protocol boundaries.
: As the agent ecosystem matures, agents will
increasingly operate in heterogeneous protocol
environments. Without migration standards, agents
are locked into specific protocol ecosystems.
Impact if Unaddressed:
: Agent deployments become fragmented across protocol
silos, reducing interoperability and increasing
operational complexity.
Existing Partial Coverage:
: ECT {{I-D.nennemann-wimse-ect}} provides a
protocol-neutral context token but does not define
migration procedures.
Companion Draft:
: {{I-D.nennemann-agent-federation-privacy}}
## Gap 9: Agent Resource Accounting and Billing {#gap-9}
Severity:
: MEDIUM
Category:
: Operations, Economics
Problem Statement:
: Autonomous agents consume computational, network, and
API resources across administrative domains. No
standardized mechanism exists for tracking, reporting,
and reconciling resource consumption by agents,
especially in multi-domain scenarios where an agent's
actions incur costs in domains other than its own.
: Resource accounting is a prerequisite for sustainable
multi-domain agent ecosystems where organizations
need to track and charge for agent resource usage.
Impact if Unaddressed:
: Organizations cannot accurately track or bill for
agent resource consumption, hindering commercial
multi-domain agent deployments.
Existing Partial Coverage:
: No existing IETF work addresses agent-specific
resource accounting.
Companion Draft:
: {{I-D.nennemann-agent-cross-domain-audit}}
## Gap 10: Agent Capability Negotiation {#gap-10}
Severity:
: MEDIUM
Category:
: A2A Protocols
Problem Statement:
: When agents interact, they need to discover and
negotiate each other's capabilities dynamically.
No standardized capability negotiation protocol
exists for agents to advertise their functions,
agree on interaction protocols, and establish
compatible communication parameters.
: Well-known URI {{RFC8615}} and HTTP {{RFC9110}}
provide discovery primitives, but agent capability
negotiation requires richer semantics including
versioning, conditional capabilities, and policy-
constrained capability advertisement.
Impact if Unaddressed:
: Agent interactions require pre-configured knowledge
of peer capabilities, limiting dynamic composition
and ad-hoc agent collaboration.
Existing Partial Coverage:
: HTTP content negotiation and well-known URIs provide
basic discovery but not agent-specific capability
negotiation.
Companion Draft:
: {{I-D.nennemann-agent-consensus}}
## Gap 11: Agent Performance Benchmarking {#gap-11}
Severity:
: MEDIUM
Category:
: Operations, Metrics
Problem Statement:
: No standardized metrics or benchmarking methodology
exists for evaluating autonomous agent performance.
Without common metrics, operators cannot compare
agent implementations, set performance baselines,
or detect performance degradation.
: Agent performance encompasses multiple dimensions:
task completion accuracy, latency, resource
efficiency, safety compliance rate, and behavioral
consistency. Standardized metrics and measurement
procedures are needed for each dimension.
Impact if Unaddressed:
: Operators cannot objectively evaluate or compare
autonomous agent implementations, hindering
procurement and deployment decisions.
Existing Partial Coverage:
: No existing IETF work addresses agent performance
benchmarking.
Companion Draft:
: {{I-D.nennemann-agent-behavioral-verification}}
# Companion Draft Roadmap
The following table maps each companion draft to the
gaps it addresses and its priority level:
| Companion Draft | Gaps | Priority |
|:---|:---:|:---:|
| draft-nennemann-agent-behavioral-verification | 1, 11 | CRITICAL |
| draft-nennemann-agent-cascade-prevention | 2, 4 | CRITICAL |
| draft-nennemann-agent-consensus | 3, 10 | HIGH |
| draft-nennemann-agent-cross-domain-audit | 6, 9 | HIGH |
| draft-nennemann-agent-override-protocol | 7 | HIGH |
| draft-nennemann-agent-federation-privacy | 5, 8 | HIGH |
{: #tab-roadmap title="Companion Draft Roadmap"}
The dependency relationships between companion drafts
are shown below:
~~~ ascii-art
behavioral-verification ---+
| |
v |
cascade-prevention |
| |
v v
override-protocol cross-domain-audit
| |
v v
consensus federation-privacy
~~~
{: #fig-deps title="Companion Draft Dependencies"}
The behavioral-verification draft (Companion A) is
foundational because its behavioral attestation format
is used by the cascade-prevention and cross-domain-audit
drafts. The cascade-prevention draft (Companion B)
defines failure containment that the override-protocol
(Companion E) builds upon. The consensus draft
(Companion C) extends behavioral verification with
multi-agent agreement. The cross-domain-audit draft
(Companion D) provides the audit infrastructure that
federation-privacy (Companion F) adds privacy controls
to.
# Security Considerations
The gaps identified in this document have direct security
implications:
Behavioral Verification (Gap 1):
: Without runtime behavioral verification, compromised
or malfunctioning agents cannot be detected, creating
opportunities for attacks that exploit trusted agent
identities to perform unauthorized actions.
Cascade Prevention (Gap 2):
: The absence of cascade containment creates a denial-
of-service vector where an attacker can compromise a
single agent to disrupt an entire multi-agent workflow.
Human Override (Gap 7):
: Without standardized override protocols, safety-
critical agent actions may not be stoppable, creating
an unacceptable risk profile for autonomous
deployments.
Cross-Domain Audit (Gap 6):
: Gaps in audit trails across domain boundaries create
opportunities for agents to take actions that evade
detection and accountability.
Federated Privacy (Gap 5):
: Sharing agent operational data across domains without
adequate privacy controls can expose sensitive
organizational information, network topology, and
business logic.
Implementers of autonomous agent systems SHOULD treat the
CRITICAL and HIGH severity gaps as security requirements
and prioritize their resolution.
# IANA Considerations
This document has no IANA actions.
--- back
# Acknowledgments
The author thanks the participants of the WIMSE, RATS,
and NMOP working groups for discussions that informed
this analysis.