Gap Analysis for Autonomous Agent Protocols

Gap Analysis for Autonomous Agent Protocols Independent Researcher

ietf@nennemann.de

OPS NMOP This document maps the IETF autonomous agent landscape, identifies eleven gap areas where standardization is absent or insufficient, and introduces six companion drafts that address the most critical gaps. Over 260 IETF drafts touch on agent communication, identity, safety, and operations, yet no single reference architecture ties them together. This gap analysis provides a structured roadmap for the standards work needed to enable safe, interoperable, and auditable autonomous agent ecosystems.

Introduction Autonomous software agents are increasingly deployed in network management, cloud orchestration, supply-chain logistics, and AI-driven workflows. Over 260 IETF drafts touch on aspects of agent communication, identity, safety, and operations. However, these efforts remain fragmented: no single reference architecture ties them together, and several critical capabilities lack any standardization at all. This document provides three contributions: A reference architecture that organizes agent capabilities into layers (Section 3). A survey of existing IETF work relevant to autonomous agents (Section 4). A gap analysis identifying eleven areas where new or extended standards are needed, together with a roadmap of six companion drafts that address the most critical gaps (Sections 5 and 6). The intended audience includes working group chairs, area directors, and protocol designers evaluating where autonomous-agent standardization efforts should focus.

Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 when, and only when, they appear in all capitals, as shown here. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 when, and only when, they appear in all capitals, as shown here. The following terms are used throughout this document:

Agent:: A software component that acts on behalf of a principal (human or organizational) to perform tasks, communicate with other agents, or interact with external systems.
Autonomous Agent:: An agent capable of executing multi-step tasks without continuous human supervision, including making decisions based on policy, context, and environmental state.
Agent Ecosystem:: The set of agents, their principals, the policies that govern them, and the infrastructure services (identity, discovery, audit) they rely on.
DAG (Directed Acyclic Graph):: A graph structure used to represent multi-step agent execution plans where tasks have dependency ordering but no circular dependencies.
HITL (Human-in-the-Loop):: A control pattern in which a human operator must approve, modify, or reject an agent action before it takes effect.
ECT (Execution Context Token):: A cryptographically signed token that carries the execution context (task identity, delegated authority, constraints) for an agent action. See .
ACP (Agent Context Policy):: A policy document that specifies permitted behaviors, resource limits, and escalation rules for an agent within a given execution context. See .
Behavioral Attestation:: A verifiable claim that an agent's runtime behavior conforms to a declared policy or behavioral profile.
Cascade Failure:: A failure mode in which an error in one agent propagates through a multi-agent workflow, causing successive agents to fail or produce incorrect results.
Consensus Protocol:: A protocol by which multiple agents reach agreement on a shared decision, state, or action plan.
Override Signal:: A message from a human operator or supervisory system that instructs an agent to halt, modify, or roll back its current action.

Reference Architecture The following diagram presents a layered reference architecture for autonomous agent ecosystems. Each layer identifies the relevant gap areas addressed by this analysis.

Existing IETF Work This section briefly surveys existing IETF efforts relevant to autonomous agent protocols.

WIMSE (Workload Identity in Multi-System Environments) The WIMSE working group addresses workload identity and Execution Context Tokens (ECTs) . ECTs provide the foundation for carrying delegated authority and task context across agent boundaries.

RATS (Remote ATtestation procedureS) RATS defines architectures and protocols for remote attestation . Attestation evidence and appraisal are directly applicable to verifying agent behavioral claims.

OAuth and GNAP OAuth 2.0 and the Grant Negotiation and Authorization Protocol (GNAP) provide authorization frameworks. Transaction tokens and token exchange mechanisms are relevant to agent-to-agent delegation chains.

SCITT (Supply Chain Integrity, Transparency, and Trust) SCITT defines transparency services for supply chain artifacts. Its append-only log model is relevant to agent audit trails and provenance tracking.

NMOP (Network Management Operations) The NMOP working group focuses on network management operations including intent-based management and autonomous network functions. Agent-driven network management is a primary use case for the gaps identified in this document.

Industry Protocols: A2A and MCP The Agent-to-Agent (A2A) protocol and Model Context Protocol (MCP) are emerging industry standards for agent communication. While not IETF specifications, they inform the gap analysis by highlighting capabilities that lack standardized, interoperable definitions.

Gap Analysis This section identifies eleven gaps in the current standards landscape for autonomous agent protocols. Gaps are classified by severity: CRITICAL: No existing standard addresses the problem; failure to standardize poses immediate safety or interoperability risks. HIGH: Partial coverage exists but is insufficient for production autonomous agent deployments. MEDIUM: The gap affects efficiency or completeness but does not pose immediate safety risks.

Gap 1: Agent Behavioral Verification

Severity:: CRITICAL
Category:: AI Safety
Problem Statement:: Autonomous agents operating in production environments currently lack any standardized mechanism for runtime verification of policy compliance. While RATS provides attestation for platform integrity, no equivalent exists for verifying that an agent's observed behavior conforms to its declared behavioral profile or policy constraints.
: Without behavioral verification, operators cannot distinguish between an agent that faithfully executes its policy and one that has drifted, been compromised, or is operating outside its intended parameters. This is especially dangerous in multi-agent workflows where one misbehaving agent can corrupt downstream results.
: The gap extends to the absence of standardized behavioral profiles, verification evidence formats, and appraisal procedures specific to agent conduct.
Impact if Unaddressed:: Undetected policy violations in autonomous agents could cause safety incidents, data breaches, or cascading failures in critical infrastructure.
Existing Partial Coverage:: RATS covers platform attestation but not behavioral compliance. ACP-DAG-HITL defines policies but not verification mechanisms.
Companion Draft:

Gap 2: Agent Failure Cascade Prevention

Severity:: CRITICAL
Category:: AI Safety, Resilience
Problem Statement:: Multi-agent workflows create dependency chains where a failure in one agent can propagate to downstream agents, causing cascade failures. No standardized mechanism exists for circuit breakers, failure isolation, or cascade containment in agent-to-agent interactions.
: Current practice relies on ad-hoc timeout and retry logic that is neither interoperable nor sufficient for complex DAG-structured workflows. Agents from different vendors or administrative domains have no common way to signal failure states or trigger containment procedures.
: The absence of cascade prevention is especially critical in network management scenarios where agent failures could propagate to affect live network operations.
Impact if Unaddressed:: A single agent failure could cascade through an entire multi-agent deployment, causing widespread service disruption with no automated containment.
Existing Partial Coverage:: ECT provides execution context but no failure containment semantics.
Companion Draft:

Gap 3: Multi-Agent Consensus Protocols

Severity:: HIGH
Category:: A2A Protocols
Problem Statement:: When multiple agents must agree on a shared decision (e.g., a network configuration change, a resource allocation plan, or a coordinated response to an incident), no standardized consensus protocol exists for agent-to-agent agreement.
: Distributed systems consensus protocols (Raft, Paxos) are designed for replicated state machines, not for heterogeneous agents with different capabilities, trust levels, and policy constraints. Agent consensus requires additional semantics such as weighted voting, capability-based participation, and policy-constrained proposals.
: Without a standard protocol, multi-agent coordination relies on proprietary mechanisms that are not interoperable across vendors or administrative domains.
Impact if Unaddressed:: Multi-vendor agent deployments cannot coordinate decisions, limiting autonomous agents to single-vendor silos.
Existing Partial Coverage:: No existing IETF work directly addresses multi-agent consensus.
Companion Draft:

Gap 4: Real-Time Agent Rollback Mechanisms

Severity:: HIGH
Category:: Resilience, Operations
Problem Statement:: When an autonomous agent takes an action that produces unintended consequences, no standardized mechanism exists for rolling back the action and restoring the previous state. This is particularly important for network management agents that modify device configurations.
: Rollback requires standardized checkpointing, state snapshots, and undo semantics that work across agent boundaries and administrative domains. Current rollback mechanisms (e.g., NETCONF confirmed-commit) are protocol-specific and do not generalize to arbitrary agent actions.
: The lack of rollback is compounded in multi-agent workflows where multiple agents may have taken coordinated actions that must be reversed as a unit.
Impact if Unaddressed:: Operators cannot safely deploy autonomous agents for critical operations without manual intervention capability for every action.
Existing Partial Coverage:: NETCONF confirmed-commit provides rollback for configuration changes only.
Companion Draft:

Gap 5: Federated Agent Learning Privacy

Severity:: HIGH
Category:: Privacy, Federated Systems
Problem Statement:: Agents that participate in federated learning or share operational data across administrative domains require privacy guarantees that go beyond transport encryption. No IETF specification addresses the privacy requirements of federated agent learning, including differential privacy parameters, data minimization for shared agent telemetry, and consent management for cross-domain data sharing.
: As agents are deployed across organizational boundaries, the data they generate and share can reveal sensitive information about network topology, operational patterns, and business logic. Privacy- preserving mechanisms specific to agent interactions are needed.
Impact if Unaddressed:: Organizations will be unable to participate in federated agent ecosystems without unacceptable privacy risks, limiting the value of multi-domain agent deployments.
Existing Partial Coverage:: General privacy frameworks exist but none address agent-specific federated learning scenarios.
Companion Draft:

Gap 6: Cross-Domain Agent Audit Trails

Severity:: HIGH
Category:: Audit, Compliance
Problem Statement:: When agents operate across multiple administrative domains, their actions must be auditable end-to-end. No standardized format exists for cross-domain agent audit trails that preserves causal ordering, links related actions across domain boundaries, and provides tamper-evident logging.
: Execution Audit Tokens provide per-action audit records, but no standard defines how these records are aggregated, correlated, and queried across domains. SCITT provides transparency log primitives but does not define agent-specific audit semantics.
: Regulatory and compliance requirements increasingly demand end-to-end audit trails for automated decision-making, making this gap urgent for enterprise deployments.
Impact if Unaddressed:: Organizations cannot demonstrate compliance for cross-domain agent operations, blocking adoption in regulated industries.
Existing Partial Coverage:: SCITT provides transparency log primitives. defines per-action audit tokens.
Companion Draft:

Gap 7: Human Override Standardization

Severity:: HIGH
Category:: AI Safety, Human Control
Problem Statement:: Autonomous agents must always be subject to human override, but no cross-vendor protocol exists for sending override signals to agents. Override requirements include emergency stop, graceful pause, parameter modification, and forced rollback.
: Current override mechanisms are vendor-specific and cannot be used in multi-vendor agent deployments. ACP-DAG-HITL defines when human approval is required but does not specify the protocol for delivering override signals to running agents.
: The absence of a standard override protocol creates a significant safety risk: if an agent misbehaves, the operator may not have a reliable way to stop it if the agent comes from a different vendor than the management platform.
Impact if Unaddressed:: Operators lose the ability to control autonomous agents in emergency situations, creating unacceptable safety risks.
Existing Partial Coverage:: ACP-DAG-HITL defines HITL policies but not override delivery.
Companion Draft:

Gap 8: Cross-Protocol Agent Migration

Severity:: MEDIUM
Category:: Interoperability
Problem Statement:: Agents may need to migrate between protocol environments (e.g., from an A2A-based system to an MCP-based system) while preserving execution context, identity, and accumulated state. No standard defines how agent context is translated or preserved across protocol boundaries.
: As the agent ecosystem matures, agents will increasingly operate in heterogeneous protocol environments. Without migration standards, agents are locked into specific protocol ecosystems.
Impact if Unaddressed:: Agent deployments become fragmented across protocol silos, reducing interoperability and increasing operational complexity.
Existing Partial Coverage:: ECT provides a protocol-neutral context token but does not define migration procedures.
Companion Draft:

Gap 9: Agent Resource Accounting and Billing

Severity:: MEDIUM
Category:: Operations, Economics
Problem Statement:: Autonomous agents consume computational, network, and API resources across administrative domains. No standardized mechanism exists for tracking, reporting, and reconciling resource consumption by agents, especially in multi-domain scenarios where an agent's actions incur costs in domains other than its own.
: Resource accounting is a prerequisite for sustainable multi-domain agent ecosystems where organizations need to track and charge for agent resource usage.
Impact if Unaddressed:: Organizations cannot accurately track or bill for agent resource consumption, hindering commercial multi-domain agent deployments.
Existing Partial Coverage:: No existing IETF work addresses agent-specific resource accounting.
Companion Draft:

Gap 10: Agent Capability Negotiation

Severity:: MEDIUM
Category:: A2A Protocols
Problem Statement:: When agents interact, they need to discover and negotiate each other's capabilities dynamically. No standardized capability negotiation protocol exists for agents to advertise their functions, agree on interaction protocols, and establish compatible communication parameters.
: Well-known URI and HTTP provide discovery primitives, but agent capability negotiation requires richer semantics including versioning, conditional capabilities, and policy- constrained capability advertisement.
Impact if Unaddressed:: Agent interactions require pre-configured knowledge of peer capabilities, limiting dynamic composition and ad-hoc agent collaboration.
Existing Partial Coverage:: HTTP content negotiation and well-known URIs provide basic discovery but not agent-specific capability negotiation.
Companion Draft:

Gap 11: Agent Performance Benchmarking

Severity:: MEDIUM
Category:: Operations, Metrics
Problem Statement:: No standardized metrics or benchmarking methodology exists for evaluating autonomous agent performance. Without common metrics, operators cannot compare agent implementations, set performance baselines, or detect performance degradation.
: Agent performance encompasses multiple dimensions: task completion accuracy, latency, resource efficiency, safety compliance rate, and behavioral consistency. Standardized metrics and measurement procedures are needed for each dimension.
Impact if Unaddressed:: Operators cannot objectively evaluate or compare autonomous agent implementations, hindering procurement and deployment decisions.
Existing Partial Coverage:: No existing IETF work addresses agent performance benchmarking.
Companion Draft:

Companion Draft Roadmap The following table maps each companion draft to the gaps it addresses and its priority level: Companion Draft Gaps Priority draft-nennemann-agent-behavioral-verification 1, 11 CRITICAL draft-nennemann-agent-cascade-prevention 2, 4 CRITICAL draft-nennemann-agent-consensus 3, 10 HIGH draft-nennemann-agent-cross-domain-audit 6, 9 HIGH draft-nennemann-agent-override-protocol 7 HIGH draft-nennemann-agent-federation-privacy 5, 8 HIGH The dependency relationships between companion drafts are shown below:

The behavioral-verification draft (Companion A) is foundational because its behavioral attestation format is used by the cascade-prevention and cross-domain-audit drafts. The cascade-prevention draft (Companion B) defines failure containment that the override-protocol (Companion E) builds upon. The consensus draft (Companion C) extends behavioral verification with multi-agent agreement. The cross-domain-audit draft (Companion D) provides the audit infrastructure that federation-privacy (Companion F) adds privacy controls to.

Security Considerations The gaps identified in this document have direct security implications:

Behavioral Verification (Gap 1):: Without runtime behavioral verification, compromised or malfunctioning agents cannot be detected, creating opportunities for attacks that exploit trusted agent identities to perform unauthorized actions.
Cascade Prevention (Gap 2):: The absence of cascade containment creates a denial- of-service vector where an attacker can compromise a single agent to disrupt an entire multi-agent workflow.
Human Override (Gap 7):: Without standardized override protocols, safety- critical agent actions may not be stoppable, creating an unacceptable risk profile for autonomous deployments.
Cross-Domain Audit (Gap 6):: Gaps in audit trails across domain boundaries create opportunities for agents to take actions that evade detection and accountability.
Federated Privacy (Gap 5):: Sharing agent operational data across domains without adequate privacy controls can expose sensitive organizational information, network topology, and business logic.

Implementers of autonomous agent systems SHOULD treat the CRITICAL and HIGH severity gaps as security requirements and prioritize their resolution.

IANA Considerations This document has no IANA actions.

Key words for use in RFCs to Indicate Requirement Levels In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements. Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words RFC 2119 specifies common key words that may be used in protocol specifications. This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the defined special meanings. Remote ATtestation procedureS (RATS) Architecture In network protocol exchanges, it is often useful for one end of a communication to know whether the other end is in an intended operating state. This document provides an architectural overview of the entities involved that make such tests possible through the process of generating, conveying, and evaluating evidentiary Claims. It provides a model that is neutral toward processor architectures, the content of Claims, and protocols. JSON Web Token (JWT) JSON Web Token (JWT) is a compact, URL-safe means of representing claims to be transferred between two parties. The claims in a JWT are encoded as a JSON object that is used as the payload of a JSON Web Signature (JWS) structure or as the plaintext of a JSON Web Encryption (JWE) structure, enabling the claims to be digitally signed or integrity protected with a Message Authentication Code (MAC) and/or encrypted. Well-Known Uniform Resource Identifiers (URIs) This memo defines a path prefix for "well-known locations", "/.well-known/", in selected Uniform Resource Identifier (URI) schemes. In doing so, it obsoletes RFC 5785 and updates the URI schemes defined in RFC 7230 to reserve that space. It also updates RFC 7595 to track URI schemes that support well-known URIs in their registry. HTTP Semantics The Hypertext Transfer Protocol (HTTP) is a stateless application-level protocol for distributed, collaborative, hypertext information systems. This document describes the overall architecture of HTTP, establishes common terminology, and defines aspects of the protocol that are shared by all versions. In this definition are core protocol elements, extensibility mechanisms, and the "http" and "https" Uniform Resource Identifier (URI) schemes. This document updates RFC 3864 and obsoletes RFCs 2818, 7231, 7232, 7233, 7235, 7538, 7615, 7694, and portions of 7230. Execution Context Tokens for Distributed Agentic Workflows Agent Context Policy Token: DAG Delegation with Human Override Cross-Domain Execution Audit Tokens Agent Behavioral Verification and Performance Benchmarking Agent Failure Cascade Prevention and Rollback Multi-Agent Consensus and Capability Negotiation Protocols Cross-Domain Agent Audit Trails and Resource Accounting Standardized Human Override Protocol for Autonomous Agents Federated Agent Learning Privacy and Cross-Protocol Migration

Acknowledgments The author thanks the participants of the WIMSE, RATS, and NMOP working groups for discussions that informed this analysis.