Problem Statement for Autonomous Agent Protocol Gaps

Problem Statement for Autonomous Agent Protocol Gaps Independent Researcher

ietf@nennemann.de

OPS NMOP The IETF autonomous agent landscape spans over 260 drafts touching agent communication, identity, safety, and operations, yet critical gaps remain where standardization is absent or insufficient. This document provides a condensed problem statement identifying eleven protocol gaps, classifies them by severity, and maps them to a suite of companion drafts that form a coherent solution framework. It is intended as an actionable reference for working group chairs, area directors, and protocol designers evaluating where autonomous-agent standardization efforts should focus.

Introduction Autonomous software agents are moving from research prototypes to production deployments in network management, cloud orchestration, supply-chain logistics, and AI-driven workflows. A survey of IETF work reveals over 260 drafts relevant to agent capabilities, yet no single reference architecture ties them together. Several critical capabilities -- runtime behavioral verification, failure cascade prevention, cross-vendor human override -- lack any standardization at all. This document distills the findings of a comprehensive gap analysis into an actionable problem statement. It identifies eleven gaps, groups them by severity, and presents a solution roadmap of nine companion drafts. The full analysis, including a survey of existing IETF work across WIMSE, RATS, OAuth/GNAP, SCITT, and NMOP, is available in and the companion arXiv paper . The intended audience is working group chairs, area directors, and protocol designers who need a concise summary of what is missing and what to build next.

Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 when, and only when, they appear in all capitals, as shown here. The following terms are used throughout this document:

Agent:: A software component that acts on behalf of a principal (human or organizational) to perform tasks autonomously.
ECT (Execution Context Token):: A cryptographically signed token carrying execution context for an agent action. See .
ACP (Agent Context Policy):: A policy specifying permitted behaviors, resource limits, and escalation rules for an agent. See .
HITL (Human-in-the-Loop):: A control pattern requiring human approval before an agent action takes effect.
Cascade Failure:: A failure mode where an error in one agent propagates through a multi-agent workflow, causing successive agents to fail.
Override Signal:: A message from a human operator instructing an agent to halt, modify, or roll back its current action.

Problem Landscape The autonomous agent ecosystem can be organized into four layers, each with distinct standardization gaps. The following diagram presents this reference architecture:

Human Operators Layer:: Provides override and human-in-the-loop controls. Gap 7 addresses the absence of a cross-vendor override protocol.
Agent Interaction Layer:: Where agents communicate, negotiate capabilities (Gap 10), reach consensus (Gap 3), and undergo behavioral verification (Gap 1).
Execution Layer:: Manages DAG-based workflows with cascade prevention (Gap 2) and rollback (Gap 4), built on Execution Context Tokens .
Policy and Governance Layer:: Enforces privacy in federated learning (Gap 5) and cross-domain audit trails (Gap 6).
Infrastructure Layer:: Handles identity, discovery, cross-protocol migration (Gap 8), resource accounting (Gap 9), and performance benchmarking (Gap 11).

Critical Gaps

CRITICAL Severity

Gap 1: Agent Behavioral Verification No standardized mechanism exists for runtime verification of agent policy compliance. RATS covers platform attestation but not behavioral conformance. Without this, operators cannot detect drifted, compromised, or out-of-bounds agents -- especially dangerous in multi-agent workflows where one misbehaving agent corrupts downstream results. Addressed by .

Gap 2: Agent Failure Cascade Prevention Multi-agent dependency chains lack standardized circuit breakers, failure isolation, or cascade containment. Current ad-hoc timeout and retry logic is neither interoperable nor sufficient for DAG-structured workflows. A single agent failure can cascade through an entire deployment with no automated containment. Addressed by .

HIGH Severity

Gap 3: Multi-Agent Consensus Protocols No standardized consensus protocol exists for heterogeneous agents with different capabilities, trust levels, and policy constraints. Distributed systems consensus (Raft, Paxos) does not address agent-specific semantics like weighted voting and capability-based participation. Multi-vendor coordination remains impossible without proprietary mechanisms. Addressed by .

Gap 4: Real-Time Agent Rollback No generalized rollback mechanism exists for autonomous agent actions. Protocol-specific approaches (e.g., NETCONF confirmed-commit) do not extend to arbitrary agent actions or coordinated multi-agent rollbacks. Operators cannot safely deploy agents for critical operations without manual intervention for every action. Addressed by .

Gap 5: Federated Agent Learning Privacy Agents sharing operational data across domains need privacy guarantees beyond transport encryption: differential privacy parameters, data minimization for shared telemetry, and consent management. Without these, organizations face unacceptable privacy risks in federated agent ecosystems. Addressed by .

Gap 6: Cross-Domain Agent Audit Trails No standardized format exists for cross-domain audit trails that preserve causal ordering and provide tamper-evident logging. Execution Audit Tokens provide per-action records, but aggregation and correlation across domains remain undefined. Compliance requirements for automated decision-making make this urgent. Addressed by .

Gap 7: Human Override Standardization No cross-vendor protocol exists for sending override signals (emergency stop, graceful pause, forced rollback) to running agents. ACP-DAG-HITL defines when human approval is required but not how to deliver override signals. This is a fundamental safety gap. Addressed by .

MEDIUM Severity

Gap 8: Cross-Protocol Agent Migration Agents migrating between protocol environments (e.g., A2A to MCP) have no standard for preserving execution context, identity, and state across protocol boundaries. ECT provides a protocol-neutral token but not migration procedures. Addressed by .

Gap 9: Agent Resource Accounting and Billing No mechanism exists for tracking and reconciling agent resource consumption across administrative domains. This is a prerequisite for sustainable multi-domain agent ecosystems with cost attribution. Addressed by .

Gap 10: Agent Capability Negotiation Agents lack a standardized protocol to dynamically advertise functions, agree on interaction protocols, and establish compatible parameters. HTTP content negotiation provides basic discovery but not agent-specific capability semantics. Addressed by .

Gap 11: Agent Performance Benchmarking No standardized metrics or methodology exists for evaluating agent performance across dimensions of accuracy, latency, resource efficiency, safety compliance, and behavioral consistency. Addressed by .

Solution Roadmap

Companion Draft Mapping The following table maps each companion draft to the gaps it addresses: Companion Draft Gaps Addressed Priority Foundation CRITICAL Foundation CRITICAL Foundation HIGH 1, 11 CRITICAL 2, 4 CRITICAL 3, 10 HIGH 6, 9 HIGH 7 HIGH 5, 8 HIGH

Companion Draft Summaries

ECT ():: Defines Execution Context Tokens that carry task identity, delegated authority, and constraints across agent boundaries. Foundational for all other drafts.
ACP-DAG-HITL ():: Specifies Agent Context Policy tokens for DAG-based delegation with human-in-the-loop safety gates. Foundational for policy enforcement across all gaps.
Execution Audit ():: Defines per-action audit tokens for tamper-evident recording of agent actions. Foundation for cross-domain audit trails.
Behavioral Verification ():: Defines behavioral profiles, verification evidence formats, and appraisal procedures for runtime agent compliance. Addresses Gaps 1 and 11.
Cascade Prevention ():: Specifies circuit breakers, failure isolation, checkpointing, and rollback mechanisms for multi-agent workflows. Addresses Gaps 2 and 4.
Consensus ():: Defines protocols for multi-agent agreement with weighted voting, capability negotiation, and policy-constrained proposals. Addresses Gaps 3 and 10.
Cross-Domain Audit ():: Specifies audit trail aggregation, correlation, and query across administrative domains, plus resource accounting. Addresses Gaps 6 and 9.
Override Protocol ():: Defines a cross-vendor protocol for emergency stop, graceful pause, parameter modification, and forced rollback signals. Addresses Gap 7.
Federation Privacy ():: Specifies privacy-preserving mechanisms for federated agent learning and cross-protocol migration procedures. Addresses Gaps 5 and 8.

Dependencies The companion drafts have the following dependency structure:

Behavioral verification is foundational: its attestation format is consumed by cascade prevention and cross-domain audit. Cascade prevention defines failure containment that override protocol builds upon. Consensus extends behavioral verification with multi-agent agreement. Cross-domain audit provides the infrastructure that federation privacy adds privacy controls to.

Recommended Prioritization Work should proceed in three phases:

Phase 1 -- Safety Foundation (Immediate):: Behavioral Verification (Gaps 1, 11) and Cascade Prevention (Gaps 2, 4). These are CRITICAL severity gaps with direct safety implications. Without runtime verification and failure containment, no autonomous agent deployment can be considered safe.
Phase 2 -- Control and Accountability (Near-term):: Human Override (Gap 7) and Cross-Domain Audit (Gaps 6, 9). Override capability is a prerequisite for any production deployment. Audit trails are required for regulatory compliance in enterprise environments.
Phase 3 -- Interoperability and Scale (Medium-term):: Consensus (Gaps 3, 10) and Federation Privacy (Gaps 5, 8). These enable multi-vendor and multi-domain agent ecosystems but depend on the safety and accountability foundations from Phases 1 and 2.

Security Considerations The gaps identified in this document have cross-cutting security implications: Behavioral Verification (Gap 1): Without runtime verification, compromised agents exploit trusted identities to perform unauthorized actions undetected. Cascade Prevention (Gap 2): Absence of containment creates a denial-of-service vector where compromising a single agent disrupts entire multi-agent workflows. Human Override (Gap 7): Without a standard override protocol, safety-critical agent actions may not be stoppable in emergency situations. Cross-Domain Audit (Gap 6): Audit trail gaps across domain boundaries enable agents to evade detection and accountability. Federated Privacy (Gap 5): Sharing agent data across domains without privacy controls exposes network topology, operational patterns, and business logic. Implementers of autonomous agent systems SHOULD treat the CRITICAL and HIGH severity gaps as security requirements and prioritize their resolution. The companion drafts each contain detailed security considerations specific to their scope.

IANA Considerations This document has no IANA actions.

Key words for use in RFCs to Indicate Requirement Levels In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements. Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words RFC 2119 specifies common key words that may be used in protocol specifications. This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the defined special meanings. Remote ATtestation procedureS (RATS) Architecture In network protocol exchanges, it is often useful for one end of a communication to know whether the other end is in an intended operating state. This document provides an architectural overview of the entities involved that make such tests possible through the process of generating, conveying, and evaluating evidentiary Claims. It provides a model that is neutral toward processor architectures, the content of Claims, and protocols. HTTP Semantics The Hypertext Transfer Protocol (HTTP) is a stateless application-level protocol for distributed, collaborative, hypertext information systems. This document describes the overall architecture of HTTP, establishes common terminology, and defines aspects of the protocol that are shared by all versions. In this definition are core protocol elements, extensibility mechanisms, and the "http" and "https" Uniform Resource Identifier (URI) schemes. This document updates RFC 3864 and obsoletes RFCs 2818, 7231, 7232, 7233, 7235, 7538, 7615, 7694, and portions of 7230. Execution Context Tokens for Distributed Agentic Workflows Agent Context Policy Token: DAG Delegation with Human Override Cross-Domain Execution Audit Tokens Agent Behavioral Verification and Performance Benchmarking Agent Failure Cascade Prevention and Rollback Multi-Agent Consensus and Capability Negotiation Protocols Cross-Domain Agent Audit Trails and Resource Accounting Standardized Human Override Protocol for Autonomous Agents Federated Agent Learning Privacy and Cross-Protocol Migration Gap Analysis for Autonomous Agent Protocols in the IETF Landscape

Acknowledgments The author thanks the participants of the WIMSE, RATS, and NMOP working groups for discussions that informed this analysis. The full gap analysis is available as .