780 lines
26 KiB
Markdown
780 lines
26 KiB
Markdown
---
|
|
title: "Gap Analysis for Autonomous Agent Protocols"
|
|
abbrev: "Agent Gap Analysis"
|
|
category: info
|
|
docname: draft-nennemann-agent-gap-analysis-00
|
|
area: "OPS"
|
|
workgroup: "NMOP"
|
|
submissiontype: IETF
|
|
v: 3
|
|
|
|
author:
|
|
- fullname: Christian Nennemann
|
|
organization: Independent Researcher
|
|
email: ietf@nennemann.de
|
|
|
|
normative:
|
|
RFC2119:
|
|
RFC8174:
|
|
|
|
informative:
|
|
RFC9334:
|
|
RFC7519:
|
|
RFC8615:
|
|
RFC9110:
|
|
I-D.nennemann-wimse-ect:
|
|
title: "Execution Context Tokens for Distributed Agentic Workflows"
|
|
target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/
|
|
I-D.nennemann-agent-dag-hitl-safety:
|
|
title: "Agent Context Policy Token: DAG Delegation with Human Override"
|
|
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/
|
|
I-D.nennemann-exec-audit:
|
|
title: "Cross-Domain Execution Audit Tokens"
|
|
target: https://datatracker.ietf.org/doc/draft-nennemann-exec-audit/
|
|
I-D.nennemann-agent-behavioral-verification:
|
|
title: "Agent Behavioral Verification and Performance Benchmarking"
|
|
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-behavioral-verification/
|
|
I-D.nennemann-agent-cascade-prevention:
|
|
title: "Agent Failure Cascade Prevention and Rollback"
|
|
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-cascade-prevention/
|
|
I-D.nennemann-agent-consensus:
|
|
title: "Multi-Agent Consensus and Capability Negotiation Protocols"
|
|
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-consensus/
|
|
I-D.nennemann-agent-cross-domain-audit:
|
|
title: "Cross-Domain Agent Audit Trails and Resource Accounting"
|
|
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-cross-domain-audit/
|
|
I-D.nennemann-agent-override-protocol:
|
|
title: "Standardized Human Override Protocol for Autonomous Agents"
|
|
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-override-protocol/
|
|
I-D.nennemann-agent-federation-privacy:
|
|
title: "Federated Agent Learning Privacy and Cross-Protocol Migration"
|
|
target: https://datatracker.ietf.org/doc/draft-nennemann-agent-federation-privacy/
|
|
|
|
--- abstract
|
|
|
|
This document maps the IETF autonomous agent landscape,
|
|
identifies eleven gap areas where standardization is absent
|
|
or insufficient, and introduces six companion drafts that
|
|
address the most critical gaps. Over 260 IETF drafts touch
|
|
on agent communication, identity, safety, and operations,
|
|
yet no single reference architecture ties them together.
|
|
This gap analysis provides a structured roadmap for the
|
|
standards work needed to enable safe, interoperable, and
|
|
auditable autonomous agent ecosystems.
|
|
|
|
--- middle
|
|
|
|
# Introduction
|
|
|
|
Autonomous software agents are increasingly deployed in
|
|
network management, cloud orchestration, supply-chain
|
|
logistics, and AI-driven workflows. Over 260 IETF drafts
|
|
touch on aspects of agent communication, identity, safety,
|
|
and operations. However, these efforts remain fragmented:
|
|
no single reference architecture ties them together, and
|
|
several critical capabilities lack any standardization at
|
|
all.
|
|
|
|
This document provides three contributions:
|
|
|
|
1. A reference architecture that organizes agent
|
|
capabilities into layers (Section 3).
|
|
|
|
2. A survey of existing IETF work relevant to autonomous
|
|
agents (Section 4).
|
|
|
|
3. A gap analysis identifying eleven areas where new or
|
|
extended standards are needed, together with a roadmap
|
|
of six companion drafts that address the most critical
|
|
gaps (Sections 5 and 6).
|
|
|
|
The intended audience includes working group chairs,
|
|
area directors, and protocol designers evaluating where
|
|
autonomous-agent standardization efforts should focus.
|
|
|
|
# Terminology
|
|
|
|
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
|
|
"SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",
|
|
"NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document
|
|
are to be interpreted as described in BCP 14 {{RFC2119}}
|
|
{{RFC8174}} when, and only when, they appear in all
|
|
capitals, as shown here.
|
|
|
|
{::boilerplate bcp14-tagged}
|
|
|
|
The following terms are used throughout this document:
|
|
|
|
Agent:
|
|
: A software component that acts on behalf of a principal
|
|
(human or organizational) to perform tasks, communicate
|
|
with other agents, or interact with external systems.
|
|
|
|
Autonomous Agent:
|
|
: An agent capable of executing multi-step tasks without
|
|
continuous human supervision, including making decisions
|
|
based on policy, context, and environmental state.
|
|
|
|
Agent Ecosystem:
|
|
: The set of agents, their principals, the policies that
|
|
govern them, and the infrastructure services (identity,
|
|
discovery, audit) they rely on.
|
|
|
|
DAG (Directed Acyclic Graph):
|
|
: A graph structure used to represent multi-step agent
|
|
execution plans where tasks have dependency ordering
|
|
but no circular dependencies.
|
|
|
|
HITL (Human-in-the-Loop):
|
|
: A control pattern in which a human operator must
|
|
approve, modify, or reject an agent action before
|
|
it takes effect.
|
|
|
|
ECT (Execution Context Token):
|
|
: A cryptographically signed token that carries the
|
|
execution context (task identity, delegated authority,
|
|
constraints) for an agent action. See
|
|
{{I-D.nennemann-wimse-ect}}.
|
|
|
|
ACP (Agent Context Policy):
|
|
: A policy document that specifies permitted behaviors,
|
|
resource limits, and escalation rules for an agent
|
|
within a given execution context. See
|
|
{{I-D.nennemann-agent-dag-hitl-safety}}.
|
|
|
|
Behavioral Attestation:
|
|
: A verifiable claim that an agent's runtime behavior
|
|
conforms to a declared policy or behavioral profile.
|
|
|
|
Cascade Failure:
|
|
: A failure mode in which an error in one agent
|
|
propagates through a multi-agent workflow, causing
|
|
successive agents to fail or produce incorrect
|
|
results.
|
|
|
|
Consensus Protocol:
|
|
: A protocol by which multiple agents reach agreement
|
|
on a shared decision, state, or action plan.
|
|
|
|
Override Signal:
|
|
: A message from a human operator or supervisory system
|
|
that instructs an agent to halt, modify, or roll back
|
|
its current action.
|
|
|
|
# Reference Architecture
|
|
|
|
The following diagram presents a layered reference
|
|
architecture for autonomous agent ecosystems. Each layer
|
|
identifies the relevant gap areas addressed by this
|
|
analysis.
|
|
|
|
~~~ ascii-art
|
|
+-------------------------------------------------------------+
|
|
| HUMAN OPERATORS |
|
|
| [Override & HITL Layer -- GAP 7] |
|
|
+-------------------------------------------------------------+
|
|
| AGENT INTERACTION LAYER |
|
|
| +---------+ +---------+ +---------+ +---------+ |
|
|
| | Agent A |<>| Agent B |<>| Agent C |<>| Agent D | |
|
|
| +----+----+ +----+----+ +----+----+ +----+----+ |
|
|
| | GAP 3: | GAP 10: | GAP 1: | |
|
|
| | Consensus | Cap.Neg. | Behav. | |
|
|
| | | | Verif. | |
|
|
+-------+------------+------------+------------+--------------+
|
|
| EXECUTION LAYER (ECT) |
|
|
| DAG Execution | Checkpoints | Rollback | Circuit Breakers |
|
|
| [GAP 2: Cascade Prevention] [GAP 4: Rollback] |
|
|
+-------------------------------------------------------------+
|
|
| POLICY & GOVERNANCE LAYER |
|
|
| ACP-DAG-HITL | Trust Scoring | Assurance Profiles |
|
|
| [GAP 5: Federated Privacy] [GAP 6: Cross-Domain Audit] |
|
|
+-------------------------------------------------------------+
|
|
| INFRASTRUCTURE LAYER |
|
|
| Identity | Discovery | Registration | Protocol Bridges |
|
|
| [GAP 8: Cross-Protocol] [GAP 9: Resource Accounting] |
|
|
| [GAP 11: Performance Benchmarking] |
|
|
+-------------------------------------------------------------+
|
|
~~~
|
|
{: #fig-arch title="Agent Ecosystem Reference Architecture"}
|
|
|
|
The Human Operators layer provides override and
|
|
human-in-the-loop controls (Gap 7). The Agent Interaction
|
|
layer is where agents communicate, negotiate capabilities
|
|
(Gap 10), reach consensus (Gap 3), and undergo behavioral
|
|
verification (Gap 1). The Execution layer manages DAG-based
|
|
workflows with cascade prevention (Gap 2) and rollback
|
|
(Gap 4). The Policy and Governance layer enforces privacy
|
|
in federated learning (Gap 5) and cross-domain audit trails
|
|
(Gap 6). The Infrastructure layer handles identity,
|
|
discovery, cross-protocol migration (Gap 8), resource
|
|
accounting (Gap 9), and performance benchmarking (Gap 11).
|
|
|
|
# Existing IETF Work
|
|
|
|
This section briefly surveys existing IETF efforts relevant
|
|
to autonomous agent protocols.
|
|
|
|
## WIMSE (Workload Identity in Multi-System Environments)
|
|
|
|
The WIMSE working group addresses workload identity and
|
|
Execution Context Tokens (ECTs) {{I-D.nennemann-wimse-ect}}.
|
|
ECTs provide the foundation for carrying delegated authority
|
|
and task context across agent boundaries.
|
|
|
|
## RATS (Remote ATtestation procedureS)
|
|
|
|
RATS defines architectures and protocols for remote
|
|
attestation {{RFC9334}}. Attestation evidence and
|
|
appraisal are directly applicable to verifying agent
|
|
behavioral claims.
|
|
|
|
## OAuth and GNAP
|
|
|
|
OAuth 2.0 and the Grant Negotiation and Authorization
|
|
Protocol (GNAP) provide authorization frameworks.
|
|
Transaction tokens and token exchange mechanisms are
|
|
relevant to agent-to-agent delegation chains.
|
|
|
|
## SCITT (Supply Chain Integrity, Transparency, and Trust)
|
|
|
|
SCITT defines transparency services for supply chain
|
|
artifacts. Its append-only log model is relevant to
|
|
agent audit trails and provenance tracking.
|
|
|
|
## NMOP (Network Management Operations)
|
|
|
|
The NMOP working group focuses on network management
|
|
operations including intent-based management and
|
|
autonomous network functions. Agent-driven network
|
|
management is a primary use case for the gaps identified
|
|
in this document.
|
|
|
|
## Industry Protocols: A2A and MCP
|
|
|
|
The Agent-to-Agent (A2A) protocol and Model Context
|
|
Protocol (MCP) are emerging industry standards for agent
|
|
communication. While not IETF specifications, they
|
|
inform the gap analysis by highlighting capabilities
|
|
that lack standardized, interoperable definitions.
|
|
|
|
# Gap Analysis
|
|
|
|
This section identifies eleven gaps in the current
|
|
standards landscape for autonomous agent protocols.
|
|
Gaps are classified by severity:
|
|
|
|
- CRITICAL: No existing standard addresses the problem;
|
|
failure to standardize poses immediate safety or
|
|
interoperability risks.
|
|
|
|
- HIGH: Partial coverage exists but is insufficient for
|
|
production autonomous agent deployments.
|
|
|
|
- MEDIUM: The gap affects efficiency or completeness but
|
|
does not pose immediate safety risks.
|
|
|
|
## Gap 1: Agent Behavioral Verification {#gap-1}
|
|
|
|
Severity:
|
|
: CRITICAL
|
|
|
|
Category:
|
|
: AI Safety
|
|
|
|
Problem Statement:
|
|
: Autonomous agents operating in production environments
|
|
currently lack any standardized mechanism for runtime
|
|
verification of policy compliance. While RATS
|
|
{{RFC9334}} provides attestation for platform integrity,
|
|
no equivalent exists for verifying that an agent's
|
|
observed behavior conforms to its declared behavioral
|
|
profile or policy constraints.
|
|
|
|
: Without behavioral verification, operators cannot
|
|
distinguish between an agent that faithfully executes
|
|
its policy and one that has drifted, been compromised,
|
|
or is operating outside its intended parameters. This
|
|
is especially dangerous in multi-agent workflows where
|
|
one misbehaving agent can corrupt downstream results.
|
|
|
|
: The gap extends to the absence of standardized
|
|
behavioral profiles, verification evidence formats,
|
|
and appraisal procedures specific to agent conduct.
|
|
|
|
Impact if Unaddressed:
|
|
: Undetected policy violations in autonomous agents
|
|
could cause safety incidents, data breaches, or
|
|
cascading failures in critical infrastructure.
|
|
|
|
Existing Partial Coverage:
|
|
: RATS {{RFC9334}} covers platform attestation but not
|
|
behavioral compliance. ACP-DAG-HITL
|
|
{{I-D.nennemann-agent-dag-hitl-safety}} defines
|
|
policies but not verification mechanisms.
|
|
|
|
Companion Draft:
|
|
: {{I-D.nennemann-agent-behavioral-verification}}
|
|
|
|
## Gap 2: Agent Failure Cascade Prevention {#gap-2}
|
|
|
|
Severity:
|
|
: CRITICAL
|
|
|
|
Category:
|
|
: AI Safety, Resilience
|
|
|
|
Problem Statement:
|
|
: Multi-agent workflows create dependency chains where a
|
|
failure in one agent can propagate to downstream agents,
|
|
causing cascade failures. No standardized mechanism
|
|
exists for circuit breakers, failure isolation, or
|
|
cascade containment in agent-to-agent interactions.
|
|
|
|
: Current practice relies on ad-hoc timeout and retry
|
|
logic that is neither interoperable nor sufficient for
|
|
complex DAG-structured workflows. Agents from
|
|
different vendors or administrative domains have no
|
|
common way to signal failure states or trigger
|
|
containment procedures.
|
|
|
|
: The absence of cascade prevention is especially
|
|
critical in network management scenarios where agent
|
|
failures could propagate to affect live network
|
|
operations.
|
|
|
|
Impact if Unaddressed:
|
|
: A single agent failure could cascade through an entire
|
|
multi-agent deployment, causing widespread service
|
|
disruption with no automated containment.
|
|
|
|
Existing Partial Coverage:
|
|
: ECT {{I-D.nennemann-wimse-ect}} provides execution
|
|
context but no failure containment semantics.
|
|
|
|
Companion Draft:
|
|
: {{I-D.nennemann-agent-cascade-prevention}}
|
|
|
|
## Gap 3: Multi-Agent Consensus Protocols {#gap-3}
|
|
|
|
Severity:
|
|
: HIGH
|
|
|
|
Category:
|
|
: A2A Protocols
|
|
|
|
Problem Statement:
|
|
: When multiple agents must agree on a shared decision
|
|
(e.g., a network configuration change, a resource
|
|
allocation plan, or a coordinated response to an
|
|
incident), no standardized consensus protocol exists
|
|
for agent-to-agent agreement.
|
|
|
|
: Distributed systems consensus protocols (Raft, Paxos)
|
|
are designed for replicated state machines, not for
|
|
heterogeneous agents with different capabilities,
|
|
trust levels, and policy constraints. Agent consensus
|
|
requires additional semantics such as weighted voting,
|
|
capability-based participation, and policy-constrained
|
|
proposals.
|
|
|
|
: Without a standard protocol, multi-agent coordination
|
|
relies on proprietary mechanisms that are not
|
|
interoperable across vendors or administrative domains.
|
|
|
|
Impact if Unaddressed:
|
|
: Multi-vendor agent deployments cannot coordinate
|
|
decisions, limiting autonomous agents to single-vendor
|
|
silos.
|
|
|
|
Existing Partial Coverage:
|
|
: No existing IETF work directly addresses multi-agent
|
|
consensus.
|
|
|
|
Companion Draft:
|
|
: {{I-D.nennemann-agent-consensus}}
|
|
|
|
## Gap 4: Real-Time Agent Rollback Mechanisms {#gap-4}
|
|
|
|
Severity:
|
|
: HIGH
|
|
|
|
Category:
|
|
: Resilience, Operations
|
|
|
|
Problem Statement:
|
|
: When an autonomous agent takes an action that produces
|
|
unintended consequences, no standardized mechanism
|
|
exists for rolling back the action and restoring
|
|
the previous state. This is particularly important
|
|
for network management agents that modify device
|
|
configurations.
|
|
|
|
: Rollback requires standardized checkpointing, state
|
|
snapshots, and undo semantics that work across agent
|
|
boundaries and administrative domains. Current
|
|
rollback mechanisms (e.g., NETCONF confirmed-commit)
|
|
are protocol-specific and do not generalize to
|
|
arbitrary agent actions.
|
|
|
|
: The lack of rollback is compounded in multi-agent
|
|
workflows where multiple agents may have taken
|
|
coordinated actions that must be reversed as a unit.
|
|
|
|
Impact if Unaddressed:
|
|
: Operators cannot safely deploy autonomous agents for
|
|
critical operations without manual intervention
|
|
capability for every action.
|
|
|
|
Existing Partial Coverage:
|
|
: NETCONF confirmed-commit provides rollback for
|
|
configuration changes only.
|
|
|
|
Companion Draft:
|
|
: {{I-D.nennemann-agent-cascade-prevention}}
|
|
|
|
## Gap 5: Federated Agent Learning Privacy {#gap-5}
|
|
|
|
Severity:
|
|
: HIGH
|
|
|
|
Category:
|
|
: Privacy, Federated Systems
|
|
|
|
Problem Statement:
|
|
: Agents that participate in federated learning or
|
|
share operational data across administrative domains
|
|
require privacy guarantees that go beyond transport
|
|
encryption. No IETF specification addresses the
|
|
privacy requirements of federated agent learning,
|
|
including differential privacy parameters, data
|
|
minimization for shared agent telemetry, and
|
|
consent management for cross-domain data sharing.
|
|
|
|
: As agents are deployed across organizational
|
|
boundaries, the data they generate and share can
|
|
reveal sensitive information about network topology,
|
|
operational patterns, and business logic. Privacy-
|
|
preserving mechanisms specific to agent interactions
|
|
are needed.
|
|
|
|
Impact if Unaddressed:
|
|
: Organizations will be unable to participate in
|
|
federated agent ecosystems without unacceptable
|
|
privacy risks, limiting the value of multi-domain
|
|
agent deployments.
|
|
|
|
Existing Partial Coverage:
|
|
: General privacy frameworks exist but none address
|
|
agent-specific federated learning scenarios.
|
|
|
|
Companion Draft:
|
|
: {{I-D.nennemann-agent-federation-privacy}}
|
|
|
|
## Gap 6: Cross-Domain Agent Audit Trails {#gap-6}
|
|
|
|
Severity:
|
|
: HIGH
|
|
|
|
Category:
|
|
: Audit, Compliance
|
|
|
|
Problem Statement:
|
|
: When agents operate across multiple administrative
|
|
domains, their actions must be auditable end-to-end.
|
|
No standardized format exists for cross-domain agent
|
|
audit trails that preserves causal ordering, links
|
|
related actions across domain boundaries, and provides
|
|
tamper-evident logging.
|
|
|
|
: Execution Audit Tokens {{I-D.nennemann-exec-audit}}
|
|
provide per-action audit records, but no standard
|
|
defines how these records are aggregated, correlated,
|
|
and queried across domains. SCITT provides
|
|
transparency log primitives but does not define
|
|
agent-specific audit semantics.
|
|
|
|
: Regulatory and compliance requirements increasingly
|
|
demand end-to-end audit trails for automated
|
|
decision-making, making this gap urgent for
|
|
enterprise deployments.
|
|
|
|
Impact if Unaddressed:
|
|
: Organizations cannot demonstrate compliance for
|
|
cross-domain agent operations, blocking adoption
|
|
in regulated industries.
|
|
|
|
Existing Partial Coverage:
|
|
: SCITT provides transparency log primitives.
|
|
{{I-D.nennemann-exec-audit}} defines per-action
|
|
audit tokens.
|
|
|
|
Companion Draft:
|
|
: {{I-D.nennemann-agent-cross-domain-audit}}
|
|
|
|
## Gap 7: Human Override Standardization {#gap-7}
|
|
|
|
Severity:
|
|
: HIGH
|
|
|
|
Category:
|
|
: AI Safety, Human Control
|
|
|
|
Problem Statement:
|
|
: Autonomous agents must always be subject to human
|
|
override, but no cross-vendor protocol exists for
|
|
sending override signals to agents. Override
|
|
requirements include emergency stop, graceful pause,
|
|
parameter modification, and forced rollback.
|
|
|
|
: Current override mechanisms are vendor-specific and
|
|
cannot be used in multi-vendor agent deployments.
|
|
ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}}
|
|
defines when human approval is required but does not
|
|
specify the protocol for delivering override signals
|
|
to running agents.
|
|
|
|
: The absence of a standard override protocol creates
|
|
a significant safety risk: if an agent misbehaves,
|
|
the operator may not have a reliable way to stop it
|
|
if the agent comes from a different vendor than the
|
|
management platform.
|
|
|
|
Impact if Unaddressed:
|
|
: Operators lose the ability to control autonomous
|
|
agents in emergency situations, creating unacceptable
|
|
safety risks.
|
|
|
|
Existing Partial Coverage:
|
|
: ACP-DAG-HITL {{I-D.nennemann-agent-dag-hitl-safety}}
|
|
defines HITL policies but not override delivery.
|
|
|
|
Companion Draft:
|
|
: {{I-D.nennemann-agent-override-protocol}}
|
|
|
|
## Gap 8: Cross-Protocol Agent Migration {#gap-8}
|
|
|
|
Severity:
|
|
: MEDIUM
|
|
|
|
Category:
|
|
: Interoperability
|
|
|
|
Problem Statement:
|
|
: Agents may need to migrate between protocol
|
|
environments (e.g., from an A2A-based system to an
|
|
MCP-based system) while preserving execution context,
|
|
identity, and accumulated state. No standard defines
|
|
how agent context is translated or preserved across
|
|
protocol boundaries.
|
|
|
|
: As the agent ecosystem matures, agents will
|
|
increasingly operate in heterogeneous protocol
|
|
environments. Without migration standards, agents
|
|
are locked into specific protocol ecosystems.
|
|
|
|
Impact if Unaddressed:
|
|
: Agent deployments become fragmented across protocol
|
|
silos, reducing interoperability and increasing
|
|
operational complexity.
|
|
|
|
Existing Partial Coverage:
|
|
: ECT {{I-D.nennemann-wimse-ect}} provides a
|
|
protocol-neutral context token but does not define
|
|
migration procedures.
|
|
|
|
Companion Draft:
|
|
: {{I-D.nennemann-agent-federation-privacy}}
|
|
|
|
## Gap 9: Agent Resource Accounting and Billing {#gap-9}
|
|
|
|
Severity:
|
|
: MEDIUM
|
|
|
|
Category:
|
|
: Operations, Economics
|
|
|
|
Problem Statement:
|
|
: Autonomous agents consume computational, network, and
|
|
API resources across administrative domains. No
|
|
standardized mechanism exists for tracking, reporting,
|
|
and reconciling resource consumption by agents,
|
|
especially in multi-domain scenarios where an agent's
|
|
actions incur costs in domains other than its own.
|
|
|
|
: Resource accounting is a prerequisite for sustainable
|
|
multi-domain agent ecosystems where organizations
|
|
need to track and charge for agent resource usage.
|
|
|
|
Impact if Unaddressed:
|
|
: Organizations cannot accurately track or bill for
|
|
agent resource consumption, hindering commercial
|
|
multi-domain agent deployments.
|
|
|
|
Existing Partial Coverage:
|
|
: No existing IETF work addresses agent-specific
|
|
resource accounting.
|
|
|
|
Companion Draft:
|
|
: {{I-D.nennemann-agent-cross-domain-audit}}
|
|
|
|
## Gap 10: Agent Capability Negotiation {#gap-10}
|
|
|
|
Severity:
|
|
: MEDIUM
|
|
|
|
Category:
|
|
: A2A Protocols
|
|
|
|
Problem Statement:
|
|
: When agents interact, they need to discover and
|
|
negotiate each other's capabilities dynamically.
|
|
No standardized capability negotiation protocol
|
|
exists for agents to advertise their functions,
|
|
agree on interaction protocols, and establish
|
|
compatible communication parameters.
|
|
|
|
: Well-known URI {{RFC8615}} and HTTP {{RFC9110}}
|
|
provide discovery primitives, but agent capability
|
|
negotiation requires richer semantics including
|
|
versioning, conditional capabilities, and policy-
|
|
constrained capability advertisement.
|
|
|
|
Impact if Unaddressed:
|
|
: Agent interactions require pre-configured knowledge
|
|
of peer capabilities, limiting dynamic composition
|
|
and ad-hoc agent collaboration.
|
|
|
|
Existing Partial Coverage:
|
|
: HTTP content negotiation and well-known URIs provide
|
|
basic discovery but not agent-specific capability
|
|
negotiation.
|
|
|
|
Companion Draft:
|
|
: {{I-D.nennemann-agent-consensus}}
|
|
|
|
## Gap 11: Agent Performance Benchmarking {#gap-11}
|
|
|
|
Severity:
|
|
: MEDIUM
|
|
|
|
Category:
|
|
: Operations, Metrics
|
|
|
|
Problem Statement:
|
|
: No standardized metrics or benchmarking methodology
|
|
exists for evaluating autonomous agent performance.
|
|
Without common metrics, operators cannot compare
|
|
agent implementations, set performance baselines,
|
|
or detect performance degradation.
|
|
|
|
: Agent performance encompasses multiple dimensions:
|
|
task completion accuracy, latency, resource
|
|
efficiency, safety compliance rate, and behavioral
|
|
consistency. Standardized metrics and measurement
|
|
procedures are needed for each dimension.
|
|
|
|
Impact if Unaddressed:
|
|
: Operators cannot objectively evaluate or compare
|
|
autonomous agent implementations, hindering
|
|
procurement and deployment decisions.
|
|
|
|
Existing Partial Coverage:
|
|
: No existing IETF work addresses agent performance
|
|
benchmarking.
|
|
|
|
Companion Draft:
|
|
: {{I-D.nennemann-agent-behavioral-verification}}
|
|
|
|
# Companion Draft Roadmap
|
|
|
|
The following table maps each companion draft to the
|
|
gaps it addresses and its priority level:
|
|
|
|
| Companion Draft | Gaps | Priority |
|
|
|:---|:---:|:---:|
|
|
| draft-nennemann-agent-behavioral-verification | 1, 11 | CRITICAL |
|
|
| draft-nennemann-agent-cascade-prevention | 2, 4 | CRITICAL |
|
|
| draft-nennemann-agent-consensus | 3, 10 | HIGH |
|
|
| draft-nennemann-agent-cross-domain-audit | 6, 9 | HIGH |
|
|
| draft-nennemann-agent-override-protocol | 7 | HIGH |
|
|
| draft-nennemann-agent-federation-privacy | 5, 8 | HIGH |
|
|
{: #tab-roadmap title="Companion Draft Roadmap"}
|
|
|
|
The dependency relationships between companion drafts
|
|
are shown below:
|
|
|
|
~~~ ascii-art
|
|
behavioral-verification ---+
|
|
| |
|
|
v |
|
|
cascade-prevention |
|
|
| |
|
|
v v
|
|
override-protocol cross-domain-audit
|
|
| |
|
|
v v
|
|
consensus federation-privacy
|
|
~~~
|
|
{: #fig-deps title="Companion Draft Dependencies"}
|
|
|
|
The behavioral-verification draft (Companion A) is
|
|
foundational because its behavioral attestation format
|
|
is used by the cascade-prevention and cross-domain-audit
|
|
drafts. The cascade-prevention draft (Companion B)
|
|
defines failure containment that the override-protocol
|
|
(Companion E) builds upon. The consensus draft
|
|
(Companion C) extends behavioral verification with
|
|
multi-agent agreement. The cross-domain-audit draft
|
|
(Companion D) provides the audit infrastructure that
|
|
federation-privacy (Companion F) adds privacy controls
|
|
to.
|
|
|
|
# Security Considerations
|
|
|
|
The gaps identified in this document have direct security
|
|
implications:
|
|
|
|
Behavioral Verification (Gap 1):
|
|
: Without runtime behavioral verification, compromised
|
|
or malfunctioning agents cannot be detected, creating
|
|
opportunities for attacks that exploit trusted agent
|
|
identities to perform unauthorized actions.
|
|
|
|
Cascade Prevention (Gap 2):
|
|
: The absence of cascade containment creates a denial-
|
|
of-service vector where an attacker can compromise a
|
|
single agent to disrupt an entire multi-agent workflow.
|
|
|
|
Human Override (Gap 7):
|
|
: Without standardized override protocols, safety-
|
|
critical agent actions may not be stoppable, creating
|
|
an unacceptable risk profile for autonomous
|
|
deployments.
|
|
|
|
Cross-Domain Audit (Gap 6):
|
|
: Gaps in audit trails across domain boundaries create
|
|
opportunities for agents to take actions that evade
|
|
detection and accountability.
|
|
|
|
Federated Privacy (Gap 5):
|
|
: Sharing agent operational data across domains without
|
|
adequate privacy controls can expose sensitive
|
|
organizational information, network topology, and
|
|
business logic.
|
|
|
|
Implementers of autonomous agent systems SHOULD treat the
|
|
CRITICAL and HIGH severity gaps as security requirements
|
|
and prioritize their resolution.
|
|
|
|
# IANA Considerations
|
|
|
|
This document has no IANA actions.
|
|
|
|
--- back
|
|
|
|
# Acknowledgments
|
|
|
|
The author thanks the participants of the WIMSE, RATS,
|
|
and NMOP working groups for discussions that informed
|
|
this analysis.
|