Generate 5-draft ecosystem family, fix formatter markdown stripping

Pipeline output:
- ABVP: Agent Behavior Verification Protocol (quality 3.0/5)
- AEM: Privacy-Preserving Agent Learning Protocol (quality 2.1/5)
- ATD: Agent Task DAG Framework (quality 2.5/5)
- HITL: Human-in-the-Loop Primitives (quality 2.4/5)
- AEPB: Real-Time Agent Rollback Protocol (quality 2.5/5)
- APAE: Agent Provenance Assurance Ecosystem (quality 2.5/5)

Quality gates: all pass novelty + references, format gate improved
with markdown stripping (_strip_markdown) and dynamic header padding.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-04 01:42:30 +01:00
parent 7a1aa346b9
commit 404092b938
9 changed files with 5122 additions and 4 deletions

View File

@@ -0,0 +1,793 @@
Internet-Draft anima
Intended status: standards-track March 2026
Expires: September 05, 2026
Agent Task DAG: A Framework for Directed Acyclic Graph Execution in Multi-Agent Systems
draft-agent-ecosystem-agent-task-a-00
Abstract
As AI agent systems become increasingly complex, there is a
growing need for structured approaches to orchestrate multi-step
tasks across multiple autonomous agents. This document defines the
Agent Task DAG (Directed Acyclic Graph) framework, which provides
a standardized approach for representing, executing, and managing
complex workflows in multi-agent environments. The framework
addresses key challenges including task decomposition, dependency
management, parallel execution, failure recovery, and human
oversight integration. By building upon existing agent
authorization profiles and task negotiation protocols, this
specification enables agents to coordinate complex workflows while
maintaining security, auditability, and the ability to incorporate
human-in-the-loop decision points. The framework supports both
fast execution in trusted environments and rigorous verification
in regulated contexts through configurable assurance profiles.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
This document is intended to have standards-track status.
Distribution of this memo is unlimited.
Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14 [RFC2119] [RFC8174] when, and only when, they
appear in all capitals, as shown here.
Agent Task DAG
A directed acyclic graph representing a complex workflow where
nodes represent individual tasks and edges represent
dependencies between tasks
Task Node
An individual unit of work within a DAG that can be executed by
one or more agents
Execution Context
The runtime environment and state information associated with
DAG execution, including agent assignments, intermediate
results, and checkpoint data
Checkpoint
A persistent snapshot of DAG execution state that enables
rollback and recovery operations
Task Binding
The association of a task node with specific agent capabilities
or agent instances
DAG Coordinator
An agent or system component responsible for orchestrating the
execution of a complete DAG workflow
Table of Contents
1. Introduction ................................................ 3
2. Terminology ................................................. 4
3. Problem Statement ........................................... 5
4. Agent Task DAG Framework .................................... 6
5. Task Execution Protocol ..................................... 7
6. Checkpoint and Recovery Mechanisms .......................... 8
7. Integration with Existing Agent Protocols ................... 9
8. Security Considerations ..................................... 10
9. IANA Considerations ......................................... 11
10. References .................................................. 12
1. Introduction
The increasing sophistication of AI agent systems has created a
demand for structured approaches to orchestrate complex, multi-
step tasks across autonomous agents. While individual agents have
become capable of handling sophisticated reasoning and execution
tasks, real-world applications often require coordinating multiple
agents to complete workflows that involve parallel processing,
sequential dependencies, and dynamic task allocation. Current
approaches to multi-agent coordination typically rely on ad-hoc
communication patterns or simple request-response chains, which
lack the expressiveness and reliability needed for complex
enterprise and research applications.
This document defines the Agent Task DAG (Directed Acyclic Graph)
framework, which provides a standardized approach for
representing, executing, and managing complex workflows in multi-
agent environments. The framework builds upon existing agent
protocols, particularly the Agent Authorization Profile [draft-
aap-oauth-profile] for security and authorization, and agent task
coordination mechanisms [draft-cui-ai-agent-task] for basic task
execution. By representing workflows as directed acyclic graphs,
the framework enables explicit modeling of task dependencies,
parallel execution opportunities, and conditional branching while
maintaining guarantees about workflow termination and consistency.
The Agent Task DAG framework addresses several critical challenges
in multi-agent systems: task decomposition and dependency
management, efficient parallel execution across heterogeneous
agents, robust failure recovery and rollback mechanisms, and
integration of human oversight at critical decision points. The
framework leverages structured claims for agent context [draft-
aap-oauth-profile] to enable context-aware task assignment and
supports agent context distribution mechanisms [draft-chang-agent-
context-interaction] to maintain coherent state across complex
multi-round workflows. This approach ensures that agents can
coordinate effectively while maintaining security boundaries and
audit trails required in enterprise environments.
The specification is designed to be protocol-agnostic and can
operate over various transport mechanisms including HTTP
[RFC9110], message queuing systems, and specialized agent
communication protocols. The framework integrates with existing
OAuth 2.0 [RFC6749] and JWT [RFC7519] infrastructure through the
Agent Authorization Profile, enabling seamless deployment in
environments that already support agent authentication and
authorization. The DAG representation follows JSON [RFC8259]
encoding standards to ensure broad compatibility and easy
integration with existing agent development frameworks.
This document focuses specifically on the DAG execution framework
and does not address broader questions of agent discovery,
capability matching, or task marketplace mechanisms, which are
covered by complementary specifications. The framework assumes the
existence of agent authorization infrastructure and builds upon
established patterns for agent-to-agent communication while
providing the additional structure needed for complex workflow
coordination.
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14 [RFC2119] [RFC8174] when, and only when, they
appear in all capitals, as shown here.
This specification builds upon terminology established in the
Agent Authorization Profile [draft-aap-oauth-profile], AI Agent
Task specifications [draft-cui-ai-agent-task], and Agent Context
Interaction mechanisms [draft-chang-agent-context-interaction].
The following terms are defined for use throughout this document:
Agent Task DAG: A directed acyclic graph data structure
representing a complex multi-step workflow where nodes correspond
to individual tasks and directed edges represent dependency
relationships between tasks. The DAG enforces execution ordering
constraints while enabling parallel execution of independent task
branches. Each DAG maintains metadata including creation time,
ownership, and execution policies that govern how the workflow may
be executed across multiple agents.
Task Node: An individual unit of work within an Agent Task DAG
that encapsulates a specific operation to be performed by one or
more AI agents. Each task node contains task specifications,
input/output schemas, execution constraints, and binding
requirements that determine which agents are capable of executing
the task. Task nodes maintain state information including
execution status, assigned agents, and result data as defined in
[draft-cui-ai-agent-task].
Execution Context: The runtime environment and associated state
information that governs the execution of an Agent Task DAG. The
execution context includes agent assignments, intermediate task
results, security credentials, operational constraints from Agent
Authorization Profiles [draft-aap-oauth-profile], and distributed
context information as specified in [draft-chang-agent-context-
interaction]. The execution context ensures consistency and
provides necessary information for task coordination across
multiple agents.
Checkpoint: A persistent, immutable snapshot of Agent Task DAG
execution state captured at a specific point in time. Checkpoints
contain the complete execution context, task completion status,
intermediate results, and sufficient metadata to enable rollback
and recovery operations. Checkpoints serve as recovery points for
failure scenarios and decision points for human-in-the-loop
interventions.
Task Binding: The process and resulting association between a task
node and specific agent capabilities or agent instances that will
execute the task. Task binding considers agent authorization
profiles, capability matching, resource availability, and security
constraints. The binding process may be performed statically
during DAG planning or dynamically during execution based on
runtime conditions.
DAG Coordinator: An agent or system component responsible for
orchestrating the complete lifecycle of Agent Task DAG execution.
The DAG Coordinator manages task scheduling, monitors execution
progress, handles inter-agent communication, enforces security
policies, and coordinates checkpoint and recovery operations. The
coordinator maintains the authoritative view of DAG execution
state and serves as the primary interface for human oversight and
intervention.
3. Problem Statement
Current approaches to multi-agent task coordination suffer from
several fundamental limitations that impede the development of
robust, scalable autonomous systems. Existing coordination
mechanisms typically rely on ad-hoc communication patterns, simple
request-response protocols, or basic workflow engines that were
not designed for the dynamic, autonomous nature of AI agents.
While protocols like those defined in [draft-cui-ai-agent-task]
provide foundations for individual task execution, they lack
standardized approaches for managing complex workflows involving
multiple interdependent tasks across heterogeneous agent
populations. The Agent Authorization Profile [draft-aap-oauth-
profile] establishes important primitives for agent identity and
authorization, but does not address the orchestration challenges
that arise when multiple authorized agents must coordinate to
complete complex, multi-step objectives.
The complexity of real-world AI agent applications demands
structured approaches to task decomposition and dependency
management that current protocols do not adequately address.
Agents operating in domains such as scientific research, business
process automation, or infrastructure management often require
workflows where tasks have intricate dependencies, may execute in
parallel when possible, and must handle partial failures
gracefully. Without standardized mechanisms for representing these
relationships, agent systems resort to brittle, custom
coordination logic that is difficult to audit, debug, or modify.
The lack of formal workflow representation also prevents effective
human oversight integration, as stakeholders cannot easily
understand or intervene in complex multi-agent processes.
Agent Context Distribution mechanisms [draft-chang-agent-context-
interaction] have demonstrated that context sharing among agents
significantly impacts execution success rates, but current
approaches do not provide systematic ways to manage context
propagation through complex workflows. In multi-step processes,
intermediate results from one task often serve as inputs to
downstream tasks, creating context dependencies that must be
carefully managed to ensure workflow integrity. Existing protocols
lack standardized approaches for maintaining execution context
across task boundaries, leading to information loss, redundant
computation, and coordination failures that compromise overall
system reliability.
Fault tolerance and recovery represent critical gaps in current
multi-agent coordination approaches. Real-world agent systems must
handle various failure modes including agent unavailability, task
timeouts, resource constraints, and partial execution failures.
Without systematic checkpoint and recovery mechanisms, workflows
often must restart completely when any component fails, leading to
inefficient resource utilization and poor user experience. The
absence of standardized rollback capabilities also complicates
human intervention scenarios, where domain experts may need to
modify workflow parameters or task assignments based on
intermediate results or changing requirements.
Scalability challenges emerge when current coordination approaches
encounter workflows with dozens or hundreds of interdependent
tasks distributed across multiple agent instances. Simple
centralized coordination quickly becomes a bottleneck, while fully
decentralized approaches struggle with consistency and deadlock
prevention. The lack of standardized protocols for parallel task
execution, resource allocation, and progress monitoring prevents
agent systems from efficiently utilizing available computational
resources. Additionally, without formal workflow representation,
it becomes difficult to optimize task scheduling, predict resource
requirements, or provide meaningful progress indicators to human
stakeholders.
These limitations necessitate a framework that provides:
structured representation of complex workflows with explicit
dependency management; standardized protocols for parallel
execution and agent coordination; systematic checkpoint and
recovery mechanisms that enable fault tolerance and human
intervention; integration with existing agent authorization and
context distribution mechanisms; and scalable execution patterns
that can accommodate workflows ranging from simple sequential
processes to complex parallel computations involving multiple
agent populations.
4. Agent Task DAG Framework
This section defines the core data model and execution semantics
for the Agent Task DAG framework. The framework provides a
structured approach for representing complex multi-agent workflows
as directed acyclic graphs, where individual tasks are modeled as
nodes and dependencies between tasks are represented as edges. The
data model builds upon existing agent protocol foundations while
introducing specific constructs needed for distributed workflow
orchestration.
4.1. DAG Data Model
An Agent Task DAG MUST be represented as a JSON object [RFC8259]
that contains the complete specification of a workflow. The DAG
structure consists of three primary components: metadata
describing the overall workflow, a collection of task nodes
representing individual units of work, and dependency
relationships that define execution ordering constraints. Each DAG
MUST include a unique identifier, version information, and
execution parameters that govern how the workflow should be
processed.
Task nodes within the DAG represent atomic units of work that can
be executed by autonomous agents. Each task node MUST specify its
execution requirements, including required agent capabilities,
input and output data schemas, and execution constraints such as
timeouts or resource limits. Task nodes SHOULD reference
standardized task types as defined in [draft-cui-ai-agent-task]
where applicable, enabling interoperability across different agent
implementations. The task specification MUST include sufficient
information for agents to determine their capability to execute
the task and negotiate execution parameters.
Dependency relationships between task nodes are expressed through
edge definitions that establish partial ordering constraints over
the DAG. Each edge MUST specify source and target task nodes, with
the semantic meaning that the target task cannot begin execution
until the source task has completed successfully. Edges MAY
include conditional execution logic, allowing for branching
workflows based on the results of predecessor tasks. The framework
supports both data dependencies, where output from one task serves
as input to another, and control dependencies, where task ordering
is required for correctness without direct data flow.
4.2. Execution Context Management
The Execution Context provides the runtime environment for DAG
processing and maintains state information throughout workflow
execution. The execution context MUST track the current state of
each task node, intermediate results produced during execution,
and metadata about agent assignments for each task. Context
information SHOULD be distributed among participating agents using
the mechanisms defined in [draft-chang-agent-context-interaction]
to ensure consistent state visibility across the multi-agent
system.
Agent binding within the execution context associates task nodes
with specific agent instances or agent capability requirements.
The framework supports both static binding, where task assignments
are predetermined before execution begins, and dynamic binding,
where task assignments are resolved at runtime based on agent
availability and capability matching. When integrated with Agent
Authorization Profiles [draft-aap-oauth-profile], the execution
context MUST validate that assigned agents possess the necessary
authorization claims to execute their bound tasks.
Checkpoint creation within the execution context enables
persistent state management and recovery capabilities. The
framework MUST support checkpoint creation at configurable
intervals, capturing the complete state of DAG execution including
task completion status, intermediate results, and current agent
assignments. Checkpoints SHOULD be created automatically before
task nodes that are marked as requiring human oversight, enabling
rollback to known-good states when human intervention modifies the
workflow execution path.
4.3. Task Execution Semantics
Task execution within the DAG framework follows a coordination
model where a DAG Coordinator orchestrates workflow progress while
individual agents execute assigned tasks autonomously. The
coordinator MUST maintain the global view of DAG state and
determine when task dependencies have been satisfied, enabling
parallel execution of independent task branches. Task scheduling
MUST respect dependency constraints while maximizing parallel
execution opportunities to optimize overall workflow completion
time.
The framework defines specific execution states for task nodes
including pending, ready, executing, completed, failed, and
skipped. State transitions MUST be coordinated through the DAG
Coordinator to ensure consistency across the distributed system.
When a task transitions to the ready state, the coordinator SHOULD
initiate agent assignment and task negotiation protocols to begin
execution. Failed tasks MAY trigger rollback procedures or
alternate execution paths depending on the configured failure
handling policies.
Integration with existing agent protocols occurs through
standardized interfaces that abstract the underlying communication
mechanisms. The framework MUST support protocol-agnostic bindings
that allow integration with different agent discovery,
authorization, and communication protocols. Task execution
requests SHOULD include structured claims as defined in [draft-
aap-oauth-profile] when agent authorization is required, ensuring
that security and audit requirements are maintained throughout the
distributed workflow execution.
5. Task Execution Protocol
The Agent Task DAG execution protocol defines a standardized
approach for coordinating the execution of complex workflows
across multiple autonomous agents. The protocol builds upon
existing agent communication mechanisms and authorization
frameworks, particularly the Agent Authorization Profile [draft-
aap-oauth-profile], to enable secure and auditable workflow
execution. The execution model supports both centralized
coordination through a designated DAG Coordinator and distributed
execution patterns where agents negotiate task assignments
dynamically.
The execution protocol operates through a series of well-defined
phases: initialization, task scheduling, parallel execution, and
completion verification. During initialization, the DAG
Coordinator validates the workflow structure, resolves task
bindings to available agents, and establishes the execution
context. Task scheduling follows topological ordering of the DAG,
with the coordinator identifying executable tasks (those with
satisfied dependencies) and dispatching them to appropriate
agents. The protocol supports parallel execution of independent
tasks while maintaining strict dependency ordering through state
synchronization mechanisms.
Agent coordination during DAG execution relies on structured
message exchanges that convey task assignments, status updates,
and result propagation. Task assignment messages MUST include the
complete task specification, execution context parameters, and any
required authorization tokens following the Agent Authorization
Profile format [draft-aap-oauth-profile]. Agents respond with
acceptance confirmations that include estimated execution time and
resource requirements. Status update messages provide real-time
execution progress and MUST be sent at configurable intervals to
enable failure detection and recovery operations.
State synchronization across the multi-agent system is achieved
through a combination of checkpoint mechanisms and distributed
context sharing. The DAG Coordinator maintains the authoritative
execution state, including task completion status, intermediate
results, and dependency satisfaction tracking. Agent Context
Distribution mechanisms [draft-chang-agent-context-interaction]
are employed to efficiently share relevant context information
among participating agents, reducing redundant data transfer while
ensuring each agent has access to necessary execution context.
Intermediate results from completed tasks are propagated to
dependent tasks through structured result messages that preserve
data lineage and enable audit trail construction.
The protocol defines specific message formats for each phase of
execution, using JSON [RFC8259] structures that can be embedded
within existing agent communication protocols. Task execution
requests include fields for task identification, input parameters,
execution constraints, and callback endpoints for status
reporting. Result messages contain structured output data,
execution metadata, and quality indicators that enable downstream
tasks to validate input requirements. Error and exception messages
provide detailed failure information including error codes,
diagnostic data, and suggested recovery actions.
Parallel execution coordination addresses the challenges of
resource contention and optimal scheduling across heterogeneous
agent capabilities. The protocol supports both push-based task
assignment, where the coordinator actively distributes work, and
pull-based execution, where agents request tasks based on their
availability and capabilities. Load balancing mechanisms consider
agent capacity, current workload, and task affinity when making
scheduling decisions. The protocol also defines procedures for
dynamic rescheduling when agents become unavailable or when
execution time estimates prove inaccurate, ensuring workflow
completion despite individual agent failures.
6. Checkpoint and Recovery Mechanisms
The Agent Task DAG framework MUST provide robust checkpoint and
recovery mechanisms to ensure workflow resilience and enable
graceful handling of failures, interruptions, and human
intervention points. Checkpoints represent persistent snapshots of
the DAG execution state at specific points in the workflow,
capturing sufficient information to resume execution from that
point or rollback to a previous stable state. The framework
defines three types of checkpoints: automatic checkpoints created
at predefined intervals or task completion boundaries, explicit
checkpoints requested by agents or human operators, and recovery
checkpoints generated immediately before high-risk operations that
may require rollback.
Checkpoint creation MUST capture the complete execution context as
defined in Section 4, including the current state of all task
nodes, intermediate results, agent assignments, and security
context derived from Agent Authorization Profiles [draft-aap-
oauth-profile]. Each checkpoint MUST include a unique identifier,
timestamp, DAG version, execution state hash, and references to
any external resources or agent context information as specified
in [draft-chang-agent-context-interaction]. The checkpoint data
structure SHOULD be serialized using JSON [RFC8259] with optional
compression for large state objects, and MUST be digitally signed
to ensure integrity and authenticity. Checkpoints MAY be stored in
distributed storage systems to ensure availability across multiple
DAG Coordinators.
The rollback procedure enables the DAG execution to revert to a
previous checkpoint when failures occur or human intervention
requires undoing completed work. When a rollback is initiated, the
DAG Coordinator MUST notify all participating agents of the
rollback operation, invalidate any results produced after the
target checkpoint, and restore the execution context to the
checkpoint state. Agents MUST acknowledge the rollback operation
and may need to perform agent-specific cleanup operations such as
releasing resources or notifying external systems. The rollback
operation MUST preserve audit trails by maintaining records of
both the original execution and the rollback event, ensuring
compliance with security and regulatory requirements.
Failure recovery strategies operate at multiple levels within the
DAG execution framework, from individual task failures to complete
coordinator failures. For task-level failures, the framework
supports automatic retry with exponential backoff, task
reassignment to alternative agents with compatible capabilities,
and conditional continuation where dependent tasks may proceed
with degraded inputs. When coordinator failures occur, recovery
mechanisms leverage distributed checkpoints and coordinator
election protocols to restore execution state on alternative
infrastructure. The framework MUST support human-in-the-loop
recovery scenarios where automated recovery is insufficient,
providing interfaces for human operators to inspect checkpoint
states, approve recovery actions, and inject corrective context
information.
The checkpoint and recovery mechanisms MUST integrate with the
agent authorization framework to ensure that recovery operations
maintain appropriate security boundaries and access controls.
Recovery operations SHOULD verify that participating agents still
possess valid authorization profiles and may require re-
authentication if significant time has elapsed since checkpoint
creation. The framework MUST provide configurable retention
policies for checkpoints, balancing storage efficiency with
recovery requirements, and MUST support secure deletion of
checkpoint data containing sensitive information when retention
periods expire or workflows complete successfully.
7. Integration with Existing Agent Protocols
This section describes how the Agent Task DAG framework integrates
with existing agent authorization, discovery, and communication
protocols to provide a comprehensive multi-agent workflow
execution environment. The framework is designed to be protocol-
agnostic while providing specific bindings for commonly used agent
protocols, enabling organizations to adopt DAG-based workflows
within their existing agent infrastructure.
The DAG framework builds upon the Agent Authorization Profile
(AAP) [draft-aap-oauth-profile] to establish secure task execution
contexts. When a DAG Coordinator initiates workflow execution, it
MUST obtain appropriate authorization tokens for each
participating agent using the structured claims defined in AAP.
The task context claim within the agent's JWT token includes the
DAG identifier, task node assignments, and operational constraints
specific to the workflow. This approach ensures that agents can
verify their authorization to execute specific tasks within the
broader workflow context while maintaining the delegation chains
and human oversight requirements established in their
authorization profiles.
Agent discovery and capability matching for DAG execution
leverages existing agent discovery protocols while extending them
with DAG-specific metadata. Agents participating in DAG workflows
SHOULD advertise their capabilities using structured capability
descriptors that include supported task types, execution
constraints, and checkpoint compatibility. The DAG Coordinator
uses this information during the task binding process to assign
task nodes to appropriate agents. When multiple agents are capable
of executing a particular task type, the coordinator MAY use load
balancing, geographic distribution, or other selection criteria to
optimize workflow execution.
Context distribution among agents executing DAG workflows follows
the mechanisms defined in [draft-chang-agent-context-interaction],
with specific extensions for DAG execution state management. The
execution context for a DAG workflow includes the complete graph
structure, current execution state, intermediate task results, and
checkpoint metadata. Agents MUST receive sufficient context to
execute their assigned tasks while minimizing the distribution of
sensitive information to unauthorized agents. The framework
supports both push-based context distribution, where the DAG
Coordinator sends relevant context to agents before task
execution, and pull-based approaches where agents request specific
context elements as needed.
The framework provides protocol bindings for common agent
communication patterns including HTTP-based REST APIs [RFC9110],
message queuing systems, and real-time communication protocols.
Each binding specifies how DAG execution messages are encoded, how
task results are reported, and how checkpoint operations are
coordinated across the distributed agent environment. Protocol-
specific considerations such as connection management, retry
mechanisms, and error handling are addressed within each binding
specification. For HTTP-based bindings, the framework defines
standardized endpoints for task execution, status reporting, and
checkpoint operations that can be implemented by any agent
supporting the DAG execution protocol.
Integration with existing agent task protocols [draft-cui-ai-
agent-task] is achieved through task node adapters that translate
between DAG task specifications and protocol-specific task
representations. These adapters handle differences in task
parameterization, result formatting, and execution semantics while
preserving the dependency relationships and execution guarantees
required by the DAG framework. The framework also supports
integration with audit and compliance systems through standardized
logging interfaces that capture task execution events,
authorization decisions, and checkpoint operations in formats
compatible with existing security and compliance tools.
8. Security Considerations
The Agent Task DAG framework introduces unique security challenges
that extend beyond traditional single-agent systems. Multi-agent
workflows create expanded attack surfaces through inter-agent
communication channels, shared execution contexts, and distributed
state management. Malicious actors may attempt to inject
unauthorized tasks into DAG structures, manipulate task
dependencies to create privilege escalation paths, or exploit
checkpoint mechanisms to gain persistent access to workflow state.
The distributed nature of DAG execution also amplifies risks
related to agent impersonation, context poisoning, and
unauthorized workflow modification during execution.
Task authorization within DAG workflows MUST leverage the Agent
Authorization Profile [draft-aap-oauth-profile] to establish fine-
grained permissions for each task node. Each task node SHOULD
include authorization requirements that specify which agent
capabilities, delegation chains, and operational constraints are
required for execution. The DAG Coordinator MUST verify that
assigned agents possess valid JWT tokens with appropriate
structured claims before initiating task execution. When tasks
involve sensitive operations or access to protected resources,
implementations SHOULD require fresh token validation rather than
relying on cached authorization state. Multi-step workflows that
span extended time periods MUST implement token refresh mechanisms
to maintain security throughout DAG execution.
Context isolation represents a critical security boundary in
multi-agent DAG systems. Execution contexts MUST be isolated
between different DAG instances to prevent information leakage and
unauthorized access to intermediate results. Implementations
SHOULD use cryptographic techniques to protect context data in
transit and at rest, particularly when context distribution
mechanisms [draft-chang-agent-context-interaction] are employed
across network boundaries. Task nodes that handle sensitive data
MUST implement appropriate data classification and handling
controls, ensuring that context information is only accessible to
authorized agents within the workflow. The framework SHOULD
support configurable context sharing policies that allow
administrators to define which context elements can be shared
between tasks and which must remain isolated.
Audit trail requirements for DAG execution are more complex than
single-agent scenarios due to the distributed and potentially
parallel nature of task execution. Implementations MUST maintain
comprehensive logs that capture DAG initiation, task assignments,
agent authorizations, execution outcomes, and any human
intervention points. Audit records SHOULD include cryptographic
signatures or integrity mechanisms to prevent tampering and
support forensic analysis. The checkpoint and recovery mechanisms
introduce additional logging requirements, as rollback operations
and failure recovery attempts MUST be fully auditable.
Organizations operating in regulated environments MAY require
enhanced audit capabilities that provide real-time monitoring of
DAG execution state and automated alerts for security policy
violations.
The integration of human oversight points within DAG workflows
creates additional security considerations around authentication,
authorization, and workflow integrity. Human operators MUST be
properly authenticated before approving task continuations or
modifying workflow parameters. The framework SHOULD support multi-
factor authentication and role-based access controls for human
intervention points. Implementations MUST ensure that human
approval requirements cannot be bypassed through agent
coordination or DAG manipulation. When human operators modify
workflow parameters or approve exceptional conditions, these
actions MUST be cryptographically signed and integrated into the
workflow's audit trail to maintain end-to-end accountability.
9. IANA Considerations
This document introduces several new protocol elements and
identifiers that require IANA registration to ensure global
uniqueness and interoperability across implementations. The Agent
Task DAG framework extends existing agent communication protocols
with new message types, node classifications, and execution state
identifiers that must be standardized for consistent
implementation.
The specification requires the establishment of a new "Agent Task
DAG Parameters" registry to manage the various identifiers used
within the framework. This registry MUST include sub-registries
for DAG node types, edge relationship types, execution states,
checkpoint types, and recovery action identifiers. Each sub-
registry MUST follow the "Specification Required" registration
policy as defined in [RFC8126], with designated experts reviewing
submissions for technical correctness and consistency with the
overall framework architecture. The registry MUST also accommodate
extensions that integrate with existing agent authorization
profiles as defined in [draft-aap-oauth-profile].
A new "application/vnd.ietf.agent-task-dag+json" media type
registration is REQUIRED for DAG workflow documents. This media
type MUST reference this specification and follow the JSON format
requirements specified in [RFC8259]. The media type enables proper
content negotiation when agents exchange DAG definitions and
execution state information. Additionally, new URI schemes "agent-
dag:" and "agent-task:" are proposed for identifying DAG instances
and individual task nodes respectively, requiring registration in
the "Uniform Resource Identifier (URI) Schemes" registry
maintained by IANA.
The framework introduces new JWT claim names for representing DAG
execution context and task bindings within agent authorization
tokens, extending the structured claims mechanism defined in
[draft-aap-oauth-profile]. These claim names MUST be registered in
the "JSON Web Token Claims" registry established by [RFC7519]. The
new claims include "dagid", "tasknode", "executioncontext",
"checkpointref", and "recovery_state", each with specific semantic
meanings within the DAG execution protocol. Registration of these
claims ensures consistent interpretation across different agent
implementations and authorization servers.
Finally, new HTTP header fields "DAG-Execution-ID" and "DAG-
Checkpoint" are introduced for coordination between agents during
DAG execution. These headers MUST be registered in the "Hypertext
Transfer Protocol (HTTP) Field Name Registry" as defined in
[RFC9110]. The headers enable stateless coordination mechanisms
and support the checkpoint and recovery procedures specified in
this framework, while maintaining compatibility with existing
HTTP-based agent communication protocols.
10. References
10.1. Normative References
[RFC 2119]
RFC 2119
[RFC 8174]
RFC 8174
[RFC 8259]
RFC 8259
[RFC 7519]
RFC 7519
[draft-aap-oauth-profile]
draft-aap-oauth-profile
[draft-cui-ai-agent-task]
draft-cui-ai-agent-task
[draft-guy-bary-stamp-protocol]
draft-guy-bary-stamp-protocol
10.2. Informative References
[RFC 6749]
RFC 6749
[RFC 9110]
RFC 9110
[draft-chang-agent-context-interaction]
draft-chang-agent-context-interaction
[draft-liu-dmsc-acps-arc]
draft-liu-dmsc-acps-arc
[draft-rosenberg-aiproto-framework]
draft-rosenberg-aiproto-framework
[draft-song-oauth-ai-agent-collaborate-authz]
draft-song-oauth-ai-agent-collaborate-authz
[draft-mao-rtgwg-apn-framework-for-ioa]
draft-mao-rtgwg-apn-framework-for-ioa
[draft-nandakumar-ai-agent-moq-transport]
draft-nandakumar-ai-agent-moq-transport
Author's Address
Generated by IETF Draft Analyzer
Family: agent-ecosystem
2026-03-04