Files
ietf-draft-analyzer/workspace/drafts/gap-analysis/draft-nennemann-agent-federation-privacy-00.md
Christian Nennemann 2506b6325a
Some checks failed
CI / test (3.11) (push) Failing after 1m37s
CI / test (3.12) (push) Failing after 57s
feat: add draft data, gap analysis report, and workspace config
2026-04-06 18:47:15 +02:00

26 KiB

fullname: Christian Nennemann
organization: Independent Researcher
email: ietf@nennemann.de

normative: RFC2119: RFC8174: RFC7519: RFC7515: RFC9110: I-D.nennemann-wimse-ect: title: "Execution Context Tokens for Distributed Agentic Workflows" target: https://datatracker.ietf.org/doc/draft-nennemann-wimse-ect/ I-D.nennemann-agent-dag-hitl-safety: title: "Agent Context Policy Token: DAG Delegation with Human Override" target: https://datatracker.ietf.org/doc/draft-nennemann-agent-dag-hitl-safety/

informative: I-D.nennemann-agent-gap-analysis: title: "Gap Analysis of IETF Agent-Related Drafts" target: https://datatracker.ietf.org/doc/draft-nennemann-agent-gap-analysis/

--- abstract

This document defines privacy-preserving protocols for federated agent learning across organizational boundaries and standardized mechanisms for agent migration between protocols, domains, and infrastructure providers while maintaining state and identity continuity. Federated learning enables multiple agent deployments to collaboratively improve without sharing raw data, but requires formal privacy guarantees to prevent data leakage between participants. Cross-protocol migration enables agents to move between environments while preserving operational state and cryptographic identity through Execution Context Tokens (ECTs).

--- middle

Introduction

As AI agents become integral to enterprise workflows, two capabilities emerge as critical yet underspecified: collaborative learning across organizational boundaries and seamless migration between protocol environments.

This document addresses Gap 5 (Federated Learning Privacy) and Gap 8 (Cross-Protocol Migration) as identified in {{I-D.nennemann-agent-gap-analysis}}.

Gap 5 concerns the absence of privacy-preserving protocols for federated agent learning. As agents learn and improve through federation, data leakage between participants must be prevented. Current federated learning research provides theoretical foundations, but no IETF-standard protocol exists that integrates differential privacy, secure aggregation, and privacy budget enforcement into agent communication frameworks.

Gap 8 concerns the lack of standardized mechanisms for agent migration between protocols, domains, and infrastructure providers. As agents need to move between environments -- whether for load balancing, disaster recovery, or organizational restructuring -- state and identity must transfer safely. Without a migration protocol, agents lose context, learned parameters, and cryptographic identity when changing environments.

This document builds on the Execution Context Token (ECT) framework {{I-D.nennemann-wimse-ect}} to provide cryptographic audit trails for both federated learning rounds and migration events, and on the Agent Context Policy Token {{I-D.nennemann-agent-dag-hitl-safety}} to enforce privacy and migration policies within delegation DAGs.

Terminology

{::boilerplate bcp14-tagged}

The following terms are used in this document:

Federated Learning:
A machine learning approach where multiple participants collaboratively train a model without sharing raw data, instead exchanging model updates (gradients or parameters).
Differential Privacy:
A mathematical framework providing formal guarantees that the output of a computation does not reveal whether any individual data point was included in the input, parameterized by epsilon and delta.
Secure Aggregation:
A cryptographic protocol enabling a server to compute the sum of participant updates without learning any individual update.
Privacy Budget:
A cumulative bound (epsilon) on the total privacy loss incurred across multiple rounds of federated learning, enforced to prevent gradual information leakage.
Data Leakage:
The unintended exposure of private training data through model updates, inference attacks, or side channels during federated learning.
Agent Migration:
The process of transferring an agent's operational state, identity, and capabilities from one protocol environment, domain, or infrastructure provider to another.
State Transfer:
The serialization, transmission, and deserialization of an agent's internal state during migration, including context, memory, learned parameters, and active tasks.
Identity Continuity:
The property that an agent's cryptographic identity (e.g., SPIFFE ID and associated ECT chain) remains verifiable across migration boundaries.
Protocol Bridge:
A component that translates agent communication between different protocols (e.g., A2A to MCP), maintaining semantic equivalence of messages and state.
Migration Handoff:
The coordinated process by which the source environment transfers responsibility for an agent to the destination environment, including state transfer and identity re-attestation.

Federated Agent Learning Privacy

Federated Learning Architecture for Agents

Federated learning for agents follows a topology where participant agents contribute model updates to an aggregation function without exposing their local training data.

+---------------------------------------------------+
|              Federation Topology                   |
|                                                    |
|   Star:        Ring:        Hierarchical:          |
|                                                    |
|      [Agg]      A1--A2       [Root Agg]            |
|     / | \       |    |       /        \            |
|   A1  A2  A3   A3--A4    [Sub-Agg]  [Sub-Agg]     |
|                           /   \       /   \        |
|                         A1    A2    A3    A4       |
|                                                    |
+---------------------------------------------------+

  [Agg] = Aggregation Server
  A1..A4 = Participant Agents

  Data flow (Star topology):

  A1 ---local_update---> [Agg]
  A2 ---local_update---> [Agg]
  A3 ---local_update---> [Agg]
        [Agg] computes aggregate
  A1 <--global_model--- [Agg]
  A2 <--global_model--- [Agg]
  A3 <--global_model--- [Agg]

{: #fig-federation-arch title="Federated Learning Topologies for Agents"}

Three topologies are defined:

Star Topology:
A central aggregation server receives updates from all participant agents and distributes the aggregated model. This is the simplest topology but creates a single point of trust.
Ring Topology:
Participant agents pass updates around a ring, each adding its own contribution before forwarding. This eliminates the central server but increases latency.
Hierarchical Topology:
Sub-aggregation servers collect updates from subsets of agents before forwarding to a root aggregator. This scales to large federations while limiting exposure at each level.

The aggregation server (or function, in ring topology) MUST NOT have access to individual agent updates in plaintext when secure aggregation is enabled.

Privacy Mechanisms

Differential Privacy for Model Updates

Participant agents MUST apply differential privacy to model updates before transmission. Each update is clipped to a maximum L2 norm S and perturbed with calibrated Gaussian noise:

  • Clipping bound S: limits the influence of any single data point
  • Noise scale sigma: calibrated to achieve (epsilon, delta)- differential privacy for each round
  • Composition: total privacy loss across T rounds is tracked using the moments accountant or Renyi differential privacy

The privacy parameters MUST be declared in the federation configuration and enforced by each participant agent.

Secure Aggregation Protocol

The aggregation server MUST implement a secure aggregation protocol such that:

  1. Each participant agent secret-shares its update using pairwise keys established with other participants.
  2. The aggregation server collects masked updates from all participants.
  3. After a configurable threshold of participants have submitted updates, the server reconstructs only the aggregate sum.
  4. Individual updates are never available to the server in plaintext.

Dropped participants are handled by reconstructing their masking contributions from the shares held by surviving participants.

Privacy Budget Tracking and Enforcement

Each federation MUST maintain a privacy budget tracker that records cumulative epsilon expenditure per participant. The tracker MUST:

  • Record the epsilon cost of each federated learning round
  • Refuse to include a participant whose cumulative epsilon would exceed the configured maximum budget
  • Support budget refresh after a configurable cooldown period
  • Report remaining budget to participants upon request

Privacy budget state MUST be recorded in ECTs (see {{ect-integration}}) to provide a cryptographic audit trail of privacy expenditure.

Gradient Compression with Privacy Preservation

To reduce communication overhead, participants MAY compress model updates using techniques such as top-k sparsification or random sparsification. Compression MUST NOT reduce the effective privacy guarantee below the declared epsilon -- noise MUST be added after compression, calibrated to the compressed update's sensitivity.

Data Leakage Prevention

Membership Inference Attack Mitigation

Federation participants MUST apply differential privacy at sufficient epsilon levels to bound the success rate of membership inference attacks. The aggregation server SHOULD monitor update distributions for anomalous patterns indicative of membership inference attempts.

Model Inversion Attack Prevention

To prevent reconstruction of training data from model updates:

  • Updates MUST be clipped and noised per the differential privacy mechanism defined above.
  • The aggregation server MUST NOT release per-participant update statistics.
  • Participants SHOULD limit the number of rounds in which they participate with unchanged local data.

Update Poisoning Detection

The aggregation server MUST implement poisoning detection to identify malicious updates that attempt to corrupt the global model:

  • Statistical outlier detection on update norms and directions
  • Byzantine-robust aggregation (e.g., coordinate-wise median or trimmed mean) as an alternative to simple averaging
  • Participants submitting suspected poisoned updates SHOULD be flagged and excluded from subsequent rounds pending review

Privacy Attestation via ECT

Each federated learning round MUST produce an ECT {{I-D.nennemann-wimse-ect}} attesting to the privacy mechanisms applied. The ECT ext claim MUST include:

{
  "ext": {
    "fed.round_id": "round-42",
    "fed.epsilon": 1.5,
    "fed.delta": 1e-5,
    "fed.participants": 12,
    "fed.aggregation": "secure_aggregation",
    "fed.poisoning_detected": false
  }
}

{: #fig-privacy-attestation title="Privacy Attestation in ECT Extension Claims"}

Privacy Policy Format

Federation participants MUST publish a machine-readable privacy policy document describing their federation parameters. The policy is a JSON object:

{
  "federation_policy_version": "1.0",
  "max_epsilon_per_round": 2.0,
  "max_total_epsilon": 10.0,
  "delta": 1e-5,
  "secure_aggregation_required": true,
  "min_participants": 3,
  "budget_refresh_seconds": 86400,
  "allowed_topologies": ["star", "hierarchical"],
  "data_categories_excluded": ["PII", "PHI"]
}

{: #fig-privacy-policy title="Machine-Readable Privacy Policy"}

Privacy level claims SHOULD be included in the ECT ext field as fed.policy_hash, containing the SHA-256 hash of the federation privacy policy document, enabling verifiers to confirm that a specific policy was in effect during a learning round.

Cross-Protocol Agent Migration

Migration Model

+-----------------------------------------------------------+
|                   Migration Flow                          |
|                                                           |
|  Source Domain (Protocol A)    Dest Domain (Protocol B)   |
|  +---------------------+      +---------------------+    |
|  |                     |      |                     |    |
|  |  [Agent]            |      |            [Agent]  |    |
|  |    |                |      |              ^      |    |
|  |    | 1.trigger      |      |              |      |    |
|  |    v                |      |         5.resume    |    |
|  |  [Serialize State]  |      |              |      |    |
|  |    |                |      |  [Deserialize State]|    |
|  |    | 2.package      |      |       ^      |      |    |
|  |    v                |      |       |4.recv|      |    |
|  |  [Migration Msg]----|--3.transfer--|------+      |    |
|  |                     |      |                     |    |
|  +---------------------+      +---------------------+    |
|                                                           |
|  ECT Chain: migration_start -> migration_transfer         |
|                             -> migration_complete         |
+-----------------------------------------------------------+

{: #fig-migration-model title="Agent Migration Between Domains"}

Migration Protocol

Migration Trigger Events and Conditions

A migration MAY be triggered by any of the following events:

  • Operator-initiated domain transfer
  • Load balancing across infrastructure providers
  • Disaster recovery failover
  • Protocol deprecation requiring protocol change
  • Policy-driven relocation (e.g., data sovereignty requirements)

The migration trigger MUST be recorded in an ECT with exec_act set to "migration_start".

Pre-Migration Capability Check

Before initiating migration, the source environment MUST verify that the destination environment supports the agent's required capabilities:

  1. Query the destination's capability advertisement endpoint.
  2. Verify that all required agent capabilities can be mapped to the destination protocol.
  3. Verify that the destination accepts the agent's identity format (e.g., SPIFFE ID).
  4. Confirm sufficient resources at the destination for the agent's state size.

If any check fails, the migration MUST be aborted and an error reported to the triggering entity.

State Serialization Format

Agent state MUST be serialized using CBOR (Concise Binary Object Representation) with the following top-level structure:

migration_state = {
  "version": uint,            ; serialization format version
  "agent_id": tstr,           ; agent SPIFFE ID
  "source_protocol": tstr,    ; source protocol identifier
  "dest_protocol": tstr,      ; destination protocol identifier
  "timestamp": uint,          ; Unix timestamp of serialization
  "state": {
    "context": bstr,          ; conversation/task context
    "memory": bstr,           ; long-term memory store
    "learned_params": bstr,   ; model parameters or embeddings
    "active_tasks": [* task]  ; in-progress task descriptors
  },
  "ect_chain": [* tstr],      ; ECT JWS chain for identity
  "integrity": tstr            ; HMAC-SHA256 of state fields
}

{: #fig-state-format title="CBOR Migration State Structure"}

Identity Transfer and Re-Attestation

During migration, the agent's identity MUST be preserved through the ECT chain:

  1. The source environment issues a migration ECT with the full ECT chain as the par claim.
  2. The destination environment verifies the ECT chain back to a trusted root.
  3. The destination environment issues a new ECT for the agent with exec_act set to "migration_complete" and par referencing the migration transfer ECT.
  4. The agent's SPIFFE ID remains unchanged; only the issuing authority for new ECTs changes.

Post-Migration Verification

After migration completes, the destination environment MUST:

  1. Verify state integrity using the HMAC in the migration payload.
  2. Deserialize and load the agent state.
  3. Execute a capability self-test to confirm operational readiness.
  4. Issue the "migration_complete" ECT.
  5. Notify the source environment of successful migration so it can release resources.

If verification fails, the destination MUST notify the source environment, which SHOULD retain the agent in its original state for retry or rollback.

State Transfer

Agent State Components

An agent's transferable state consists of four components:

Context:
The current conversation or task execution context, including recent message history and active reasoning chains.
Memory:
Long-term memory stores such as retrieval-augmented generation (RAG) indices, episodic memory, or key-value caches.
Learned Parameters:
Fine-tuned model weights, adapter layers, embeddings, or reinforcement learning policies specific to the agent's role.
Active Tasks:
In-progress task descriptors including task ID, current step, dependencies, and expected outputs.

Incremental State Transfer for Large State

For agents with state exceeding 10 MB, incremental transfer MUST be supported:

  1. The source environment transmits a state manifest listing all state chunks with their SHA-256 hashes.
  2. The destination environment requests only chunks it does not already possess (delta transfer).
  3. Each chunk transfer is individually acknowledged.
  4. After all chunks are received, the destination assembles the complete state and verifies the root hash.

State Integrity Verification

State integrity MUST be verified at each stage:

  • Before transmission: source computes HMAC-SHA256 over the serialized state using a key derived from the migration ECT.
  • During transmission: TLS provides transport integrity.
  • After reception: destination recomputes and verifies the HMAC.
  • After deserialization: destination runs a state consistency check (e.g., verifying that active task references resolve).

Protocol Bridges

Bridge Architecture for Common Protocols

Protocol bridges translate agent communication between protocols while preserving semantic equivalence. A bridge MUST support bidirectional translation for each protocol pair it advertises.

[Agent (A2A)] <--A2A--> [Bridge] <--MCP--> [Agent (MCP)]
                           |
                      [ECT Logger]

{: #fig-bridge-arch title="Protocol Bridge Architecture"}

Each bridge instance MUST:

  • Maintain a mapping table for message types between protocols.
  • Preserve task identifiers across protocol boundaries.
  • Record each translation as an ECT with exec_act set to "bridge_translate".

Context Translation Rules

When translating context between protocols, bridges MUST:

  • Map equivalent fields (e.g., A2A "task" to MCP "resource").
  • Preserve all metadata as extension fields where direct mapping is not available.
  • Flag semantic mismatches in the translation ECT's ext claim under bridge.warnings.

Capability Re-Mapping

Agent capabilities expressed in the source protocol MUST be re-mapped to the closest equivalent in the destination protocol. Capabilities with no equivalent MUST be listed in the migration state as unmapped_capabilities so the destination environment can handle them appropriately (e.g., by loading additional tooling or reporting reduced functionality).

Privacy During Migration

Context Sanitization Before Transfer

Before state serialization, the source environment MUST sanitize the agent's context by:

  • Removing credentials, API keys, and session tokens.
  • Redacting PII unless the destination is authorized to receive it per the agent's privacy policy.
  • Stripping environment-specific configuration (e.g., internal hostnames, file paths).

Selective State Disclosure

The migration protocol supports selective state disclosure: the source environment MAY omit state components that the destination is not authorized to receive. The migration state manifest indicates which components are included and which are withheld, allowing the destination to request missing components through an authorized channel if needed.

No-Context-Leakage Guarantees to New Host

The destination environment MUST NOT have access to state components that were excluded during selective disclosure. The migration protocol provides the following guarantees:

  • State components are individually encrypted with component- specific keys.
  • Only authorized components have their keys transmitted to the destination.
  • The destination cannot derive keys for withheld components from the keys it receives.
  • The migration ECT records which components were transferred, enabling audit of information flow.

ECT Integration

Privacy Attestation Claims

ECTs produced during federated learning rounds MUST include privacy attestation claims in the ext field as defined in {{privacy-attestation-via-ect}}. These claims enable any verifier in the ECT chain to confirm that appropriate privacy mechanisms were applied without accessing the underlying data.

Migration Evidence Chain

Migration events produce a chain of three ECTs that together provide a complete cryptographic record of the migration:

ECT 1: exec_act = "migration_start"
  - Records: trigger reason, source domain, agent ID
  - par: references the agent's most recent operational ECT

ECT 2: exec_act = "migration_transfer"
  - Records: state hash, components transferred, dest domain
  - par: references ECT 1
  - inp_hash: SHA-256 of serialized migration state

ECT 3: exec_act = "migration_complete"
  - Records: verification result, new domain, resumed capabilities
  - par: references ECT 2
  - Issued by: destination environment

{: #fig-migration-ect-chain title="Migration ECT Evidence Chain"}

This three-ECT chain ensures that migration events are non-repudiable and auditable. Any party with access to the ECT chain can verify that a migration occurred, what state was transferred, and whether it completed successfully.

Federation Participation Records

Each agent's participation in federated learning MUST be recorded in the ECT DAG. The aggregation server issues a per-round ECT with exec_act set to "fed_aggregate" and par referencing the ECTs of all participating agents for that round. This creates a verifiable record of federation participation without revealing the content of individual updates.

Security Considerations

Privacy Budget Exhaustion Attacks

An attacker controlling the aggregation server or a quorum of participants could attempt to exhaust a victim participant's privacy budget by triggering excessive learning rounds. Mitigations include:

  • Participant-side rate limiting on round participation.
  • Privacy budget enforcement at the participant, not solely at the aggregation server.
  • ECT-based audit trails enabling detection of abnormal round frequency.

Migration Hijacking

An attacker could attempt to redirect a migration to a malicious destination. Mitigations include:

  • Mutual TLS authentication between source and destination.
  • Destination identity verification via SPIFFE ID in the migration ECT.
  • Operator confirmation for migrations to previously unknown destinations.

State Tampering During Transfer

An attacker with access to the network path could attempt to modify the migration state in transit. Mitigations include:

  • HMAC-SHA256 integrity protection of the serialized state.
  • TLS 1.3 for transport security.
  • Post-migration state verification at the destination.
  • ECT inp_hash recording the expected state hash.

Protocol Bridge Vulnerabilities

Protocol bridges are trusted intermediaries that could be compromised to:

  • Modify messages during translation.
  • Exfiltrate sensitive data from translated messages.
  • Inject malicious content into translated messages.

Mitigations include:

  • ECT audit trails for all bridge translations.
  • Input/output hash verification via inp_hash/out_hash.
  • Bridge attestation using hardware security modules where available.

Federation Participant Compromise

A compromised participant could attempt to:

  • Submit poisoned updates to corrupt the global model.
  • Conduct inference attacks on other participants' updates observed during ring topology forwarding.
  • Collude with the aggregation server to bypass secure aggregation.

Mitigations include:

  • Byzantine-robust aggregation algorithms.
  • Secure aggregation preventing server access to individual updates.
  • Anomaly detection on update distributions.
  • ECT-based participation records enabling forensic analysis.

IANA Considerations

This document requests the following IANA registrations:

ECT Action Type Registry

Registration of the following exec_act values in a future ECT action type registry:

Value Description
migration_start Agent migration initiated
migration_transfer Agent state transferred
migration_complete Agent migration completed
fed_aggregate Federated learning round aggregated
bridge_translate Protocol bridge translation

ECT Extension Claims Registry

Registration of the following ext claim prefixes:

Prefix Description
fed. Federated learning privacy claims
mig. Migration-related claims
bridge. Protocol bridge claims

Media Type Registration

Registration of the following media type:

  • Type name: application
  • Subtype name: agent-migration-state+cbor
  • Required parameters: none
  • Optional parameters: version
  • Encoding: binary (CBOR)
  • Purpose: Serialized agent migration state for cross-protocol agent migration as defined in this document.

--- back

Acknowledgments

{:numbered="false"}

This document builds on the Execution Context Token specification {{I-D.nennemann-wimse-ect}} and the Agent Context Policy Token {{I-D.nennemann-agent-dag-hitl-safety}}. The gap analysis {{I-D.nennemann-agent-gap-analysis}} identified the requirements addressed by this document.