ietf-draft-analyzer/data/reports/draft-proposals/camel-inspired/01-capability-security-policies.md

---
title: "Capability-Based Security Policies for AI Agent Tool Use"
draft_name: draft-nennemann-ai-agent-capability-policies-00
intended_wg: SECDISPATCH → new WG or WIMSE
status: outline
gaps_addressed: [86, 89, 93]
camel_sections: [5.2, 5.3]
date: 2026-03-09
---

# Capability-Based Security Policies for AI Agent Tool Use

## 1. Problem Statement

AI agents interact with external tools (APIs, filesystems, messaging services) on behalf of users. Current agent frameworks allow any tool to receive any data, with no mechanism to restrict what an agent can do with a particular piece of information. This leads to:

- **Data exfiltration**: an agent tricked into sending private data to unauthorized recipients
- **Resource abuse**: agents consuming unbounded computational, network, or API resources
- **Privacy violations**: sensitive data flowing to tools that should never see it

CaML (Debenedetti et al., 2025) demonstrates that associating **capabilities** (provenance + access control metadata) with every data value, and checking **security policies** before each tool invocation, can provide provable security guarantees against prompt injection attacks — without modifying the underlying LLM.

No IETF standard currently defines how capabilities should be represented, propagated, or enforced in AI agent systems.

## 2. Scope

This document defines:

1. A **capability metadata schema** for tagging data values with provenance and access control
2. A **security policy expression format** for defining per-tool invocation constraints
3. A **policy enforcement protocol** for checking capabilities against policies before tool execution
4. Integration points with existing authorization frameworks (OAuth 2.0, GNAP, WIMSE)

Out of scope:

- The internal architecture of the agent (Dual-LLM vs. single LLM)
- Specific tool implementations
- Model training or fine-tuning requirements

## 3. Key Concepts from CaML

### 3.1 Capabilities

From CaML §5.3: Capabilities are tags assigned to each value describing:

- **Sources**: where the data came from (user input, specific tool, LLM transformation)
- **Readers**: who is allowed to access the data (public, specific users/email addresses, specific tools)

CaML's implementation tracks:
- `User` provenance (literals from trusted user query)
- `CaMeL` provenance (results of interpreter transformations)
- Tool-specific provenance (identified by unique tool invocation ID)
- Inner sources (e.g., the sender of an email retrieved by `read_email`)

### 3.2 Security Policies

From CaML §5.2: Security policies are functions that take a tool name and its arguments (with capability metadata) and return `Allowed` or `Denied`. Example (from CaML Figure 6):

```
# Calendar event: title/description must be readable by participants,
# OR all participants must come from user (trusted source)
if is_trusted(participants):
    return Allowed()
if not can_readers_read_value(participants_set, kwargs["title"]):
    return Denied("Title is not readable by participants")
```

Policies can be:
- **Global**: apply to all tools (e.g., "never send PII to external services")
- **Per-tool**: specific to one tool (e.g., `send_email` requires recipient to be able to read body)
- **Contextual**: depend on runtime state

## 4. Proposed Wire Format

### 4.1 Capability Metadata Object

```json
{
  "cap:version": "1.0",
  "cap:value_id": "val-2f8a3c",
  "cap:sources": [
    {
      "type": "user",
      "trust_level": "trusted"
    },
    {
      "type": "tool",
      "tool_id": "read_email",
      "invocation_id": "inv-9d2e1f",
      "inner_source": "sender:bob@example.com"
    }
  ],
  "cap:readers": {
    "type": "set",
    "members": ["user", "bob@example.com"]
  },
  "cap:transformations": [
    {
      "type": "llm_extraction",
      "model_role": "quarantined",
      "input_values": ["val-1a2b3c"],
      "timestamp": "2026-03-09T14:30:00Z"
    }
  ]
}
```

### 4.2 Security Policy Definition

```json
{
  "policy:version": "1.0",
  "policy:id": "pol-send-email",
  "policy:tool": "send_email",
  "policy:rules": [
    {
      "description": "Recipient must come from user or be readable by all other param sources",
      "check": "sources_trusted_or_readers_match",
      "params": ["recipient"],
      "against": ["body", "subject", "attachments"]
    },
    {
      "description": "Attachments must be readable by recipient",
      "check": "readers_include",
      "params": ["attachments"],
      "must_include": "{{recipient}}"
    }
  ],
  "policy:on_violation": "deny_with_user_prompt"
}
```

### 4.3 Policy Evaluation Result

```json
{
  "result": "denied",
  "policy_id": "pol-send-email",
  "rule_index": 1,
  "reason": "Attachment 'confidential.txt' sources=[tool:cloud_storage] readers=[user, file_editors] — recipient 'attacker@evil.com' not in readers",
  "remediation": "user_approval_required"
}
```

## 5. Protocol Flow

```
User Query
    │
    ▼
┌──────────────┐
│ Agent Planner │  (Privileged — sees only user query)
│ generates plan│
└──────┬───────┘
       │  plan = [(tool₁, args₁), (tool₂, args₂), ...]
       ▼
┌──────────────────┐
│ Capability Engine │  (Interpreter / orchestrator)
│                  │
│  For each step:  │
│  1. Execute tool │
│  2. Tag result   │──► Capability metadata attached
│     with caps    │
│  3. Check policy │──► Policy evaluation
│     before next  │    ├─► Allowed → proceed
│     tool call    │    ├─► Denied → halt + explain
│  4. Propagate    │    └─► User prompt → ask user
│     caps through │
│     transforms   │
└──────────────────┘
```

## 6. Integration Points

### 6.1 With WIMSE / ECT

- Capability `sources` map to WIMSE workload identities
- Policy evaluation results can be recorded as ECT claims
- Trust domain boundaries in WIMSE correspond to capability reader boundaries

### 6.2 With MCP (Model Context Protocol)

- MCP tool definitions extended with `required_capabilities` field
- MCP tool results extended with `capability_metadata` field
- MCP servers can declare their security policies

### 6.3 With OAuth 2.0 / GNAP

- OAuth scopes are coarse-grained (per-API); capabilities are fine-grained (per-value)
- Capability `readers` can reference OAuth client IDs or GNAP access tokens
- Policy enforcement complements (not replaces) OAuth authorization

## 7. Security Considerations

- Capability metadata must be integrity-protected (signed) to prevent tampering
- Policy definitions must come from a trusted source (the platform, not the agent)
- Capability propagation through LLM transformations is inherently lossy — conservative defaults required
- Side-channel leakage through policy denial patterns (see Draft 6)

## 8. Open Questions

1. **Granularity**: per-value capabilities (CaML) vs. per-message capabilities — performance tradeoff?
2. **Composability**: how do capabilities compose when data from multiple sources is merged?
3. **Delegation**: can an agent delegate capabilities to sub-agents?
4. **Revocation**: how are capabilities revoked when trust relationships change?
5. **Policy conflict resolution**: when multiple policies apply, which wins?

## 9. References

- Debenedetti et al. "Defeating Prompt Injections by Design." arXiv:2503.18813, 2025.
- Needham & Walker. "The Cambridge CAP computer and its protection system." ACM SIGOPS, 1977.
- Watson et al. "Capsicum: Practical Capabilities for UNIX." USENIX Security 10, 2010.
- Watson et al. "CHERI: A hybrid capability-system architecture." IEEE S&P, 2015.
- Morgan. "libcap: POSIX capabilities support for Linux." 2013.
- draft-ietf-wimse-arch (WIMSE architecture)
- draft-nennemann-wimse-ect (Execution Context Tokens)