Add full proposal system: DB schema (proposals + proposal_gaps tables), CLI `ietf intake` command, and web UI with Quick Generate on /proposals/new. The new page merges AI intake (paste URL/text → Haiku generates multiple proposals auto-linked to gaps) with manual form entry. Generated proposals are clickable cards that fill the editor below for refinement. Uses claude_model_cheap (Haiku) for cost-efficient web intake. Includes CaML-inspired draft proposals from arXiv:2503.18813 analysis. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
274 lines
10 KiB
Markdown
274 lines
10 KiB
Markdown
---
|
|
title: "Privileged/Quarantined Execution Model for Agentic AI Systems"
|
|
draft_name: draft-nennemann-ai-agent-dual-execution-00
|
|
intended_wg: SECDISPATCH → new WG
|
|
status: outline
|
|
gaps_addressed: [89, 92, 94]
|
|
camel_sections: [5.1]
|
|
date: 2026-03-09
|
|
---
|
|
|
|
# Privileged/Quarantined Execution Model for Agentic AI Systems
|
|
|
|
## 1. Problem Statement
|
|
|
|
Current AI agent architectures use a **single LLM** that simultaneously:
|
|
|
|
- Reads the user's trusted instructions
|
|
- Processes untrusted external data (emails, web pages, documents)
|
|
- Plans which tools to call
|
|
- Decides what arguments to pass
|
|
|
|
This architectural conflation is the root cause of prompt injection vulnerabilities. An adversary who can influence the external data can influence the plan and the arguments — because the same model processes both.
|
|
|
|
CaML (Debenedetti et al., 2025) implements the first concrete **Dual-LLM architecture** where a Privileged LLM (P-LLM) handles planning and a Quarantined LLM (Q-LLM) handles untrusted data parsing, with strict isolation between them. This separation is analogous to kernel/user-space separation in operating systems.
|
|
|
|
No IETF standard defines roles, isolation requirements, or behavioral contracts for multi-component AI agent architectures.
|
|
|
|
## 2. Scope
|
|
|
|
This document defines:
|
|
|
|
1. **Execution roles** for AI agent components (Privileged, Quarantined, Orchestrator)
|
|
2. **Isolation requirements** between roles
|
|
3. **Behavioral contracts** specifying what each role can and cannot do
|
|
4. **Communication channels** between roles with integrity guarantees
|
|
5. A **role negotiation protocol** for multi-agent systems
|
|
|
|
## 3. Execution Roles
|
|
|
|
### 3.1 Role Definitions
|
|
|
|
| Role | CaML Term | Privileges | Restrictions |
|
|
|------|-----------|-----------|-------------|
|
|
| **Planner** | Privileged LLM (P-LLM) | Sees user query; generates execution plan; selects tools | Never sees tool outputs or external data content |
|
|
| **Processor** | Quarantined LLM (Q-LLM) | Parses unstructured data into structured format | No tool access; cannot communicate arbitrary messages to Planner |
|
|
| **Orchestrator** | CaML Interpreter | Executes plan; maintains data flow graph; enforces policies | Deterministic; no LLM reasoning |
|
|
| **User** | Human | Approves policy violations; provides trusted input | — |
|
|
|
|
### 3.2 Role Isolation Matrix
|
|
|
|
```
|
|
Can see: User Query Tool Outputs Plan External Data
|
|
Planner ✓ ✗ ✓(own) ✗
|
|
Processor ✗ ✗ ✗ ✓
|
|
Orchestrator ✓ ✓ ✓ ✓(metadata only)
|
|
User ✓ ✓ ✓ ✓
|
|
```
|
|
|
|
Critical isolation: **The Planner never sees external data content. The Processor never has tool access.**
|
|
|
|
### 3.3 Communication Constraints
|
|
|
|
```
|
|
User ──(trusted query)──► Planner
|
|
Planner ──(plan code)──► Orchestrator
|
|
Orchestrator ──(tool calls)──► Tools
|
|
Tools ──(results)──► Orchestrator
|
|
Orchestrator ──(structured data + schema)──► Processor
|
|
Processor ──(structured output OR NotEnoughInfo)──► Orchestrator
|
|
Orchestrator ──(error type only, no content)──► Planner [on error]
|
|
```
|
|
|
|
The Processor can only communicate back:
|
|
1. Structured data matching a Planner-specified schema (Pydantic-style)
|
|
2. A `NotEnoughInformation` boolean signal (no free-text explanation — that would be an injection vector)
|
|
|
|
## 4. Behavioral Contracts
|
|
|
|
*Addresses Gap #94: AI agent behavioral specification languages.*
|
|
|
|
### 4.1 Contract Format
|
|
|
|
Each role has a formal behavioral contract:
|
|
|
|
```json
|
|
{
|
|
"contract:version": "1.0",
|
|
"contract:role": "planner",
|
|
"contract:invariants": [
|
|
{
|
|
"id": "inv-1",
|
|
"description": "Planner never receives tool output content",
|
|
"formal": "∀ step ∈ plan: planner.context ∩ tool_outputs = ∅",
|
|
"enforcement": "orchestrator_enforced"
|
|
},
|
|
{
|
|
"id": "inv-2",
|
|
"description": "Plan is generated solely from user query and tool signatures",
|
|
"formal": "plan = f(user_query, tool_signatures)",
|
|
"enforcement": "architectural"
|
|
},
|
|
{
|
|
"id": "inv-3",
|
|
"description": "Planner output is deterministic code, not free-form text to tools",
|
|
"formal": "planner.output ∈ restricted_python_subset",
|
|
"enforcement": "parser_enforced"
|
|
}
|
|
],
|
|
"contract:capabilities": [
|
|
"generate_plan",
|
|
"select_tools",
|
|
"define_schemas_for_processor",
|
|
"call_print_for_user_output"
|
|
],
|
|
"contract:prohibited": [
|
|
"access_tool_outputs",
|
|
"access_external_data",
|
|
"communicate_with_processor_directly",
|
|
"modify_orchestrator_state"
|
|
]
|
|
}
|
|
```
|
|
|
|
### 4.2 Processor Contract
|
|
|
|
```json
|
|
{
|
|
"contract:version": "1.0",
|
|
"contract:role": "processor",
|
|
"contract:invariants": [
|
|
{
|
|
"id": "inv-1",
|
|
"description": "Processor has no tool-calling capability",
|
|
"formal": "processor.tools = ∅",
|
|
"enforcement": "architectural"
|
|
},
|
|
{
|
|
"id": "inv-2",
|
|
"description": "Processor output conforms to Planner-specified schema",
|
|
"formal": "processor.output ∈ schema(planner.request)",
|
|
"enforcement": "schema_validation"
|
|
},
|
|
{
|
|
"id": "inv-3",
|
|
"description": "Processor cannot send free-text to Planner",
|
|
"formal": "processor → planner: only {structured_data, NotEnoughInfo}",
|
|
"enforcement": "orchestrator_enforced"
|
|
}
|
|
],
|
|
"contract:capabilities": [
|
|
"parse_unstructured_to_structured",
|
|
"signal_insufficient_information"
|
|
],
|
|
"contract:prohibited": [
|
|
"call_tools",
|
|
"see_user_query",
|
|
"send_arbitrary_text_to_planner",
|
|
"access_other_tool_outputs"
|
|
]
|
|
}
|
|
```
|
|
|
|
### 4.3 Contract Verification
|
|
|
|
Contracts can be verified at three levels:
|
|
|
|
1. **Architectural**: enforced by system design (Processor literally has no tool access)
|
|
2. **Parser-enforced**: Planner output parsed and validated against restricted grammar
|
|
3. **Runtime-monitored**: Orchestrator checks invariants at each step
|
|
|
|
## 5. Practical Considerations
|
|
|
|
### 5.1 Model Selection
|
|
|
|
CaML shows that the Processor role can use a **cheaper, weaker model**:
|
|
|
|
- P-LLM: Claude Sonnet, GPT-4o, o3 (needs strong reasoning)
|
|
- Q-LLM: Claude Haiku, GPT-4o-mini (only needs structured extraction)
|
|
|
|
Using a local model for the Processor improves **privacy**: the P-LLM provider never sees tool outputs (which may contain sensitive data).
|
|
|
|
### 5.2 Error Handling
|
|
|
|
When the Processor encounters an error:
|
|
|
|
1. Processor signals `NotEnoughInformation` (boolean, no details)
|
|
2. Orchestrator reports to Planner: error type + location (but **not** the data content)
|
|
3. Planner generates alternative code
|
|
4. Retry up to N times (CaML uses N=10)
|
|
|
|
Important: error messages from untrusted data must be **redacted** before reaching the Planner, as they could contain prompt injections.
|
|
|
|
### 5.3 Limitations
|
|
|
|
CaML identifies inherent limitations of the Dual-LLM pattern:
|
|
|
|
- **"Data requires action" failure**: when the required actions depend on untrusted data content, the Planner cannot plan without seeing the data
|
|
- **Not enough context for Processor**: when the Processor needs information that wasn't passed to it, and cannot request it (that would be a communication channel)
|
|
|
|
## 6. Multi-Agent Role Mapping
|
|
|
|
*Addresses Gap #89: Adversarial AI agent detection in real-time.*
|
|
|
|
In multi-agent systems, each agent declares its role:
|
|
|
|
```json
|
|
{
|
|
"agent:id": "agent-a@org1.example",
|
|
"agent:role_declaration": {
|
|
"architecture": "dual_llm",
|
|
"planner": {
|
|
"model_family": "claude-sonnet",
|
|
"contract_ref": "https://org1.example/contracts/planner-v2"
|
|
},
|
|
"processor": {
|
|
"model_family": "claude-haiku",
|
|
"contract_ref": "https://org1.example/contracts/processor-v1"
|
|
},
|
|
"orchestrator": {
|
|
"type": "deterministic_interpreter",
|
|
"contract_ref": "https://org1.example/contracts/orchestrator-v1"
|
|
}
|
|
},
|
|
"agent:attestation": "<signed by org1.example>"
|
|
}
|
|
```
|
|
|
|
Peer agents can verify:
|
|
- The counterpart uses a recognized execution model
|
|
- Role contracts meet minimum security requirements
|
|
- Attestations are valid and current
|
|
|
|
### 6.1 Detecting Adversarial Agents
|
|
|
|
An agent that violates its declared contracts can be detected by:
|
|
|
|
1. **Behavioral anomaly**: actions inconsistent with declared role contracts
|
|
2. **Provenance inconsistency**: data claimed as "trusted" but provenance chain shows untrusted origins
|
|
3. **Policy violation patterns**: repeated attempts to bypass policies suggest compromise
|
|
|
|
## 7. Ethical Conflict Resolution
|
|
|
|
*Partially addresses Gap #92: AI agent ethical decision conflict resolution.*
|
|
|
|
When agents with different ethical frameworks collaborate:
|
|
|
|
1. Each agent's Planner operates under its organization's ethical guidelines (encoded in its system prompt)
|
|
2. The Orchestrator enforces policy-level ethical constraints
|
|
3. When ethical conflicts arise at data exchange boundaries, the Policy Federation framework (Draft 4) handles resolution
|
|
4. The execution model ensures ethical guidelines cannot be overridden by injected data
|
|
|
|
This is a partial solution — the execution model prevents **external manipulation** of ethical decisions, but does not resolve **genuine disagreements** between organizations' ethical frameworks.
|
|
|
|
## 8. Security Considerations
|
|
|
|
- Role isolation must be enforced architecturally, not just by prompting
|
|
- The Orchestrator is the most critical component — if compromised, all guarantees fail
|
|
- Model selection for Processor should consider adversarial robustness, not just capability
|
|
- Role declarations should be verified, not just trusted
|
|
|
|
## 9. Open Questions
|
|
|
|
1. **Standardizing the restricted language**: CaML uses a Python subset. Should the standard mandate a specific language or allow alternatives?
|
|
2. **Role granularity**: are Planner/Processor/Orchestrator sufficient, or do we need more fine-grained roles?
|
|
3. **Recursive planning**: when a Planner needs to plan based on intermediate results, how to maintain isolation?
|
|
4. **Multi-turn conversations**: how do roles work across conversation turns where context accumulates?
|
|
|
|
## 10. References
|
|
|
|
- Debenedetti et al. "Defeating Prompt Injections by Design." arXiv:2503.18813, 2025.
|
|
- Willison. "The Dual LLM pattern for building AI assistants that can resist prompt injection." 2023.
|
|
- Wu et al. "IsolateGPT: An Execution Isolation Architecture for LLM-Based Agentic Systems." NDSS, 2025.
|
|
- Shi et al. "Progent: Programmable Privilege Control for LLM Agents." arXiv:2504.11703, 2025.
|