Internet-Draft AI/Agent WG Intended status: Standards Track March 2026 Expires: September 15, 2026 Dynamic Agent Trust Scoring (DATS) draft-dats-dynamic-agent-trust-scoring-00 Abstract This document defines the Dynamic Agent Trust Scoring (DATS) protocol, a mechanism for AI agents to build, assess, and revoke trust relationships based on observed behavior over time. Static authentication (certificates, API keys) verifies identity but says nothing about whether an agent is reliable, accurate, or well-behaved. DATS augments identity-based auth with a numeric trust score that adjusts dynamically based on interaction outcomes. The protocol defines trust score computation, propagation between agents, decay over inactivity, and threshold-based access policies. DATS is intentionally simple: a single score per agent-pair, standard adjustment events, and a JWT-based transport for trust assertions. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document is intended to have Standards Track status. Distribution of this memo is unlimited. Table of Contents 1. Introduction 2. Terminology 3. Problem Statement 4. Trust Score Model 5. Trust Events and Adjustments 6. Trust Propagation 7. Threshold-Based Access Policies 8. Security Considerations 9. IANA Considerations 1. Introduction The IETF has 98 drafts addressing agent identity and authentication, providing strong mechanisms for verifying who an agent is. But identity alone is insufficient for long- running autonomous systems. A properly authenticated agent may still produce bad results, violate expectations, or degrade over time. Static certificates cannot capture this. DATS adds a behavioral dimension to agent trust. It answers the question: "I know who you are, but should I rely on you?" The model is deliberately simple — a single floating-point score between 0.0 and 1.0 per agent relationship — because complex reputation systems tend to be gamed or ignored. The protocol is inspired by: - TCP congestion control: trust increases slowly (additive) and decreases quickly (multiplicative) on failure. - TLS certificate transparency: trust assertions are logged for auditability. - Web of trust (PGP): trust can propagate through intermediaries, with attenuation. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Trust Score: A floating-point value in [0.0, 1.0] representing one agent's assessed reliability of another, based on observed interaction outcomes. Trust Event: An observable interaction outcome that causes a trust score adjustment. Events are either positive (task completed successfully) or negative (task failed, timeout, policy violation). Trust Decay: The automatic reduction of trust scores over periods of inactivity, reflecting the principle that trust requires ongoing evidence. Trust Assertion: A signed statement by one agent about another agent's trust score, transportable as a JWT claim. 3. Problem Statement Agent A delegates a task to Agent B. Agent B completes it correctly. Agent A delegates again. After 100 successful interactions, Agent B starts returning subtly incorrect results (model drift, adversarial manipulation, or simple degradation). Agent A has no standard way to: 1. Track B's reliability over time. 2. Reduce B's privileges based on degraded performance. 3. Share its experience with Agent C, who is considering delegating to Agent B. 4. Automatically revoke B's access when trust drops below acceptable levels. Existing attestation drafts (STAMP, DAAP) provide cryptographic proof of specific actions but not ongoing behavioral assessment. DATS fills this gap. 4. Trust Score Model Each agent maintains a trust table: a mapping from peer agent IDs to trust scores. { "urn:uuid:agent-b": { "score": 0.82, "interactions": 147, "last_updated": "2026-03-01T11:30:00Z", "last_event": "task_success" } } Initial trust for an unknown agent is a deployment-configured default. A value of 0.5 is RECOMMENDED as a neutral starting point, but deployments MAY use lower values (e.g., 0.1) for zero-trust environments. Trust scores are updated using an additive-increase, multiplicative-decrease (AIMD) algorithm: On positive event: score = min(1.0, score + alpha) On negative event: score = max(0.0, score * beta) Default parameters: alpha = 0.01, beta = 0.8. This means trust builds slowly (100 successes to go from 0.5 to ~1.0) but drops quickly (a single failure reduces an 0.82 score to 0.66). This asymmetry is intentional: in autonomous systems, the cost of trusting a bad agent exceeds the cost of being slow to trust a good one. Agents MAY tune alpha and beta per relationship or per action type, but MUST use the AIMD structure. 5. Trust Events and Adjustments The following standard trust events are defined: | Event | Direction | Default Weight | |----------------------|-----------|----------------| | task_success | positive | 1x alpha | | task_partial_success | positive | 0.5x alpha | | task_failure | negative | 1x beta | | task_timeout | negative | 1x beta | | policy_violation | negative | applied twice | | attestation_invalid | negative | applied twice | | rollback_triggered | negative | 1x beta | "applied twice" means the multiplicative decrease is applied two times in succession (score * beta * beta), reflecting the severity of policy violations versus simple failures. Trust decay: if no interaction occurs for a configurable period (default: 7 days), the trust score decays: score = max(initial_default, score - decay_rate) Default decay_rate: 0.01 per day. This ensures that stale trust relationships gradually return to the default level rather than persisting indefinitely. Agents MUST record all trust events in a local audit log. 6. Trust Propagation Agent A may share its trust assessment of Agent B with Agent C through a signed trust assertion. The assertion is a JWT (RFC 7519) with the following claims: { "iss": "urn:uuid:agent-a", "sub": "urn:uuid:agent-b", "iat": 1709294400, "exp": 1709380800, "dats_score": 0.82, "dats_interactions": 147, "dats_confidence": "high" } "dats_confidence" is based on interaction count: "low" (<10), "medium" (10-99), "high" (100+). When Agent C receives this assertion, it MAY incorporate it into its own trust score for Agent B using attenuation: c_score_for_b = max(c_score_for_b, a_score_for_b * trust_of_a * attenuation) Where: - a_score_for_b is Agent A's reported score for B (0.82) - trust_of_a is Agent C's trust score for Agent A - attenuation is a constant (default: 0.5) preventing unbounded trust propagation Trust assertions are advisory. Agents MUST NOT blindly adopt propagated scores. An agent's own direct observations always take precedence over propagated trust. To prevent trust laundering (colluding agents inflating each other's scores), agents SHOULD limit propagation depth to 1 hop by default. The "dats_hops" claim tracks propagation depth; agents MUST NOT propagate assertions where dats_hops exceeds their configured maximum. 7. Threshold-Based Access Policies Agents SHOULD define trust thresholds for different action categories: { "thresholds": { "read_data": 0.3, "execute_task": 0.5, "modify_config": 0.7, "delegate_auth": 0.9 } } When an agent requests an action, the serving agent checks the requester's trust score against the threshold for that action type. If the score is below the threshold, the request is denied with a 403 response including a DATS-specific error: { "error": "trust_insufficient", "required_score": 0.7, "current_score": 0.54, "action": "modify_config" } The response SHOULD NOT reveal the exact current score in production deployments to prevent score probing. Instead, it MAY return only the "trust_insufficient" error. Automatic revocation: when an agent's trust score drops below a configured floor (default: 0.2), the trusting agent SHOULD revoke all outstanding delegations and emit a trust revocation event. This provides automatic containment of agents that have become unreliable. 8. Security Considerations Trust scores are sensitive metadata. Agents MUST NOT expose their full trust tables to peers. Only pairwise trust assertions (Section 6) should be shared, and only intentionally. Trust assertion JWTs MUST be signed using algorithms from RFC 7518 (e.g., ES256, EdDSA). Agents MUST verify signatures before processing trust assertions. Score manipulation attacks: a malicious agent could intentionally behave well for many interactions to build trust, then exploit high trust for a damaging action. Mitigation: policy_violation events apply double penalties, and deployments SHOULD set trust thresholds high for critical actions regardless of accumulated trust. Sybil attacks: an attacker could create many agents to generate fake positive trust assertions. Mitigation: agents SHOULD weight propagated trust by their own direct trust in the asserting agent (Section 6 attenuation) and SHOULD require agents to be registered in a trusted directory (e.g., ANS) before accepting trust assertions. All trust-related communications MUST use TLS 1.3 [RFC8446]. 9. IANA Considerations This document requests IANA establish the following: 1. Registration of JWT claims "dats_score", "dats_interactions", "dats_confidence", and "dats_hops" in the JSON Web Token Claims registry per RFC 7519. 2. A "DATS Trust Event Type" registry under Specification Required policy. Initial entries: "task_success", "task_partial_success", "task_failure", "task_timeout", "policy_violation", "attestation_invalid", "rollback_triggered". Author's Address Generated by IETF Draft Analyzer 2026-03-01