Risk Verdict¶

Every enforcement decision now carries a Risk Verdict, a signed, multi-dimensional decision record that replaces the single blended trust score with four independent, explainable dimensions, each with its own evidence trail.

Everyone gives you a trust score. The Xybern Authorisation Layer gives you a signed verdict with a statistical guarantee.

The legacy trust_score field is still returned everywhere and is now derived from the verdict, so existing integrations, threshold policies and dashboards keep working unchanged.

The four dimensions¶

All dimensions are scored 0–100 where higher = safer.

Dimension	What it answers	How it's computed
`intent_alignment`	Does this action serve the agent's declared job?	The verification LLM judges the action against the agent's active Access Profile description. No extra LLM call, it piggybacks on the existing verification pass.
`behavioral_conformance`	Is this action normal for this specific agent?	Five behavioral signals (novel action, time-of-day, rate spike, magnitude outlier, trajectory, action-sequence likelihood) are calibrated with a conformal p-value against the agent's own rolling nonconformity history.
`blast_radius`	How much damage if this action is wrong?	Fully deterministic: reversibility class of the action type, external boundary, monetary value, bulk counts, and data sensitivity (PII / document class). Higher score = smaller blast radius.
`provenance_confidence`	How trustworthy is the chain behind the request?	Deterministic: registry entry, cryptographic identity proof, credential freshness, delegation depth, session containment, federation trust caps.

Conformal calibration¶

behavioral_conformance is not a heuristic score, it is a statistically valid guarantee. The nonconformity of each action is compared against the agent's rolling calibration window (up to 300 past actions):

p = (1 + #{past nonconformity ≥ current}) / (1 + window size)

The evidence string reads, for example:

Anomalous at 97% confidence given 240-action history (nonconformity 68, p=0.031)

The guarantee tightens as the agent's history grows. Until an agent has 20 decisions and 30 calibration points, the dimension reports insufficient_history and is excluded from the aggregate (never penalising cold starts).

Availability and the aggregate¶

A dimension that cannot be computed (no access profile, cold-start agent, LLM-free preflight) is returned with "available": false and is excluded from the aggregate by renormalizing the weights (intent 0.35, conformance 0.25, blast 0.25, provenance 0.15). Federation trust caps apply after blending, exactly as before.

Response shape¶

{
  "decision": "escalate",
  "trust_score": 58,
  "risk_verdict": {
    "verdict_version": 1,
    "decision_id": "enf_ab12cd34ef56",
    "generated_at": "2026-07-04T12:00:00Z",
    "dimensions": {
      "intent_alignment": {
        "score": 82, "label": "aligned", "available": true,
        "evidence": ["Action serves declared job: outbound invoice follow-ups"],
        "source": "llm+access_profile", "profile_id": "apf_..."
      },
      "behavioral_conformance": {
        "score": 64, "label": "typical", "available": true,
        "evidence": ["Anomalous at 36% confidence given 240-action history (p=0.640)"],
        "p_value": 0.64, "nonconformity": 22.5, "history_size": 240
      },
      "blast_radius": {
        "score": 35, "label": "severe", "available": true,
        "evidence": [
          "Financial action class 'transfer_*' (-25)",
          "Monetary value $150,000 (-25)",
          "External boundary: recipient outside org (-15)"
        ],
        "polarity_note": "higher = smaller blast radius"
      },
      "provenance_confidence": {
        "score": 90, "label": "strong", "available": true,
        "evidence": ["Cryptographic identity verified (did:xyb:...)", "Direct action, no delegation chain"]
      }
    },
    "aggregate": {
      "trust_score": 58, "blended_score": 58,
      "weights_used": {"intent_alignment": 0.35, "behavioral_conformance": 0.25, "blast_radius": 0.25, "provenance_confidence": 0.15},
      "renormalized": false, "federation_cap_applied": null, "source": "verdict"
    },
    "recommendation": "escalate",
    "rationale": "Blast radius severe (35): ... Aggregate 58 → escalate.",
    "signature": {"algorithm": "hmac-sha256", "value": "9f2c…", "key_scope": "workspace"}
  }
}

Labels: intent aligned / partial / misaligned · conformance typical / unusual / anomalous · blast contained / moderate / severe · provenance strong / partial / weak.

Signature¶

The verdict is HMAC-SHA256 signed over its canonical JSON (sorted keys, no whitespace, signature field excluded), keyed per workspace with the same scheme as the Provenance Vault. The verdict is also sealed inside the hash-chained vault entry of its decision, so it is doubly tamper-evident: modify one dimension and both the signature and the vault chain break.

To verify a verdict you received:

import hashlib, hmac, json

def verify_verdict(verdict: dict, workspace_id: str, vault_secret: str) -> bool:
    provided = verdict.get("signature", {}).get("value", "")
    unsigned = {k: v for k, v in verdict.items() if k != "signature"}
    canonical = json.dumps(unsigned, sort_keys=True, separators=(",", ":"), default=str)
    key = f"{vault_secret}:{workspace_id}".encode()
    expected = hmac.new(key, canonical.encode(), hashlib.sha256).hexdigest()
    return hmac.compare_digest(expected, provided)

Verdict policies¶

A new verdict policy type conditions directly on the dimensions, expressiveness a single scalar can never give you:

curl -X POST https://xybern.com/api/v1/policies \
  -H "X-API-Key: $XYBERN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Escalate severe blast radius",
    "policy_type": "verdict",
    "decision": "escalate",
    "conditions": {"dimension": "blast_radius", "op": "<", "value": 40}
  }'

AND-combinations:

{
  "conditions": {
    "all": [
      {"dimension": "blast_radius", "op": "<", "value": 40},
      {"dimension": "provenance_confidence", "op": "<", "value": 50}
    ]
  }
}

dimension may be any of the four dimension names or aggregate. Operators: <, <=, >, >=, ==. Verdict policies are evaluated in a second pass after the verdict is built and merged with the usual strictness ordering (block > escalate > allow). An unavailable dimension never triggers a condition, cold-start agents are not penalised.

The dashboard policy builder supports this type directly (Policies → Create → "Risk Verdict, dimension conditions").

Where you see it¶

Escalation review modal, the verdict card (four dimension bars, evidence, signature badge, rationale) sits above the action content, so approvers see why an action needs review, not just a number.
Provenance Vault detail, every enforcement entry shows its sealed verdict.
Decision history rows, compact I/B/R/P dimension chips next to the legacy score.
POST /v1/enforce/intercept, /enforce/batch and /gateway/preflight responses (preflight is LLM-free, so intent_alignment reports unavailable there).

Backwards compatibility¶

trust_score remains in every API response, DB record and vault entry.
Existing threshold policies keep working against the same scalar.
Decisions recorded before the upgrade simply have no risk_verdict; all UI surfaces fall back to the scalar.