Self-Hosted Relay¶

Run Xybern's enforcement plane inside your own network. The relay authorises AI actions locally, action content never leaves your infrastructure, while still pulling policies from, and writing a complete audit trail to, the Xybern control plane.

It's a drop-in: point the Python SDK at the relay and everything else (agent registration, policy CRUD, device login) is transparently proxied upstream. Only intercepts are served locally.

Why run a relay?¶

	Cloud (default)	Self-hosted relay
Data residency	Action content sent to Xybern (or hashed with `redact`)	Content never leaves your network, only metadata audit records are forwarded
Latency	One round-trip per action	Local, in-process evaluation
Availability	Needs the cloud reachable	Keeps enforcing from cached policies if the cloud is briefly unreachable
Audit	Provenance Vault	The same tamper-evident Provenance Vault, forwarded asynchronously

Architecture¶

   your agent ──▶ Xybern SDK ──▶  Xybern Relay  ──▶ (audit only) ──▶ Xybern Cloud
                                  │  ▲                                  │
                  local policy eval│  │  policy cache (refreshed)  ◀────┘
                                  ▼  │
                          decision (allow / escalate / block)

Evaluated locally: action_type, threshold, content_pattern, temporal, and sequence (velocity + ordered). The deterministic and stateful policy types run entirely on-prem.
Forwarded to the cloud: policy types that need it, currently semantic (an LLM intent judge). Only the matching intercept is forwarded (XYBERN_CLOUD_POLICY_MODE=forward), or you can skip the cloud entirely.
Policy cache: pulled from GET /v1/enforce/policies every XYBERN_POLICY_REFRESH seconds, with last-known-good fallback.
Audit: every decision is forwarded asynchronously to POST /v1/enforce/relay/audit, which writes it into the Provenance Vault hash chain, so relay decisions appear in your dashboard alongside cloud ones.

Run it (Docker)¶

export XYBERN_API_KEY=xb_live_...        # your workspace key
docker compose up -d                     # relay on http://localhost:8787

Or directly:

pip install -r requirements.txt
export XYBERN_API_KEY=xb_live_...
gunicorn --bind 0.0.0.0:8787 --workers 1 --threads 8 xybern_relay.app:app

Point the SDK at the relay¶

export XYBERN_BASE_URL=http://localhost:8787/v1

import xybern
xybern.auto.connect(mode="enforce")   # intercepts now resolve at the relay

Configuration¶

Env var	Default	Purpose
`XYBERN_API_KEY`	(required)	Workspace key, pulls policies, forwards audit
`XYBERN_UPSTREAM_URL`	`https://www.xybern.com/api/v1`	Control plane URL
`XYBERN_POLICY_REFRESH`	`60`	Policy cache refresh interval (seconds)
`XYBERN_FAIL_OPEN`	`true`	If upstream unreachable: allow (`true`) or block (`false`)
`XYBERN_FORWARD_AUDIT`	`true`	Forward an audit record of every decision
`XYBERN_CLOUD_POLICY_MODE`	`forward`	Cloud-only policies: `forward` the intercept, or `skip`
`XYBERN_RELAY_PORT`	`8787`	Listen port

Endpoints¶

Method	Path	Purpose
`POST`	`/v1/enforce/intercept`	Authorise one action (local)
`GET`	`/v1/status`	Relay + policy-cache + audit-queue status
`GET`	`/healthz`	Liveness / readiness
`*`	`/v1/enforce/`, `/v1/auth/`	Transparently proxied upstream

Scaling note¶

The per-agent action log used by sequence policies is in-process. Run a single relay (with threads) for a coherent stateful view. The Docker image runs one gunicorn worker with 8 threads for exactly this reason; scale horizontally only if you don't rely on sequence/velocity policies, or front them with a shared store.