Skip to content

PII Redaction

Scroll handles sensitive client data, passports, Emirates IDs, licence numbers, owner names. For client-originated runs, that data is anonymised before it reaches any AI model and restored afterward, so the LLM works on tokenised text and never sees the real values. This is powered by Xybern Redact.


How it works

Client run:
  real inputs → tokenise → AI analysis step  ("[ENTITY_A], licence [ID_001], owner [PERSON_A]")
                         ← restore ← AI output
  Stored output & PDF show the real values, the model only ever saw tokens.
  • Business names[ENTITY_A]
  • Licence / ID numbers[ID_001]
  • Owner / person names[PERSON_A]

A fresh token map is derived for each client run, and the real values are restored in the step output, the stored form data, and the generated documents, so your team and the final package always show real data, while the AI never did.


When it applies

  • Client-originated runs, runs tied to a saved client, an intake-link submission, a CRM pull, or Renewal Autopilot, are tokenised before AI analysis.
  • The mapping is held only for the duration of processing and used to restore the output.

Why it matters

Regulatory work involves exactly the data you don't want sitting in third-party model logs. Redaction means you get the benefit of frontier AI on Arabic legal documents without exposing client identities to the model provider, important for AML obligations and client trust. See Xybern Redact for the underlying privacy proxy.