PEyeEye AI/Docs/Introduction

The PEyeEye AI API

Redact PII on the way into your LLM prompts and rehydrate it on the way out. One round-trip, deterministic tokens, zero data retention by default.

✓

First call in 90 seconds. If you've got a terminal and an API key, skip to Quickstart. Everything else is reference material.

Building an LLM agent or tool-using pipeline? See the agents guide for tool schemas, Claude/OpenAI/Gemini snippets, LangChain/LlamaIndex/CrewAI wrappers, streaming, and stateless multi-turn sessions.

How it fits into your stack #

PEyeEye AI is not an LLM provider. It's a thin, stateless shield that sits between your application and whatever model you're using: Claude, GPT, Gemini, a fine-tune, your own checkpoint. Two HTTP endpoints do the whole dance:

POST /v1/redact: your raw prompt in, tokenized text out.
You prompt the LLM with the tokenized text.
POST /v1/rehydrate: the model's reply in, original values back in.

Everything PEyeEye AI does is synchronous, idempotent, and observable. There is no queue, no background worker, no magic. If your LLM call times out, ours already finished.

Guarantees #

Zero retention by default. Redacted text and source values are held in memory for the session's TTL (default 15m) and then discarded. Set session: "stateless" to skip server-side storage entirely.
Deterministic tokens within a session. Ada Lovelace is always [PERSON_1] inside one session, and never leaks across sessions.
At-rest encryption (AES-256-GCM) and TLS 1.3 in transit.
Per-org isolation. Custom detectors, policies, and API keys are scoped to your organization — cross-tenant leakage is impossible by construction.

Quickstart #

One redact + one rehydrate, from zero to working code. Grab your API key from the dashboard first.

1 · Install

# Node / TypeScript
npm install peyeeye

2 · Set your key

export PEYEEYE_KEY="pk_live_51H…"

3 · Round-trip a prompt

The SDK wraps redact+rehydrate into a single shield() helper. This is the recommended pattern — you can still call the raw endpoints if you need to.

import { Peyeeye } from "peyeeye";
import Anthropic from "@anthropic-ai/sdk";

const peyeeye  = new Peyeeye({ apiKey: process.env.PEYEEYE_KEY });
const claude    = new Anthropic();

const shield = await peyeeye.shield();
const safe   = await shield.redact("Hi, I'm Ada, ada@a-e.com");

const reply = await claude.messages.create({
  model: "claude-sonnet-*",
  max_tokens: 256,
  messages: [{ role: "user", content: safe }]
});

console.log(await shield.rehydrate(reply.content[0].text));
// "Hi Ada, thanks — we've emailed ada@a-e.com."

The returned session handle is opaque — it's how rehydrate matches tokens back to real values. Pass it verbatim. Don't persist it longer than the redacted text lives.

Authentication #

All requests use bearer-token auth. Keys are prefixed pk_live_, scoped to one organization, and don't expire — rotate them yourself in the dashboard.

Authorization: Bearer pk_live_51H…
Content-Type:    application/json
Idempotency-Key: req_a1b2c3d4   # optional, recommended

Never ship a key to the browser. Call peyeeye from your backend and proxy the redacted text forward. Anything you put in a browser bundle leaks — peyeeye keys included.

Idempotency #

Pass an Idempotency-Key header to safely retry. We cache the full response keyed on the tuple (api_key, idempotency_key). Mismatched bodies raise idempotency_conflict.

POST /v1/redact #

Detect PII in a block of text, replace each span with a deterministic token, and return a session handle you can later rehydrate.

POSThttps://api.peyeeye.ai/v1/redactsince v1.0

Body parameters

text*

string

Input to scan. UTF-8, up to 128K characters. Arrays accepted — each element is redacted in the same session.

locale

string

BCP-47 language tag. Biases detectors toward locale-specific formats (e.g. fr-FR SIRET, en-GB NHS number).default: "auto"

policy

string | object

Name of a saved policy, or an inline policy object. Controls which entities are redacted, allow-lists, and severity.default: "default"

session

string | "stateless"

Optional existing session ID to extend. Pass "stateless" to skip server-side storage — the response will include a rehydration_key blob you must present to /rehydrate.

entities

string[]

Restrict detection to these entity IDs. Omit to use the policy's default set.

placeholder

string

Token template. "[{TYPE}_{N}]" (default), "<{TYPE}>", or a custom format with {TYPE}, {N}, {HASH} variables.

Example

POST /v1/redact
Authorization: Bearer pk_live_…
Content-Type: application/json

{
  "text": "Hi, I'm Ada Lovelace.\nEmail: ada@analytic-engines.com\nCard: 4242 4242 4242 4242",
  "locale": "en-US",
  "policy": "default"
}

{
  "redacted": "Hi, I'm [PERSON_1].\nEmail: [EMAIL_1]\nCard: [CARD_1]",
  "session": "ses_7fA2kLw9MxPq",
  "entities": [
    { "token": "[PERSON_1]", "type": "PERSON",
      "span": [8, 20], "confidence": 0.98 },
    { "token": "[EMAIL_1]",  "type": "EMAIL",
      "span": [29, 55], "confidence": 1.00 },
    { "token": "[CARD_1]",   "type": "CARD",
      "span": [62, 81], "confidence": 0.99 }
  ],
  "latency_ms": 38,
  "expires_at": "2026-05-01T14:27:03Z"
}

POST /v1/rehydrate #

Substitute tokens in a string with the original values held in a session. Unknown tokens pass through verbatim — we don't fail the call if the LLM made one up.

POSThttps://api.peyeeye.ai/v1/rehydratesince v1.0

Body parameters

text*

string

Text containing tokens to swap back.

session*

string

Session ID returned by /redact, or rehydration_key blob if you used stateless mode.

strict

boolean

When true, any unknown tokens raise unknown_token instead of passing through. Useful for catching model hallucinations.default: false

Response

{
  "text": "Hi Ada, thanks — we've emailed ada@analytic-engines.com.",
  "replaced": 2,
  "unknown": [],
  "latency_ms": 11
}

More endpoints #

Everything else the dashboard uses is available over the same bearer-token API.

GET /v1/sessions/:id

stateful only

Inspect a session — locale, policy, chars processed, entity count, expires_at, and whether it's already expired.

DELETE /v1/sessions/:id

stateful only

Drop the mapping immediately, don't wait for TTL.

GET /v1/entities

read

List built-in detectors plus your org's custom detectors. Built-ins come with id, category, sample, locales; customs add kind, pattern, enabled.

POST /v1/entities

write

Create or upsert a custom detector. Body: id, kind: "regex" | "fewshot", pattern, examples, confidence_floor. Included for everyone, with no cap on how many you create.

PATCH /v1/entities/:id

write

Update pattern, toggle enabled, or tune confidence_floor without a full replace.

DELETE /v1/entities/:id

write

Retire a custom detector.

POST /v1/entities/test

dry-run

Compile a regex and run it against a sample string. Returns matches and spans without creating a detector — safe to call repeatedly while iterating on a pattern.

GET /v1/entities/templates

read

Starter detector templates (Twilio SIDs, Stripe keys, AWS access keys, GitHub PATs, JWTs, Slack tokens, a generic customer-id shape). Copy the pattern into a POST /v1/entities call to adopt one.

Errors & retries #

All errors return a JSON body with code, message, and request_id. Transient errors (429, 5xx) are safe to retry with exponential backoff — the SDKs do this for you.

400 invalid_request

terminal

Missing required field, unknown entity ID, malformed JSON. Don't retry — fix and re-send.

400 unknown_token

terminal

Rehydrate in strict: true mode hit a token that wasn't in the session. Often means the LLM hallucinated a placeholder.

401 unauthorized

terminal

Missing, malformed, or revoked API key.

403 forbidden

terminal

The requested capability isn't enabled for your account. Every standard feature is on by default, so this is rare.

404 session_not_found

terminal

Session expired or never existed. Re-run /redact.

409 idempotency_conflict

terminal

Same idempotency key, different body. Use a fresh key.

413 payload_too_large

terminal

Input exceeds 128K characters. Split the text and redact each chunk into the same session.

429 rate_limited

retry

Burst capacity exhausted. Honor the Retry-After header. SDKs back off automatically.

5xx internal_error

retry

Transient server fault. Retry with exponential backoff — SDKs do this for you.

Rate limits #

Per-key limits, measured as requests-per-second with a burst bucket of 2× sustained RPS. Response headers report your remaining budget:

X-RateLimit-Limit:      1000
X-RateLimit-Remaining:  987
Retry-After:            0.42   # seconds, only on 429

Every account — 1000 rps sustained, 2000 rps burst

Sessions & tokens #

A session is the bridge that lets PEyeEye AI swap tokens back to real values later. Two modes:

Stateful (default)

We hold the mapping for 15m after the last touch, then discard it. Simple, low-latency, but requires server-side storage on our end — if that's a non-starter for you, use stateless mode instead. DELETE /v1/sessions/:id to drop the mapping early.

Stateless

Pass session: "stateless". The response includes an opaque rehydration_key (prefixed skey_) — an AES-256-GCM-sealed blob of the token→value mapping. Store it yourself. Send it back to /rehydrate as the session value. We never persist anything.

Entity catalog #

62 built-in entity types (regex + checksum validated, supplemented by ML NER), grouped below. Every ID is usable in entities: [...] or as a policy rule.

Custom detectors #

Define your own detector with a regex, or drop in a handful of example strings and let peyeeye induce the pattern (LLM-backed when enabled, heuristic fallback otherwise):

{
  "id": "ORDER_ID",
  "kind": "regex",
  "pattern": "#A-\\d{6,}",
  "examples": ["#A-884217", "#A-007431"],
  "confidence_floor": 0.9
}

If pattern is omitted, peyeeye induces one from examples at create time. Test-drive patterns against sample text before you save them with POST /v1/entities/test.

Streaming #

When you're piping an LLM's token stream back to a user, naive rehydration breaks on mid-token boundaries. The streaming API buffers partial tokens until they complete, then emits cleanly. Included for everyone.

POSThttps://api.peyeeye.ai/v1/redact/streamsince v1.0

Post a list of chunks; get back Server-Sent Events in three flavours — session fires once with the new session id, redacted fires per chunk, done closes the stream.

# POST /v1/redact/stream  body: { "chunks": ["Hi, I'm Ada", " — card 4242 4242 4242 4242"] }
event: session
data: {"session":"ses_7fA2kLw9MxPq"}

event: redacted
data: {"text":"Hi, I'm [PERSON_1]","entities":1}

event: redacted
data: {"text":" — card [CARD_1]","entities":1}

event: done
data: {"chars":37}

Both SDKs wrap this with partial-token buffering so you can interleave upstream LLM chunks with rehydration safely. Open a shield once, redact the user prompt, then pipe each streamed LLM chunk through rehydrateChunk:

import { Peyeeye } from "peyeeye";
import Anthropic from "@anthropic-ai/sdk";

const peyeeye = new Peyeeye({ apiKey: process.env.PEYEEYE_KEY! });
const claude  = new Anthropic();

const shield = await peyeeye.shield();
const safe   = await shield.redact(userInput);

const upstream = await claude.messages.stream({
  model: "claude-sonnet-*",
  messages: [{ role: "user", content: safe }],
});

for await (const chunk of upstream) {
  if (chunk.type !== "content_block_delta") continue;
  const out = await shield.rehydrateChunk(chunk.delta.text);  // partial-token safe
  process.stdout.write(out);
}
process.stdout.write(await shield.flush());  // emit any buffered remainder

from peyeeye import Peyeeye
from anthropic import Anthropic
import os, sys

peyeeye = Peyeeye(api_key=os.environ["PEYEEYE_KEY"])
claude  = Anthropic()

with peyeeye.shield() as shield:
    safe = shield.redact(user_input)

    with claude.messages.stream(
        model="claude-sonnet-*",
        max_tokens=512,
        messages=[{"role": "user", "content": safe}],
    ) as upstream:
        for text in upstream.text_stream:
            sys.stdout.write(shield.rehydrate_chunk(text))  # partial-token safe
            sys.stdout.flush()

    sys.stdout.write(shield.flush())  # emit any buffered remainder

If you want the raw SSE — for example from a runtime without the SDK on it — post directly to /v1/redact/stream and consume the stream of session / redacted / done events:

import { Peyeeye } from "peyeeye";

const peyeeye = new Peyeeye({ apiKey: process.env.PEYEEYE_KEY! });

for await (const ev of peyeeye.redactStream({
  chunks: ["Hi, I'm Ada", " — card 4242 4242 4242 4242"],
})) {
  if (ev.event === "session")  sessionId = ev.data.session;
  if (ev.event === "redacted") process.stdout.write(ev.data.text);
}

from peyeeye import Peyeeye

peyeeye = Peyeeye(api_key="pk_live_...")

for ev in peyeeye.redact_stream([
    "Hi, I'm Ada",
    " — card 4242 4242 4242 4242",
]):
    if ev.event == "session":
        session_id = ev.data["session"]
    elif ev.event == "redacted":
        print(ev.data["text"])

Never flush during a streaming response — only after upstream closes. Flushing mid-stream can emit a partial token to the user.

SDKs #

First-party libraries, open-source under MIT. Full parity with the HTTP API — redact, rehydrate, streaming with partial-token buffering, stateless sealed sessions, custom detectors, session management. Current stable release: v1.0.0.

TypeScript / Node

peyeeye · v1.0.0

Node 18+, Bun, Deno, Cloudflare Workers, Vercel Edge. Zero runtime dependencies — uses the platform fetch. Dual ESM + CJS build with typed .d.ts / .d.cts.

GitHub ↗npm ↗MIT

Python

peyeeye · v1.0.0

Python 3.9+. Single runtime dependency (httpx). Fully type-hinted with py.typed. Shield context manager handles session lifecycle automatically.

GitHub ↗PyPI ↗MIT

Go

github.com/peyeeye/peyeeye-go · v1.0.0

Go 1.22+. Zero third-party runtime dependencies, standard library only. Functional options, typed *Error, goroutine-safe client with a per-call Shield for stateful sessions.

GitHub ↗pkg.go.dev ↗MIT

TypeScript / Node

Install:

# npm, pnpm, yarn, or bun — pick your poison
npm install peyeeye
pnpm add peyeeye
bun add peyeeye

Quickstart — end-to-end redact → LLM → rehydrate:

import { Peyeeye } from "peyeeye";
import Anthropic from "@anthropic-ai/sdk";

const peyeeye = new Peyeeye({ apiKey: process.env.PEYEEYE_KEY! });
const claude  = new Anthropic();

const shield = await peyeeye.shield();
const safe   = await shield.redact("Hi, I'm Ada, ada@a-e.com");

const reply = await claude.messages.create({
  model: "claude-sonnet-*",
  max_tokens: 256,
  messages: [{ role: "user", content: safe }],
});

console.log(await shield.rehydrate(reply.content[0].text));
// "Hi Ada, thanks — we've emailed ada@a-e.com."

shield() opens a session on the first redact() call, keeps reusing it across subsequent calls, and swaps tokens back on rehydrate(). The same real value always yields the same token within a shield; tokens never leak across shields.

Client configuration:

new Peyeeye({
  apiKey: "pk_live_…",
  baseUrl: "https://api.peyeeye.ai",   // optional
  maxRetries: 3,                        // 429 + 5xx back off exponentially
  timeoutMs: 30_000,                    // per-request timeout
  defaultHeaders: { "X-App": "my-app" },
  fetch: globalThis.fetch,              // override on Cloudflare Workers
});

Low-level calls (when you don't want the shield helper):

const r = await peyeeye.redact("Card: 4242 4242 4242 4242");
// r.redacted  → "Card: [CARD_1]"
// r.session   → "ses_…"
// r.entities  → [{ token: "[CARD_1]", type: "CARD", span: [6, 25], confidence: 0.99 }]

const back = await peyeeye.rehydrate("Confirmation for [CARD_1].", r.session);
// back.text → "Confirmation for 4242 4242 4242 4242."

Full surface: README — shield, stateless sealed mode, SSE streaming, custom detectors, session management, retry / rate-limit headers, typed errors.

Python

Install:

# pip, poetry, pdm, uv — works with any installer
pip install peyeeye
poetry add peyeeye
uv pip install peyeeye

Quickstart — end-to-end redact → LLM → rehydrate:

import os
from peyeeye import Peyeeye
from anthropic import Anthropic

peyeeye = Peyeeye(api_key=os.environ["PEYEEYE_KEY"])
claude  = Anthropic()

with peyeeye.shield() as shield:
    safe  = shield.redact("Hi, I'm Ada, ada@a-e.com")
    reply = claude.messages.create(
        model="claude-sonnet-*",
        max_tokens=256,
        messages=[{"role": "user", "content": safe}],
    )
    print(shield.rehydrate(reply.content[0].text))

Inside the with block the shield pins a single session: the same real value always maps to the same token, and the session is cleaned up on exit (stateful mode).

Client configuration:

from peyeeye import Peyeeye

peyeeye = Peyeeye(
    api_key="pk_live_...",
    base_url="https://api.peyeeye.ai",   # optional
    timeout=30.0,                         # per-request timeout (seconds)
    max_retries=3,                        # 429 + 5xx back off exponentially
    default_headers={"X-App": "my-app"},
)

Low-level calls (skip the shield helper):

r = peyeeye.redact("Card: 4242 4242 4242 4242")
# r.redacted  → "Card: [CARD_1]"
# r.session   → "ses_…"
# r.entities  → [DetectedEntity(token="[CARD_1]", type="CARD", span=(6, 25), confidence=0.99)]

back = peyeeye.rehydrate("Confirmation for [CARD_1].", session=r.session)
# back.text → "Confirmation for 4242 4242 4242 4242."

Stateless sealed mode — server never persists the mapping, the sealedskey_… blob carries everything the rehydrate step needs:

with peyeeye.shield(stateless=True) as shield:
    safe = shield.redact("Ada, 4242 4242 4242 4242")
    # shield.rehydration_key → "skey_AES-GCM-sealed..."
    # Shipped to a client, used later, no server-side state.
    print(shield.rehydrate("Hi [PERSON_1], your [CARD_1] is active."))

Typed errors from the API:

from peyeeye import PeyeeyeError

try:
    peyeeye.redact(text)
except PeyeeyeError as e:
    # e.status, e.code, e.message, e.request_id
    if e.code == "rate_limited":
        retry(e.retry_after)
    elif e.code == "forbidden":
        upgrade_plan()
    else:
        raise

Full surface: README — shield, stateless sealed mode, SSE streaming via redact_stream(), custom detectors, session management, retry / rate-limit headers, typed errors.

Go

Install:

# Go 1.22+, no third-party dependencies
go get github.com/peyeeye/peyeeye-go

Quickstart — end-to-end redact → LLM → rehydrate:

package main

import (
    "context"
    "fmt"
    "os"

    "github.com/peyeeye/peyeeye-go"
)

func main() {
    pe  := peyeeye.New(os.Getenv("PEYEEYE_KEY"))
    ctx := context.Background()

    shield := pe.Shield(ctx)
    defer shield.Close(ctx)

    safe, _  := shield.Redact(ctx, "Hi, I'm Ada, ada@a-e.com")
    reply    := callYourLLM(safe) // model: claude-sonnet-*
    clean, _ := shield.Rehydrate(ctx, reply)
    fmt.Println(clean)
}

Shield opens a session on the first Redact and reuses it across calls. The underlying *Client is goroutine-safe; theShield itself is per-call state and is not.

Client configuration:

pe := peyeeye.New("pk_live_...",
    peyeeye.WithBaseURL("https://api.peyeeye.ai"),
    peyeeye.WithMaxRetries(3),
    peyeeye.WithHTTPClient(&http.Client{Timeout: 30 * time.Second}),
    peyeeye.WithDefaultHeaders(map[string]string{"X-App": "my-app"}),
)

Low-level calls (skip the shield helper):

r, _ := pe.Redact(ctx, "Card: 4242 4242 4242 4242")
text, _ := r.SingleRedacted()
// text       → "Card: [CARD_1]"
// r.Session  → "ses_…"
// r.Entities → []DetectedEntity{{Token: "[CARD_1]", Type: "CARD", Span: [2]int{6, 25}, Confidence: 0.99}}

back, _ := pe.Rehydrate(ctx, "Confirmation for [CARD_1].", r.Session)
// back.Text → "Confirmation for 4242 4242 4242 4242."

Typed errors via errors.As:

var pee *peyeeye.Error
if errors.As(err, &pee) {
    // pee.Status, pee.Code, pee.Message, pee.RequestID
    switch pee.Code {
    case "rate_limited":
        retry()
    case "forbidden":
        upgradePlan()
    }
}

Full surface: README — shield, stateless sealed mode, SSE streaming via RedactStream, custom detectors, session management, retry / rate-limit headers, typed errors.

All three SDKs follow semver. Major versions track the HTTP API major version. Older majors are supported for 18 months after a new major ships.

NextOpen the dashboard BackHome