Redacta sits between your app and LLM providers. It automatically detects and scrubs personal information before it reaches any AI model — then restores it in the response. Two lines of code. Zero data leakage.
No credit card required · 50 free redactions/mo
Paste any text and watch PII disappear in real time
Paste any text with PII — see it scrubbed in real time
No SDK wrappers. No code refactoring. Just environment variables.
const openai = new OpenAI({
apiKey: "sk-your-key"
});
// PII goes straight to OpenAI
await openai.chat.completions.create({
model: "gpt-4o",
messages: [{
role: "user",
content: "Review John Smith's file.
SSN: 483-29-1847,
Email: john@acme.com"
}]
});const openai = new OpenAI({
apiKey: "rdk_your-redacta-key",
baseURL: "https://getredacta.com/api/v1"
});
// PII is scrubbed automatically
// OpenAI sees: "[PERSON_a7f3b1c9]'s file.
// SSN: [US_SSN_b2c14f80],
// Email: [EMAIL_ADDRESS_d4e5ab72]"
await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "..." }]
});Built for developers who can't afford to leak customer data
Names, SSNs, credit cards, emails, phone numbers, addresses — detected and replaced before reaching any AI model.
Change your SDK's baseURL and API key. That's it. Every API call is protected automatically.
The LLM only ever sees sanitized tokens — real PII is held server-side and restored in the response before it returns to your app.
Upload .docx, .xlsx, .pdf, .py, and 20+ file types. See exactly what PII exists before sharing with anyone.
Works with OpenAI, Anthropic, and Google Gemini SDKs. Supports streaming, tool use, and all API features.
AES-256-GCM encrypted token mappings, per-user configurable retention (1 hour to 90 days), and audit logs for admin and billing events on the Business plan.
Upload files directly or let the API proxy handle text in API calls
Only pay for what you redact. Clean scans are always free.
25 redactions/mo
300 redactions/mo
1,500 redactions/mo
5,000 redactions/mo
Everything you need to know before getting started
Token mappings (the link between real PII and replacement tokens) are encrypted at rest with AES-256-GCM and automatically purged on a per-user retention schedule — 24 hours by default, configurable from 1 hour to 90 days on the Business plan. We only store anonymized metadata (counts, timestamps, PII types) for your dashboard. Your real data is never logged, sold, or used to train models.
We use a dual-layer approach: TypeScript regex for structured PII (SSNs, credit cards, emails) with Luhn validation, plus Microsoft Presidio with spaCy NER for unstructured PII (names, locations, organizations). Combined, we achieve 99%+ recall on common entity types.
Our regex-layer scrubbing runs at sub-millisecond p99 latency (measured at p50 ≈ 2 µs, p95 ≈ 3 µs, p99 ≈ 4 µs on 10,000 synthetic samples). The Presidio NER pass runs in parallel and adds variable overhead bounded by model inference time. End-to-end production latency is dominated by the LLM provider call itself (typically 1-10 seconds), so scrubbing overhead is imperceptible. Streaming responses are fully supported with transparent de-scrubbing.
Yes — Business plan customers get access to the self-hosted Docker deployment. All data stays within your network. We also provide Claude Code enterprise hooks for companies running Claude Code internally.
Tokens are deterministic within a request: the same name always maps to the same token. So when the LLM reads '[PERSON_a7f3b1c9] contacted [PERSON_a7f3b1c9] about their order,' it can still reason about the same person — it just doesn't know who. Responses are fully restored before being returned to your app.
Yes. We handle SSE (Server-Sent Events) with a buffer-and-flush transform that correctly reassembles tokens split across chunks. Streaming works identically to the non-streaming flow from your app's perspective.
Fully supported. Function arguments, tool call payloads, and tool result content are all walked recursively and scrubbed the same way regular messages are — including nested objects and arrays. String leaves get tokenized; numbers, booleans, and nulls pass through. Works for OpenAI tool_calls, the legacy function_call format, OpenAI tool result messages, and Anthropic tool_use / tool_result blocks.
Names, emails, phone numbers, SSNs, credit cards (Luhn-validated), bank account numbers, IP addresses, physical addresses, locations, dates, driver's license numbers, passport numbers, and API keys. Custom entity types can be added on Business plans.
Free plan: hard cutoff at 25 redactions — your requests will return 429 until the next billing period. Paid plans: we allow overage at a per-redaction rate ($0.02-$0.05 depending on tier), so your app never breaks. We send email alerts at 80% and 95% usage.
Set up in under 5 minutes. Start with 50 free redactions per month.
Get Started Free