Precise PII Detection at Scale
Self-hosted, air-gapped PII detection with enterprise-grade security.
Core Operations
PII Eraser offers four modes of operation to detect and anonymize PII, PCI and other sensitive data.
import requests
# Detect sensitive data instances
response = requests.post(
"http://localhost:8000/text/detect",
json={
"text": ["Hallo Matthias"],
}
)import requests
# Replace sensitive data with entity types
response = requests.post(
"http://localhost:8000/text/transform",
json={
"text": ["I live at Berlinerstr 34, Magdeburg"],
"operator": "redact",
}
)import requests
# Mask characters with asterisks
response = requests.post(
"http://localhost:8000/text/transform",
json={
"text": ["My Nino is QQ689643C"],
"operator": "mask",
}
)import requests
# Cryptographic hashing for analytics
response = requests.post(
"http://localhost:8000/text/transform",
json={
"text": ["Medicare no. 3278851195"],
"operator": "hash",
}
){
"entities": [
[
{
"entity_type": "NAME",
"start": 6,
"end": 14,
"score": 0.995
}
]
],
"stats": { "total_tokens": 4, "tps": 5420 }
}{
"text": [
"I live at <ADDRESS>"
],
"entities": [
[
{
"entity_type": "ADDRESS",
"output_start": 10,
"output_end": 19
}
]
],
"stats": {"total_tokens": 13, "tps": 4550.87}
}{
"text": [
"My Nino is #########"
],
"entities": [
[
{
"entity_type": "GOV_SERVICES_ID",
"output_start": 11,
"output_end": 20
}
]
],
"stats": {"total_tokens": 11, "tps": 4282.22}
}{
"text": [
"Medicare no. 041827ef234aeddd34e45a4eb9ef23240fdd7072824a372bb0953377557c6d9c"
],
"entities": [
[
{
"entity_type": "HEALTHCARE_ID",
"output_start": 13,
"output_end": 77
}
]
],
"stats": {"total_tokens": 10, "tps": 4640.52}
}For the full range of options, including how to configure entity types via YAML, please visit the documentation
Designed for the Agentic AI Era
PII Eraser natively understands OpenAI-format chat conversations, delivering context-aware detection for LLM guardrail and agentic workflows.
Existing PII Guardrails
Scans messages individually. Fast, but misses PII requiring context.
Hi, I need to update the beneficiary details for the 'Project Alpha' contract payouts.
I can help with that. Which specific banking detail do you need to amend?
The bank account number has changed for our UK entity.
Understood. Please provide the new 8-digit account number.
Process All Messages
Scans entire history every turn. High accuracy, but scales poorly.
Hi, I need to update the beneficiary details for the 'Project Alpha' contract payouts.
I can help with that. Which specific banking detail do you need to amend?
The bank account number has changed for our UK entity.
Understood. Please provide the new 8-digit account number.
Smart Context
Automatically includes relevant context. The optimal balance.
Hi, I need to update the beneficiary details for the 'Project Alpha' contract payouts.
I can help with that. Which specific banking detail do you need to amend?
The bank account number has changed for our UK entity.
Understood. Please provide the new 8-digit account number.
Natively process OpenAI-format chats with context pooled between messages for improved accuracy.
Choose to anonymize only user messages while keeping system prompts and assistant responses intact.
Optimize latency-sensitive apps by skipping previously processed messages while maintaining full conversational context.
Great Accuracy, Globally
Accurate identification of 60+ entity types across Western Europe, North America and Australia.
60+ Localized Entity Types
Most tools only detect universal identifiers like emails & phone numbers. PII Eraser features deep, country-level localizations, with support for German Steuer-IDs, French NIR numbers, Australian TFNs and dozens more.
Regular Model Updates
The world changes fast. Older models fail on terms like "COVID" and aren't familiar with MCP tool calls. We continuously update our models to recognize contemporary entities and the shifting GenAI landscape.
No Regex Maintenance
PII Eraser relies on large encoder transformer models, freeing your team from maintaining fragile regex-based solutions. We also offer model updates free of charge in case we do miss something.
Protect Confidential Business Information
Detect and remove customer names, deal values, project codenames, and internal identifiers that create risk even when they aren't covered by privacy regulations.
System Architecture
Customer Infrastructure (VPC / On-Prem)
Self-hosted, air-gapped deployment. Your data never leaves your environment.
API Gateway
Systems
High Throughput
Sustain >5,000 tokens/sec on standard 8 vCPU instances via AVX-512 VNNI and AMX instruction sets.
Bank-Grade Deployment
Multiple reference implementations including AWS Fargate & ECS and Kubernetes. Deploy a hardened, distroless container built for zero-CVE compliance.
Seamless Migration
PII Eraser provides drop-in compatibility for Microsoft Presidio Analyzer.
Why PII Eraser?
See how a purpose-built, self-hosted solution compares to the alternatives.
| PII Eraser | Cloud API | Open Source | LLM (Generative) | |
|---|---|---|---|---|
| Data Sovereignty | 100% Local / Air-gapped | Cloud Only | Local | Cloud (Mostly) |
| Cost Model | Fixed Price Unlimited | Per Character (Expensive) | Free (Maintenance Heavy) | Per Token (Very Expensive) |
| EU Localization | Native | Limited | Limited | Native |
| Native Chat Support | OpenAI Completions Format | No | No | No Structured Chat Input |
| Long Input Support | 1M+ Tokens, No Accuracy Loss | Supported | Chunking Required; Accuracy Degrades | Accuracy Degrades with Length |
| Latency | Real-time | Medium | Medium | High (> 1000ms) |
| Security & Supply Chain | Hardened, Minimal Dependencies | Provider Managed | Self-Managed Security & Patching | Provider Managed |
| Hallucinations | Zero | Zero | Zero | Possible (Probabilistic) |
Ready to see it in action?
Contact us for a technical walkthrough or to deploy a trial instance on your own infrastructure.