Expunct
Privacy-first APIs for AI workflows. Expunct ships two product pillars that share one platform, one API key, and one audit trail:
- Redaction API — detect and replace PII, PCI, and PHI in text, documents, images, video, and audio. Generally available.
- Document Intelligence API — parse, extract, and safe-parse PDF and DOCX into LLM-ready structure. Beta, available on enabled tenants.
safe_parse is the hero workflow: parse a sensitive document and emit sanitized canonical JSON, markdown, and chunks in a single call — ready for RAG, search, or any downstream LLM.
What you can do today
| Pillar | Operations | Status |
|---|---|---|
| Redaction | POST /api/v1/redact, file & URI redaction, batch, policies | GA |
| Document Intelligence | POST /api/v1/parse, POST /api/v1/extract, POST /api/v1/workflows/safe-parse | Beta — gated by tenant feature flags. PDF and DOCX only. |
Supported formats
| Pillar | Formats |
|---|---|
| Redaction | Plain text, JSON, PDF, DOCX, PNG, JPG, MP4, WAV, MP3, plus s3://, gs://, https:// URIs |
| Document Intelligence (beta) | PDF, DOCX |
Entity coverage
The redaction engine detects 27+ entity types across three categories:
- PII — names, emails, SSNs, addresses, and more
- PCI — credit cards, bank accounts, IBAN codes
- PHI — medical licenses, national/religious/political groups
See Entity Types for the full list.
Multi-language support
Redaction runs in English (en), Spanish (es), and several other languages. More languages are planned.
Beta access
Document Intelligence is opt-in beta while we stabilize quality and rollout pacing:
- Free — redaction only.
- Starter — Document Intelligence beta available by request.
- Professional / Business — Document Intelligence beta available for approved tenants in the rollout.
Endpoints return 403 until the tenant feature flag is enabled. Contact support to enable document_parse_api, document_extract_api, or document_safe_parse_workflow for your tenant.