On PII redaction for LLMs, the real cost of self-hosting detection, and the design choices behind the API.
GLiNER is a clever zero-shot NER model. We use it. But generic NER misses structural PII like cards, IBANs, and tax IDs, and it doesn't handle the rest of the LLM round-trip.
How to wrap a LangChain chain or agent with redact and rehydrate steps so the model never sees customer data, with code that works for both LCEL and the older runnable interfaces.
A walkthrough of the four-step pattern (detect, redact, prompt, rehydrate) that keeps personal data out of OpenAI, Anthropic, and Gemini calls without breaking the user experience.
A practical look at where Microsoft Presidio fits, where it stops, and what you end up building around it when you wire it into an LLM pipeline.