OpenAI Releases Privacy Filter Model for Detecting and Masking Personal Data

April 23, 2026
OpenAI has released Privacy Filter, an open-weight model designed to identify and redact personally identifiable information in text. The model can run locally and supports up to 128,000 tokens of context for efficient privacy workflows.
OpenAI Releases Privacy Filter Model for Detecting and Masking Personal Data
Image: OpenAI

OpenAI announced in a press release the release of Privacy Filter, an open-weight model designed to detect and mask personally identifiable information in text. The model aims to support developers in implementing stronger privacy safeguards in their AI workflows.

Privacy Filter is a compact model capable of context-aware detection of private data in unstructured text. It can run locally, allowing users to mask or redact sensitive information without sending data to external servers. The model supports up to 128,000 tokens of context and labels text spans across eight privacy categories, including personal names, addresses, emails, phone numbers, URLs, dates, account numbers, and secrets.

The model uses a bidirectional token classification architecture with a constrained decoding process, labeling all tokens in a single pass for faster performance. It achieved an F1 score of 96 percent on the PII-Masking-300k benchmark and 97.43 percent on a corrected version of the dataset. Developers can fine tune the model for domain specific use cases and customize masking behavior.

Privacy Filter is available under the Apache 2.0 license on Hugging Face and GitHub, with documentation covering architecture, taxonomy, evaluation, and limitations. OpenAI uses a fine tuned version of the model internally for privacy preserving workflows and encourages external feedback for further refinement.

We hope you enjoyed this article.

Consider subscribing to one of our newsletters like Cybersecurity AI Weekly or Daily AI Brief.

Also, consider following us on social media:

Subscribe to Cybersecurity AI Weekly

Weekly newsletter about AI in Cybersecurity.

Market report

2025 State of Data Security Report: Quantifying AI’s Impact on Data Risk

Varonis Systems, Inc.

The 2025 State of Data Security Report by Varonis analyzes the impact of AI on data security across 1,000 IT environments. It highlights critical vulnerabilities such as exposed sensitive cloud data, ghost users, and unsanctioned AI applications. The report emphasizes the need for robust data governance and security measures to mitigate AI-related risks.

Read more