Skip to content

Recipes gallery #1687

@omri374

Description

@omri374

Is your feature request related to a problem? Please describe.

Users often struggle to get started with customizing Microsoft Presidio for their specific data types and use cases. This makes it difficult to evaluate and adapt the tool for production scenarios. For example, someone working with chat logs containing financial data or clinical notes may not know the best way to configure Presidio for optimal performance. This lack of guidance creates friction and slows down adoption. Furthermore, it would demonstrate the desired use for presidio, which is to customize it and not use in its vanilla form.

Describe the solution you'd like

We’d like to provide a curated list of “recipes” tailored to common data privacy and de-identification scenarios (e.g., chat conversations with financial data, clinical notes, REST API logs in JSON format). Each recipe would be an end-to-end, reproducible example and would include:

  1. A method for generating synthetic data that mimics the real-world scenario.

  2. Evaluation metrics (e.g., precision, recall, F₂-score, and latency) for different Presidio configurations—such as out-of-the-box, with custom recognizers, or with best-effort tuning.

Illustrative example:

Presidio's performance on common scenarios

This table benchmarks Microsoft Presidio’s de-identification accuracy and performance across representative data domains and implementation levels—from out-of-the-box use to custom pipelines with transformers or LLMs. Each cell includes precision, recall, F₂ score, and average latency per sample.
Clicking on Notebook takes you to a reproducible Jupyter notebook containing the exact configuration, dataset (if available), evaluation logic, and performance metrics used for that experiment.

Domain / Scenario 1. Out-of-the-box
(spaCy)
2. Augmented
(+ custom recognizers)
3. Custom Model
(own ML/Transformer)
4. Hybrid “Best-Effort”
(ensemble/LLM+gliNER)
Financial (Chatbot) P 0.78
R 0.65
F₂ 0.71
Latency 12 ms/sample
Notebook
P 0.85
R 0.78
F₂ 0.81
Latency 18 ms/sample
Notebook
P 0.92
R 0.88
F₂ 0.90
Latency 45 ms/sample
Notebook
P 0.95
R 0.93
F₂ 0.94
Latency 150 ms/sample
Notebook
Medical (Clinical Notes) P 0.70
R 0.60
F₂ 0.65
Latency 15 ms/sample
Notebook
P 0.80
R 0.75
F₂ 0.77
Latency 22 ms/sample
Notebook
P 0.88
R 0.82
F₂ 0.85
Latency 50 ms/sample
Notebook
P 0.93
R 0.90
F₂ 0.91
Latency 160 ms/sample
Notebook
Retail (JSON REST) P 0.82
R 0.70
F₂ 0.75
Latency 10 ms/sample
Notebook
P 0.88
R 0.80
F₂ 0.84
Latency 16 ms/sample
Notebook
P 0.93
R 0.87
F₂ 0.90
Latency 40 ms/sample
Notebook
P 0.96
R 0.94
F₂ 0.95
Latency 140 ms/sample
Notebook
Multilingual (e.g., Spanish) P 0.65
R 0.55
F₂ 0.60
Latency 20 ms/sample
Notebook
P 0.75
R 0.70
F₂ 0.72
Latency 28 ms/sample
Notebook
P 0.85
R 0.80
F₂ 0.82
Latency 55 ms/sample
Notebook
P 0.92
R 0.88
F₂ 0.90
Latency 170 ms/sample
Notebook
Legend:
  • P = Precision
  • R = Recall
  • F = F₂ score (weighted recall-heavy F-score)
  • Latency = Average processing time per sample (milliseconds per record)

This would give users a concrete starting point for customization and performance benchmarking.

Describe alternatives you've considered

  1. Expanding the current documentation and samples with more general usage guidance, but this lacks the contextual depth and reproducibility of recipe-based examples.
  2. Providing pre-trained models or templates, but they may not align closely with users' specific domains without example-driven guidance.

Additional context

The goal is to help users bridge the gap between generic documentation and production-ready deployment. These recipes would serve as educational tools and performance baselines for different domains. Ideally, they would live in a dedicated section of the Presidio GitHub repo or documentation site, and we could encourage contributions from the community over time.

Sub-issues

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions