diff --git a/docs/building_custom_docker_images.md b/docs/building_custom_docker_images.md new file mode 100644 index 000000000..ca056e750 --- /dev/null +++ b/docs/building_custom_docker_images.md @@ -0,0 +1,104 @@ +supported\_languages: + + - en + + - de + + + +\# Building Custom Presidio Docker Images + + + +\## Overview + + + +This guide explains how to build custom Presidio Docker images with support for additional languages beyond English. + + + +\*Common Use Cases:\* + +\- Add German, Spanish, French language support + +\- Use different NLP backends (Spacy vs Stanza) + +\- Optimize for production deployments + + + +\--- + + + +\## Key Files to Modify + + + +When customizing Presidio, you'll work with configuration files in: + +presidio-analyzer/presidio\_analyzer/conf + +\### Important Configuration Files + + + +1\. \*default\_recognizers.yaml\* + + - Defines which PII recognizers are enabled/disabled + + - Specifies language support for each recognizer + + - Location: presidio-analyzer/presidio\_analyzer/conf/default\_recognizers.yaml + + + +2\. \*spacy.yaml / stanza.yaml\* + + - Configure which NLP backend to use + + - Location: presidio-analyzer/presidio\_analyzer/conf/ + + + +\### Example: Add German Language Support + + + +1\. Open presidio-analyzer/presidio\_analyzer/conf/default\_recognizers.yaml + + + +2\. Find the "Germany recognizers" section (around line 312) + + + +3\. Change enabled: false to enabled: true: + + + +```yaml + +\- name: DeTaxIdRecognizer + + supported\_languages: + + - de + + type: predefined + + enabled: true + + + +\- name: DePassportRecognizer + + supported\_languages: + + - de + + type: predefined + + enabled: true +