feat: add Turkish phone number (TR_PHONE_NUMBER) recognizer#2006
Open
mrcuren wants to merge 6 commits into
Open
feat: add Turkish phone number (TR_PHONE_NUMBER) recognizer#2006mrcuren wants to merge 6 commits into
mrcuren wants to merge 6 commits into
Conversation
- Add Turkey (TR) support to generic PhoneRecognizer - Extend TR_PHONE_NUMBER to support geographic numbers (2/3/4 prefix) - Implement ITU-T E.164 compliant validation with MNP awareness - Add Turkish context words for better detection accuracy - Update tests and documentation for enhanced coverage - Legal basis: KTK Madde 23, ITU-T E.164 compliance Addresses SharonHart's feedback on country-specific checks
Generic PhoneRecognizer region changes are out of scope for TR_PHONE_NUMBER. Focus only on the country-specific recognizer.
Contributor
Author
|
@SharonHart Following up on your feedback in #1973, this PR adds the country-specific Turkish phone number recognizer with:
Ready for review when you have a chance. |
8 tasks
…umber-recognizer-clean
…ig instead of subclass - Fix PhoneRecognizer._get_recognizer_result to use self.supported_entities[0] instead of hardcoded 'PHONE_NUMBER', making the supported_entity parameter from PR microsoft#2014 fully functional - Delete TrPhoneNumberRecognizer subclass; TR phone detection now uses PhoneRecognizer(supported_regions=['TR'], supported_entity='TR_PHONE_NUMBER', context=[...]) programmatically per maintainer guidance - Remove TrPhoneNumberRecognizer from __init__.py, __all__, and default_recognizers.yaml - Rewrite tests to use PhoneRecognizer with TR config (40 test cases) - Update CHANGELOG.md and docs/supported_entities.md
Contributor
Author
|
Refactored per @SharonHart guidance. Fixed PhoneRecognizer._get_recognizer_result bug (entity_type=PHONE_NUMBER to self.supported_entities[0]). Deleted TrPhoneNumberRecognizer subclass - TR phone detection now uses PhoneRecognizer(supported_regions=['TR'], supported_entity='TR_PHONE_NUMBER', context=...) programmatically. 8 files changed, +61/-332. 40/40 TR phone tests, 26/26 existing phone tests, 57/57 Turkey recognizer tests all pass. Ruff lint: 0 errors. |
SharonHart
previously approved these changes
May 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds Turkish phone number recognizer to Presidio Analyzer, following up on the discussion in #1973.
The generic
PhoneRecognizerusespython-phonenumbersand can parse Turkish numbers when TR is added tosupported_regions. However, a country-specific recognizer provides additional value:TR_PHONE_NUMBERvs genericPHONE_NUMBER) for targeted PII detectionFeatures:
validate_result()and_validate_turkish_number()Legal basis: Karayolları Trafik Kanunu (KTK) Madde 23, ITU-T E.164.
Issue reference
Part of #1973
Testing
test_tr_phone_number_recognizer.pywith 51 test casesChecklist
default_recognizers.yamlwithenabled: false__init__.pyand__all__