rule based address recognizer
See original GitHub issueI came across this Demo and learned that presidio
detect the address using data from open street map which includes: Names of places + point of interest + addresses for specific regions.
Later in the demo presidio
put his data in a Trie data structure and use term frequency to find the address.
I looked into the predefined recognizers and also the supported list in documentation but I wasn’t able to find anything on address.
As far as I was able to figure out, the address detection is part of spacy recognizer. But I wasn’t able to find anything else.
Would you please point out where the rule-based approach for finding address is located in the GitHub repo? Or is this something deprecated?
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
Rule-based matching · spaCy Usage Documentation
spaCy features a rule-matching engine, the Matcher , that operates over tokens, similar to regular expressions. The rules can refer to token annotations ......
Read more >Rule-Based Extraction and Entity Recognition - Mindbreeze
Using rule-based extraction and enrichment, it is possible to easily and with a large amount of flexibility, extract different entities ...
Read more >Building an address parser with spaCy | Globant - Medium
Applying Named Entity Recognition to identify addresses. ... by writing traditional rule-based (often regular expression-driven) algorithms.
Read more >A Rule-Based Named-Entity Recognizer With Improved Recall
PDF | This article describes CustNER: a system for named-entity recognition (NER) of person, location, and organization.
Read more >Rule-based pattern extractor and named entity recognition
Rule -based pattern extractor and named entity recognition: A hybrid approach ; CD: 978-1-4244-6717-4 ; ISSN Information: Electronic ISSN: 2155-899X. Print ISSN: ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi @omri374,
Of course. I’d be happy to collaborate on this 😃
Hi @sdixit0309, there is no official address recognizer in Presidio. What I would suggest to look into is NER models which were trained on data containing addresses. For example: https://huggingface.co/obi/deid_roberta_i2b2
Here’s an example Presidio recognizer which leverages this specific transformers model: https://huggingface.co/spaces/omri374/presidio/blob/main/transformers_recognizer.py (an adaptation of the transformers sample in Presidio’s samples gallery).