question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

rule based address recognizer

See original GitHub issue

I came across this Demo and learned that presidio detect the address using data from open street map which includes: Names of places + point of interest + addresses for specific regions.

Later in the demo presidio put his data in a Trie data structure and use term frequency to find the address.

I looked into the predefined recognizers and also the supported list in documentation but I wasn’t able to find anything on address.

As far as I was able to figure out, the address detection is part of spacy recognizer. But I wasn’t able to find anything else.

Would you please point out where the rule-based approach for finding address is located in the GitHub repo? Or is this something deprecated?

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
fatemerhmicommented, Aug 12, 2021

Hi @omri374,

Of course. I’d be happy to collaborate on this 😃

0reactions
omri374commented, Aug 22, 2022

Hi @sdixit0309, there is no official address recognizer in Presidio. What I would suggest to look into is NER models which were trained on data containing addresses. For example: https://huggingface.co/obi/deid_roberta_i2b2

Here’s an example Presidio recognizer which leverages this specific transformers model: https://huggingface.co/spaces/omri374/presidio/blob/main/transformers_recognizer.py (an adaptation of the transformers sample in Presidio’s samples gallery).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Rule-based matching · spaCy Usage Documentation
spaCy features a rule-matching engine, the Matcher , that operates over tokens, similar to regular expressions. The rules can refer to token annotations ......
Read more >
Rule-Based Extraction and Entity Recognition - Mindbreeze
Using rule-based extraction and enrichment, it is possible to easily and with a large amount of flexibility, extract different entities ...
Read more >
Building an address parser with spaCy | Globant - Medium
Applying Named Entity Recognition to identify addresses. ... by writing traditional rule-based (often regular expression-driven) algorithms.
Read more >
A Rule-Based Named-Entity Recognizer With Improved Recall
PDF | This article describes CustNER: a system for named-entity recognition (NER) of person, location, and organization.
Read more >
Rule-based pattern extractor and named entity recognition
Rule -based pattern extractor and named entity recognition: A hybrid approach ; CD: 978-1-4244-6717-4 ; ISSN Information: Electronic ISSN: 2155-899X. Print ISSN: ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found