question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItΒ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Anonymizing multiple emails with Faker in a text message

See original GitHub issue

Hi,

I’m trying to anonymize emails using Faker. All is ok if in my text there is only one email address. If it contains multiple email addresses, the same fake email is used multiple times. Here the code I’m using:

def anonymizeEmail(text_to_anonymize): 
    analyzer_results = analyzer.analyze(text=text_to_anonymize, entities=["EMAIL_ADDRESS"], language='en')

    anonymized_results = anonymizer.anonymize(
        text=text_to_anonymize,
        analyzer_results=analyzer_results,    
        anonymizers_config={"EMAIL_ADDRESS": AnonymizerConfig("replace", {"new_value": fake.safe_email()})}
    )

    return anonymized_results

fake = Faker('en_US')
fake.add_provider(internet)

analyzer = AnalyzerEngine()
anonymizer = AnonymizerEngine()

text = 'The user has the following two emails: email1@gmail.com and email2@gmail.com'
anonymizeEmail(text)

The output is:

The user has the following two emails: kenneth74@example.net and kenneth74@example.net

How can I change the AnonymizerConfig in order to generate a fake email address for each one found into the string?

Thanks.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

4reactions
efungcommented, Mar 23, 2021

One way you could extend Presidio is by allowing the AnonymizerConfig objects to take lambda functions, not just static strings.

Here’s a diff that implements the idea:

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
modified: presidio-anonymizer/presidio_anonymizer/anonymizers/replace.py
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
@ presidio-anonymizer/presidio_anonymizer/anonymizers/replace.py:18 @ class Replace(Anonymizer):
         new_val = params.get(self.NEW_VALUE)
         if not new_val:
             return f"<{params.get('entity_type')}>"
-        return new_val
+        return new_val(text) if callable(new_val) else new_val

     def validate(self, params: Dict = None) -> None:
         """Validate the new value is string."""
-        validate_type(params.get(self.NEW_VALUE), self.NEW_VALUE, str)
+        new_val = params.get(self.NEW_VALUE)
+        if callable(new_val):
+            return
+        else:
+            validate_type(new_val, self.NEW_VALUE, str)
         pass

     def anonymizer_name(self) -> str:

and here’s how you would call it:

    anonymized_result = anonymizer.anonymize(
        text=text_to_anonymize,
        analyzer_results=analyzer_results,    
        anonymizers_config={
            "EMAIL_ADDRESS": AnonymizerConfig("replace", {"new_value": lambda x: fake.safe_email()})
            }
    )
3reactions
navalevcommented, Mar 23, 2021

@lucazav this was added to our backlog - thanks for suggesting this feature!

Read more comments on GitHub >

github_iconTop Results From Across the Web

The 7 Best Sites to Prank Your Friends With Fake Email ...
The 7 Best Sites to Prank Your Friends With Fake Email Messages Β· 1. Deadfake Β· 2. Emkei's Mailer Β· 3. Send Anonymous...
Read more >
Text anonymization with Presidio and Faker | by Oleg Litvinov
Presidio analyzer for finding sensitive data; Presidio anonymizer; Faker for ... β€’Full overlap of PIIs β€” When one text have several PIIs,Β ...
Read more >
A Practical Guide to Anonymizing Datasets with Python & Faker
Learn to generate simulated data as an alternative to the complex process of anonymizing real data, using python for data analysis tutorial.
Read more >
TXT Faker - Send Anonymous Text Messages from ... - YouTube
Our website TXT FAKER . COM Allows you to send SMS / text messages from any number to any number. You can also...
Read more >
Welcome to Faker's documentation! β€” Faker 15.3.4 ...
Whether you need to bootstrap your database, create good-looking XML documents, fill-in your persistence to stress test it, or anonymize data taken from...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found