Scala Common Enrich: support "pii" annotations in schemas for PII Enrichment
See original GitHub issuePII = Personally Identifiable Information
The basic idea:
- Any JSON Schema (ue or context) can be annotated with
"pii": true
on a per-property basis - If this PII Scrubber is turned on, then we encrypt any given PII property in any JSON, using AES - so you end up with a unique but non-PII value, e.g. “Fred Blundun” always -> “1de6e53cb23”
This would be of potential interest to users in healthcare or finance, where the ability for analysts to drill down to individual users could be a privacy concern
Issue Analytics
- State:
- Created 9 years ago
- Comments:15 (15 by maintainers)
Top Results From Across the Web
how to capture clickstream events in Kafka with Snowplow
Snowplow allows us to capture user behavior, via a Javascript tag, at the individual level. In contrast to the most popular web analytics ......
Read more >Install Snowplow On The Google Cloud Platform
A walkthrough for deploying the Snowplow Analytics pipeline in the Google Cloud Platform environment.
Read more >Building the Lakehouse Architecture With Azure Synapse ...
Enriched is where data is cleaned, deduped etc, whereas curated is where we create our summary outputs, including facts and dimensions, all in ......
Read more >Modern Unified Data Architecture
Data should be cleansed, deduped, enriched and curated for data integrity so that businesses can trust the data and make a confident analysis....
Read more >The Delta Lake Series — Complete Collection
How does schema evolution work? ... How Delta Lake Solves Common Pain Points in Streaming ... Simplifying Streaming Stock Data Analysis Using Delta...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
One of the nice things about this idea is that the pii: true hint would be enough for Iglu when generating Redshift etc tables to make sure these columns are wide enough to take the hashed value.
It also just means that the work to identify that e.g. com.acme.email/send_email’s email_recipient property is PII is just done in one place (at the time of schema authorship), rather than every user having to configure their own PII Enrichment.
Migrated to https://github.com/snowplow/enrich/issues/212