question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Scala Common Enrich: encrypt original values in PII Enrichment

See original GitHub issue

The motivation for this ticket is to help users of piinguin and piinguin relay to better secure access to the original data on piinguin without having to focus on securing access to piinguin within an organisation.

The way to achieve that is to have one (or more) public keys with which all the original values will be encrypted. The new configuration will look like this:

{
  "schema": "iglu:com.snowplowanalytics.snowplow.enrichments/pii_enrichment_config/jsonschema/3-0-0",
  "data": {
    "vendor": "com.snowplowanalytics.snowplow.enrichments",
    "name": "pii_enrichment_config",
    "emitEvent": true,
    "enabled": true,
    "parameters": {
      "pii": [
        {
          "pojo": {
            "field": "user_id",
            "encryptionKeyName": "other-key"
          }
        },
        {
          "pojo": {
            "field": "user_fingerprint"
            # No encryption
          }
        },
        {
          "json": {
            "field": "unstruct_event",
            "schemaCriterion": "iglu:com.mailchimp/subscribe/jsonschema/1-*-*",
            "jsonPath": "$.data.['email', 'ip_opt']",
            "encryptionKeyName": "email-key"
          }
        }
      ],
      "strategy": {
        "pseudonymize": {
          "hashFunction": "SHA-1",
          "salt": "pepper123"
        }
      },
      "encryption": [
        {
          "keyName": "email-key",
          "key": "some-rsa-publickey"
        },
        {
          "keyName": "other-key",
          "key": "some-rsa-publickey-2"
        }
      ]
    }
  }
}

The emitted event will also be changed (value is encrypted and base64 encoded, the actual implementation will need to be finalised):


{
  "schema": "iglu:com.snowplowanalytics.snowplow/unstruct_event/jsonschema/1-0-0",
  "data": {
    "schema": "iglu:com.snowplowanalytics.snowplow/pii_transformation/jsonschema/2-0-0",
    "data": {
      "pii": {
        "pojo": [
          {
            "fieldName": "user_fingerprint",
            "originalValue": "its_you_again!",
            "modifiedValue": "27abac60dff12792c6088b8d00ce7f25c86b396b8c3740480cd18e21068ecff4"
          },
          {
            "fieldName": "user_ipaddress",
            "originalValue": "eZDx1Y1SMIcP0vIzkNsx3xMZ4twdyqqU5bqNPkLNYElDNcUhD/8NH0Xb8vYPLvy5NZmm5XuMzInQ7xRHr4kB9q4kvRwtCwUGSS4OSR/QlPQWMz6NzMAep7oQ10crpdxQcXH5LxvMTMROndxOnV5Aglepd4zuSMRj+q3u9uH6zZmiMjS/1xcxC4dRdD3NtrR9IpNjaqkx9BrQ2S1ClsVntU/UGLZEAle5H+Uy+qvXYczbQsmVVwYLdgv4S4Om0QPW+T48pu2VGXVwNnJUwdAFqL+snAFrOfyGa1oDcwoTGcbhR3YJO2Gv7NzvMyDtPaNLaYgrzDJcDV1qLt1W12h2Bg==",
            "modifiedValue": "dd9720903c89ae891ed5c74bb7a9f2f90f6487927ac99afe73b096ad0287f3f5",
            "encryptionKeyName": "other-key"
          },
          {
            "fieldName": "user_id",
            "originalValue": "eZDx1Y1SMIcP0vIzkNsx3xMZ4twdyqqU5bqNPkLNYElDNcUhD/8NH0Xb8vYPLvy5NZmm5XuMzInQ7xRHr4kB9q4kvRwtCwUGSS4OSR/QlPQWMz6NzMAep7oQ10crpdxQcXH5LxvMTMROndxOnV5Aglepd4zuSMRj+q3u9uH6zZmiMjS/1xcxC4dRdD3NtrR9IpNjaqkx9BrQ2S1ClsVntU/UGLZEAle5H+Uy+qvXYczbQsmVVwYLdgv4S4Om0QPW+T48pu2VGXVwNnJUwdAFqL+snAFrOfyGa1oDcwoTGcbhR3YJO2Gv7NzvMyDtPaNLaYgrzDJcDV1qLt1W12h2Bg==",
            "modifiedValue": "7d8a4beae5bc9d314600667d2f410918f9af265017a6ade99f60a9c8f3aac6e9",
            "encryptionKeyName": "other-key"
          }
        ],
        "json": [
          {
            "fieldName": "unstruct_event",
            "originalValue": "eZDx1Y1SMIcP0vIzkNsx3xMZ4twdyqqU5bqNPkLNYElDNcUhD/8NH0Xb8vYPLvy5NZmm5XuMzInQ7xRHr4kB9q4kvRwtCwUGSS4OSR/QlPQWMz6NzMAep7oQ10crpdxQcXH5LxvMTMROndxOnV5Aglepd4zuSMRj+q3u9uH6zZmiMjS/1xcxC4dRdD3NtrR9IpNjaqkx9BrQ2S1ClsVntU/UGLZEAle5H+Uy+qvXYczbQsmVVwYLdgv4S4Om0QPW+T48pu2VGXVwNnJUwdAFqL+snAFrOfyGa1oDcwoTGcbhR3YJO2Gv7NzvMyDtPaNLaYgrzDJcDV1qLt1W12h2Bg==",
            "modifiedValue": "269c433d0cc00395e3bc5fe7f06c5ad822096a38bec2d8a005367b52c0dfb428",
            "jsonPath": "$.ip",
            "schema": "iglu:com.mailgun/message_clicked/jsonschema/1-0-0",
            "encryptionKeyName": "email-key"
          },
          {
            "fieldName": "contexts",
            "originalValue": "eZDx1Y1SMIcP0vIzkNsx3xMZ4twdyqqU5bqNPkLNYElDNcUhD/8NH0Xb8vYPLvy5NZmm5XuMzInQ7xRHr4kB9q4kvRwtCwUGSS4OSR/QlPQWMz6NzMAep7oQ10crpdxQcXH5LxvMTMROndxOnV5Aglepd4zuSMRj+q3u9uH6zZmiMjS/1xcxC4dRdD3NtrR9IpNjaqkx9BrQ2S1ClsVntU/UGLZEAle5H+Uy+qvXYczbQsmVVwYLdgv4S4Om0QPW+T48pu2VGXVwNnJUwdAFqL+snAFrOfyGa1oDcwoTGcbhR3YJO2Gv7NzvMyDtPaNLaYgrzDJcDV1qLt1W12h2Bg==",
            "modifiedValue": "1c6660411341411d5431669699149283d10e070224be4339d52bbc4b007e78c5",
            "jsonPath": "$.data.emailAddress2",
            "schema": "iglu:com.acme/email_sent/jsonschema/1-1-0",
            "encryptionKeyName": "email-key"
          },
          {
            "fieldName": "contexts",
            "originalValue": "eZDx1Y1SMIcP0vIzkNsx3xMZ4twdyqqU5bqNPkLNYElDNcUhD/8NH0Xb8vYPLvy5NZmm5XuMzInQ7xRHr4kB9q4kvRwtCwUGSS4OSR/QlPQWMz6NzMAep7oQ10crpdxQcXH5LxvMTMROndxOnV5Aglepd4zuSMRj+q3u9uH6zZmiMjS/1xcxC4dRdD3NtrR9IpNjaqkx9BrQ2S1ClsVntU/UGLZEAle5H+Uy+qvXYczbQsmVVwYLdgv4S4Om0QPW+T48pu2VGXVwNnJUwdAFqL+snAFrOfyGa1oDcwoTGcbhR3YJO2Gv7NzvMyDtPaNLaYgrzDJcDV1qLt1W12h2Bg==",
            "modifiedValue": "72f323d5359eabefc69836369e4cabc6257c43ab6419b05dfb2211d0e44284c6",
            "jsonPath": "$.emailAddress",
            "schema": "iglu:com.acme/email_sent/jsonschema/1-0-0",
            "encryptionKeyName": "email-key"
          }
        ]
      },
      "strategy": {
        "pseudonymize": {
          "hashFunction": "SHA-256"
        }
      }
    }
  }
}

An incidental benefit coming out of this is that the values in kinesis pii are also encrypted.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
knserviscommented, Jun 4, 2018

That is a possible alternative. One possible motivation to do it in the enrichment is to also help secure the pii stream, however this is not a strong reason. It may be better in piinguin.

0reactions
chuwycommented, Jun 19, 2020
Read more comments on GitHub >

github_iconTop Results From Across the Web

Filtering events from specific IPs - Enrichment - Discourse
Hey, I was wondering if there is a way to filter events from specific IPs BEFORE they are loaded into the target. Before...
Read more >
How to Use Databricks to Encrypt and Protect PII Data
The first step in this process is to protect the data by encrypting it. One possible solution is the Fernet Python library. Fernet...
Read more >
how to capture clickstream events in Kafka with Snowplow
We don't need to wait in order to act. time value of data. In this post, we'll walk you through the steps to...
Read more >
What's new - IBM Cloud Pak for Data as a Service
Python 3.10 is now supported in Decision Optimization experiments in Watson Studio and for deployment in Watson Machine Learning. The default version remains ......
Read more >
Transcriptome Profiling Uncovers Potential Common ... - NCBI
These genes are involved in ubiquitination, protein folding, cell proliferation, and apoptosis. Pathway-based enrichment analyses demonstrated ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found