question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add filtering support for metadata payload

See original GitHub issue

In Kibana’s APM instrumentation, we are now capturing stats from developer’s local environments. One issue is that we cannot filter some PII such as the developer’s username from the path that is contained in the process.args array in the metadata that is sent along with APM traces:

image

Currently, we’re including a notice that some personal data may be captured with an option for disabling this. Ideally, we’d be able filter this out similar to how we can filter out transaction data.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
trentmcommented, Feb 25, 2021

@tylersmalley Thanks! You are right. process.title is the other one that can hold the full path. I’m sold on the use case then. I’ll get a PR moving for adding a metadata filter, then.

1reaction
trentmcommented, Dec 15, 2020

Some quick notes and a first hack:

  1. patch to apm-nodejs-http-client.git
diff --git a/index.js b/index.js
index c27fb71..2bb3a0d 100644
--- a/index.js
+++ b/index.js
@@ -495,7 +495,9 @@ function onStream (client, onerror) {

     // All requests to the APM Server must start with a metadata object
     if (!client._encodedMetadata) {
-      client._encodedMetadata = client._encode({ metadata: client._conf.metadata }, Client.encoding.METADATA)
+      // XXX HACK: This should be done *once*, but as late as possible.
+      var filteredMetadata = client._conf.metadataFilters.process(client._conf.metadata)
+      client._encodedMetadata = client._encode({ metadata: filteredMetadata }, Client.encoding.METADATA)
     }
     stream.write(client._encodedMetadata)
   }
  1. patch to apm-agent-nodejs.git
diff --git a/lib/agent.js b/lib/agent.js
index cd75056..56038c1 100644
--- a/lib/agent.js
+++ b/lib/agent.js
@@ -42,6 +42,7 @@ function Agent () {
   this._errorFilters = new Filters()
   this._transactionFilters = new Filters()
   this._spanFilters = new Filters()
+  this._metadataFilters = new Filters()
   this._transport = null

   this.lambda = lambda(this)
@@ -221,6 +224,12 @@ Agent.prototype.addFilter = function (fn) {
   this.addErrorFilter(fn)
   this.addTransactionFilter(fn)
   this.addSpanFilter(fn)
+  // XXX Decide if this would be a breaking change. For example the *default
+  // filter example in the docs* will break because it assumes
+  // `payload.context`. However that should *already* break for filtering
+  // error payloads.
+  // https://www.elastic.co/guide/en/apm/agent/nodejs/current/agent-api.html#apm-add-filter
+  this.addMetadataFilter(fn)
 }

 Agent.prototype.addErrorFilter = function (fn) {
@@ -250,6 +259,15 @@ Agent.prototype.addSpanFilter = function (fn) {
   this._spanFilters.push(fn)
 }

+Agent.prototype.addMetadataFilter = function (fn) {
+  if (typeof fn !== 'function') {
+    this.logger.error('Can\'t add filter of type %s', typeof fn)
+    return
+  }
+
+  this._metadataFilters.push(fn)
+}
+
 Agent.prototype.captureError = function (err, opts, cb) {
   if (typeof opts === 'function') return this.captureError(err, null, opts)

diff --git a/lib/config.js b/lib/config.js
index 2702181..e69e6f5 100644
--- a/lib/config.js
+++ b/lib/config.js
@@ -253,6 +253,7 @@ class Config {
           globalLabels: maybePairsToObject(conf.globalLabels),
           hostname: conf.hostname,
           environment: conf.environment,
+          metadataFilters: agent._metadataFilters,

           // Sanitize conf
           truncateKeywordsAt: config.INTAKE_STRING_MAX_SIZE,
  1. which allows me to write a (poor) filter somewhat for your case like this:
apm.addMetadataFilter(function myFilt(payload) {
  if (payload.process && payload.process.argv) {
    const user = new RegExp(process.env.USER, 'g')
    payload.process.argv = payload.process.argv.map((arg) => {
      return arg.replace(user, '[REDACTED]')
    })
  }
  return payload
})

and then observe that filtering in traffic to apm-server:

[2020-12-15T22:33:03.579Z]  INFO: mockapmserver/26715 on pink.local: request (req.remoteAddress=::ffff:127.0.0.1, req.remotePort=60675)
    POST /intake/v2/events HTTP/1.1
    accept: application/json
    user-agent: elasticapm-node/3.9.0 elastic-apm-http-client/9.4.2 node/10.23.0
    content-type: application/x-ndjson
    content-encoding: gzip
    host: localhost:8200
    connection: keep-alive
    transfer-encoding: chunked

    {"metadata":{"service":{"name":"esapp","environment":"development","runtime":{"name":"node","version":"10.23.0"},"language":{"name":"javascript"},"agent":{"name":"nodejs","version":"3.9.0"},"framework":{"name":"express","version":"4.17.1"},"version":"1.0.1"},"process":{"pid":28461,"ppid":27276,"title":"node","argv":["/Users/[REDACTED]/.nvm/versions/node/v10.23.0/bin/node","/Users/[REDACTED]/tm/play/esapp.js"]},"system":{"hostname":"pink.local","architecture":"x64","platform":"darwin"}}}
    {"span":{"name":...

Read more comments on GitHub >

github_iconTop Results From Across the Web

Metadata filtering - Overview - Pinecone
Searches with metadata filters retrieve exactly the number of nearest-neighbor ... You can associate a metadata payload with each vector in an index, ......
Read more >
Introducing payload-based message filtering for Amazon SNS
With payload-based message filtering, you have a simple, no-code option to further prevent unwanted data from being delivered to and processed ...
Read more >
Apigee Edge - Dynamic Filtering of Payloads - YouTube
This screencast shows how you can implement dynamic filtering of payloads in Apigee Edge. The filtering is done in custom JavaScript, ...
Read more >
Metadata filter - API Reference - Box Developer Documentation
A metadata template to filter the search results by. ... Specifies which fields on the template to filter the search results by. When...
Read more >
Basic Tutorial · OData - the Best Way to REST
Nested Filter in Expand. OData V4 supports nested filters in $expand . The request below return People and all their trips with Name...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found