question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

External requests are not always tracked by APM in Lambdas

See original GitHub issue

Describe the bug

I built the test application based on the recommendations provided by APM documentation, it’s the test code:

const apm = require("elastic-apm-node");
const fetch = require("node-fetch");

const elasticAPMURL = "<apm address>";
const elasticAPMToken = "<apm token>";

const logApm = (...args) => {
  console.log({ type: "apm", args, stack: new Error().stack });
};

const loggerApm = {
  fatal: logApm,
  error: logApm,
  warn: logApm,
  info: logApm,
  debug: logApm,
  trace: logApm,
};

const elasticApm = apm.start({
  serviceName: "testing-apm",
  environment: "test-lambda-just-request",
  serverUrl: elasticAPMURL,
  secretToken: elasticAPMToken,
  usePathAsTransactionName: true,
  logger: loggerApm,
});

exports.handler = elasticApm.lambda(async (event) => {
  const jokeResponse = await fetch(
    "https://corporatebs-generator.sameerkumar.website/"
  );
  const jokeJson = await jokeResponse.json();

  const response = {
    statusCode: 200,
    headers: {
      "Content-Type": "application/json",
    },
    body: JSON.stringify(jokeJson),
  };

  return response;
});

After thr multiple executions of this Lambda via Function URL, in the Kibana I see the following screen, where I’d expect to see the external HTTP request: image

More than that, in the logs of the APM server I can see that APM receives some invalid request to endpoint: /intake/v2/events?flushed=true, without proper Content Type :

{"log.level":"error","@timestamp":"2022-06-01T11:41:07.155Z","log.logger":"request","log.origin":{"file.name":"middleware/log_middleware.go","file.line":60},"message":"data validation error","service.name":"apm-server","url.original":"/intake/v2/events?flushed=true","http.request.method":"POST","user_agent.original":"apm-agent-nodejs/3.34.0 (testing-apm)","source.address":"3.144.139.235","http.request.body.bytes":0,"http.request.id":"f354756b-c1b8-4aae-b1fe-889b0061c60a","event.duration":104819,"http.response.status_code":400,"error.message":"invalid content type: ''","ecs.version":"1.6.0"}

The interesting part is that if I add the custom span the situation starts to be much better. The following code is added to the handler:

exports.handler = elasticApm.lambda(async (event) => {
    const sleep = function sleep(ms) {
      return new Promise((resolve) => setTimeout(resolve, ms));
    }

    const span = elasticApm.startSpan("Just waiting for happiness!");
    await sleep(1000);
    span.end();
    
    ... the rest of the code from previous example
});

After that I can see the added span, but still no external request and still I can see 400 requests in the logs of the APM server: image

The next step for me was to add some external calls to AWS:

exports.handler = elasticApm.lambda(async (event) => {
    const listBuckets = await s3.listBuckets().promise();
    ... the rest of the code from previous example
});

And after that some magic happens: image

BUT, I still can see the 400 error in the logs of the APM server.

To Reproduce

Steps to reproduce the behavior:

  1. Use the code provided above

  2. Create the Lambda, enable Function URL, upload the code provided above with dependencies based on the package-lock.json provided below

  3. Execute the Lambda by opening the Function URL from the Lambda Interface image

  4. Open Kibana, open APM section and open service “test-lambda-just-request”

  5. See absence of external HTTP request in the “Timeline” tab

Expected behavior

The external HTTP request is displayed in the Timeline tab, so that we can trace amount of time spent on waiting for external service

Environment (please complete the following information)

  • OS: APM server in Amazon AMI, APM executed in Lambda environment
  • Node.js version: 14.x
  • APM Server version: 7.17.4
  • Agent version: 3.34.0

How are you starting the agent? (please tick one of the boxes)

  • Calling agent.start() directly (e.g. require('elastic-apm-node').start(...))
  • Requiring elastic-apm-node/start from within the source code
  • Starting node with -r elastic-apm-node/start

Additional context

  • package.json dependencies:

    Click to expand
        {
          "name": "test-apm",
          "version": "1.0.0",
          "description": "",
          "main": "index.js",
          "dependencies": {
            "aws-sdk": "^2.1146.0",
            "elastic-apm-node": "^3.34.0",
            "node-fetch": "^2.6.7"
          },
          "devDependencies": {},
          "scripts": {
            "test": "echo \"Error: no test specified\" && exit 1"
          },
          "author": "",
          "license": "ISC"
        }
    

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
astormcommented, Jun 2, 2022

Thanks for the background info @ValeriyMaslenikov – it’s always interesting to hear how folks are using/deploying the tech. 😃

1reaction
ValeriyMaslenikovcommented, Jun 2, 2022

Could you talk a bit more about this?

@astorm , sure! We have some expectations, that as far as AWS Lambdas are region-specific and are executed in some specific region, i.e. it’s us-east-2, we host our Elastic Stack in the same region. That gives us the possibility to have low network latency between Lambda <> Elastic Stack.

That gave us some expectations that we can sacrifice this insignificant time spent on sending the metrics before user receives the response. I’d say we can set some timeout, after which we will stop waiting and just return the result to the end-users. More than that, one of the ways how on Lambdas it may be implemented is that it’s not required for us to wait for the Elastic Stack to process and give HTTP status code in the response, it should be enough to “fire and forget”, it will lead us on absence of logs in the case of some errors occurs, but will work better for performance.

As for me, it’s pretty similar to how Lambda Extensions work, they provide us a low-latency communication possibility (even though in the case of Lambda Extensions the latency even lower, cause it’s really the same process 😃) to transfer the data to some Elastic APM server in the same region.

And I’m not saying anything against the Lambda Extensions, we’d love to use them in the case we find the proper solution with logs

Read more comments on GitHub >

github_iconTop Results From Across the Web

Lambda Not Picking Up Existing Trace Context #2925 - GitHub
We've received reports that AWS Lamba, when serving an API Gateway Request, will not pickup the HTTP context. A lambda nodejs app has....
Read more >
How to Monitor Lambda Functions | Datadog
Learn how you can use Datadog to monitor the performance of your serverless applications running on AWS Lambda.
Read more >
Monitoring Your AWS Lambda Functions - New Relic
Using New Relic One's synthetic monitors, you can script tests to monitor how our AWS Lambda function responds to external events.
Read more >
Operating Lambda: Logging and custom metrics - AWS
Lambda reports some metrics directly to the CloudWatch service and these do not appear in the logs. With CloudWatch, you can create alarms...
Read more >
Monitoring AWS Lambda Node.js Functions | APM ... - Elastic
System and custom metrics are not collected for Lambda functions. This is both because most of those are irrelevant and because the interval-based...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found