External requests are not always tracked by APM in Lambdas
See original GitHub issueDescribe the bug
I built the test application based on the recommendations provided by APM documentation, it’s the test code:
const apm = require("elastic-apm-node");
const fetch = require("node-fetch");
const elasticAPMURL = "<apm address>";
const elasticAPMToken = "<apm token>";
const logApm = (...args) => {
console.log({ type: "apm", args, stack: new Error().stack });
};
const loggerApm = {
fatal: logApm,
error: logApm,
warn: logApm,
info: logApm,
debug: logApm,
trace: logApm,
};
const elasticApm = apm.start({
serviceName: "testing-apm",
environment: "test-lambda-just-request",
serverUrl: elasticAPMURL,
secretToken: elasticAPMToken,
usePathAsTransactionName: true,
logger: loggerApm,
});
exports.handler = elasticApm.lambda(async (event) => {
const jokeResponse = await fetch(
"https://corporatebs-generator.sameerkumar.website/"
);
const jokeJson = await jokeResponse.json();
const response = {
statusCode: 200,
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(jokeJson),
};
return response;
});
After thr multiple executions of this Lambda via Function URL, in the Kibana I see the following screen, where I’d expect to see the external HTTP request:
More than that, in the logs of the APM server I can see that APM receives some invalid request to endpoint: /intake/v2/events?flushed=true
, without proper Content Type
:
{"log.level":"error","@timestamp":"2022-06-01T11:41:07.155Z","log.logger":"request","log.origin":{"file.name":"middleware/log_middleware.go","file.line":60},"message":"data validation error","service.name":"apm-server","url.original":"/intake/v2/events?flushed=true","http.request.method":"POST","user_agent.original":"apm-agent-nodejs/3.34.0 (testing-apm)","source.address":"3.144.139.235","http.request.body.bytes":0,"http.request.id":"f354756b-c1b8-4aae-b1fe-889b0061c60a","event.duration":104819,"http.response.status_code":400,"error.message":"invalid content type: ''","ecs.version":"1.6.0"}
The interesting part is that if I add the custom span the situation starts to be much better. The following code is added to the handler:
exports.handler = elasticApm.lambda(async (event) => {
const sleep = function sleep(ms) {
return new Promise((resolve) => setTimeout(resolve, ms));
}
const span = elasticApm.startSpan("Just waiting for happiness!");
await sleep(1000);
span.end();
... the rest of the code from previous example
});
After that I can see the added span, but still no external request and still I can see 400 requests in the logs of the APM server:
The next step for me was to add some external calls to AWS:
exports.handler = elasticApm.lambda(async (event) => {
const listBuckets = await s3.listBuckets().promise();
... the rest of the code from previous example
});
And after that some magic happens:
BUT, I still can see the 400 error in the logs of the APM server.
To Reproduce
Steps to reproduce the behavior:
-
Use the code provided above
-
Create the Lambda, enable Function URL, upload the code provided above with dependencies based on the package-lock.json provided below
-
Execute the Lambda by opening the Function URL from the Lambda Interface
-
Open Kibana, open APM section and open service “test-lambda-just-request”
-
See absence of external HTTP request in the “Timeline” tab
Expected behavior
The external HTTP request is displayed in the Timeline tab, so that we can trace amount of time spent on waiting for external service
Environment (please complete the following information)
- OS: APM server in Amazon AMI, APM executed in Lambda environment
- Node.js version: 14.x
- APM Server version: 7.17.4
- Agent version: 3.34.0
How are you starting the agent? (please tick one of the boxes)
- Calling
agent.start()
directly (e.g.require('elastic-apm-node').start(...)
) - Requiring
elastic-apm-node/start
from within the source code - Starting node with
-r elastic-apm-node/start
Additional context
-
package.json
dependencies:Click to expand
{ "name": "test-apm", "version": "1.0.0", "description": "", "main": "index.js", "dependencies": { "aws-sdk": "^2.1146.0", "elastic-apm-node": "^3.34.0", "node-fetch": "^2.6.7" }, "devDependencies": {}, "scripts": { "test": "echo \"Error: no test specified\" && exit 1" }, "author": "", "license": "ISC" }
Issue Analytics
- State:
- Created a year ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
Thanks for the background info @ValeriyMaslenikov – it’s always interesting to hear how folks are using/deploying the tech. 😃
@astorm , sure! We have some expectations, that as far as AWS Lambdas are region-specific and are executed in some specific region, i.e. it’s us-east-2, we host our Elastic Stack in the same region. That gives us the possibility to have low network latency between Lambda <> Elastic Stack.
That gave us some expectations that we can sacrifice this insignificant time spent on sending the metrics before user receives the response. I’d say we can set some timeout, after which we will stop waiting and just return the result to the end-users. More than that, one of the ways how on Lambdas it may be implemented is that it’s not required for us to wait for the Elastic Stack to process and give HTTP status code in the response, it should be enough to “fire and forget”, it will lead us on absence of logs in the case of some errors occurs, but will work better for performance.
As for me, it’s pretty similar to how Lambda Extensions work, they provide us a low-latency communication possibility (even though in the case of Lambda Extensions the latency even lower, cause it’s really the same process 😃) to transfer the data to some Elastic APM server in the same region.
And I’m not saying anything against the Lambda Extensions, we’d love to use them in the case we find the proper solution with logs