question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Incorrect currentTransaction context during parallel execution

See original GitHub issue

Describe the bug

During parallel execution of a code which is covered by custom transaction/span monitors, the currentTransaction reference is misleading. For all parallel flows, it’s pointing to the firstly created transaction. I’m using a custom wrapper, which surrounds async/sync code and starts/ends transaction/span automatically - it’s deciding by existence of the currentTransaction object, so that’s the place where issues start.

To Reproduce

Simplified wrapper function:

const monitorAsyncWrapper = (fx: Function, name: string) => {
    const apmMonitor = apm.currentTransaction
        ? apm.startSpan(name)
        : apm.startTransaction(name)
    console.log(`creating new [${apmMonitor?.constructor.name}] - ${apmMonitor?.traceparent}`)
    return fx.apply(null)
        .then((result: any) => {
            apmMonitor!.end()
            console.log(`ending [${apmMonitor?.constructor.name}] - ${apmMonitor?.traceparent}`)
            return result
        })
}

Desired behavior: APM output for parallel execution should be same as for following serial execution:

(async () => {
    for (let i = 0; i < 50; i++) {
        await monitorAsyncWrapper(async () => {
            monitorAsyncWrapper(async () => {
                await setTimeout(() => {
                    return Math.pow(2, 2)
                }, 100)
            }, 'power calculation')
        }, 'Calculations')
    }
})();

console output showing transactions and spans are created correctly after each other

creating new [Transaction] - 00-e90ba39596ee8cd9173880a026e9d4bb-c2e9044e0caa4252-01
creating new [Span] - 00-e90ba39596ee8cd9173880a026e9d4bb-4c0e9b8bc4a05261-01
ending [Transaction] - 00-e90ba39596ee8cd9173880a026e9d4bb-c2e9044e0caa4252-01
ending [Span] - 00-e90ba39596ee8cd9173880a026e9d4bb-4c0e9b8bc4a05261-01
creating new [Transaction] - 00-e6f43777ee59a0f33b20b9a0c0970b99-f7305732fb1ede7d-01
creating new [Span] - 00-e6f43777ee59a0f33b20b9a0c0970b99-1c016033375384eb-01
ending [Transaction] - 00-e6f43777ee59a0f33b20b9a0c0970b99-f7305732fb1ede7d-01
ending [Span] - 00-e6f43777ee59a0f33b20b9a0c0970b99-1c016033375384eb-01
creating new [Transaction] - 00-7a16cba0e47ccc83e15e29b9a40244a6-3aab7f71c0d79ad9-01
creating new [Span] - 00-7a16cba0e47ccc83e15e29b9a40244a6-a0f4000911136a7a-01
ending [Transaction] - 00-7a16cba0e47ccc83e15e29b9a40244a6-3aab7f71c0d79ad9-01

2021-08-11-13:32:45-screenshot

Actual behavior: APM parallel execution:

(async () => {
    let promises = []
    for (let i = 0; i < 50; i++) {
        promises.push(
            monitorAsyncWrapper(async () => {
                await monitorAsyncWrapper(async () => {
                    await setTimeout(() => {
                        return Math.pow(2, 2)
                    }, 100)
                }, 'power calculation')
            }, 'Calculations')
        )
    }
    await Promise.all(promises)
})();

console output showing how all the promises are bind to the firstly created transaction:

creating new [Transaction] - 00-e48b38ebd2f1965c650261c49b7523ab-d3964644462375cd-01
creating new [Span] - 00-e48b38ebd2f1965c650261c49b7523ab-fe30d468e7ddb7c5-01
creating new [Span] - 00-e48b38ebd2f1965c650261c49b7523ab-3e199361bc1e777b-01
creating new [Span] - 00-e48b38ebd2f1965c650261c49b7523ab-ee64926800a91a5e-01
creating new [Span] - 00-e48b38ebd2f1965c650261c49b7523ab-d44e3b0cf8812f69-01
creating new [Span] - 00-e48b38ebd2f1965c650261c49b7523ab-d95df8233fae9c0f-01
creating new [Span] - 00-e48b38ebd2f1965c650261c49b7523ab-959358be41ac811d-01
creating new [Span] - 00-e48b38ebd2f1965c650261c49b7523ab-e2b9f28a92b22095-01
creating new [Span] - 00-e48b38ebd2f1965c650261c49b7523ab-337337dc5fdd847c-01
creating new [Span] - 00-e48b38ebd2f1965c650261c49b7523ab-8c71c46bbe8b5e9b-01
ending [Span] - 00-e48b38ebd2f1965c650261c49b7523ab-fe30d468e7ddb7c5-01
ending [Span] - 00-e48b38ebd2f1965c650261c49b7523ab-ee64926800a91a5e-01

2021-08-11-13:33:23-screenshot

Expected behavior

    const apmMonitor = apm.currentTransaction
        ? apm.startSpan(name)
        : apm.startTransaction(name)

the apm.currentTransaction object should be null for all promises

Environment (please complete the following information)

  • OS: Linux
  • Node.js version: tested on 16.2.0, 14.17.4
  • APM Server version: 7.11.1
  • Agent version: 3.19

How are you starting the agent? (please tick one of the boxes)

  • Calling agent.start() directly (e.g. require('elastic-apm-node').start(...))
  • Requiring elastic-apm-node/start from within the source code
  • Starting node with -r elastic-apm-node/start

Additional context

  • Agent config options

    Click to expand
    ELASTIC_APM_LOGGER=false
    
    module.exports = {
    serverUrl: 'http://localhost:8200'
    }
    
  • package.json dependencies:

    Click to expand
      "dependencies": {
      "@elastic/ecs-pino-format": "^1.0.0",
      "elastic-apm-node": "^3.12.1"
      }
    

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
trentmcommented, Aug 12, 2021

I’ll have to have a play with it.

plays

  1. Note: Having the loop var i in each transaction and span name is actually a bad practice because these values should have reasonably low cardinality. One side-effect of this is that a “metricset” is gathered for each transaction.name and is being sent as well. So in addition to 800 transactions and 800 spans, the agent is attempting to send 800 metricsets. Reducing transaction.name cardinality and/or using metricsInterval=0 to disable metrics can help mitigate somewhat.
  2. I understand now why ELASTIC_APM_MAX_QUEUE_SIZE is relevant here now. This maxQueueSize is a defense mechanism intended to not overload the upstream APM server under very high tracing load from the user application. However in this case the burst of tracing load is fast enough (1600 tracing objects are ended at roughly the same time) that even the APM agent’s serialization and sending of events upstream does not keep up. I think it is fair that the agent limits impact on the user app by dropping events in this case.

I’ve added https://github.com/elastic/apm-agent-nodejs/issues/2192 for the docs suggestion. Thanks!

I’m closing this issue now. Feel free to open new ones or ask on our discuss forum if you have other Qs or issues.

1reaction
trentmcommented, Aug 11, 2021

@david-sykora Thanks for the issue with the excellent detail and repro!

I made some changes to your script for discussion:

// issue-2186.js
const apm = require('elastic-apm-node').start({
    serviceName: 'issue-2186'
})
const { executionAsyncId } = require('async_hooks')

async function monitorAsyncWrapper(fx, name) {
    // await Promise.resolve() // Force move to a new async context.
    const apmMonitor = apm.currentTransaction
        ? apm.startSpan(name)
        : apm.startTransaction(name)
    console.log(`[xid=${executionAsyncId()}] creating new [${apmMonitor.constructor.name} ${apmMonitor.name}] - ${apmMonitor.traceparent}`)
    return fx.apply(null)
        .then((result) => {
            apmMonitor.end()
            console.log(`[xid=${executionAsyncId()}] ending [${apmMonitor.constructor.name} ${apmMonitor.name}] - ${apmMonitor.traceparent}`)
            return result
        })
}

(async () => {
    let promises = []
    for (let i = 0; i < 4; i++) {
        promises.push(
            monitorAsyncWrapper(async () => {
                await monitorAsyncWrapper(async () => {
                    await setTimeout(() => {
                        return Math.pow(2, 2)
                    }, 100)
                }, `power-calculation-${i}`)
            }, `Calculations-${i}`)
        )
    }
    await Promise.all(promises)
})();
  • 50 -> 4 to make the output easier to grok
  • I added the loop var i to the span/transaction names and printed those in the console.logs (it is easier to read names than the generated ids)
  • I added async_hooks.executionAsyncId() to each console.log to help show when code is run in the same task (on the node.js event loop)

Running this as is

% node issue-2186.js
[xid=1] creating new [Transaction Calculations-0] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-062edc2358a9a545-01
[xid=1] creating new [Span power-calculation-0] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-b01929b43232ec02-01
[xid=1] creating new [Span Calculations-1] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-2e109f0f8f4b9aaa-01
[xid=1] creating new [Span power-calculation-1] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-b4c7dc5370fe1d59-01
[xid=1] creating new [Span Calculations-2] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-278210e9608ce619-01
[xid=1] creating new [Span power-calculation-2] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-27d0c0c12f76b0e5-01
[xid=1] creating new [Span Calculations-3] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-71a06b524a667121-01
[xid=1] creating new [Span power-calculation-3] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-be8ebd48dff1e50a-01
[xid=27] ending [Span power-calculation-0] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-b01929b43232ec02-01
[xid=37] ending [Span power-calculation-1] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-b4c7dc5370fe1d59-01
[xid=47] ending [Span power-calculation-2] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-27d0c0c12f76b0e5-01
[xid=57] ending [Span power-calculation-3] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-be8ebd48dff1e50a-01
[xid=29] ending [Transaction Calculations-0] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-062edc2358a9a545-01
[xid=39] ending [Span Calculations-1] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-2e109f0f8f4b9aaa-01
[xid=49] ending [Span Calculations-2] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-278210e9608ce619-01
[xid=59] ending [Span Calculations-3] - 00-b7e098a449ddb1941d2a616dcf9ff0e6-71a06b524a667121-01

resulting in a trace something like this:

    transaction "Calculations-0"
    `- span "power-calculation-0"
    `- span "Calculations-1"
    `- span "power-calculation-1"
    `- span "Calculations-2"
    `- span "power-calculation-2"
    `- span "Calculations-3"
    `- span "power-calculation-3"

Notice that the creation of every apmMonitor in monitorAsyncWrapper is being run in the same event-loop task ([xid=1]). (An “event-loop task” can be identified with node.js’ async_hooks.executionAsyncId().) Actually, all the way down to the setTimeout(...) call is executed in that same event-loop task.

The APM agent tracks a current transaction and current span per event-loop task. So the first call to monitorAsyncWrapper (“Calculations-0”) results in apm.startTransaction and that becomes the apm.currentTransaction for async task 1. All the subsequent apmMonitor variables end up calling apm.startSpan().

Forcing each apmMonitor into a separate async task

If we uncomment the await Promise.resolve() line, this forces the rest of the monitorAsyncWrappers body into a separate async task.

% node issue-2186.js
[xid=21] creating new [Transaction Calculations-0] - 00-8e26c17fab30f4835cbef5b0173715b2-60bdf3873b22935a-01
[xid=24] creating new [Transaction Calculations-1] - 00-ee7c0ac4f798a4efe8bf90616dbb49ba-b04939ab975b4faf-01
[xid=27] creating new [Transaction Calculations-2] - 00-52ebf8ecf2c9b3cdf3c92bbc17aa1a01-67369ee47443d73b-01
[xid=30] creating new [Transaction Calculations-3] - 00-0abce8e457fb8fbba5045a38ef94bd5a-d27004d06f1431ba-01
[xid=69] creating new [Span power-calculation-0] - 00-8e26c17fab30f4835cbef5b0173715b2-18e773ed8870c75c-01
[xid=75] creating new [Span power-calculation-1] - 00-ee7c0ac4f798a4efe8bf90616dbb49ba-9f408dec4856ac7e-01
[xid=81] creating new [Span power-calculation-2] - 00-52ebf8ecf2c9b3cdf3c92bbc17aa1a01-664821fcb7a2be93-01
[xid=87] creating new [Span power-calculation-3] - 00-0abce8e457fb8fbba5045a38ef94bd5a-b424051db68a0497-01
[xid=94] ending [Span power-calculation-0] - 00-8e26c17fab30f4835cbef5b0173715b2-18e773ed8870c75c-01
[xid=100] ending [Span power-calculation-1] - 00-ee7c0ac4f798a4efe8bf90616dbb49ba-9f408dec4856ac7e-01
[xid=106] ending [Span power-calculation-2] - 00-52ebf8ecf2c9b3cdf3c92bbc17aa1a01-664821fcb7a2be93-01
[xid=112] ending [Span power-calculation-3] - 00-0abce8e457fb8fbba5045a38ef94bd5a-b424051db68a0497-01
[xid=71] ending [Transaction Calculations-0] - 00-8e26c17fab30f4835cbef5b0173715b2-60bdf3873b22935a-01
[xid=77] ending [Transaction Calculations-1] - 00-ee7c0ac4f798a4efe8bf90616dbb49ba-b04939ab975b4faf-01
[xid=83] ending [Transaction Calculations-2] - 00-52ebf8ecf2c9b3cdf3c92bbc17aa1a01-67369ee47443d73b-01
[xid=89] ending [Transaction Calculations-3] - 00-0abce8e457fb8fbba5045a38ef94bd5a-d27004d06f1431ba-01

Notice how each “creating” is now in a separate async task id. The resulting trace is:

    transaction "Calculations-0"
    `- span "power-calculation-0"
    transaction "Calculations-1"
    `- span "power-calculation-1"
    transaction "Calculations-2"
    `- span "power-calculation-2"
    transaction "Calculations-3"
    `- span "power-calculation-3"

like you wanted.

Notes

This await Promise.resolve() is a bit of a hack. They are necessary workaround of the APM agent’s current apm.startTransaction() and apm.startSpan() APIs which change the current context, instead of taking a function scope to run in a new context, something like:

    apm.withSpan({name: 'my-span'}, async function () {
      // do work here in the context of "my-span"
    })

We are doing some work in this area, but there is no current timeline for a new API something like withSpan(..., fn) yet.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Parallel Processing - Ask TOM
Hi Tom From a developer perspective, Whats the best way to determine the degree of parallelism for sql statements(select,insert)...parallel ...
Read more >
What is the reason of "Transaction context in use by another ...
First of all, transaction context is used ("locked") right at the time of sql command execution. So it's difficult to reproduce such a ......
Read more >
ParallelExecution in queries · Issue #1310 · graphql-dotnet ...
Problem I am using EFCore + DataLoaders and trying to query { root ... Executing both in parallel will fail with DbContext A...
Read more >
Troubleshoot plug-ins - Power Apps - Microsoft Learn
This topic contains information about errors that can occur due to plug-in execution and how to fix them.
Read more >
Transaction in Entity Framework 6 & Core
UseTransaction(): Allows us to pass an existing transaction object created out of the scope of a context object. This will allow EF to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found