question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Activities not mapping to the expected parents

See original GitHub issue

Question

Describe your environment.

I have a WPF application (.NET Framework 4.7.1) to which we are exploring adding telemetry. This application uses HttpClient to send various web requests to a WebAPI backend. Eventually, this will expand to include the backend as well, but for now, we are just looking at the UI client.

What are you trying to achieve?

All of our web calls ultimately funnel through a single async method which is where we are calling startActivity. Inside that, it then calls HttpClient.PostAsync. We have the data transmitting to a Jaeger instance. What we expected to see is that our explicitly created “startActivity” calls there would be top level items and the activities generated by httpclient would then appear inside it.

However, what we are finding seems Very Random™. Sometimes our activity is a top level item, sometimes the HttpClient item is top level. Sometimes our activities have other “our activities” nested under them (which conceptually is not the case). In fact, I have yet to find a case where it worked “correctly” (based on expectation one of our activities had its one HTTP call within it and nothing else).

This async method I referred to is called from a number of different threads and these web calls will certainly overlap. It almost feels like the OpenTelemetry framework is not consistently picking up the correct parent activity because of this?

Additional Context

Example code is below that spins up 20 parallel calls to the HttpClient, each wrapped in a manually created activity, named with the numbers 1-20.

What i expected to see was 1-20 are all top level items in Jaeger, with 1 subitem on each for the HTTP call. Instead, many top level items no sub items, and some subitems have multiple. For example, in this screenshot, number 11 happened to get a bunch of the HTTP calls (but not all): image

image

using System.Diagnostics;
using System.Linq;
using System.Net.Http;
using System.Reflection;
using System.Threading.Tasks;
using OpenTelemetry;
using OpenTelemetry.Trace;

namespace ConsoleApp3
{
    class Program
    {
        static ActivitySource activitySource = new ActivitySource(Assembly.GetExecutingAssembly().GetName().Name, Assembly.GetExecutingAssembly().GetName().Version.ToString());

        private static HttpClient client = new HttpClient();

        static void Main(string[] args)
        {
            using var tracerProvider = Sdk.CreateTracerProviderBuilder()
                .AddSource(Assembly.GetExecutingAssembly().GetName().Name)
                .AddHttpClientInstrumentation()
                .AddJaegerExporter()
                .Build();
            
            Task.WaitAll(Enumerable.Range(1, 20).Select(MakeWebCall).ToArray());
        }

        private static async Task MakeWebCall(int id)
        {
            using var activity = activitySource.StartActivity(id.ToString(), ActivityKind.Client);
            using HttpResponseMessage msg = await client.PostAsync($"http://localhost/{id}", new StringContent("dummy data")).ConfigureAwait(false);
        }
    }
}

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (3 by maintainers)

github_iconTop GitHub Comments

3reactions
cabadamcommented, Jun 22, 2021

Thanks. It confirmed what I suspected ! There is an issue which was earlier investigated and confirmed in the .NET runtime repo. I’ll find the tracking issue and workarounds.

Were you ever able to track down a workaround for this? Thanks!

2reactions
toddfoustcommented, Jul 14, 2021

Hello @cabadam Cijo asked me to respond because we recently worked with another customer facing same issue where telemetry was getting ‘mislabeled’ operation ids.

This is due to an issue somewhere in System.Diagnostics.DiagnosticSource namespace around code written to support Activities for the older desktop framework versions. We’ve seen scenarios where the wrong Activity object will get assigned to the wrong async thread, and so you’d see some telemetry items get set with the wrong distributed telemetry operation id.

This problem impacts Application Insights, Open Telemetry or any Diagnostic Listener implementation that is running on .NET 4.x frameworks. The injected diagnostic listeners don’t account for the behavior within .NET 4.x where the framework might chain calls out to the same URI on the same connection in a singular async context.

If you move the code over to .NET Core 3.1 or .NET 5.0 or later then you will avoid this problem. This only impacts 4.x apps.

One of our software architects came up with a workaround by supplying a custom HttpHandler to the HttpClient which will

  1. Try to pick a free connection
  2. Creates a new one if under limit
  3. More advanced stuff (least busy algorithm) should we exhaust available connections
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;
using System.Runtime.CompilerServices;
using System.IO;

namespace FullFramework
{
    public class QueueHttpHandler : DelegatingHandler
    {
        // We store a mapping from service point to semaphore slim. This lets us properly limit concurrent requests for individual URLs
        // based on the mapping from URI (service point) to semaphore slim.
        private ConditionalWeakTable<ServicePoint, SemaphoreSlim> _servicePointMap = new ConditionalWeakTable<ServicePoint, SemaphoreSlim>();

        protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
        {
            var servicePoint = ServicePointManager.FindServicePoint(request.RequestUri);

            // This assumes HTTP/1.1, one request per connection (which is the case by default anyways)
            var queue = _servicePointMap.GetValue(servicePoint, sp => new SemaphoreSlim(sp.ConnectionLimit, sp.ConnectionLimit));

            await queue.WaitAsync();
            try
            {
                var response = await base.SendAsync(request, cancellationToken);
                response.Content = new QueueContent(response.Content, queue);
                return response;
            }
            catch
            {
                // If there's an exception producing the response, release the semaphore slim
                queue.Release();
                throw;
            }
        }

        // This content is responsible releasing the semaphore when the content is fully read
        private class QueueContent : HttpContent
        {
            private readonly HttpContent _httpContent;
            private readonly SemaphoreSlim _queue;

            public QueueContent(HttpContent httpContent, SemaphoreSlim queue)
            {
                _httpContent = httpContent;
                _queue = queue;
            }

            protected override async Task SerializeToStreamAsync(Stream stream, TransportContext context)
            {
                try
                {
                    await _httpContent.CopyToAsync(stream, context);
                }
                finally
                {
                    // Release the semaphore slim after the copy
                    _queue.Release();
                }
            }

            protected override bool TryComputeLength(out long length)
            {
                
                var contentLength = _httpContent.Headers.ContentLength;
                if (contentLength == null)
                {
                    length = 0;
                    return false;
                }

                length = contentLength.Value;
                return true;
            }
        }
    }
}

I think you’ll be able to pass this custom handler, QueueHttpHandler, directly into your HttpClient constructor.

If you are doing dependency injection, like our earlier customer, then something like this will help configure the handler too:

        public void ConfigureServices(IServiceCollection services)
        {
            services.AddTransient<QueueHttpHandler>();
            services.AddHttpClient<TestClient1>().AddHttpMessageHandler<QueueHttpHandler>();
            ....

Hope this helps. Let us know if the workaround works for you too. We tested it out earlier and saw the workaround hold up really well under load.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Java this in parent and child not working as expected
Your getS() methods from parent class is inherited in Child class and hence it is available for Child object. Overriding is only for...
Read more >
Effective communication with parents and carers
Effective communication is key to positive partnerships with parents and carers. It's built on respect for families, plus positive listening ...
Read more >
The Relentlessness of Modern Parenting
Raising children has become significantly more time-consuming and expensive, amid a sense that opportunity has grown more elusive.
Read more >
9 Steps to More Effective Parenting
Parenting is incredibly challenging and rewarding. Here are 9 child-rearing tips that can help.
Read more >
Do Parents Today Raise Kids Differently From Their Own ...
Among parents raising their children differently from how they were raised, 7% mentioned that they want to instill different values in their ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found