question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

dotnet process crashes when using wildcard topics

See original GitHub issue

Description

We are developing a medium sized application that uses Kafka as publish - subscribe message broker. The application crashes when we subscribe several consumers using wildcard topics. If we do not use wildcard topics everything works as expected.

Most of the times the crash happens a few seconds after subscribing the consumer. The crash cause varies among access violation, stack overflow and heap corruption, and it is usually located in librdkafka.DLL, although sometimes other dlls appear, like ntdll.DLL.

We have verified that exactly the same code running exactly the same versions of Confluent.Kafka nuget and dotnet only crashes in Windows. If the code is executed in a Linux host, it never crashes.

We have also verified that this crash happens at least with versions 0.11.4, 0.11.6 and 1.0.0-RC4 of Confluent.Kafka nuget.

We have prepared a toy example below that reproduces the crash.

How to reproduce

With version 1.0.0-RC4 of the Nuget, using dotnet 2.2.105, the following code reproduces the error in Windows 10 x64:


static void Main(string[] args)
        {
            var conf1 = new ConsumerConfig
            {
                GroupId = "test-consumer-group1",
                BootstrapServers = "localhost:9092",
                AutoOffsetReset = AutoOffsetReset.Latest
            };

            var conf2 = new ConsumerConfig
            {
                GroupId = "test-consumer-group2",
                BootstrapServers = "localhost:9092",
                AutoOffsetReset = AutoOffsetReset.Latest
            };

            var conf3 = new ConsumerConfig
            {
                GroupId = "test-consumer-group3",
                BootstrapServers = "localhost:9092",
                AutoOffsetReset = AutoOffsetReset.Latest
            };

            var conf4 = new ConsumerConfig
            {
                GroupId = "test-consumer-group4",
                BootstrapServers = "localhost:9092",
                AutoOffsetReset = AutoOffsetReset.Latest
            };

            var c1 = new ConsumerBuilder<Ignore, string>(conf1).Build();
            var c2 = new ConsumerBuilder<Ignore, string>(conf2).Build();
            var c3 = new ConsumerBuilder<Ignore, string>(conf3).Build();
            var c4 = new ConsumerBuilder<Ignore, string>(conf4).Build();


            c1.Subscribe("^tenants\\.a77b7bec-c9d5-468c-89d8-cc3dc293354f\\.plants\\.a77b7bec-c9d5-468c-89d8-cc3dc293354f\\.notifications\\..*");
            c2.Subscribe("^tenants\\.a77b7bec-c9d5-468c-89d8-cc3dc293354f\\.plants\\.a77b7bec-c9d5-468c-89d8-cc3dc293354f\\.notifications\\..*");
            c3.Subscribe("^tenants\\.a77b7bec-c9d5-468c-89d8-cc3dc293354f\\.plants\\.a77b7bec-c9d5-468c-89d8-cc3dc293354f\\.notifications\\..*");
            c4.Subscribe("^tenants\\.a77b7bec-c9d5-468c-89d8-cc3dc293354f\\.plants\\.a77b7bec-c9d5-468c-89d8-cc3dc293354f\\.notifications\\..*");

            Console.WriteLine("All consumers subscribed.");

            CancellationTokenSource cts = new CancellationTokenSource();
            Console.CancelKeyPress += (_, e) => {
                e.Cancel = true; // prevent the process from terminating.
                cts.Cancel();
            };

                while (true)
                {
                    try
                    {
                        var cr1 = c1.Consume(cts.Token);
                        Console.WriteLine($"1. Consumed message '{cr1.Value}' at: '{cr1.TopicPartitionOffset}'.");
                        var cr2 = c2.Consume(cts.Token);
                        Console.WriteLine($"2. Consumed message '{cr2.Value}' at: '{cr2.TopicPartitionOffset}'.");
                        var cr3 = c3.Consume(cts.Token);
                        Console.WriteLine($"3. Consumed message '{cr3.Value}' at: '{cr3.TopicPartitionOffset}'.");
                        var cr4 = c4.Consume(cts.Token);
                        Console.WriteLine($"4. Consumed message '{cr4.Value}' at: '{cr4.TopicPartitionOffset}'.");
                }
                    catch (ConsumeException e)
                    {
                        Console.WriteLine($"Error occured: {e.Error.Reason}");
                    }
                }

            c1.Close();
            c2.Close();
            c3.Close();
            c4.Close();
        }

The output of the program shows the message “All consumers subscribed” and then dies after a few seconds. If the topic are replaced by “tenants\.a77b7bec-c9d5-468c-89d8-cc3dc293354f\.plants\.a77b7bec-c9d5-468c-89d8-cc3dc293354f\.notifications\.a” (the other three ending respectively in b, c and d) then the application does not crash.

In the code of the project the way in which we setup the consumers and manage the messages is much more involved (we use several async methods), but this minimal example reproduces the error exactly in the same way.

We can provide more details on our setup and even a zip file with the project if needed.

Checklist

Please provide the following information:

  • A complete (i.e. we can run it), minimal program demonstrating the problem. No need to supply a project file.

  • Confluent.Kafka nuget version. Crash reproduced using versions 0.11.4, 0.11.6 and 1.0.0-RC4.

  • Apache Kafka version. Kafka 2.1.0, Zookeeper 3.4.12 and java 1.8.0_181.

  • Client configuration.

  • Operating system. dotnet version 2.2.105 running under Windows 10 x64. The issue is not reproducible in Ubuntu 16.04.5 LTS even using exactly the same versions for everything.

  • Provide logs (with “debug” : “…” as necessary in configuration). Application crashes without printing anything to console. The Windows application event log shows the following information:

  • If the crash happens in librdkafka (most of the times):

Name of the application with errors: dotnet.exe, versión: 2.2.27207.3, timestamp: 0x5c0ab1b7
Name of the module with errors: librdkafka.DLL, versión: 0.0.0.0, timestamp: 0x5c99628f
Exception code: 0xc00000fd
Error offset: 0x00000000000cb36e
Identifier of the process with errors: 0x6938
Application with errors start time: 0x01d4eeb93b8ac8e1
Path to the application with errors: C:\Program Files\dotnet\dotnet.exe
Path to the application module with errors: C:\Users\vmartin\.nuget\packages\librdkafka.redist\1.0.0\runtimes\win-x64\native\librdkafka.DLL
Report identifier: 879113b6-44ff-4e07-94ce-1094048db72a
  • If the crash happens in ntdll.dll (sometimes):
Name of the application with errors: dotnet.exe, versión: 2.2.27207.3, marca de tiempo: 0x5c0ab1b7
Name of the module with errors: ntdll.dll, versión: 10.0.17763.404, marca de tiempo: 0xbf6ea104
Exception code: 0xc0000374
Error offset: 0x00000000000faf89
Identifier of the process with errors: 0x6a54
Application with errors start time: 0x01d4eedb61eb4cea
Path to the application with errors: C:\Program Files\dotnet\dotnet.exe
Path to the application module with errors: C:\WINDOWS\SYSTEM32\ntdll.dll
Report identifier: 00e2b779-0858-46e6-b15b-590d80884540
  • Provide broker log excerpts. No log lines are usually written to the log of the broker when the crash happens.
  • Critical issue.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:15 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
yburke94commented, Jul 31, 2019

I am seeing the same problem on a Windows 10 machine running version 1.0.1. We using lz4 compression on the producer side (as mentioned in #482 ). Our consumer application creates two consumer objects that use wildcard subscriptions (with different topic patterns) on startup. Shortly afterwards, the application crashes. Sometimes due to a StackOverflow. Other times due to a MemoryAccessViolation.

@edenhill Have you been able to reproduce this yet? I cannot see any issues in the librdkafka repo that seem related to this bug.

1reaction
dornycommented, Jul 17, 2019

We hit this issue today. I can confirm the behavior - it crashes on windows if there are more than two consumers using wildcard subscription. As it crashes without any (managed) exception or stack trace it took me a while to figure out what’s going.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Can you explain this bizarre crash in the .NET runtime?
So my code (the obfuscated function names at the end) calls System.IO.Directory.GetFiles(path) which crashes with a string indexing problem.
Read more >
C# Debugger stopps working and crashes application at ...
If i try to break the debugging i've to end the process of the debugged application in the task manager to continue to...
Read more >
Resolved issues in Windows 10, version 20H2
Find information on recently resolved issues for Windows 10, version 20H2. To find a specific issue, use the search function on your browser...
Read more >
Wildcard studios doesn't care of my crashes... What to do?
Hello. I am experiencing regular crashes in this game especially on the center map (but also the island sometimes) and i sent the...
Read more >
Fixing Xamarin.Forms linker issues - Progrunning
I've decided to update one of my Android apps. Unfortunately the process wasn't so smooth. At the first attempt of releasing it, after...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found