question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[EventHubs] Some unit tests are flaky or fail constantly

See original GitHub issue

Summary

Four of the Event Processor Client tests are not working as expected: one of them is already tracked by #9228, while the other three are tracked by this issue.

  • PartitionClosingAsyncIsCalledWithOwnershipLostReasonWhenStoppingTheFailedProcessor is flaky. Sometimes the test hangs for indefinite time until it gets canceled. Sometimes it works. The flakiness is believed to have been introduced after the creation of the PartitionLoadBalancer class.
  • PartitionClosingAsyncTokenIsCanceledWhenStopProcessingAsyncIsCalled always fails. Reason is unknown.
  • ProcessErrorAsyncIsTriggeredWithCorrectArgumentsWhenOwnershipClaimFails always fails, but the cause is known.

https://github.com/Azure/azure-sdk-for-net/blob/1408ad9db2579f2943043549c6af70b98f9eb7fe/sdk/eventhub/Azure.Messaging.EventHubs.Processor/src/EventProcessorClient.cs#L949-L957

The Event Processor Client cannot figure out which partition failed during the RunLoadBalancingAsync call and it’s passing a null to ProcessErrorEventArgs instead of passing the partition id. Ownership claim failure is the only load balancing scenario in which a partition id is expected (null should be kept for other types of failure).

Goal

  • Make the necessary changes to the client to make the tests pass reliably without flakiness. The tests themselves could be the real problem, so changes to the tests might also be necessary.
  • Remove the Ignore attribute from the aforementioned tests.
  • Assert that no other tests fail because of the changes.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:11 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
jsquirecommented, Mar 5, 2020

Looks like that disable was included in #10320

0reactions
jsquirecommented, Mar 6, 2020

In theory, no… the CI and nightly runs are the same for unit tests. The nightly runs include the live tests as well and, as a result, have a much higher potential for intermittent delays due to longer-running operations , availability in the thread pool, and general wonkiness around ARM calls.

Consequently, we seem to find tests with timing sensitivity during nightly runs moreso than CI or local runs where the time variance isn’t as dramatic.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Fix Flaky Tests - Semaphore CI
A test that intermittently fails for no apparent reason — or works in your local machine and fails with continuous integration — is...
Read more >
8 Common Causes of Flaky Tests in Elixir
Flaky tests are tests that sometimes fail. They erode confidence in your test suite and are hard to fix because they are hard...
Read more >
Manage flaky tests - Azure Pipelines
A flaky test is a test that provides different outcomes, such as pass or fail, even when there are no changes in the...
Read more >
Preventing Flaky Tests from Ruining your Test Suite
A Flaky Test is a test that reports success and failure given the “same” execution environment. Flaky tests are costly. Now obviously, flaky...
Read more >
Flaky Tests: Getting Rid Of A Living Nightmare In Testing
A flaky test is one that fails to produce the same result each time the same analysis is run. The build will fail...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found