spark: fix flaky tests
See original GitHub issueHere fails io.openlineage.spark.agent.SparkContainerIntegrationTest [10] spark_v2_drop.py, pysparkV2DropTableStartEvent.json, pysparkV2DropTableCompleteEvent.json, true
and here testFilteringDeltaEvents
https://app.circleci.com/pipelines/github/OpenLineage/OpenLineage/4634/workflows/3acb2e0a-ee15-45b2-9d5e-b38c00228d6c/jobs/56458
Issue Analytics
- State:
- Created a year ago
- Comments:5 (4 by maintainers)
Top Results From Across the Web
[#SPARK-28358] Fix Flaky Tests - ASF JIRA
I took a look related issue to this. Seems like it's very likely the issue from the old PyPy. It will be fixed...
Read more >[SPARK-40096][CORE][TESTS][FOLLOW-UP] Fix flaky test ...
This PR is a followup of #37533 that fix the flaky test case. Why are the changes needed? The test case is flaky,...
Read more >How to Deal with Flaky Tests - The New Stack
1. Visualizing Test Runs · 2. Quarantining Flaky Tests · 3. Cleaning up State · 4. Looking for Timeouts · 5. Using Test...
Read more >[jira] [Updated] (SPARK-33273) Fix Flaky Test ... - The Mail Archive
[jira] [Updated] (SPARK-33273) Fix Flaky Test: ThriftServerQueryTestSuite. subquery_scalar_subquery_scalar_subquery_select_sql · Dongjoon Hyun (Jira) Thu, ...
Read more >Spark 2.0.1-1703 Release Notes
In addition, Spark 2.0.1-1703 includes backports of all the fixes contained in ... be401c8, 2017/02/06, [SPARK-17624][SQL][STREAMING][TEST] Fixed flaky ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
It still is a very bad experience. We should strive to remove this problem, maybe by rewriting tests or making them work with larger datasets if that’s what will allow us to emit events in time.
https://app.circleci.com/pipelines/github/OpenLineage/OpenLineage/4816/workflows/962d0d0f-e8ce-4aba-947e-e46dc9db59e3/jobs/60092
@mobuchowski indeed Maciej, they didn’t merge the fix for Spark 3.3.1.
To rephrase the problem: Before we are able to process
SparkListenerSQLExecutionStart
, theSparkListenerSQLExecutionEnd
is published and query execution is removed fromexecutionIdToQueryExecution
which we do require for processing start event.Are somehow able to check within processing
start
event if there is already anend
event and get queryExecution from it?