question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

test_join_sort.py in CI failed by timeout with new Dask release - 2022.2.0

See original GitHub issue

Error: https://github.com/modin-project/modin/runs/5195622251?check_suite_focus=true

Dask release - https://github.com/dask/dask/releases/tag/2022.02.0

Fastest option here - pin dask<2022.2.0, but it also requires an investigation into the cause.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
dchigarevcommented, Feb 15, 2022

Besides fixing the perf issue, I think we can revise test_join_sort suites in order to reduce them. Only this one test generates 17.000+ cases (around ~35% of all tests for dataframes).

0reactions
mvashishthacommented, Jun 8, 2022

I am trying MODIN_ENGINE=dask pytest -n 2 --durations=0 modin/pandas/test/dataframe/test_join_sort.py on my Macbook to get an understanding of the test case time distribution on dask 2022.1.1.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Changelog — Dask.distributed 2022.12.1 documentation
This release changes the default scheduling mode to use queuing. This will significantly reduce cluster memory use in most cases, and generally improve ......
Read more >
[dask] [python] lightgbm.dask hangs indefinitely after an error
I think this because timeout errors happen immediately after a test that produces an error, and as of #4159, all of LightGBM's Dask...
Read more >
Dask multi-stage resource setup causes Failed to Serialize Error
Python verion: 3.8.10 dask: 2022.2.0 dask-jobqueue: 0.7.3 The problem is self-evident. Setup is just like in the documentation.
Read more >
Running numerous tasks on dask cluster with large data ...
I was running a lot of tasks (hundreds and up to thousands) that retrieves large volumes of data and pass them to other...
Read more >
Repeated cluster timeout errors — Coiled documentation
Sometimes creating a cluster can fail due to a connection timeout error, ... be due to a port being blocked or due to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found