question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG]Prediction with empty partitions fails on sklearn dask-ml models

See original GitHub issue

Prediction with empty partitions fails on sklearn dask-ml Models . This is because sklearn currently errors on empty frames. I am opening this issue here to track the best approach (wether its a fix that should go in dask-ml or sklearn or dask-sql.

Trace:

Exception: "ValueError('Found array with 0 sample(s) (shape=(0, 2)) while a minimum of 1 is required.')"

What happened:

%%sql
SELECT * FROM PREDICT(
  MODEL model,
  SELECT * FROM test_set limit 100
)

What you expected to happen:

Would expect this to work similar to cuML .

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
VibhuJawacommented, Mar 10, 2022

Is this an issue that can be narrowed down to a Dask-ML reproducer? If so, I would assume a fix would make sense there as generally Dask APIs shouldn’t run into issues if a dataframe contains empty partitions

Yup. The hope is that i can push a fix for this in Dask-ML . If not then fallback to a fix here. Will like to keep this issue open for tracking purposes.

0reactions
sarahyurickcommented, Dec 21, 2022

Can we close this issue since we’ve eliminated all Dask-ML dependencies?

Read more comments on GitHub >

github_iconTop Results From Across the Web

xgboost.dask.predict fails in the presence of empty partitions
When using xgb.dask.predict on a dataset which as a few empty partitions fails with an error.
Read more >
Dask DataFrame filter and repartition gives some empty ...
I found two existing posts from SO. remove empty partitions using cull_empty_partitions(); rebalance to get even partition sizes using ...
Read more >
dask-sql - bytemeta
[ENH]Warn users when training non dask friendly ML models with wrap_predict=False. ... [BUG]Prediction with empty partitions fails on sklearn dask-ml models.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found