Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Improve ML implementation

See original GitHub issue

The ML implementation is still a bit experimental - we can improve on this:

SHOW MODELS and DESCRIBE MODEL
Hyperparameter optimizations, AutoML-like behaviour
@romainr brought up the idea of exporting models (#191, still missing: onnx - see discussion in the PR by @rajagurunath)
and some more showcases and examples

Issue Analytics

State:
Created 3 years ago
Comments:25

Top GitHub Comments

2reactions

ckmganeshcommented, Feb 6, 2021

Thanks for the explanation It could take some time for me to go through the code and implement this. I’ll try my best to do this. Thanks

1reaction

nils-brauncommented, Jun 28, 2021

Your changes get really better with each PR you are doing, @rajagurunath - congratulations! I am not an expert in these automl packages, but this looks already really good. You can open a PR with those changes, I think I do not have many comments this time (as it looks already quite good). Good work!

Any idea how to proceed with this?

Sorry, I missed sending an answer to this. You can create a dask client by yourself (as you have pointed out in your comment), or you can rely on dask’s “auto-client” feature. If there is a client set up before, it will automatically pick it up. So basically all you need to do in your tests is to use the “client” pytest fixture (which is from the dask.distributed package, which is already imported). This will set up a valid dask client for you and the XGBoost implementation will pick it up automatically.

I think we should not create our own parameter for this and better use this “auto-client” feature, what do you think? (if you are wondering how this will work when users run dask-sql in SQL-only mode: the sql-server automatically sets up a dask client already, so this is not a problem).