question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Improve ML implementation

See original GitHub issue

The ML implementation is still a bit experimental - we can improve on this:

  • SHOW MODELS and DESCRIBE MODEL
  • Hyperparameter optimizations, AutoML-like behaviour
  • @romainr brought up the idea of exporting models (#191, still missing: onnx - see discussion in the PR by @rajagurunath)
  • and some more showcases and examples

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:25

github_iconTop GitHub Comments

2reactions
ckmganeshcommented, Feb 6, 2021

Thanks for the explanation It could take some time for me to go through the code and implement this. I’ll try my best to do this. Thanks

1reaction
nils-brauncommented, Jun 28, 2021

Your changes get really better with each PR you are doing, @rajagurunath - congratulations! I am not an expert in these automl packages, but this looks already really good. You can open a PR with those changes, I think I do not have many comments this time (as it looks already quite good). Good work!

Any idea how to proceed with this?

Sorry, I missed sending an answer to this. You can create a dask client by yourself (as you have pointed out in your comment), or you can rely on dask’s “auto-client” feature. If there is a client set up before, it will automatically pick it up. So basically all you need to do in your tests is to use the “client” pytest fixture (which is from the dask.distributed package, which is already imported). This will set up a valid dask client for you and the XGBoost implementation will pick it up automatically.

I think we should not create our own parameter for this and better use this “auto-client” feature, what do you think? (if you are wondering how this will work when users run dask-sql in SQL-only mode: the sql-server automatically sets up a dask client already, so this is not a problem).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Best Practices for Improving Your Machine Learning and ...
The first step in improving machine learning models is to carefully review the underlying hypotheses for the model in the context of the...
Read more >
5 Effective Ways to Improve the Accuracy of Your Machine ...
One of the easiest ways to improve the accuracy of your machine learning models is to handle missing values and outliers. If you...
Read more >
10 Ways to Improve Your Machine Learning Models - dummies
Another great way to obtain both new cases and new features is by scraping the data from the web. Often, data is available...
Read more >
Strategies for Improving Machine Learning Algorithms - LinkedIn
Strategies for Improving ML Models — Structured Data. There are many methods for improving machine learning models based on structured data.
Read more >
Machine Learning Performance Improvement Cheat Sheet
Machine Learning Performance Improvement Cheat Sheet · 1. Improve Performance With Data · 2. Improve Performance With Algorithms · 3. Improve ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found