Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Joblib-spark as a possible alternative for Distributed Optimization

See original GitHub issue

Since we are able to use Optuna with joblib, it seems possible to generalize the method using joblib-spark to leverage a spark backend similar to HyperOpt SparkTrials(). Of course, the tradeoffs between parallelism and running should be considered here.

Issue Analytics

State:
Created 3 years ago
Comments:14 (5 by maintainers)

Top GitHub Comments

1reaction

WaterKnight1998commented, Jun 21, 2022

FYI: Optuna does not use joblib internally from #2269, but Optuna can still be used with joblib and joblib-spark as it is now.

Could you share an example, please? Thanks in advance @HideakiImamura

1reaction

felipeeeantunescommented, Oct 17, 2020

@toshihikoyanase started this PR https://github.com/optuna/optuna/pull/1942.

I have some doubts about it: the current version of joblibspark has a bug, fixed in this merge (https://github.com/joblib/joblib-spark/pull/21). I should figure out how to add it to the Dockerfile instead of using pip.

Also, I should figure out how to provide a Dockerfile and k8s’ YAML with Spark to reproduce the example in minikube. Can you help with that?

Top Results From Across the Web

Efficient Distributed Hyperparameter Tuning with Apache Spark

Hyperparameter tuning is a key step in achieving and maintaining optimal performance from Machine Learning (ML) models.

Learning Spark, Second Edition - Databricks

Big Data and Distributed Computing at Google. 1. Hadoop at Yahoo! 2. Spark's Early Years at AMPLab. 3. What Is Apache Spark?

Introduction to Distributed Optimization

1) Create some input RDDs from external data or parallelize a collection in your driver program. 3) Ask Spark to cache() any intermediate...

Efficient Distributed Hyperparameter Tuning with Apache Spark

Hyperparameter tuning is a key step in achieving and maintaining optimal performance from Machine Learning (ML) models. Today, there are many open-source ...

Optuna vs Hyperopt: Which Hyperparameter Optimization ...

Ideally, you would like to stop those runs as soon as possible try different parameters instead. Optuna gives you an option to do...