Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to set up elephas on spark workers with --archives

See original GitHub issue

When running the basic example from the doc:

from elephas.utils.rdd_utils import to_simple_rdd rdd = to_simple_rdd(sc, x_train, y_train) from elephas.spark_model import SparkModel from elephas import optimizers as elephas_optimizers sgd = elephas_optimizers.SGD() spark_model = SparkModel(sc, model, optimizer=sgd, frequency=‘epoch’, mode=‘asynchronous’, num_workers=2) spark_model.train(rdd, nb_epoch=epochs, batch_size=batch_size, verbose=1, validation_split=0.1)

I get the following error: “ImportError: No module named elephas.spark_model” I use PySpark 2.1 and Keras 2. Any suggestions?

: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 5.0 failed 4 times, most recent failure: Lost task 1.3 in stage 5.0 (TID 58, xxxx, executor 8): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/xx/xx/hadoop/yarn/local/usercache/xx/appcache/application_1512662857247_19188/container_151xxx2857247_19188_01_000009/pyspark.zip/pyspark/worker.py", line 163, in main
    func, profiler, deserializer, serializer = read_command(pickleSer, infile)
  File "/xx/xx/hadoop/yarn/local/usercache/xx/appcache/application_1512662857247_19188/container_151xxx2857247_19188_01_000009/pyspark.zip/pyspark/worker.py", line 54, in read_command
    command = serializer._read_with_length(file)
  File /yarn/local/usercache/xx/appcache/application_1512xx57247_19x8/container_1512xxx857247_19188_01_000009/pyspark.zip/pyspark/serializers.py", line 169, in _read_with_length
    return self.loads(obj)
  File "/yarn//local/usercache/xx/appcache/application_1512xx57247_19x8/container_1512xxx857247_19188_01_000009/pyspark.zip/pyspark/serializers.py", line 454, in loads
    return pickle.loads(obj)
ImportError: No module named elephas.spark_model

	at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:193)
	at org.apache.spark.api.python.PythonRunner$$anon$1.<init>(PythonRDD.scala:234)
	at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:152)
	at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:99)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)```

Issue Analytics

State:
Created 6 years ago
Comments:6 (3 by maintainers)

Top GitHub Comments

2reactions

maxpumperlacommented, Mar 7, 2018

yeah, production systems are always a little messy. in the end I can only guess what’s going on. keep me posted in case I can help somehow.

0reactions

maxpumperlacommented, Mar 8, 2018

cool, thanks for your feedback. I’ve changed the name of the issue so people can find this. at some point I want to write up how to use elephas from scratch on AWS or GCE etc., this might be very helpful.

Top Results From Across the Web

python - Elephas not loaded in PySpark: No module named ...

I found a solution on how to properly load a virtual environment to the master and all the slave workers: virtualenv venv --relocatable...

Distributed Deep Learning with Elephas

This post presents the python code to run Keras model in a distributed environment powered by Apache Spark.

Distributed Deep Learning Pipelines with PySpark and Keras

The first thing we do with Elephas is create an estimator similar to some of the PySpark pipeline items above. We can set...

Spark ML model pipelines on Distributed Deep Neural Nets

If you don't have it already, install Spark locally by following the instructions provided ... --driver-memory 4G elephas/examples/Spark_ML_Pipeline.ipynb.

Deep Learning With Apache Spark: Part 1 - KDnuggets

This part: What is Spark, basics on Spark+DL and a little more. ... Deep Learning Pipelines is an open source library created by...