question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to set up elephas on spark workers with --archives

See original GitHub issue

When running the basic example from the doc:

from elephas.utils.rdd_utils import to_simple_rdd rdd = to_simple_rdd(sc, x_train, y_train) from elephas.spark_model import SparkModel from elephas import optimizers as elephas_optimizers sgd = elephas_optimizers.SGD() spark_model = SparkModel(sc, model, optimizer=sgd, frequency=‘epoch’, mode=‘asynchronous’, num_workers=2) spark_model.train(rdd, nb_epoch=epochs, batch_size=batch_size, verbose=1, validation_split=0.1)

I get the following error: “ImportError: No module named elephas.spark_model” I use PySpark 2.1 and Keras 2. Any suggestions?

: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 5.0 failed 4 times, most recent failure: Lost task 1.3 in stage 5.0 (TID 58, xxxx, executor 8): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/xx/xx/hadoop/yarn/local/usercache/xx/appcache/application_1512662857247_19188/container_151xxx2857247_19188_01_000009/pyspark.zip/pyspark/worker.py", line 163, in main
    func, profiler, deserializer, serializer = read_command(pickleSer, infile)
  File "/xx/xx/hadoop/yarn/local/usercache/xx/appcache/application_1512662857247_19188/container_151xxx2857247_19188_01_000009/pyspark.zip/pyspark/worker.py", line 54, in read_command
    command = serializer._read_with_length(file)
  File /yarn/local/usercache/xx/appcache/application_1512xx57247_19x8/container_1512xxx857247_19188_01_000009/pyspark.zip/pyspark/serializers.py", line 169, in _read_with_length
    return self.loads(obj)
  File "/yarn//local/usercache/xx/appcache/application_1512xx57247_19x8/container_1512xxx857247_19188_01_000009/pyspark.zip/pyspark/serializers.py", line 454, in loads
    return pickle.loads(obj)
ImportError: No module named elephas.spark_model

	at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:193)
	at org.apache.spark.api.python.PythonRunner$$anon$1.<init>(PythonRDD.scala:234)
	at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:152)
	at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:99)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)```

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
maxpumperlacommented, Mar 7, 2018

yeah, production systems are always a little messy. in the end I can only guess what’s going on. keep me posted in case I can help somehow.

0reactions
maxpumperlacommented, Mar 8, 2018

cool, thanks for your feedback. I’ve changed the name of the issue so people can find this. at some point I want to write up how to use elephas from scratch on AWS or GCE etc., this might be very helpful.

Read more comments on GitHub >

github_iconTop Results From Across the Web

python - Elephas not loaded in PySpark: No module named ...
I found a solution on how to properly load a virtual environment to the master and all the slave workers: virtualenv venv --relocatable...
Read more >
Distributed Deep Learning with Elephas
This post presents the python code to run Keras model in a distributed environment powered by Apache Spark.
Read more >
Distributed Deep Learning Pipelines with PySpark and Keras
The first thing we do with Elephas is create an estimator similar to some of the PySpark pipeline items above. We can set...
Read more >
Spark ML model pipelines on Distributed Deep Neural Nets
If you don't have it already, install Spark locally by following the instructions provided ... --driver-memory 4G elephas/examples/Spark_ML_Pipeline.ipynb.
Read more >
Deep Learning With Apache Spark: Part 1 - KDnuggets
This part: What is Spark, basics on Spark+DL and a little more. ... Deep Learning Pipelines is an open source library created by...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found