Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.

See original GitHub issue

Could you please help me? I am giving my model and the error that I am seeing here.

model.compile(loss=lossFunc, optimizer=gradDiscent, metrics=['accuracy']);
##############################START: DISTRIBUTED MODEL################
from pyspark import SparkContext, SparkConf
#Create spark context
conf = SparkConf().setAppName('NSL-KDD-DISTRIBUTED').setMaster('local[8]');
sc = SparkContext(conf=conf);

from elephas.utils.rdd_utils import to_simple_rdd
#Build RDD (Resilient Distributed Dataset) from numpy features and labels
rdd = to_simple_rdd(sc, trainX, trainY);

from elephas.spark_model import SparkModel
from elephas import optimizers as elephas_optimizers
#Initialize SparkModel from Keras model and Spark Context
elphOptimizer = elephas_optimizers.Adagrad();
sparkModel = SparkModel(sc, model, optimizer=elphOptimizer, frequency='epoch', model='asynchronous', num_workers=1);
#Train Spark Model
sparkModel.train(rdd, nb_epoch=epochs, batch_size=batchSize, verbose=2);

#Evaluate Spark Model
score = sparkModel.master_network.evaluate(testX, testY, verbose=2);
print(score);

####################ERROR######################## Traceback (most recent call last): File “C:\PythonWorks\mine\dm-dist.py”, line 230, in <module> sparkModel.train(rdd, nb_epoch=epochs, batch_size=batchSize, verbose=2); File “C:\Miniconda3\lib\site-packages\elephas\spark_model.py”, line 194, in train self._train(rdd, nb_epoch, batch_size, verbose, validation_split, master_url) File “C:\Miniconda3\lib\site-packages\elephas\spark_model.py”, line 205, in _train self.start_server() File “C:\Miniconda3\lib\site-packages\elephas\spark_model.py”, line 125, in start_server self.server.start() File “C:\Miniconda3\lib\multiprocessing\process.py”, line 105, in start self._popen = self._Popen(self) File “C:\Miniconda3\lib\multiprocessing\context.py”, line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File “C:\Miniconda3\lib\multiprocessing\context.py”, line 322, in _Popen return Popen(process_obj) File “C:\Miniconda3\lib\multiprocessing\popen_spawn_win32.py”, line 65, in init reduction.dump(process_obj, to_child) File “C:\Miniconda3\lib\multiprocessing\reduction.py”, line 60, in dump ForkingPickler(file, protocol).dump(obj) File “C:\Miniconda3\lib\site-packages\pyspark\context.py”, line 306, in getnewargs "It appears that you are attempting to reference SparkContext from a broadcast " Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.

Issue Analytics

State:
Created 5 years ago
Comments:18 (4 by maintainers)

Top GitHub Comments

1reaction

maxpumperlacommented, Jun 28, 2018

@mohaimenz thanks. yeah, I’m not 100% happy with dist-keras for the simple reason that it obviously started as an elephas fork without ever crediting it (until I forced the guy to do it).

This is how open source dies… instead of trying to grab GitHub fame (useless stars) and being selfish, just try to help out. If Joeri had just put his time into patching elephas and become maintainer, which I would have offered, instead of stealing it, we’d have a better product now, not 2 mediocre ones that compete. Alright, that’s my rant, haha. 😄

0reactions

mohaimenzcommented, Jul 24, 2018

@maxpumperla Hi Max, I am closing this issue. I have found that ELEPHAS does not work in windows. I have ran it in Linux and it is working. However, only frequency=‘epochs’. Frequency type ‘batch’ does not work because you are using Slice_X there which does not exist in KEras anymore. I will open a new issue for that.

Top Results From Across the Web

Spark: Broadcast variables: It appears that you are ...

"Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transforamtion. SparkContext can only be used on ......

Exception: It appears that you are attempting to reference ...

Exception : It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used...

PicklingError: Could not serialize object: Exception

SparkContext can only be used on the driver, not in code that it run on workers. ... you are attempting to reference SparkContext...

Pickle.picklingerror: could not serialize object - Hang's Blog

PicklingError: Could not serialize object: Exception: It appears that you are attempting to reference SparkContext from a broadcast variable ...

Source code for pyspark.context - Apache Spark

from pyspark.context import SparkContext >>> sc = SparkContext('local', ... SparkContext can only be used on the driver, " "not in code that it...