question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.

See original GitHub issue

Could you please help me? I am giving my model and the error that I am seeing here.

model.compile(loss=lossFunc, optimizer=gradDiscent, metrics=['accuracy']);
##############################START: DISTRIBUTED MODEL################
from pyspark import SparkContext, SparkConf
#Create spark context
conf = SparkConf().setAppName('NSL-KDD-DISTRIBUTED').setMaster('local[8]');
sc = SparkContext(conf=conf);

from elephas.utils.rdd_utils import to_simple_rdd
#Build RDD (Resilient Distributed Dataset) from numpy features and labels
rdd = to_simple_rdd(sc, trainX, trainY);

from elephas.spark_model import SparkModel
from elephas import optimizers as elephas_optimizers
#Initialize SparkModel from Keras model and Spark Context
elphOptimizer = elephas_optimizers.Adagrad();
sparkModel = SparkModel(sc, model, optimizer=elphOptimizer, frequency='epoch', model='asynchronous', num_workers=1);
#Train Spark Model
sparkModel.train(rdd, nb_epoch=epochs, batch_size=batchSize, verbose=2);

#Evaluate Spark Model
score = sparkModel.master_network.evaluate(testX, testY, verbose=2);
print(score);

####################ERROR######################## Traceback (most recent call last): File “C:\PythonWorks\mine\dm-dist.py”, line 230, in <module> sparkModel.train(rdd, nb_epoch=epochs, batch_size=batchSize, verbose=2); File “C:\Miniconda3\lib\site-packages\elephas\spark_model.py”, line 194, in train self._train(rdd, nb_epoch, batch_size, verbose, validation_split, master_url) File “C:\Miniconda3\lib\site-packages\elephas\spark_model.py”, line 205, in _train self.start_server() File “C:\Miniconda3\lib\site-packages\elephas\spark_model.py”, line 125, in start_server self.server.start() File “C:\Miniconda3\lib\multiprocessing\process.py”, line 105, in start self._popen = self._Popen(self) File “C:\Miniconda3\lib\multiprocessing\context.py”, line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File “C:\Miniconda3\lib\multiprocessing\context.py”, line 322, in _Popen return Popen(process_obj) File “C:\Miniconda3\lib\multiprocessing\popen_spawn_win32.py”, line 65, in init reduction.dump(process_obj, to_child) File “C:\Miniconda3\lib\multiprocessing\reduction.py”, line 60, in dump ForkingPickler(file, protocol).dump(obj) File “C:\Miniconda3\lib\site-packages\pyspark\context.py”, line 306, in getnewargs "It appears that you are attempting to reference SparkContext from a broadcast " Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:18 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
maxpumperlacommented, Jun 28, 2018

@mohaimenz thanks. yeah, I’m not 100% happy with dist-keras for the simple reason that it obviously started as an elephas fork without ever crediting it (until I forced the guy to do it).

This is how open source dies… instead of trying to grab GitHub fame (useless stars) and being selfish, just try to help out. If Joeri had just put his time into patching elephas and become maintainer, which I would have offered, instead of stealing it, we’d have a better product now, not 2 mediocre ones that compete. Alright, that’s my rant, haha. 😄

0reactions
mohaimenzcommented, Jul 24, 2018

@maxpumperla Hi Max, I am closing this issue. I have found that ELEPHAS does not work in windows. I have ran it in Linux and it is working. However, only frequency=‘epochs’. Frequency type ‘batch’ does not work because you are using Slice_X there which does not exist in KEras anymore. I will open a new issue for that.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Spark: Broadcast variables: It appears that you are ...
"Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transforamtion. SparkContext can only be used on ......
Read more >
Exception: It appears that you are attempting to reference ...
Exception : It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used...
Read more >
PicklingError: Could not serialize object: Exception
SparkContext can only be used on the driver, not in code that it run on workers. ... you are attempting to reference SparkContext...
Read more >
Pickle.picklingerror: could not serialize object - Hang's Blog
PicklingError: Could not serialize object: Exception: It appears that you are attempting to reference SparkContext from a broadcast variable ...
Read more >
Source code for pyspark.context - Apache Spark
from pyspark.context import SparkContext >>> sc = SparkContext('local', ... SparkContext can only be used on the driver, " "not in code that it...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found