Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.
See original GitHub issueCould you please help me? I am giving my model and the error that I am seeing here.
model.compile(loss=lossFunc, optimizer=gradDiscent, metrics=['accuracy']);
##############################START: DISTRIBUTED MODEL################
from pyspark import SparkContext, SparkConf
#Create spark context
conf = SparkConf().setAppName('NSL-KDD-DISTRIBUTED').setMaster('local[8]');
sc = SparkContext(conf=conf);
from elephas.utils.rdd_utils import to_simple_rdd
#Build RDD (Resilient Distributed Dataset) from numpy features and labels
rdd = to_simple_rdd(sc, trainX, trainY);
from elephas.spark_model import SparkModel
from elephas import optimizers as elephas_optimizers
#Initialize SparkModel from Keras model and Spark Context
elphOptimizer = elephas_optimizers.Adagrad();
sparkModel = SparkModel(sc, model, optimizer=elphOptimizer, frequency='epoch', model='asynchronous', num_workers=1);
#Train Spark Model
sparkModel.train(rdd, nb_epoch=epochs, batch_size=batchSize, verbose=2);
#Evaluate Spark Model
score = sparkModel.master_network.evaluate(testX, testY, verbose=2);
print(score);
####################ERROR######################## Traceback (most recent call last): File “C:\PythonWorks\mine\dm-dist.py”, line 230, in <module> sparkModel.train(rdd, nb_epoch=epochs, batch_size=batchSize, verbose=2); File “C:\Miniconda3\lib\site-packages\elephas\spark_model.py”, line 194, in train self._train(rdd, nb_epoch, batch_size, verbose, validation_split, master_url) File “C:\Miniconda3\lib\site-packages\elephas\spark_model.py”, line 205, in _train self.start_server() File “C:\Miniconda3\lib\site-packages\elephas\spark_model.py”, line 125, in start_server self.server.start() File “C:\Miniconda3\lib\multiprocessing\process.py”, line 105, in start self._popen = self._Popen(self) File “C:\Miniconda3\lib\multiprocessing\context.py”, line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File “C:\Miniconda3\lib\multiprocessing\context.py”, line 322, in _Popen return Popen(process_obj) File “C:\Miniconda3\lib\multiprocessing\popen_spawn_win32.py”, line 65, in init reduction.dump(process_obj, to_child) File “C:\Miniconda3\lib\multiprocessing\reduction.py”, line 60, in dump ForkingPickler(file, protocol).dump(obj) File “C:\Miniconda3\lib\site-packages\pyspark\context.py”, line 306, in getnewargs "It appears that you are attempting to reference SparkContext from a broadcast " Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.
Issue Analytics
- State:
- Created 5 years ago
- Comments:18 (4 by maintainers)
Top GitHub Comments
@mohaimenz thanks. yeah, I’m not 100% happy with dist-keras for the simple reason that it obviously started as an elephas fork without ever crediting it (until I forced the guy to do it).
This is how open source dies… instead of trying to grab GitHub fame (useless stars) and being selfish, just try to help out. If Joeri had just put his time into patching elephas and become maintainer, which I would have offered, instead of stealing it, we’d have a better product now, not 2 mediocre ones that compete. Alright, that’s my rant, haha. 😄
@maxpumperla Hi Max, I am closing this issue. I have found that ELEPHAS does not work in windows. I have ran it in Linux and it is working. However, only frequency=‘epochs’. Frequency type ‘batch’ does not work because you are using Slice_X there which does not exist in KEras anymore. I will open a new issue for that.