question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

NoClassDefFoundError: org/apache/spark/ml/util/MLWritable

See original GitHub issue

I’m using Spark 2.4.2 with Anaconda python 3.6.5. I’m getting below error. What’s the best way to resolve this?

Command: pyspark --master local[*] --packages databricks:spark-deep-learning:1.5.0-spark2.4-s_2.11

from pyspark.ml.classification import LogisticRegression from pyspark.ml import Pipeline from sparkdl import DeepImageFeaturizer /mnt/conda/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type. from ._conv import register_converters as _register_converters Using TensorFlow backend.

featurizer = DeepImageFeaturizer(inputCol=“image”, outputCol=“features”, modelName=“InceptionV3”) Traceback (most recent call last): File “<stdin>”, line 1, in <module> File “/mnt/tmp/spark-3a6fe30a-fc8a-4ece-accc-80033a821db0/userFiles-c4b15186-fd69-41a7-8e50-430045afaeb1/databricks_spark-deep-learning-1.5.0-spark2.4-s_2.11.jar/sparkdl/param/shared_params.py”, line 50, in keyword_only File “/mnt/tmp/spark-3a6fe30a-fc8a-4ece-accc-80033a821db0/userFiles-c4b15186-fd69-41a7-8e50-430045afaeb1/databricks_spark-deep-learning-1.5.0-spark2.4-s_2.11.jar/sparkdl/transformers/named_image.py”, line 196, in init File “/mnt/spark/python/pyspark/ml/wrapper.py”, line 67, in _new_java_obj return java_obj(*java_args) File “/mnt/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py”, line 1525, in call File “/mnt/spark/python/pyspark/sql/utils.py”, line 63, in deco return f(*a, **kw) File “/mnt/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py”, line 328, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling None.com.databricks.sparkdl.DeepImageFeaturizer. : java.lang.NoClassDefFoundError: org/apache/spark/ml/util/MLWritable$class at com.databricks.sparkdl.DeepImageFeaturizer.<init>(DeepImageFeaturizer.scala:35) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:238) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: org.apache.spark.ml.util.MLWritable$class at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) … 12 more

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:9

github_iconTop GitHub Comments

4reactions
khaarthikmcommented, Nov 3, 2020

Issue is still open. Unfortunately SPARK 3.0 is not compatible with mmlspark

4reactions
Abhishek-Pcommented, Aug 13, 2020

I am seeing the issue although not in Databrick based pyspark. Rather in normal py spark-nlp mode, but on windows. I have spark-nlp 2.5.5 and Spark 2.4.6.

`I am trying out the ContenxtAwareSpellChecker provided in https://medium.com/spark-nlp/applying-context-aware-spell-checking-in-spark-nlp-3c29c46963bc

The first of the component in the pipeline is a DocumentAssembler

from sparknlp.annotator import *
from sparknlp.base import *
import sparknlp


spark = sparknlp.start()
documentAssembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

The above code when run fails as below

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\pab\AppData\Local\Continuum\anaconda3.7\envs\MailChecker\lib\site-packages\pyspark\__init__.py", line 110, in wrapper
    return func(self, **kwargs)
  File "C:\Users\pab\AppData\Local\Continuum\anaconda3.7\envs\MailChecker\lib\site-packages\sparknlp\base.py", line 148, in __init__
    super(DocumentAssembler, self).__init__(classname="com.johnsnowlabs.nlp.DocumentAssembler")
  File "C:\Users\pab\AppData\Local\Continuum\anaconda3.7\envs\MailChecker\lib\site-packages\pyspark\__init__.py", line 110, in wrapper
    return func(self, **kwargs)
  File "C:\Users\pab\AppData\Local\Continuum\anaconda3.7\envs\MailChecker\lib\site-packages\sparknlp\internal.py", line 72, in __init__
    self._java_obj = self._new_java_obj(classname, self.uid)
  File "C:\Users\pab\AppData\Local\Continuum\anaconda3.7\envs\MailChecker\lib\site-packages\pyspark\ml\wrapper.py", line 69, in _new_java_obj
    return java_obj(*java_args)
  File "C:\Users\pab\AppData\Local\Continuum\anaconda3.7\envs\MailChecker\lib\site-packages\pyspark\python\lib\py4j-0.10.9-src.zip\py4j\java_gateway.py", line 1569, in __call__
  File "C:\Users\pab\AppData\Local\Continuum\anaconda3.7\envs\MailChecker\lib\site-packages\pyspark\sql\utils.py", line 131, in deco
    return f(*a, **kw)
  File "C:\Users\pab\AppData\Local\Continuum\anaconda3.7\envs\MailChecker\lib\site-packages\pyspark\python\lib\py4j-0.10.9-src.zip\py4j\protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.com.johnsnowlabs.nlp.DocumentAssembler.
: java.lang.NoClassDefFoundError: org/apache/spark/ml/util/MLWritable$class
        at com.johnsnowlabs.nlp.DocumentAssembler.<init>(DocumentAssembler.scala:16)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Thread.java:748)

`

Read more comments on GitHub >

github_iconTop Results From Across the Web

spark-nlp : DocumentAssembler initializing failing with 'java ...
spark-nlp : DocumentAssembler initializing failing with 'java.lang.NoClassDefFoundError: org/apache/spark/ml/util/MLWritable$class' · Ask ...
Read more >
MLWritable — PySpark 3.3.1 documentation - Apache Spark
class pyspark.ml.util. MLWritable [source]¶. Mixin for ML instances that provide MLWriter . ... write (). Returns an MLWriter instance for this ML instance....
Read more >
java.lang.NoClassDefFoundError: org/apache/spark/internal ...
I have to write the extracted data from XML to DB , i am using Dataframe for transformation and trying to load that...
Read more >
combust/mleap - Gitter
NoClassDefFoundError : org/apache/spark/ml/feature/FeatureHasher at java.lang.Class. ... foldLeft(ArrayBuffer.scala:48) at ml.combust.bundle.
Read more >
Resolve the "java.lang.ClassNotFoundException" in Spark on ...
When I use custom JAR files in a spark-submit or PySpark job on Amazon EMR, I get a java.lang.ClassNotFoundException error.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found