TensorflowWrapper.scala fails to load ClassifierDLApproach
See original GitHub issueDescription
java.util.NoSuchElementException is thrown when doing TensorflowWrapper$.readZippedSavedModel as part of ClassifierDLApproach.loadSavedModel.
StackTrace:
at scala.collection.Iterator$$anon$2.next(Iterator.scala:41)
at scala.collection.Iterator$$anon$2.next(Iterator.scala:39)
at scala.collection.IndexedSeqLike$Elements.next(IndexedSeqLike.scala:63)
at scala.collection.IterableLike.head(IterableLike.scala:109)
at scala.collection.IterableLike.head$(IterableLike.scala:108)
at scala.collection.mutable.ArrayBuffer.scala$collection$IndexedSeqOptimized$$super$head(ArrayBuffer.scala:49)
at scala.collection.IndexedSeqOptimized.head(IndexedSeqOptimized.scala:129)
at scala.collection.IndexedSeqOptimized.head$(IndexedSeqOptimized.scala:129)
at scala.collection.mutable.ArrayBuffer.head(ArrayBuffer.scala:49)
at com.johnsnowlabs.ml.tensorflow.TensorflowWrapper$.readZippedSavedModel(TensorflowWrapper.scala:506)
at com.johnsnowlabs.nlp.annotators.classifier.dl.ClassifierDLApproach.loadSavedModel(ClassifierDLApproach.scala:410)
at com.johnsnowlabs.nlp.annotators.classifier.dl.ClassifierDLApproach.train(ClassifierDLApproach.scala:346)
at com.johnsnowlabs.nlp.annotators.classifier.dl.ClassifierDLApproach.train(ClassifierDLApproach.scala:98)
at com.johnsnowlabs.nlp.AnnotatorApproach._fit(AnnotatorApproach.scala:69)
at com.johnsnowlabs.nlp.AnnotatorApproach.fit(AnnotatorApproach.scala:75)
at org.apache.spark.ml.Pipeline.$anonfun$fit$5(Pipeline.scala:151)
at org.apache.spark.ml.MLEvents.withFitEvent(events.scala:130)
at org.apache.spark.ml.MLEvents.withFitEvent$(events.scala:123)
at org.apache.spark.ml.util.Instrumentation.withFitEvent(Instrumentation.scala:42)
at org.apache.spark.ml.Pipeline.$anonfun$fit$4(Pipeline.scala:151)
at scala.collection.Iterator.foreach(Iterator.scala:941)
at scala.collection.Iterator.foreach$(Iterator.scala:941)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
at org.apache.spark.ml.Pipeline.$anonfun$fit$2(Pipeline.scala:147)
at org.apache.spark.ml.MLEvents.withFitEvent(events.scala:130)
at org.apache.spark.ml.MLEvents.withFitEvent$(events.scala:123)
at org.apache.spark.ml.util.Instrumentation.withFitEvent(Instrumentation.scala:42)
at org.apache.spark.ml.Pipeline.$anonfun$fit$1(Pipeline.scala:133)
at org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191)
at scala.util.Try$.apply(Try.scala:213)
at org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191)
at org.apache.spark.ml.Pipeline.fit(Pipeline.scala:133)
This seems to happen in https://github.com/JohnSnowLabs/spark-nlp/blob/340fe8068fae9a83130871f31633109f5fda8e70/src/main/scala/com/johnsnowlabs/ml/tensorflow/TensorflowWrapper.scala#L510, which is called from https://github.com/JohnSnowLabs/spark-nlp/blob/340fe8068fae9a83130871f31633109f5fda8e70/src/main/scala/com/johnsnowlabs/nlp/annotators/classifier/dl/ClassifierDLApproach.scala#L426, using /classifier-dl as root directory in the call to readZippedSavedModel
/classifier-dl does not exist on the target machine, and the user running the JVM is not root
Expected Behavior
The model should be loaded
Current Behavior
Exception is thrown
Possible Solution
Steps to Reproduce
- Create a “blank” Google Compute cloud instance with Ubuntu 20.04 focal distro
- apt-get install -y --no-install-recommends git openjdk-8-jdk maven
- git clone … - Build/deploy a jar that somewhere calls model = pipeline.fit(dataset), with pipeline = new Pipeline().setStages(new PipelineStage[] { getDocumentAssembler(), getTokenizer(), getEncoder(),getEmbedder(), getClassifier() }); and getClassifier() returning a new ClassifierDLApproach()
- java -jar /home/some_user/some_target/some.jar
Your Environment
VM settings: Max. Heap Size (Estimated): 2.88G Using VM: OpenJDK 64-Bit Server VM
spark.jars.packages : com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:3.2.1
Linux test 5.11.0-1020-gcp #22~20.04.1-Ubuntu SMP Tue Sep 21 10:54:26 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (5 by maintainers)
Hi @kgoderis,
I created a small Spring Boot app that trains a ClassifierDL model to replicate the error. I tested on Ubuntu 20, Debian 11, and it is working. I also containerized the app with Docker, tested it under Ubuntu 20, Debian 11, and GCP General-Purpose Machine and Computed-Optimised (Debian - buster), and it works, as you can see in the screenshot below.
I’m not sure how to configure the underlying image to “stable” In GCP Control Panel, I just found options for buster, bullseye, and stretch. Could you please elaborate more on how to configure it as stable?
@maziyarpanahi I have tossed away the test instance, but it is plain vanilla Ubuntu 20.04, nothing fancy nor non-standard configuration. The docker based test was done in the same way via the GCP Control Panel, except adding the container image, but here I changed the underlying image to be the “stable” one from the list shown on the Control Panel, e.g I avoided the “dev” image release