Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Deep Learning Pretrained Pipelines Failing to Load

See original GitHub issue

Description I have an issue that seems specifically related to the deep learning (“_dl”) pretrained pipelines.

Specifically if I run the following: pipeline = PretrainedPipeline(‘explain_document_ml’, lang=‘en’) it runs fine, if however I run pipeline = PretrainedPipeline(‘explain_document_dl’, lang=‘en’)

I get the following error:

py4j.protocol.Py4JJavaError: An error occurred while calling z:com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadPipeline.
: java.lang.UnsatisfiedLinkError: Cannot find TensorFlow native library for OS: linux, architecture: x86_64. See https://github.com/tensorflow/tensorflow/tree/master/tensorflow/java/README.md for possible solutions (such as building the library from source). Additional information on attempts to find the native library can be obtained by adding org.tensorflow.NativeLibrary.DEBUG=1 to the system properties of the JVM.

Any ideas why this is occurring just for the deep learning libraries?

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

from sparknlp.base import *
from sparknlp.annotator import *              
from sparknlp.pretrained import PretrainedPipeline
import sparknlp

spark = sparknlp.start()
pipeline = PretrainedPipeline('explain_document_dl', lang='en')

Context Your Environment

Spark NLP version: 2.7.4

Apache NLP version: Java version (java -version): openjdk version “1.8.0_282” OpenJDK Runtime Environment (build 1.8.0_282-b08) OpenJDK 64-Bit Server VM (build 25.282-b08, mixed mode)

Setup and installation (Pypi, Conda, Maven, etc.): Followed steps in:

java -version conda create -n sparknlp python=3.6 -y conda activate sparknlp pip install spark-nlp==2.7.4 pyspark==2.4.7

Operating System and version: NAME=“Red Hat Enterprise Linux Server” VERSION=“7.3 (Maipo)” ID=“rhel” ID_LIKE=“fedora” VERSION_ID=“7.3”

Link to your project (if any):