Tensorflow/GCC error when loading pre-trained pipeline on Github Actions
See original GitHub issueDescription
Hey again, folks… I’m attempting to load some pretrained SparkNLP pipelines in my GitHub Actions workflow as part of my unit tests. They were passing before, but started failing after I expanded my pretrained pipelines to perform POS tagging, lemmatization, NER detection, etc. with Tensorflow. I now receive the following error when running tests:
terminate called after throwing an instance of 'std::runtime_error'
what(): random_device could not be read
I managed to find the error call in the GCC source code, but can find few resources beyond that. My program exits without a stack trace, and my unit tests run fine locally on my Mac.
I’m curious if you might have any idea of what kind of architecture difference could exist between my machine and GitHub runners that could cause such an issue.
Expected Behavior
Code should run on any UNIX-based system
Current Behavior
Errors when using GitHub Actions CI
Possible Solution
At first I thought there was maybe a difference in file encodings or line endings or something by looking at the GCC source – now I’m not so sure that’s the case.
Steps to Reproduce
Tricky to reproduce seeing this only seems to occur on the CI, but I suppose a simple reproduction could be loading PretrainedPipeline("explain_document_md", "fr")
in Scala in a GA workflow, which is one of the pipelines that is currently erroring.
Context
I haven’t been able to land any of my changes or test any of my code because my CI constantly fails with the same error.
Your Environment
- Spark NLP version
sparknlp.version()
: 3.1.3 - Apache NLP version
spark.version
: 3.1.2 - Java version
java -version
: 11 - Scala version: 2.12.13
- Setup and installation (Pypi, Conda, Maven, etc.): Scala w/ Bazel (
scala-rules
) - Operating System and version: macOS Big Sur 11.4 (GitHub Actions run on
ubuntu-latest
)
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
Seems to be an issue on GHA ubuntu images having not enough enthropy: https://github.com/actions/virtual-environments/issues/672
@maziyarpanahi I’m sorry, perhaps that was a bit of an over-generalization haha. I mostly just was referring to being able to run on my Mac machine + the Ubuntu GA system. Both of them are supported by TF though.
My config is:
my test script looks like (env passing omitted):