question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Tensorflow/GCC error when loading pre-trained pipeline on Github Actions

See original GitHub issue

Description

Hey again, folks… I’m attempting to load some pretrained SparkNLP pipelines in my GitHub Actions workflow as part of my unit tests. They were passing before, but started failing after I expanded my pretrained pipelines to perform POS tagging, lemmatization, NER detection, etc. with Tensorflow. I now receive the following error when running tests:

terminate called after throwing an instance of 'std::runtime_error'
  what():  random_device could not be read

I managed to find the error call in the GCC source code, but can find few resources beyond that. My program exits without a stack trace, and my unit tests run fine locally on my Mac.

I’m curious if you might have any idea of what kind of architecture difference could exist between my machine and GitHub runners that could cause such an issue.

Expected Behavior

Code should run on any UNIX-based system

Current Behavior

Errors when using GitHub Actions CI

Possible Solution

At first I thought there was maybe a difference in file encodings or line endings or something by looking at the GCC source – now I’m not so sure that’s the case.

Steps to Reproduce

Tricky to reproduce seeing this only seems to occur on the CI, but I suppose a simple reproduction could be loading PretrainedPipeline("explain_document_md", "fr") in Scala in a GA workflow, which is one of the pipelines that is currently erroring.

Context

I haven’t been able to land any of my changes or test any of my code because my CI constantly fails with the same error.

Your Environment

  • Spark NLP version sparknlp.version(): 3.1.3
  • Apache NLP version spark.version: 3.1.2
  • Java version java -version: 11
  • Scala version: 2.12.13
  • Setup and installation (Pypi, Conda, Maven, etc.): Scala w/ Bazel (scala-rules)
  • Operating System and version: macOS Big Sur 11.4 (GitHub Actions run on ubuntu-latest)

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
shuttiecommented, Apr 27, 2022

Seems to be an issue on GHA ubuntu images having not enough enthropy: https://github.com/actions/virtual-environments/issues/672

1reaction
Nickersoftcommented, Aug 11, 2021

@maziyarpanahi I’m sorry, perhaps that was a bit of an over-generalization haha. I mostly just was referring to being able to run on my Mac machine + the Ubuntu GA system. Both of them are supported by TF though.

My config is:

name: CI

on:
  pull_request:
    branches: [master]
    commit-ignore:
      - [skip ci]

env:
  DOPPLER_TOKEN: ${{ secrets.DOPPLER_TOKEN_DEV }}

jobs:
  build:
    name: Build & Test
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v2
      - name: Setup Go
        uses: actions/setup-go@v2
        with:
          go-version: "^1.16.2"
      - name: Setup Java
        uses: actions/setup-java@v1
        with:
          java-version: "11"
      - name: Setup Doppler
        uses: dopplerhq/cli-action@v1
      - name: Setup Bazelisk
        run: |
          go get github.com/bazelbuild/bazelisk
          export PATH=$PATH:$(go env GOPATH)/bin
      - name: Setup FFmpeg
        uses: FedericoCarboni/setup-ffmpeg@v1
        with:
          token: ${{ secrets.GITHUB_TOKEN }}
        id: setup-ffmpeg
      - name: Run Tests
        run: doppler run -- sh ./.github/workflows/run_tests.sh
      - name: Upload Logs
        uses: actions/upload-artifact@v2
        if: ${{ failure() }}
        with:
          name: Test Logs
          path: |
            .build_cache/**/testlogs/**/*

my test script looks like (env passing omitted):

bazelisk \
  --host_jvm_args=-Xms512m \
  --host_jvm_args=-Xmx1024m \
  --output_user_root=.build_cache \
  test all_tests \
  --spawn_strategy=local \
  --test_output=errors \
  --test_verbose_timeout_warnings 
Read more comments on GitHub >

github_iconTop Results From Across the Web

How to build a CI/CD pipeline with GitHub Actions in four ...
Here's a quick guide on the advantages of using GitHub Actions as your preferred CI/CD tool—and how to build a CI/CD pipeline with...
Read more >
Create a CICD Pipeline with GitHub Actions
This article focuses on how to create a CICD pipeline to deploy a machine learning application into production using GitHub Actions, ...
Read more >
Configure CI/CD with GitHub Actions - Azure App Service
Learn how to deploy your code to Azure App Service from a CI/CD pipeline with GitHub Actions. Customize the build tasks and execute...
Read more >
Monitor Your GitHub Actions Workflows With Datadog CI ...
Learn how Datadog can help you identify slow jobs and failing builds with the GitHub Actions integration.
Read more >
How to Perform Load Testing with k6 using GitHub Actions
Developers use k6 to test a system's performance under a particular load to catch performance regressions or errors. GitHub Actions is a new ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found