question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

One of the use cases we want to show for the CML site is producing a tensorboard link in the report. For example:

View your TensorBoard live at: https://tensorboard.dev/experiment/E31QaKfWTQaQuEUR6H03UA/

I’m running into some trouble using the tensorboard dev command inside my GH runner. This is my workflow (project repo is here):

name: train-my-model

on: [push]

jobs:
  run:
    runs-on: [ubuntu-latest]
    container: docker://dvcorg/cml-py3:latest

    steps:
      - uses: actions/checkout@v2

      - name: dvc_cml_run
        env:
          repo_token: ${{ secrets.GITHUB_TOKEN }}
        run: |
          pip3 install -r requirements.txt
          python train.py
          
          tensorboard dev upload --logdir logs

I’ve confirmed the commands work on my local machine. On the runner, I’m getting this error message:

Traceback (most recent call last):
  File "/usr/local/bin/tensorboard", line 8, in <module>
    sys.exit(run_main())
  File "/usr/local/lib/python3.6/dist-packages/tensorboard/main.py", line 75, in run_main
    app.run(tensorboard.main, flags_parser=tensorboard.configure)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/usr/local/lib/python3.6/dist-packages/tensorboard/program.py", line 289, in main
    return runner(self.flags) or 0
  File "/usr/local/lib/python3.6/dist-packages/tensorboard/uploader/uploader_main.py", line 633, in run
    return _run(flags)
  File "/usr/local/lib/python3.6/dist-packages/tensorboard/uploader/uploader_main.py", line 120, in _run
    _prompt_for_user_ack(intent)
  File "/usr/local/lib/python3.6/dist-packages/tensorboard/uploader/uploader_main.py", line 76, in _prompt_for_user_ack
    response = six.moves.input("Continue? (yes/NO) ")
EOFError: EOF when reading a line
Continue? (yes/NO) 
##[error]Process completed with exit code 1.

I tried passing in a “yes” with

echo "yes" | tensorboard dev upload --logdir logs but got the same error. Does this make any sense given the setup of the runner?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10 (10 by maintainers)

github_iconTop GitHub Comments

2reactions
elleobriencommented, May 18, 2020

Great work @DavidGOrtega! I had a feeling the authentication was going to take a little work. Looks like a similar approach to authenticating with GDrive as a DVC remote?

I’ll give your solution a go. Really cool

1reaction
DavidGOrtegacommented, May 18, 2020

@andronovhopf this is a very interesting case. Probably a candidate to have its own cml wrapper @dmpetrov

First of all Tensorboard dev is expecting you to auth using OAuth, this process will generate a file inside your machine under ~/.config/tensorboard/credentials/uploader-creds.json ala google drive or gs credentials.

So, you must copy those JSON credentials and add them as a secret. Then within your approach you still need a Ctrl+c to cancel sending data to TB.

However this might not be the best approach. One of the best features is watching your model logs in real time so TB should be started first (as a fg process) and then train. However this procedure had a lots of caveats failing in many ways due to failures in the pipes. Also is very important to know whats the url to be checked during the training. Finally after some iterations this is actually the solution:

name: train-my-model

on: [push]

jobs:
  run:
    runs-on: [ubuntu-latest]
    container: docker://dvcorg/cml-py3:latest

    steps:
      - uses: actions/checkout@v2

      - name: dvc_cml_run
        shell: bash
        env:
          repo_token: ${{ secrets.GITHUB_TOKEN }} 
        run: |
          pip3 install -r requirements.txt

          TENSORBOARD_CREDENTIALS='{"refresh_token": "1//03IJCTMThPsYECgYIARAAGAMSNwF-L9Iru6PoxuqEtGTcnvbXeGwi5j4cXBrFQpXdcmBhZyvZggR1WqKeIjhs1V57g1NvpCsUFnw", "token_uri": "https://oauth2.googleapis.com/token", "client_id": "373649185512-8v619h5kft38l4456nm2dj4ubeqsrvh6.apps.googleusercontent.com", "client_secret": "pOyAuU2yq2arsM98Bw5hwYtr", "scopes": ["openid", "https://www.googleapis.com/auth/userinfo.email"], "type": "authorized_user"}'
          mkdir -p ~/.config/tensorboard/credentials/
          echo "$TENSORBOARD_CREDENTIALS" > ~/.config/tensorboard/credentials/uploader-creds.json

          script -f -c "
              tensorboard dev upload --logdir logs & \
              sleep 5 && \
              cat tb.log | grep http > report.md && \
              cml-send-comment report.md && \
              python train.py" tb.log

          cml-send-github-check report.md

This solution is very good in a PR since it’s going to generate a comment telling you where to go aside of printing the url in the command line before start training.

image

Then is generating the final report also. image

with secrets just replace

TENSORBOARD_CREDENTIALS='{"refresh_token": "1//03IJCTMThPsYECgYIARAAGAMSNwF-L9Iru6PoxuqEtGTcnvbXeGwi5j4cXBrFQpXdcmBhZyvZggR1WqKeIjhs1V57g1NvpCsUFnw", "token_uri": "https://oauth2.googleapis.com/token", "client_id": "373649185512-8v619h5kft38l4456nm2dj4ubeqsrvh6.apps.googleusercontent.com", "client_secret": "pOyAuU2yq2arsM98Bw5hwYtr", "scopes": ["openid", "https://www.googleapis.com/auth/userinfo.email"], "type": "authorized_user"}'
mkdir -p ~/.config/tensorboard/credentials/
echo "$TENSORBOARD_CREDENTIALS" > ~/.config/tensorboard/credentials/uploader-creds.json

with

mkdir -p ~/.config/tensorboard/credentials/
echo "${{ secrets.TENSORBOARD_CREDENTIALS }}" > ~/.config/tensorboard/credentials/uploader-creds.json
Read more comments on GitHub >

github_iconTop Results From Across the Web

TensorBoard.dev - Upload and Share ML Experiments for Free
TensorBoard is TensorFlow's visualization toolkit, enabling you to track metrics like loss and accuracy, visualize the model graph, view histograms of weights, ...
Read more >
Introducing TensorBoard.dev: a new way to share your ML ...
It enables tracking experiment metrics, visualizing models, profiling ML programs, visualizing hyperparameter tuning experiments, and much more.
Read more >
Tensorboard dev upload bug · Issue #3751 - GitHub
Unable to upload my experiment on tensorboard.dev. Here is my code : tensorboard dev upload --logdir gs://bioasq_clean --name "bioasq ...
Read more >
Collaborating with TensorBoard.dev | Analytics Vidhya - Medium
Collaborating with TensorBoard.dev. Collaboration is a key aspect of Deep Learning and it requires sharing the details such as visualizing ...
Read more >
How to Make Your TensorBoard Projects Easy to Share and ...
Firstly, TensorFlow has its own toolkit (TensorBoard.dev) for TensorBoard sharing. It's a good option to consider, but keep in mind that your TensorBoard ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found