question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

docker container cannot launch tensorboard

See original GitHub issue

Hi all,

I was running the docker container using this command

docker run --gpus all -it --rm -v /mnt/NeMo:/NeMo --shm-size=8g \
-p 8888:8888 -p 6006:6006 --ulimit memlock=-1 --ulimit \
stack=67108864 --device=/dev/snd nvcr.io/nvidia/nemo:1.0.0b3

and the NeMo repository was on branch 1.0.0b3.

In this container, I have successfully trained my model but when I was trying to monitor my training using tensorboard using the following command,

tensorboard --bind_all --logdir .

the error appeared.

TensorFlow installation not found - running with reduced feature set.
Traceback (most recent call last):
  File "/opt/conda/bin/tensorboard", line 8, in <module>
    sys.exit(run_main())
  File "/opt/conda/lib/python3.6/site-packages/tensorboard/main.py", line 65, in run_main
    default.get_plugins(),
  File "/opt/conda/lib/python3.6/site-packages/tensorboard/default.py", line 108, in get_plugins
    return get_static_plugins() + get_dynamic_plugins()
  File "/opt/conda/lib/python3.6/site-packages/tensorboard/default.py", line 146, in get_dynamic_plugins
    "tensorboard_plugins"
  File "/opt/conda/lib/python3.6/site-packages/tensorboard/default.py", line 145, in <listcomp>
    for entry_point in pkg_resources.iter_entry_points(
  File "/opt/conda/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2472, in load
    return self.resolve()
  File "/opt/conda/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2478, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/opt/conda/lib/python3.6/site-packages/tensorboard_plugin_dlprof/plugin.py", line 31, in <module>
    from tensorboard.plugins.graph import dlprof_pb2
ImportError: cannot import name 'dlprof_pb2'

Any help is appreciated. Thanks.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6

github_iconTop GitHub Comments

2reactions
lodm94commented, Jan 15, 2021

Yeah i guess i faced that too. Try this pipeline:

  1. pull the container
  2. run it
  3. pip uninstall -y tensorboard
  4. pip uninstall tensorboard-plugine-dlproof (check if i spelled correctly!!!)
  5. pip install nvidia-pyindex
  6. pip install nvidia-tensorboard-plugin-dlprof
  7. pip install tensorboard

then i can launch tensorboard with: tensorboard --logdir yourpathtofiles --bind_all --port yourport

and i open a browser typing 192.168.xxx.xxx:yourport

i guess the this two commands are persoonal. for start, apologize if i am not accurate with the istruction. i’m kinda noob about all this. i am working on my thesis so I don’t have much experience. if you dont solve on monday i can log into my server again and i will copy all the steps.

it will be great if someone experienced could help you. and also me of course.

cheers

1reaction
tann9949commented, Jan 18, 2021

This method works well! Thank you!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Unable to open Tensorboard in browser - Stack Overflow
First of all, make sure the port you use for Tensorboard is opened to the outside world. To make this possible run your...
Read more >
Running TensorBoard in a Dockerfile - Tensor Examples
For me killing tensorboard doesn't work, and it required me to restart the whole docker container. If you get 127.0.0.1 didn't send any ......
Read more >
TensorBoard on Docker - Lei Mao's Log Book
Connect Ports of Docker Container to Server. This is usually done via the -p argument of docker run command. TensorBoard uses port 6006...
Read more >
Running TensorFlow - NVIDIA Documentation Center
Before you can run an NGC deep learning framework container, your Docker environment must support NVIDIA GPUs. To run a container, ...
Read more >
tensorflow/serving - Docker Image
:latest-gpu : minimal image with TensorFlow Serving binary installed and ready to serve on GPUs! ... You will need nvidia-docker to run GPU...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found