question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

wandb fails on CUDA 8 and python3

See original GitHub issue
  • Weights and Biases version: 0.8.9 (according to pip)
  • Python version: 3.7.1
  • Operating System: linux

Description

During any action with wandb, it fails on import (SyntaxError). This happens both in any script or any CLI action. We are running CUDA 8.0, and I guess pynvml that is being imported doesn’t support python3.

Here’s CUDA version:

> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

Traceback on wandb import:

> python3 main.py
Traceback (most recent call last):
  File "main.py", line 8, in <module>
    import wandb
  File "/home/username/.local/lib/python3.7/site-packages/wandb/__init__.py", line 42, in <module>
    from wandb.apis import InternalApi, PublicApi, CommError
  File "/home/username/.local/lib/python3.7/site-packages/wandb/apis/__init__.py", line 95, in <module>
    from .public import Api as PublicApi
  File "/home/username/.local/lib/python3.7/site-packages/wandb/apis/public.py", line 21, in <module>
    from wandb.summary import HTTPSummary, download_h5
  File "/home/username/.local/lib/python3.7/site-packages/wandb/summary.py", line 15, in <module>
    from wandb.meta import Meta
  File "/home/username/.local/lib/python3.7/site-packages/wandb/meta.py", line 6, in <module>
    import pynvml
  File "/cm/local/apps/cuda/libs/current/pynvml/pynvml.py", line 1671
    print c_count.value
                ^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print(c_count.value)?

Traceback on CLI usage:

>  wandb
Traceback (most recent call last):
  File "/home/username/.local/bin/wandb", line 7, in <module>
    from wandb.cli import cli
  File "/home/username/.local/lib/python3.7/site-packages/wandb/__init__.py", line 42, in <module>
    from wandb.apis import InternalApi, PublicApi, CommError
  File "/home/username/.local/lib/python3.7/site-packages/wandb/apis/__init__.py", line 95, in <module>
    from .public import Api as PublicApi
  File "/home/username/.local/lib/python3.7/site-packages/wandb/apis/public.py", line 21, in <module>
    from wandb.summary import HTTPSummary, download_h5
  File "/home/username/.local/lib/python3.7/site-packages/wandb/summary.py", line 15, in <module>
    from wandb.meta import Meta
  File "/home/username/.local/lib/python3.7/site-packages/wandb/meta.py", line 6, in <module>
    import pynvml
  File "/cm/local/apps/cuda/libs/current/pynvml/pynvml.py", line 1671
    print c_count.value
                ^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print(c_count.value)?

Potential fix

I skimmed the code and the repo, it seems pynvml (Python bindings for NVIDIA Management Library) is used only in meta.py and stats.py to log used GPU stats and GPU count. A possible fix would be to wrap imports with try-catch + fail flag, or maybe an environment flag to skip that part entirely.

I can make a pull request over this weekend, but let me know if you have something else in mind.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
issue-label-bot[bot]commented, Sep 11, 2019

Issue-Label Bot is automatically applying the label bug to this issue, with a confidence of 0.97. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

0reactions
vanpeltcommented, Sep 2, 2020

Hey @Qnlp this is really strange. Can you print what experiment.config is after you’ve initialized the WandbLogger? A quick thing to try is our prerelease library. You can install with pip install wandb -U --pre.

Read more comments on GitHub >

github_iconTop Results From Across the Web

wandb fails on CUDA 8 and python3 · Issue #539 - GitHub
During any action with wandb, it fails on import (SyntaxError). This happens both in any script or any CLI action. We are running...
Read more >
Wanb.watch(model) causing CUDA OOM - W&B Help
I am trying to use wandb gradient visualization to debug the gradient flow in my neural net on Google Colab.
Read more >
Track fine-tune experiments & datasets with Weights & Biases
You may need to install wandb first with pip install wandb . ... "/home/foo/gpu/env/lib/python3.8/site-packages/openai/_openai_scripts.py", ...
Read more >
pytorch - Training yolov5 causes a ... - Stack Overflow
I don't know why but it seems as torch 1.8 is built on older version of cuda. Also as pytorch has its own...
Read more >
Training MoCo works on a single GeForce RTX 3090 machine ...
Traceback of forward call that caused the error: File "<string>", line 1, in <module> File "/usr/lib/python3.8/multiprocessing/spawn.py", ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found