question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

NameError: name '_CAPI_DGLNCCLGetUniqueId' is not defined

See original GitHub issue

🐛 Bug

Traceback (most recent call last): File “entity_sample.py”, line 196, in <module> main(args) File “entity_sample.py”, line 153, in main labels, emb_optimizer, optimizer) File “entity_sample.py”, line 98, in train emb_optimizer.step() File “/home/ubuntu/anaconda3/envs/RGCN/lib/python3.7/site-packages/dgl-0.8-py3.7-linux-x86_64.egg/dgl/optim/pytorch/sparse_optim.py”, line 80, in step self._comm_setup() File “/home/ubuntu/anaconda3/envs/RGCN/lib/python3.7/site-packages/dgl-0.8-py3.7-linux-x86_64.egg/dgl/optim/pytorch/sparse_optim.py”, line 109, in _comm_setup self._comm = nccl.Communicator(1, 0, nccl.UniqueId()) File “/home/ubuntu/anaconda3/envs/RGCN/lib/python3.7/site-packages/dgl-0.8-py3.7-linux-x86_64.egg/dgl/cuda/nccl.py”, line 22, in init self._handle = _CAPI_DGLNCCLGetUniqueId() NameError: name ‘_CAPI_DGLNCCLGetUniqueId’ is not defined

To Reproduce

Steps to reproduce the behavior:

  1. git clone https://github.com/mufeili/dgl.git -b simplify
  2. Install DGL from source
  3. cd dgl/examples/pytorch/rgcn
  4. python entity_sample.py -d am --n-bases 40 --gpu 0 --fanout '35,35' --batch-size 64 --n-hidden 16 --use-self-loop --n-epochs=20 --dgl-sparse --sparse-lr 0.02 --dropout 0.7

Expected behavior

Running without error

Environment

  • DGL Version (e.g., 1.0):
  • Backend Library & Version (e.g., PyTorch 0.4.1, MXNet/Gluon 1.3): PyTorch 1.10.0
  • OS (e.g., Linux): Linux
  • How you installed DGL (conda, pip, source):
  • Build command you used (if compiling from source):
  • Python version: 3.8
  • CUDA/cuDNN version (if applicable): 10.2
  • GPU models and configuration (e.g. V100):
  • Any other relevant information:

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
nv-dlasallecommented, Jan 13, 2022

Thanks. I think there are really two issues here:

  1. When using only a single GPU to store the embedding, we end up depending the ability to create a NCCL unique ID, which isn’t necessary.
  2. The SparseOptimizer fails to check if DGL was built with NCCL, -DUSE_NCCL=ON, before attempting to call NCCL specific operations.

To fix 1 we could either a) add a third code path for the case where we want to store things in the GPU, but do not need NCCL, or b) we add support for things like _CAPI_DGLNCCLGetUniqueId() and communicator creation when NCCL is not enabled, but restrict them to communicators of size 1, where no communication actually needs to take place.

0reactions
mufeilicommented, Jan 21, 2022

Generally, I think it’s better to fall back to a solution without NCCL if DGL is not built with NCCL.

Read more comments on GitHub >

github_iconTop Results From Across the Web

NameError: name 'List' is not defined - python - Stack Overflow
If I try importing List first I get an error No module named 'List' . I'm using Python 3.7.
Read more >
Python nameerror name is not defined Solution - Career Karma
A NameError is raised when you try to use a variable or a function name that is not valid. In Python, code runs...
Read more >
NameError: Name plot_cases_simple is Not Defined
In this section, you'll see how to fix the "NameError: Name is Not Defined" error in Python. I've divided this section into sub-sections...
Read more >
NameError: name 'os' is not defined - LearnDjango.com
Starting with Django 3.1, the startproject command generates a settings.py file that imports pathlib rather than os on the top line. The quick ......
Read more >
NameError: name 'os' is not defined - Forums - IBM Support
NameError : name 'os' is not defined. I used. for root, dirs, files in os.walk(pfad, topdown=True): .... can someone help me with this?...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found