question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItΒ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] `pip install deepspeed==0.6.2` raises `ModuleNotFoundError: No module named 'op_builder'`

See original GitHub issue

Describe the bug I’d like to report a possible bug in the installation of 0.6.2 in particular.

To Reproduce

$ # === from the host ===
$ docker pull pytorchlightning/pytorch_lightning:base-cuda-py3.7-torch1.11
$ docker run --rm -it --gpus all --ipc host pytorchlightning/pytorch_lightning:base-cuda-py3.7-torch1.11
$ # === within the container ===
$ pip list | grep -e deepspeed -e torch
deepspeed                     0.6.1
torch                         1.11.0+cu113
torchmetrics                  0.8.0
torchtext                     0.12.0
torchvision                   0.12.0+cu113
$ pip uninstall deepspeed -y
$ pip install deepspeed==0.6.2
Collecting deepspeed==0.6.2
  Downloading deepspeed-0.6.2.tar.gz (542 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 542.4/542.4 KB 19.5 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  Γ— python setup.py egg_info did not run successfully.
  β”‚ exit code: 1
  ╰─> [6 lines of output]
      Traceback (most recent call last):
        File "<string>", line 36, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-brk5ad19/deepspeed_e004efb3362a460e85f320ebf0a47444/setup.py", line 35, in <module>
          from op_builder import ALL_OPS, get_default_compute_capabilities, OpBuilder
      ModuleNotFoundError: No module named 'op_builder'
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

Γ— Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

FYI, installing other versions succeeds.

$ pip install deepspeed==0.6.1  # or other versions I've tried

Expected behavior Successful installation with no error raised.

pip install deepspeed==0.6.2

ds_report output Please run ds_report to give us details about your setup.

Screenshots Please refer to the commands above or to our CI run: https://dev.azure.com/PytorchLightning/pytorch-lightning/_build/results?buildId=69269&view=logs&j=3afc50db-e620-5b81-6016-870a6976ad29&t=96e63b50-ac1f-5745-9558-05f9481cd089&l=17

System info (please complete the following information):

  • OS: Linux ad93fadeb85f 5.4.0-53-generic #59-Ubuntu SMP Wed Oct 21 09:38:44 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  • GPU count and types [e.g. two machines with x8 A100s each]
  • Interconnects (if applicable) [e.g., two machines connected with 100 Gbps IB]
  • Python version: 3.7
  • Any other relevant info about your setup

Launcher context n/a

Docker context

docker pull pytorchlightning/pytorch_lightning:base-cuda-py3.7-torch1.11

Additional context Encountered this issue while working on https://github.com/PyTorchLightning/pytorch-lightning/issues/12881.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
akihironittacommented, Apr 27, 2022

Confirmed 0.6.3 works. Thank you for deploying the fix quickly.

1reaction
jeffracommented, Apr 27, 2022

We’ve fixed the release and removed the 0.6.2 distribution since it does not install w/o error. Please install 0.6.3, please re-open if you’re still seeing the error but it should be resolved now.

Read more comments on GitHub >

github_iconTop Results From Across the Web

No module named 'tqdm' when run pip install -e but works ...
I managed to fix this issue. The culprit is simply because you cannot specify specific version of tqdm in your setup.py file.
Read more >
ModuleNotFoundError: No Module Named 'setuptools' in Python
Quick Fix: Python raises the ImportError or ModuleNotFoundError: No module named 'setuptools' when it cannot find the library setuptools .
Read more >
How to fix ModuleNotFoundError: No module named 'skmisc'?
Depending on how your system is set up, this command pip install --user scikit-misc could be installing the package for Python v2,Β ...
Read more >
python - ModuleNotFoundError: No module named 'distutils.util'
The module not found likely means the packages aren't installed. sudo apt-get install python3-distutils sudo apt-get install python3-apt.
Read more >
pip3 cannot install tabulate: ImportError: No module named ...
Bug 1651675 - pip3 cannot install tabulate: ImportError: No module named ... File "/tmp/pip-build-u6ka8lyy/tabular/setup.py", line 50, in <module> raise ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found