question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Install Failure on GCP Deep Learning VM

See original GitHub issue

I created a simple GCP Deep Learning VM: https://cloud.google.com/deep-learning-vm/

I followed the install directions, and the install failed with errors:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .

...
Command "/opt/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-req-build-j0qgf5ds/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, 
__file__, 'exec'))" --cpp_ext --cuda_ext install --record /tmp/pip-record-1yr2fag5/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-req-build-j0qgf5ds/
Exception information:
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 143, in main
    status = self.run(options, args)
  File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 366, in run
    use_user_site=options.use_user_site,
  File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/req/__init__.py", line 49, in install_given_reqs
    **kwargs
  File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/req/req_install.py", line 791, in install
    spinner=spinner,
  File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/utils/misc.py", line 705, in call_subprocess
    % (command_desc, proc.returncode, cwd))

The Python-only option also failed:

pip install -v --no-cache-dir .

...
Command "/opt/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-req-build-eedemek6/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, 
__file__, 'exec'))" install --record /tmp/pip-record-ehl5a4y7/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-req-build-eedemek6/
Exception information:
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 143, in main
    status = self.run(options, args)
  File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 366, in run
    use_user_site=options.use_user_site,
  File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/req/__init__.py", line 49, in install_given_reqs
    **kwargs
  File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/req/req_install.py", line 791, in install
    spinner=spinner,
  File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/utils/misc.py", line 705, in call_subprocess
    % (command_desc, proc.returncode, cwd))

It would seem like installation on a GCP Deep Learning VM would be one of the tested use cases here no?? If it doesn’t work there of all places, where is it intended to work?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:21 (17 by maintainers)

github_iconTop GitHub Comments

7reactions
glenn-jochercommented, Jun 14, 2019

@see-- this works! I was able to successfully install on a GCP VM with the following commands:

source activate base
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir . --user

UPDATE 1: On running a mixed precision model with the above install I get the following warning: Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback.

Installing instead with the following line removed the warning:

source activate base
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" . --user
5reactions
see--commented, Jun 7, 2019

Using --user worked for me with python3. I guess that is what you get from using 3 different python versions.

pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" . --user
Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshooting | Deep Learning VM Images - Google Cloud
This page describes problems that can come up when creating Deep Learning VM Images instances, and tells you how to address the problems....
Read more >
Error in creating GCP deep learning VM image - Stack Overflow
I am getting the following error while trying to create GCP deep learning VM instance using my free $300 credits. I have tried...
Read more >
GCP Virtual Machine with GPU for Deep Learning (2021 Guide)
Creating a virtiual machine with an attached GPU on GCP can be challenging sometimes. Errors like:"Operation type [insert] failed with ...
Read more >
Troubleshooting Virtual Machine Startup Failures on GCP
In this course, we'll show you how to diagnose a virtual machine that fails to boot on Google Cloud Platform and what steps...
Read more >
Deep Learning VM – Marketplace - Google Cloud Console
Deploy a Compute Engine instance with your favorite machine learning framework, Intel(R) optimized for GCE and configured to support common ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found