Install Failure on GCP Deep Learning VM
See original GitHub issueI created a simple GCP Deep Learning VM: https://cloud.google.com/deep-learning-vm/
I followed the install directions, and the install failed with errors:
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .
...
Command "/opt/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-req-build-j0qgf5ds/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code,
__file__, 'exec'))" --cpp_ext --cuda_ext install --record /tmp/pip-record-1yr2fag5/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-req-build-j0qgf5ds/
Exception information:
Traceback (most recent call last):
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 143, in main
status = self.run(options, args)
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 366, in run
use_user_site=options.use_user_site,
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/req/__init__.py", line 49, in install_given_reqs
**kwargs
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/req/req_install.py", line 791, in install
spinner=spinner,
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/utils/misc.py", line 705, in call_subprocess
% (command_desc, proc.returncode, cwd))
The Python-only option also failed:
pip install -v --no-cache-dir .
...
Command "/opt/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-req-build-eedemek6/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code,
__file__, 'exec'))" install --record /tmp/pip-record-ehl5a4y7/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-req-build-eedemek6/
Exception information:
Traceback (most recent call last):
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 143, in main
status = self.run(options, args)
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 366, in run
use_user_site=options.use_user_site,
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/req/__init__.py", line 49, in install_given_reqs
**kwargs
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/req/req_install.py", line 791, in install
spinner=spinner,
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/utils/misc.py", line 705, in call_subprocess
% (command_desc, proc.returncode, cwd))
It would seem like installation on a GCP Deep Learning VM would be one of the tested use cases here no?? If it doesn’t work there of all places, where is it intended to work?
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:21 (17 by maintainers)
Top Results From Across the Web
Troubleshooting | Deep Learning VM Images - Google Cloud
This page describes problems that can come up when creating Deep Learning VM Images instances, and tells you how to address the problems....
Read more >Error in creating GCP deep learning VM image - Stack Overflow
I am getting the following error while trying to create GCP deep learning VM instance using my free $300 credits. I have tried...
Read more >GCP Virtual Machine with GPU for Deep Learning (2021 Guide)
Creating a virtiual machine with an attached GPU on GCP can be challenging sometimes. Errors like:"Operation type [insert] failed with ...
Read more >Troubleshooting Virtual Machine Startup Failures on GCP
In this course, we'll show you how to diagnose a virtual machine that fails to boot on Google Cloud Platform and what steps...
Read more >Deep Learning VM – Marketplace - Google Cloud Console
Deploy a Compute Engine instance with your favorite machine learning framework, Intel(R) optimized for GCE and configured to support common ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@see-- this works! I was able to successfully install on a GCP VM with the following commands:
UPDATE 1: On running a mixed precision model with the above install I get the following warning:
Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback.
Installing instead with the following line removed the warning:
Using
--user
worked for me with python3. I guess that is what you get from using 3 different python versions.