question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

can't train with cpu only

See original GitHub issue

Because of some certain reason, I want to train the model with small data on a PC with no GPU. But Python gave me an AssertionError as I found here.

Traceback (most recent call last):
  File "/home/snowyjune/local/faster-rcnn.pytorch/faster-rcnn.pytorch/trainval_net.py", line 336, in <module>
    loss.backward()

  File "/home/snowyjune/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)

  File "/home/snowyjune/anaconda3/lib/python3.6/site-packages/torch/autograd/__init__.py", line 89, in backward
    allow_unreachable=True)  # allow_unreachable flag

  File "/home/snowyjune/local/faster-rcnn.pytorch/faster-rcnn.pytorch/lib/model/roi_align/functions/roi_align.py", line 38, in backward
    assert((self.feature_size is not None) and (grad_output.is_cuda))

AssertionError

It seems that the program is designed not to support training with CPU. Is that true?

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:18 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jihwan1008commented, Oct 12, 2018

@SnowyJune973 Sorry to bother, but did you succeeded in ‘sh make.sh’? Didn’t you get any cffi errors?

0reactions
EMCPcommented, May 10, 2020

if you’re using COCO based datasets and resnet 101, you can give my implementation a try https://github.com/EMCP/faster-rcnn.pytorch

it uses pytorch 1.5.x and removes the requirements to compile pycocotools yourself, plus some other things

Read more comments on GitHub >

github_iconTop Results From Across the Web

Training a neural network using CPU only - Stack Overflow
Yes, it should be straightforward to train on CPU, simply by specifying that choice as the back end when you configure your model....
Read more >
Cannot use GPU in CPU-only Caffe: check mode. caffe
Well obviously you compiled caffe in CPU-only mode (look at your Makefile.config) but still try to use it in GPU-mode, which obviously doesn't...
Read more >
TDA4VM: CPU only mode for Training? Because "Not ... - TI E2E
But if the training is done on CPUs the time taken may the 10 times or even more. This is the reason why...
Read more >
GPU training deadlock with tensorflow-metal 0.5
Interestingly, the problem can not be reproduced if I change any of following. GPU to CPU; remove Dropout layers; downgrade tensorflow-metal to 0.4....
Read more >
Efficient Training on Multiple GPUs - Hugging Face
ZeRO + Offload CPU and optionally NVMe; as above plus Memory Centric Tiling (see below for details) if the largest layer can't fit...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found