reimplement with python3.6 and pytorch 1.0.1, Is this problem because of batch_size?
See original GitHub issueHallo tekin,
I have just reimplement this project with python3.6 and pytorch1.0.1, and because of the limited computation of the GPU(GTX 970m, memory: 3GB) on my laptop. I changed the batch_size from 32 to only 2 and trained for a whole night(epoch:700 remains).
after that I test my model, but with results below:
-----------------------------------
tensor to cuda : 0.000402
predict : 0.004263
get_region_boxes : 0.063616
eval : 0.009278
total : 0.077559
-----------------------------------
2019-04-11 13:23:42 Results of ape
2019-04-11 13:23:42 Acc using 5 px 2D Projection = 0.00%
2019-04-11 13:23:42 Acc using 10% threshold - 0.0103 vx 3D Transformation = 0.00%
2019-04-11 13:23:42 Acc using 5 cm 5 degree metric = 0.00%
2019-04-11 13:23:42 Mean 2D pixel error is 2978.765381, Mean vertex error is 2.461958, mean corner error is 409.030334
2019-04-11 13:23:42 Translation error: 2.461311 m, angle error: 143.237230 degree, pixel error: 2978.765628 pix
all the Acc are Zero!!
I have checked the code, but everything is fine. Do you think it is because of batch_size is too small?? Or still because of the version of python and pytorch.
Looking forward to your reply!
Issue Analytics
- State:
- Created 4 years ago
- Comments:10
Top Results From Across the Web
When running model forward with large batch size, it reports ...
When running model forward with large batch size, it reports the error: THCudaTensor sizes too large for THCDeviceTensor conversion #24401.
Read more >Batch size problem - PyTorch Forums
Hi everyone, I created my own data set and it's size=[22806,1,3]. But when I tried to put this dataset to training. I got...
Read more >How to Grid Search Hyperparameters for Deep Learning ...
This is an odd example because often, you will choose one approach a priori and instead focus on tuning its parameters on your...
Read more >Cuda not compatible with PyTorch installation error while ...
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 ... and followed to implement conda install pytorch torchvision ...
Read more >Protein ML Colab Notebooks - Tianyu Lu
A numpy array and a PyTorch tensor can store the same data, ... in /usr/local/lib/python3.6/dist-packages (from tensorboard) (1.0.1) ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@btekin Hi, after change another GPU with 4GB memory and change the batch_size from 2 to 8. The ACC is not zero any more. It is really related with batch_size. I’m still a bit confused with this issue. So do you have any idea why the batch_size cannot be so small just like 2?
With very small batch sizes, the learning rate is also set to a higher value in the current implementation. See https://github.com/microsoft/singleshotpose/blob/master/train.py#L386 . Therefore, with small batch sizes you might want to adapt your learning rate as well for better convergence.