bug in gradient accumulation example
See original GitHub issueShouldn’t this condition be step % gradient_accumulation_steps != 0
because we want to avoid gradient averaging at each step other than gradient_accumulation_steps
step?
Issue Analytics
- State:
- Created a year ago
- Comments:5
Top Results From Across the Web
Trying to accumulate gradients in Pytorch, but getting ...
Here's an example which accumulates the gradient, not the loss: model = nn. ... Since your gradients will be accumulated twice.
Read more >Gradient accumulation: should I duplicate data? - Transformers
Hello! I am using gradient accumulation to simulate bigger batches when fine-tuning. However, I remember to have seen some notebooks in the ...
Read more >Understanding Clouds from Satellite Images | Kaggle
A trick to use bigger batches for training: gradient accumulation ... In most cases (not all, for example in GANs) using bigger batches...
Read more >[D] Does gradient accumulation achieve anything different ...
I'm trying to understand the practical justification for gradient accumulation (ie. Running with an effectively larger batch size by summing ...
Read more >gradient-accumulator - PyPI
There is also an example of how to use gradient accumulation with mixed precision here. Adaptive gradient clipping. There has also been added...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Let’s say
gradient_accumulation_steps = 5
, then In the current codeoptimizer.step()
is getting called in step1,2,3,4,6,7,8,9,11,...
instead it should be getting called in step0,5,10,15,20,...
right? Basically it’s in the opposite logical blockThanks for the fix!