How to set num_training_steps in lr_scheduler properly
See original GitHub issueUsually I call something like this to set the scheduler
from transformers import get_linear_schedule_with_warmup
scheduler = get_linear_schedule_with_warmup(
optimizer, num_warmup_steps=warmup_step, num_training_steps=num_training_steps
)
And the num_training_steps
usually equals to
t_total = int(len(train_dataloader) * num_epochs) ## len(train_dataloader) = num_batches
If I use accelerator, should I change the num_training_steps
to something like this? And how to understand this:
I believe the len(train_dataloader) = batch size at each device
t_total = int(len(train_dataloader) * num_epochs // accelerator.num_processes)
All the above operations are before accelerator.prepare
Issue Analytics
- State:
- Created a year ago
- Comments:9 (1 by maintainers)
Top Results From Across the Web
How to change optimizer and lr scheduler in the middle of ...
I need to train a model multi-phases with a pre-trained backbone. For the first 10 epoch, I want to have the backbone frozen...
Read more >How to change optimizer and lr scheduler in the ... - GitHub
you can try the on_train_epoch_start method in the callback and reconfigure the optimizers and schedulers the way you want. Here are some links ......
Read more >PyTorch LR Scheduler - Adjust The Learning Rate ... - YouTube
In this PyTorch Tutorial we learn how to use a Learning Rate ( LR) Scheduler to adjust the LR during training. Models often...
Read more >StepLR — PyTorch 1.13 documentation
When last_epoch=-1, sets initial lr as lr. Parameters: optimizer (Optimizer) – Wrapped optimizer. step_size (int) – Period of learning rate decay.
Read more >Guide to Pytorch Learning Rate Scheduling | Kaggle
Sets the learning rate of each parameter group to the initial lr times a given ... Learning Rate = ",optimizer.param_groups[0]["lr"]) scheduler.step() ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@allanj everything and anything that has to do with gradient accumulation accelerate will now handle for you. Just pass in the
gradient_accumulation_steps
arg and make no changes to your code, as if you weren’t using gradient accumulation at all 😄Since: https://github.com/huggingface/accelerate/blob/b0f8189d34fa42821ca041e2cba161db864c76b5/src/accelerate/scheduler.py#L31-L32