Schedulers like get_linear_schedule_with_warmup need access to the length of the train dataset
See original GitHub issueš Bug
If youāre using a lr scheduler that needs access to the number of batches in the train dataset like @huggingfaceās get_linear_schedule_with_warmup
, thereās currently no way to access the dataset in configure_optimizers()
because it looks like it is called before train_dataloader()
.
It would be nice to have some way to load the datasets before the optimizers and make the dataset available to other methods with something like self.train_dataset = train_dataset
.
Code sample:
train_steps = int(len(train_dataset) / (batch_size * grad_steps) * epochs)
scheduler = get_linear_schedule_with_warmup(optimizer, num_warmup_steps=
int(0.1 * train_steps), num_training_steps=train_steps)
Issue Analytics
- State:
- Created 4 years ago
- Reactions:6
- Comments:20 (8 by maintainers)
Top Results From Across the Web
Schedulers like get_linear_schedule_with_warmup need ...
Schedulers like get_linear_schedule_with_warmup need access to the length of the train dataset Ā· Issue #1038 Ā· Lightning-AI/lightning Ā· GitHub.
Read more >Optimization - Hugging Face
The .optimization module provides: an optimizer with weight decay fixed that can be used to fine-tuned models, and; several schedules in the form...
Read more >pytorch get_linear_schedule_with_warmup - You.com | The AI ...
If you're using a lr scheduler that needs access to the number of batches in the train dataset like @huggingface's get_linear_schedule_with_warmup , there'sĀ ......
Read more >ćTrainćDeberta-v3-large baseline - Kaggle
The following is necessary if you want to use the fast tokenizer for deberta v2 or ... AutoConfig from transformers import get_linear_schedule_with_warmup,Ā ...
Read more >Transformers for Multi-Regression ā [PART2] | by Zeineb Ghrib
All the code sources can be retrieved from my Kaggle notebook ... In the same way, we want to get the local evaluation...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Can this be re-opened? Why not do what @SokolovYaroslav is suggesting. If configure_optimizers() can be called after train_dataloader() then users can simply save the length in a local variable.
@SkafteNicki thanks for the response. Thatās almost exactly what I did:
If thatās the ārecommendedā way of doing it then Iām fine with that š