question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItĀ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Schedulers like get_linear_schedule_with_warmup need access to the length of the train dataset

See original GitHub issue

šŸ› Bug

If you’re using a lr scheduler that needs access to the number of batches in the train dataset like @huggingface’s get_linear_schedule_with_warmup, there’s currently no way to access the dataset in configure_optimizers() because it looks like it is called before train_dataloader().

It would be nice to have some way to load the datasets before the optimizers and make the dataset available to other methods with something like self.train_dataset = train_dataset.

Code sample:

train_steps = int(len(train_dataset) / (batch_size * grad_steps) * epochs)
scheduler = get_linear_schedule_with_warmup(optimizer, num_warmup_steps=
                      int(0.1 * train_steps), num_training_steps=train_steps)

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:6
  • Comments:20 (8 by maintainers)

github_iconTop GitHub Comments

4reactions
dd1923commented, Jun 4, 2021

Can this be re-opened? Why not do what @SokolovYaroslav is suggesting. If configure_optimizers() can be called after train_dataloader() then users can simply save the length in a local variable.

4reactions
marrrcincommented, Mar 26, 2020

@SkafteNicki thanks for the response. That’s almost exactly what I did:

    @lru_cache()
    def total_steps(self):
        return len(self.train_dataloader()) // self.hparams.accumulate_grad_batches * self.hparams.epochs

    def configure_optimizers(self):
        optimizer = AdamW(self.model.parameters(), lr=self.hparams.lr)
        lr_scheduler = get_linear_schedule_with_warmup(
                    optimizer,
                    num_warmup_steps=self.hparams.warmup_steps,
                    num_training_steps=self.total_steps(),
        )
        return [optimizer], [{"scheduler": lr_scheduler, "interval": "step"}]

If that’s the ā€œrecommendedā€ way of doing it then I’m fine with that 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

Schedulers like get_linear_schedule_with_warmup need ...
Schedulers like get_linear_schedule_with_warmup need access to the length of the train dataset Ā· Issue #1038 Ā· Lightning-AI/lightning Ā· GitHub.
Read more >
Optimization - Hugging Face
The .optimization module provides: an optimizer with weight decay fixed that can be used to fine-tuned models, and; several schedules in the form...
Read more >
pytorch get_linear_schedule_with_warmup - You.com | The AI ...
If you're using a lr scheduler that needs access to the number of batches in the train dataset like @huggingface's get_linear_schedule_with_warmup , there'sĀ ......
Read more >
怐Train怑Deberta-v3-large baseline - Kaggle
The following is necessary if you want to use the fast tokenizer for deberta v2 or ... AutoConfig from transformers import get_linear_schedule_with_warmup,Ā ...
Read more >
Transformers for Multi-Regression — [PART2] | by Zeineb Ghrib
All the code sources can be retrieved from my Kaggle notebook ... In the same way, we want to get the local evaluation...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found