question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

LR Finder doesn't restore original model weights?

See original GitHub issue

Hey! I love this repo, thanks for making it 💯

Everything works well except for one thing, after some digging around/experimenting, here’s what I’ve found:

Below are some figures for the training loss and training accuracy (on MNIST, using a resnet18).

Problem:

  1. Using LRFinder on a model, and then training with it afterwards appears to hurt the models learning (see pink curve below).

Solution:

  1. Using LRFinder on a model, and manually restoring the weights, appears to train the model optimally. (see green curve below).
  2. Using LRFinder on a clone of the model, and then using the original model for training, appears to train the model optimally. (see green curve below).

Regarding the figure/graphs below, both models used the same hyperparameters.

An in-code example of option 1) would be similar to what was given in the README.md:

from torch_lr_finder import LRFinder

model = ...
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-7, weight_decay=1e-2)
lr_finder = LRFinder(model, optimizer, criterion, device="cuda")
lr_finder.range_test(trainloader, end_lr=100, num_iter=100)
lr_finder.plot()

// Then use "model" for training

An in-code example of option 3) would be:

from torch_lr_finder import LRFinder

model = ...
temp_model = *create model with same architecture*
// copy weights over
temp_model.load_state_dict(model.state_dict)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-7, weight_decay=1e-2)
// use temp model in lr_finder
lr_finder = LRFinder(temp_model, optimizer, criterion, device="cuda")
lr_finder.range_test(trainloader, end_lr=100, num_iter=100)
lr_finder.plot()

image

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:23 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
AmarSainicommented, Jan 3, 2020

So I ran some experiments too, check out my project page: Optimizer Benchmarks

The Jupyter Notebooks are in the GitHub Repo, you can view them with the build-in notebook viewer!

Main conclusion from project page:

  • OneCycle LR > Constant LR
  • Making a new optimizer vs. Preserving state and re-using the same optimizer both achieve very similar performance. i.e. Discarding an optimizer’s state didn’t really hurt the model’s performance, with or without an LR Scheduler.

Note: Conclusions are based on the Adam optimizer and OneCycle LR Scheduler. I haven’t experimented with other optimizers to see if dropping their state is more impactful

1reaction
AmarSainicommented, Dec 21, 2019

Oh nice!

Let me know if you’re looking for any help, I’d be more than happy to contribute on any part of this lovely repo! 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

Learning Rate Finder doesn't reset model parameters #6675
This means the user has to manually reset the model parameters every time after tuning model's learning rate.
Read more >
Model and Weights do not load from checkpoint - Stack Overflow
First, the code won't run if you don't already have the weights/model saved. So I commented out the below lines and ran the...
Read more >
Learning Rate Finder - Apache MXNet
We implement this method in MXNet (with the Gluon API) and create a 'Learning Rate Finder' which you can use while training your...
Read more >
Effective Training Techniques - PyTorch Lightning
Restore the initial state of model and trainer. Warning. Batch size finder is not yet supported for DDP or any of its variations,...
Read more >
Keras Learning Rate Finder - PyImageSearch
After training is complete, we reset the initial model weights and learning rate value (Lines 155 and 156). Our final method, plot_loss ,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found