question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add optimal model size and stopping time feature

See original GitHub issue

🚀 Feature request

The calculator blog post presented an automated way to find scaling laws with model size and compute budget on language modeling tasks. Adding it to the library would help save on training costs by picking an optimal model size and training time.

Motivation

Estimating how big of a model to use and how long to train for is more of an art than a science. An automated tool to perform that task would allow researchers and practitioners to concentrate on the the high-level parts of their projects as opposed to parameter tweaking.

Your contribution

I can submit a PR with my existing work, probably integrating it within Trainer and/or knocknock.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:42
  • Comments:14 (7 by maintainers)

github_iconTop GitHub Comments

6reactions
TevenLeScaocommented, Jun 9, 2020

Great stuff, thank you! The energy estimates look 1000 worse than reality though, V100 running for 12 h should not consume 5432 kWh I think, else we’d be all dead. 5.4 kWh looks more reasonable.

Screenshot 2020-06-09 at 00 26 45

Ah yes - I remembered having a doubt on that, I checked again the library we used to estimate those and there might have been a unit conversion error, I’ll fix that ASAP tomorrow!

Edit: it’s fixed, thank you @lopuhin !

1reaction
lopuhincommented, Jun 8, 2020

Great stuff, thank you! The energy estimates look 1000 worse than reality though, V100 running for 12 h should not consume 5432 kWh I think, else we’d be all dead. 5.4 kWh looks more reasonable.

Screenshot 2020-06-09 at 00 26 45
Read more comments on GitHub >

github_iconTop Results From Across the Web

Fixed Step Solvers in Simulink - MathWorks
Fixed-step solvers solve the model at regular time intervals from the beginning to the end of the simulation. The size of the interval...
Read more >
Strategies to Counter Small Automatic Time Steps - COMSOL
We give 3 modeling scenarios where small automatic time steps are encountered and strategies for improving the simulation efficiency in ...
Read more >
4 Strategies for Multi-Step Time Series Forecasting
Stop learning Time Series Forecasting the slow way! ... Having one model for each time step is an added computational and maintenance burden ......
Read more >
Understand Forward and Backward Stepwise Regression
The stopping rule is satisfied when all remaining variables to consider have a p-value larger than some specified threshold, if added to the...
Read more >
What is Underfitting? - IBM
However, if you train the model too much or add too many features to it, ... visual representation of underfit model, an optimal...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found