question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to implement model check pointing for best models

See original GitHub issue

Hi,

How can I checkpoint a model based on accuracy instead of saving checkpoint based on the number of iterations. Currently, I am using cfg.SOLVER.CHECKPOINT_PERIOD which uses number of iterations. What do I have to do if I need to save only the best weights instead of saving model after every 5000 iterations?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:4
  • Comments:6

github_iconTop GitHub Comments

1reaction
ppwwyyxxcommented, Oct 8, 2020

With plain_train_net.py, you can add this logic using the return value of do_test: https://github.com/facebookresearch/detectron2/blob/5e2a6f62ef752c8b8c700d2e58405e4bede3ddbe/tools/plain_train_net.py#L175-L182

Alternatively, it can also be implemented as a hook to be used as trainers, but it’s often more work.

0reactions
ppwwyyxxcommented, Feb 2, 2020

The model computes losses when in training mode. So model(validation_inputs) will give you losses.

By writing the model yourself you’ll be able to decide what it returns if you’d like anything not supported by the existing models.

See also https://detectron2.readthedocs.io/tutorials/models.html#model-output-format and https://detectron2.readthedocs.io/tutorials/write-models.html

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Checkpoint Deep Learning Models in Keras
A good use of checkpointing is to output the model weights each time an improvement is observed during training. The example below creates...
Read more >
Model Checkpointing for DL - Analytics Vidhya
Improving your Deep Learning model using Model Checkpointing(Implementation)- Part 2 · 1. Loading the dataset · 2. Pre-processing the data · 3.
Read more >
Checkpointing Deep Learning Models in Keras
In this article, you will learn how to checkpoint a deep learning model built using Keras and then reinstate the model architecture and ......
Read more >
Checkpointing Tutorial for TensorFlow, Keras, and PyTorch
This post will demonstrate how to checkpoint your training models on FloydHub so that you can resume your experiments from these saved ...
Read more >
Checkpointing Models — H2O 3.38.0.3 documentation
To resume model training, use checkpoint model keys ( model_id ) to incrementally train a specific model using more iterations, more data, different...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found