question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Can I train the model based on the files in "checkpoints" dir

See original GitHub issue

I copy the files in checkpoints dir into ./output/ctpn_end2end/voc_2007_trainval/ , but when I run python3 ./ctpn/train_net.py. I came cross errors.

Traceback (most recent call last):
  File "./ctpn/train_net.py", line 37, in <module>
    restore=bool(int(cfg.TRAIN.restore)))
  File "/home/fant/projects/python/text-detection-ctpn-master/lib/fast_rcnn/train.py", line 233, in train_net
    sw.train_model(sess, max_iters, restore=restore)
  File "/home/fant/projects/python/text-detection-ctpn-master/lib/fast_rcnn/train.py", line 187, in train_model
    if last_snapshot_iter != iter:
UnboundLocalError: local variable 'iter' referenced before assignment

How can I resolve it

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
cipri-tomcommented, Feb 9, 2018

@FantDing the loss tells you almost nothing, unless it is diverging. Remember it is an average of many things. Try to evaluate the model and see how it performs.

Next week I’ll try to integrate an eval step during training, so you can pass a validation set and check the accuracy in TensorBoard.

1reaction
cipri-tomcommented, Feb 8, 2018

it depends on what you train on. The model in checkpoints was trained as detailed in the readme. If you want to continue training with your own images, I can imagine these are very different, so the model will be “confused” and is normal to have a high loss.

If you want to train on a different set of images, I suggest you train from scratch. If it is too slow for you, I know of a couple of optimisations which I implemented in my fork

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Checkpoint Deep Learning Models in Keras
A simpler checkpoint strategy is to save the model weights to the same file if and only if the validation accuracy improves. This...
Read more >
Training checkpoints | TensorFlow Core
The persistent state of a TensorFlow model is stored in tf.Variable objects. These can be constructed directly, but are often created through high-level...
Read more >
Checkpoints | Data Version Control · DVC
This guide covers how to implement checkpoints in an ML project using DVC. We're going to train a model to identify handwritten digits...
Read more >
Checkpoints in Deep Learning - INDUSMIC
Checkpoints does not contain information about the model or the training nor do they contain any of the computations defined by the model,...
Read more >
Use Checkpoints in Amazon SageMaker - AWS Documentation
Use checkpoints in Amazon SageMaker to save the state of models. Training can be resumed from the saved checkpoints.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found