question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

learner.fit doesn't show result after every epoch

See original GitHub issue

Hi @kaushaltrivedi , I used:

learner.fit(epochs=6, 
			lr=6e-5, 
			validate=True. 	# Evaluate the model after each epoch
			schedule_type="warmup_cosine") 

However, that code onlys checks after the whole training, not after each epoch. What could I do? Thanks

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:7

github_iconTop GitHub Comments

2reactions
amin-nejadcommented, Aug 14, 2019

Had the same issue of no metrics being printed at all, seems like it’s because the default setting of the logger is to only print warning messages, not info messages. If you are using the root logger (e.g. logger = logging.getLogger()), before you create the object, run this line:

logging.basicConfig(level=logging.NOTSET)

If you are defining a custom logger yourself (e.g. logger = logging.getLogger("my-logger")), before you create the object, run this line:

logging.basicConfig(level=logging.NOTSET)

Now the training process, will print out the loss as well as any other metrics you passed to the learner object.

Alternatively, you can always view the training process either during or afterwards, using tensorboard. The training process creates a folder called tensorboard with all the events files in there.

1reaction
vondersamcommented, Sep 7, 2019

No, I didn’t get any validation metric results, not even after each epoch. This is the code I’m using.

databunch = BertDataBunch(data_dir=BERT_DATA_PATH/fold, 
                                      label_dir=LABEL_PATH, 
                                      tokenizer=args['bert_model'], 
                                      train_file=f'train{is_masked}.csv', 
                                      val_file=f'val{is_masked}.csv',
                                      test_data=None,
                                      text_col="text", 
                                      label_col=labels_index,
                                      batch_size_per_gpu=args['train_batch_size'], 
                                      max_seq_length=args['max_seq_length'], 
                                      multi_gpu=multi_gpu, 
                                      multi_label=True, 
                                      model_type='bert')
learner = BertLearner.from_pretrained_model(databunch, 
                                        pretrained_path=args['bert_model'], 
                                        metrics=metrics, 
                                        device=device, 
                                        logger=logger, 
                                        finetuned_wgts_path=None, 
                                        warmup_steps=500,
                                        output_dir=fold_dir,
                                        is_fp16=args['fp16'],
                                        loss_scale=args['loss_scale'],
                                        multi_gpu=multi_gpu,  
                                        multi_label=True,
                                        logging_steps=50)
learner.fit(args['num_train_epochs'], lr=args['learning_rate'], schedule_type="warmup_linear")
learner.save_model()

And this is what the logs look like image

Read more comments on GitHub >

github_iconTop Results From Across the Web

Show progress bar for each epoch during batchwise training ...
This will generate a progress bar for each batch instead of each epoch. Is it possible to generate a progress bar for each...
Read more >
Show Loss Every N Batches · Issue #2850 · keras-team/keras
Hey Everyone, At the moment, the model fitting process returns the accuracy/loss metrics every epoch such as shown below: Train on 10000 ...
Read more >
Why Do I Get Different Results Each Time in Machine Learning?
In this tutorial, you will discover why you can expect different results when using machine learning algorithms.
Read more >
Customizing what happens in `fit()` - Keras
We implement a custom train_step() that updates the state of these metrics (by calling update_state() on them), then query them (via result() ) ......
Read more >
Writing a training loop from scratch | TensorFlow Core
If you want to customize the learning algorithm of your model while still leveraging the convenience of fit() (for instance, to train a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found