Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Validation during training: time incoherence

See original GitHub issue

Hi, everyone

I’m trying to run simultaneously the training and the validation of a CIL algorithm with eval_every = 1 to get the accuracy and the loss for each epoch in the test set. This is the code is use. Note that I set num_workers = 4 in the train call.

esp_plugin = EarlyStoppingPlugin(patience = 2, val_stream_name = 'test_stream', metric_name = "Top1_Acc_Exp")

cl_strategy = LwF(
    model, Adam(model.parameters(), lr = 0.001),
    CrossEntropyLoss(), train_mb_size = 256, train_epochs = 10, eval_mb_size = 256, plugins = [esp_plugin], 
    evaluator = eval_plugin, alpha = [0, 1], temperature = 1, eval_every = 1, device = device)
    
for experience in generic_scenario.train_stream:
    n_exp = experience.current_experience
    print("Start of experience: ", n_exp)
    print("Current Classes: ", experience.classes_in_this_experience)
    cl_strategy.train(experience, eval_streams = [generic_scenario.test_stream[0:n_exp+1]], num_workers = 4)
    print('Computed accuracy on the whole test set')

This is the problem I got. While the training iteration only lasts for 21’‘, the evaluation lasts for almost 3’ when the size of the evaluation stream is 5x times shorter. I tried in both the beta version and the latest version but the same error was found for both.

Issue Analytics

State:
Created a year ago
Comments:6 (4 by maintainers)

Top GitHub Comments

1reaction

ggraffieticommented, May 5, 2022

Perfect @PabloMese! I’ll reopen just to keep track of the bug, we’ll close the issue when the fix is included in the official code 👍

0reactions

ggraffieticommented, May 5, 2022

@PabloMese solved by the linked PR. When it will be accepted you can reinstall the “nightly” version of the library and the bug will be gone 😃

Top Results From Across the Web

Inconsistency in the use of the term “validation” in studies ...

In technical machine learning workflows, it is often implied that model validation is based on some form of a cross-validation procedure, where the...

machine learning - Inconsistency in cross-validation results

I am trying to train a Linear Discriminant Analysis (LDA) classifier to classify this data. The classifier is later to be used in...

Inconsistency in the use of the term “validation” in ... - PLOS

Studies are inconsistent in the use of the term “validation”, with some using it to refer to tuning and others testing, which hinders...

Inconsistency in loss on SAME data for train and validation ...

1 Answer 1 ... If you set momentum=0.0 in the BatchNorm layer, the averaged statistics should match perfectly with the statistics from the...

Inference and Validation - Ryan Wingate

To test for overfitting while training, we measure the performance on data not in the training set called the validation set.