question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Validation during training: time incoherence

See original GitHub issue

Hi, everyone

I’m trying to run simultaneously the training and the validation of a CIL algorithm with eval_every = 1 to get the accuracy and the loss for each epoch in the test set. This is the code is use. Note that I set num_workers = 4 in the train call.

esp_plugin = EarlyStoppingPlugin(patience = 2, val_stream_name = 'test_stream', metric_name = "Top1_Acc_Exp")

cl_strategy = LwF(
    model, Adam(model.parameters(), lr = 0.001),
    CrossEntropyLoss(), train_mb_size = 256, train_epochs = 10, eval_mb_size = 256, plugins = [esp_plugin], 
    evaluator = eval_plugin, alpha = [0, 1], temperature = 1, eval_every = 1, device = device)
    
for experience in generic_scenario.train_stream:
    n_exp = experience.current_experience
    print("Start of experience: ", n_exp)
    print("Current Classes: ", experience.classes_in_this_experience)
    cl_strategy.train(experience, eval_streams = [generic_scenario.test_stream[0:n_exp+1]], num_workers = 4)
    print('Computed accuracy on the whole test set')

This is the problem I got. While the training iteration only lasts for 21’‘, the evaluation lasts for almost 3’ when the size of the evaluation stream is 5x times shorter. I tried in both the beta version and the latest version but the same error was found for both.

0

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
ggraffieticommented, May 5, 2022

Perfect @PabloMese! I’ll reopen just to keep track of the bug, we’ll close the issue when the fix is included in the official code 👍

0reactions
ggraffieticommented, May 5, 2022

@PabloMese solved by the linked PR. When it will be accepted you can reinstall the “nightly” version of the library and the bug will be gone 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

Inconsistency in the use of the term “validation” in studies ...
In technical machine learning workflows, it is often implied that model validation is based on some form of a cross-validation procedure, where the...
Read more >
machine learning - Inconsistency in cross-validation results
I am trying to train a Linear Discriminant Analysis (LDA) classifier to classify this data. The classifier is later to be used in...
Read more >
Inconsistency in the use of the term “validation” in ... - PLOS
Studies are inconsistent in the use of the term “validation”, with some using it to refer to tuning and others testing, which hinders...
Read more >
Inconsistency in loss on SAME data for train and validation ...
1 Answer 1 ... If you set momentum=0.0 in the BatchNorm layer, the averaged statistics should match perfectly with the statistics from the...
Read more >
Inference and Validation - Ryan Wingate
To test for overfitting while training, we measure the performance on data not in the training set called the validation set.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found