Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unify Loggers/Tracers API and Reduce Architectural Overhead

See original GitHub issue

Dear all,

I’ve taken a deeper look at the new evaluation module. While it looks like a major improvement w.r.t. the previous features offering, now I’m mostly concerned about two issues: 1. “confusing API” and 2. “extensibility complexity”.

About 1., I find it difficult to separate conceptually the loggers from the tracers. Even if we separate the concepts into “visual loggers” like tensorflow and “stdout/file” tracers, for the user it would be better to have a single object that can configure and handles all the logging features in avalanche. I’d like something like this:

    my_logger = Logger(stdout=True, log_file='./logs/my_log.txt', text_style=DotTrace, dashboard=Tensorboard)
    evaluation_plugin = EvaluationPlugin(
        EpochAccuracy(), TaskForgetting(), EpochLoss(),
        EpochTime(), AverageEpochTime(),
        ConfusionMatrix(num_classes=scenario.n_classes),
        logger=my_logger)

This would be neater and easier to use I believe.

For 2., I’ve seen the evaluation module architecture is rather complex (6 files and even more classes). We cannot expect a general user to read and understand all this material to integrate a single metric. As we have done in the past for the rewriting of the BaseStrategy we should think at a way to simplify the architecture of the eval module so that it’s easier to understand and expand. The best would be to have a single BaseMetric you can inherit with handlers!

What do you think?

Issue Analytics

State:
Created 3 years ago
Comments:32 (12 by maintainers)

Top GitHub Comments

1reaction

lrzpellegrinicommented, Jan 22, 2021

Hi @AntonioCarta, I gave the PR a look with @vlomonaco. It looks great!

We had a brainstorming and got to the following conclusions, let me know if you agree with these:

@vlomonaco wanted to make the metric values available to the user after each call to train and test. In the very first API the strategy returned the results of the legacy EvalProtocol. With the introduction of the EvaluationPlugin, with it being one of the many plugins handled by the strategy, the metric values couldn’t be returned anymore. In order to allow the user to gather the metric values after a call to strategy.test/train(...) we were thinking about adding a method to the evaluation plugin to retrieve the last computed values (or a complete history of values, like Keras/TF does, still to decide). This would require not deleting the EvaluationPlugin from the codebase.
Keeping the idea of defining a list of metric in the EvaluationPlugin constructor, we need to mark which metrics are to be logged in a textual way. @vlomonaco proposed to add a log_to_text: bool constructor parameter to the metrics. Since your tqdm-based trace method handles metrics in a agnostic way, adding such a parameter to the metrics makes sense. I’m also proposing to add a proper implementation of the __str__ method to each metric supporting textual logging, so that we can control how each metric gets printed. This will also allow users to quickly implement a way to print their own custom metrics!
We agree to make tracer and loggers uniform! In fact, we should start deprecating the “tracer” term. For instance, your text logging method could be named TqdmLogger(). By having “loggers == tracers”, we could just add an arbitrary amount of loggers to the EvaluationPlugin constructor. The user-facing interface would be as follows:

# Consider it a pseudo-code :)
eval_plugin = EvaluationPlugin(metricA(), metricB(log_to_text=True), ...,
                               loggers=[TensorBoardLogger(), TqdmLogger(), ...])

strategy = MyStrategy(..., plugins=[eval_plugin, ...])

for step in scenario.train_stream:
    strategy.train(step)
    train_metrics = eval_plugin.metrics()  # Retrieves the train metrics
    # ...
    strategy.test(scenario.test_stream[...])
    test_metrics = eval_plugin.metrics()  # Retrieves the test metrics

What do you think?

Concerning the PR, I also unified the base abstract classes for plugins and loggers. I’ll soon open a PR too to show you which direction I took!

1reaction

lrzpellegrinicommented, Jan 19, 2021

The problem with a log_to_text parameter is that different trace implementation handle a different set of metrics, making it hard/impossibile to parametrize the metrics with log_to_text directly.

About the second point, I already implemented (in the very first iteration of the metrics API) metric classes in which different granularities were handled together. It’s a mess. Handling minibatch, epoch and step granularities in the same class at the same time makes the implementation of the metrics extremely complex and the code 100% unreadable. Trust me, it’s a mess.

However, we can do something similar for what we did with the benchmark API: we can create a single helper method, like loss_metrics(minibatch=True, epoch=True, step=True, ...) which returns the list of metrics requested by the user.

So, instead of doing something like that:

evaluation_plugin = EvaluationPlugin(
        MinibatchAccuracy(), EpochAccuracy(), TaskAccuracy(),
        MinibatchLoss(), EpochLoss(), EpochTime(), AverageEpochTime(),
        loggers=my_logger)

the user would do this (which achieves the very same result):

evaluation_plugin = EvaluationPlugin(
        accuracy_metrics(minibatch=True, epoch=True, task=True),
        loss_metrics(minibatch=True, epoch=True, task=False)
        AverageEpochTime(),  # The user can still use each metric class without using helpers
        loggers=my_logger)

What do you think?