Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Discussion about ProgressBar and metrics display

See original GitHub issue

Hello, I would like to discuss the feature in contrib.handlers.ProgressBar that displays metrics on each Events.ITERATION_COMPLETED as a postfix of the bar. For some context, please read https://github.com/pytorch/ignite/pull/295.

I think the ProgressBar and this feature are essential to the package, as it provides a concise way to have a nice training and evaluation experience (replacing the old way of spamming numbers in the console), and is appealing to beginners.

Here is some of the issues I found.

1. Non-`RunningAverage` metrics

Problem

On the doc, the given example is

pbar = ProgressBar()
pbar.attach(trainer, ['loss'])

The metrics parameter (here ['loss']) works with Metrics that have been attached to the engine, AND that rewired self.computed (which updates the state.metrics dict of the engine) on Events.ITERATION_COMPLETED instead of Events.EPOCH_COMPLETED. The second part is unintuitive, and not documented. The current document suggests that this might work with all metrics, but this raises a KeyError when the dict has not been updated before the end of the first iteration. Currently, this freature only works with RunningAverage.

The problem has been raised by @vfdev-5 in https://github.com/pytorch/ignite/pull/256, but I did not see the discussion go anywhere, I would like for it to continue.

Proposed solutions

The documentation should be updated to reflect the fact that this feature will not work on all of the built-in metrics but RunningAverage.
The error message for the KeyError is metrics not found in engine.state.metrics and could be modified to hint for what the issue might be.
a Maybe the Metric class hierarchy should be updated, to distinguish metrics that update the engine state every epoch, and those that update it every iteration. The ProgressBar could then only accept the latter one.
b Or maybe all of the current built-in metrics could update the state of the engine at each iteration (on self.update instead of on self.completed. This will add computation at each epoch. The progress bar would then work on all metrics.

2. Print the loss of a training engine

Problem

Have a look at the doc example again :

pbar = ProgressBar()
pbar.attach(trainer, ['loss'])

This suggests that we can print the training loss, that updates at each iteration.

However, we can’t. When we are using a trainer created with engine.create_supervised_trainer, the process_function returns the loss for the processed batch. I think this is the most intuitive way to do it even for custom training engines. The Metrics that can be attached are computed from the output of this function (y_pred, y for an evaluator for example), which may pass through a transform function. You see where I am going: the output of the trainer is already the loss itself, we do not need any further computation.

If we want to print the training loss, we have to attach a Metric that does nothing but moves the already computed loss into the state.metrics dict of the engine at every iteration. See https://github.com/pytorch/ignite/pull/295 for the Identity metric. This is not intuitive, we should be able to display the training loss out of the box.

Proposed solutions

Built-in Identity metric, see https://github.com/pytorch/ignite/pull/295 for the code.
Modify ProgressBar to allow the display in the postfix the value of engine.state.output The attach method could then be
```
attach(self, engine, metric_names=None, display_output=False)
```
with the output looking like Epoch 1: [59/689] 9%|▊ , output=3.54e-02 [00:04<00:50]
Modify ProgressBar to allow the display in the postfix the value of any custom variables in the engine.state object, including the output.

Thanks for reading, sorry for the long issue and the english mistakes.

Issue Analytics

State:
Created 5 years ago
Comments:14 (3 by maintainers)

Top GitHub Comments

1reaction

miguelvrcommented, Oct 22, 2018

But for the custom_variables we assume they are all printable ? A list of tuples (var_name, output_transform) seems complicated.

you would actually have to transform it into a dictionary {name: value} for it to make sense

we could rename it to display_output

This doesn’t generalize, you are only thinking about the create_supervised_trainer case. If you look at the DCGAN example you’ll see the step function output it is actually a dictionary, for example.

0reactions

vfdev-5commented, Oct 22, 2018

Yes, we can explicitly detail in the doc that progress bar can display only printable variables.