Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Event description TERMINATE_SINGLE_EPOCH not matching its actual behavior

See original GitHub issue

🐛 Bug description

From the documentation it seems that the event TERMINATE_SINGLE_EPOCH is only fired, if engine.terminate_epoch is called. I discovered, that the event is also fired, if I call engine.terminate inside of an epoch as can be seen in the code linked below.

Event description: https://github.com/pytorch/ignite/blob/51d3e3e54ee042b5ef1b73a489aabe4b75968411/ignite/engine/events.py#L161-L162

Current implementation: https://github.com/pytorch/ignite/blob/51d3e3e54ee042b5ef1b73a489aabe4b75968411/ignite/engine/engine.py#L817-L821

IMO, the implementation is fine and only the documentation should be updated.

I think the other option to not fire TERMINATE_SINGLE_EPOCH if enginge.terminate is called would make handling different types of termination more complicated. Because:
Assuming you have to do some post processing after each epoch and it doesn’t matter if the epoch completes with engine.terminate, engine.terminate_epoch or by StopIteration from the data loader. You would have to attach the same processing function three times, for EPOCH_COMPLETED, TERMINATE_SINGLE_EPOCH and TERMINATE.

At the moment one have to attach post processing function to the two events EPOCH_COMPLETED and TERMINATE_SINGLE_EPOCH to catch termination via command. I didn’t expect that EPOCH_COMPLETED is not fired if I call any termination method. Is this documented somewhere? Haven’t found it.

Thanks in advance 😃

EDIT: Sorry, I missed that the signature of the function that is called on EPOCH_COMPLETED is different than for TERMINATE_SINGLE_EPOCH. EDIT2: Fixed event name TERMINATE_SINGLE_EPOCH

EDIT3: I think, now I got it:

If engine.terminate is called, TERMINATE_SINGLE_EPOCH and TERMINATE are called but not EPOCH_COMPLETED.
If engine.terminate_epoch is called, TERMINATE_SINGLE_EPOCH and EPOCH_COMPLETED are called but not TERMINATE.
If epoch completes without termination, EPOCH_COMPLETED is called but not TERMINATE_SINGLE_EPOCH and TERMINATE.

There is no common event that is called in all three cases and which could be used if all cases should be treated equally.

Environment

PyTorch Version (e.g., 1.4): 1.5.1
Ignite Version (e.g., 0.3.0): 0.4.2
OS (e.g., Linux): Windows
How you installed Ignite (conda, pip, source): pip
Python version: 3.7.7
Any other relevant information: -

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:5 (2 by maintainers)

Top GitHub Comments

1reaction

vfdev-5commented, Oct 2, 2020

@shngt thanks ! I think having the table in the docs would be better.

1reaction

alxlampecommented, Sep 28, 2020

@vfdev-5 I don’t clearly understand the purpose of engine.terminate and engine.terminate_epoch. Is it called if something goes wrong? Or is it called if some threshold/metric is reached? Can you link an example?

My intuition about engine.terminate and engine.terminate_epoch was, that it just gives control to break out of the iteration loop with engine.terminate_epoch and to break out of the epoch loop with engine.terminate. I would have expected something like this:

method\event	`EPOCH_COMPLETED`	`TEMINATE_SINGLE_EPOCH`	`TERMINATE`
no termination	x	-	-
`engine.terminate_epoch`	x	x	-
`engine.terminate`	x	-	x

Assuming that I want to calculate something at the end of each epoch, I only have to add an event handler for EPOCH_COMPLETED independent of the termination that occurred.

The current implementation is:

method\event	`EPOCH_COMPLETED`	`TEMINATE_SINGLE_EPOCH`	`TERMINATE`
no termination	x	-	-
`engine.terminate_epoch`	x	x	-
`engine.terminate`	-	x	x

For my use case and with the current implementation I would have to add two event handlers to get my function executed exactly one time, EPOCH_COMPLETED and TERMINATE.

@vfdev-5, your proposal would look like this I think:

method\event	`EPOCH_COMPLETED`	`TEMINATE_SINGLE_EPOCH`	`TERMINATE`
no termination	x	-	-
`engine.terminate_epoch`	x	x	-
`engine.terminate`	-	-	x

For my use case and with the this implementation I wouldn’t have the chance to catch all cases but executing my function only exactly one time, because I would have to attach my function to all three events, but EPOCH_COMPLETED appears in two cases. This use case is like the one above and I would have to attach my function to EPOCH_COMPLETED and TERMINATE.

When I have a look at the different logics in the tables, the logic I would have expected (at the tip) seems to be the most reasonable to me. Bus as I said, I don’t know the use cases, where the current implementation is advantageous. Would be nice, if you can provide one 😃