question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

model is moved too early in create_supervised_trainer/evaluator

See original GitHub issue

Hi, first of all, thanks for the great library! I have a general “bug” report or question on the specific implementation of create_supervised_trainer and create_supervised_evaluator. As seen for example here: https://github.com/pytorch/ignite/blob/master/ignite/engine/__init__.py#L64

The moment a supervised evaluator/trainer is created the model is moved to the specific device and not during the inference stage.

I agree, that this version doesn’t have an impact in general, but I would argue that the model should be moved in the _inference function and not when I create the evaluator/trainer. A weird example but similar to my case: I have multiple evaluators and I would like to run one on the CPU and one on the GPU. The reason being that a different library doesn’t support GPUs yet and I would like to add it in an “ignite” style to keep the code similar. Now, the run steps will not work, as the model is not moved again in the inference stage.

I can imagine that there is a reason why it is the way it is, but I am wondering if a future version couldn’t do the model.to(device) step in the _inference function. 😃

Thanks!

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:11 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
vfdev-5commented, Mar 10, 2020

yes, please

On Tue, Mar 10, 2020, 09:47 kai-tub notifications@github.com wrote:

Since this is labeled as “help wanted” should I issue a PR with the changes mentioned by @justusschock https://github.com/justusschock?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pytorch/ignite/issues/830?email_source=notifications&email_token=AASYOHYP6NPNVYLKAVAO7BDRGX5BVA5CNFSM4LCQXVF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOKQP7Q#issuecomment-596969470, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASYOH4PIKK5KUO7IVOBZ7LRGX5BVANCNFSM4LCQXVFQ .

0reactions
kai-tubcommented, Apr 8, 2020

Seems like it does not make sense at all to send model to a device in create_supervised_trainer while optimizer is already defined.

Yes, seems like it would be better to not move the model in trainer.

Concerning your use-case:

In my case, I only used these evaluators after the training procedure. So there was no need to move the model back in-between steps. But I initialized both at the same time and that is how I saw the “bug”.

I don’t see a good way how we could still move the model in the function. Right now, I would favor removing the model moving process and updating the docs.

But I can imagine that a lot of code uses this behavior to save one line of moving the model before initializing the optimizer. What is your opinion of trying to catch an error due to different devices and adding more details to the exception? Or should the docs suffice? Or do you want to add a deprecation warning?

Do you have an alternative approach?

Read more comments on GitHub >

github_iconTop Results From Across the Web

ignite.engine — PyTorch-Ignite v0.4.2 Documentation
High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.
Read more >
PyTorch-Ignite: training and evaluating neural networks ...
This post is a general introduction of PyTorch-Ignite. It intends to give a brief but illustrative overview of what PyTorch-Ignite can offer ...
Read more >
Trainer — PyTorch Lightning 1.8.5.post0 documentation
The trainer allows overriding any key part that you don't want automated. Basic use. This is the basic use of the trainer: model...
Read more >
Engines — MONAI 1.1.0 Documentation
Factory function for creating an evaluator for supervised models. ... consider inheriting from trainer or evaluator to develop more trainers or evaluators.
Read more >
A Brief Summary of Supervision Models
Thus, clinical supervision is now recognized as a complex exchange between supervisor and supervisee, with supervisory models/theories developed to provide ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found