Compute metric per image and handler mutex in DistributedDataParallel
See original GitHub issue❓ Questions/Help/Support
Hi @vfdev-5 ,
I am writing an ignite handler to write the segmentation metrics of every image into 1 CSV file as the summary, for example:
metrics.csv:
/data/spleen/image_1 0.85
/data/spleen/image_2 0.87
/data/spleen/image_3 0.91
... ...
The problems are that:
- I tried to add logic to
metrics.update()
to cache every record and write to CSV inmetrics.complete()
, but ignite.metrics only acceptsoutput_transform
, so I can’t extract the filenames fromengine.state.batch
. - Then I changed to write a separate handler for this feature, but ignite metrics only saves the final average metrics into
engine.state.metrics
, handler is not easy to get every metric value corresponding to every image. - Another problem is the
DistributedDataParallel
, when I run the handler in multi-processsing, how do you usually use the multi-processing lock to save content into 1 CSV in both unix and windows OS?
Thanks.
Issue Analytics
- State:
- Created 3 years ago
- Comments:16 (9 by maintainers)
Top Results From Across the Web
Distributed Training in PyTorch (Distributed Data Parallel)
Today we will be covering Distributed Data Parallel in PyTorch which can be used to distribute data across GPUs to train the model...
Read more >Distributed data parallel training in Pytorch
DataParallel is easier to use (just wrap the model and run your training script). However, because it uses one process to compute the...
Read more >TensorBow: Supporting Small-Batch Training in TensorFlow
Deep neural networks are trained using mini-batch Stochastic Gradient Descent (SGD) on specialised hardware accelerators such as a GPU.
Read more >9th USENIX Symposium on Operating Systems Design and ...
mutex per mapping. 5 EVALUATION. This section evaluates the MOSBENCH applications on the most recent Linux kernel at the time of writing. (Linux...
Read more >Lecture Notes in Computer Science 6272 - Springer Link
In such approaches images and video frames are scattered among the available compute nodes, such that each node calculates over a par-.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi @vfdev-5 and @sdesrozis ,
Thanks for your discussion. I added
self.engine
to the metrics in this draft MONAI PR: https://github.com/Project-MONAI/MONAI/pull/1497. Will delete it in the future when you guys addedengine
to theMetric
base class.Thanks.
@sdesrozis the problem here is that we already have existing API for
detach
andis_attached
method that always requireengine
and addingMetric.engine
would require an update for it too, I think.