Metrics Implementation Question
See original GitHub issueThanks for the great library, especially the metrics. I have a few questions to better understand the implementation:
During the update stage, why are the values converted to Python floats instead of keeping them as torch values (e.g. here)? This operation incurs a device->host transfer, so the operation is blocking, right? Wouldn’t it be better to keep the metric values as torch values on the GPU so the update
is async? Then, they can be converted to python floats in the compute
method.
In the distributed case, the values are put back in a tensor before the all-reduce, so why not keep them as tensors to begin with?
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:16 (9 by maintainers)
Top Results From Across the Web
How to define metrics with the “Questions first” approach
In the process of defining performance metrics, the first thing we need to do is understand what questions we want to answer with...
Read more >609 questions with answers in METRICS | Science topic
Explore the latest questions and answers in Metrics, and find Metrics experts. ... function by implementing new designed metrics in RPL using Contiki/cooja....
Read more >THE GOAL QUESTION METRIC APPROACH
The Goal Question Metric (GQM) approach is based upon the assumption that for an organization to measure in a purposeful way it must...
Read more >Metrics Sense: Designing a Metric II - Hacking the TPM Interview
This question is both a metric and technical question combined into one, making it excellent for practicing both skills needed for a TPM...
Read more >The Importance of Implementing Effective Metrics - iSixSigma
The successful implementation of any new metric requires the approval and interest of senior managers. They have to lead the culture change from...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@n2cholas in some sense maybe it could make sense to update metrics code, such that internal cumulators where is not done become tensors and user can specify the storage by device. We already have this argument but it is unsed in most of the cases… We also should be careful about specfic implementations where double precision is required…
@vfdev-5 thanks for the rerun and choosing a more realistic batch size.
@sdesrozis Here is a similar script for evaluating a validation loop instead of training. I used a batch size of 512 (like @vfdev-5 ) and 50 runs to get a tighter standard deviation. This and my previous runs were both on a GTX 1080. The results are pretty similar, but the traces are much cleaner. The completely async nature of the custom cuda implementation is much more apparent.
Ignite:
Custom on CPU:
Custom on GPU:
Traces:
Ignite:
Custom CPU:
Custom GPU: