Feature: standardize inputs/outputs of metrics
See original GitHub issueCurrently there are several different inputs/output formats possible in Metrics
. We should standardize them as much as possible and respecting the following principle:
- inputs/outputs are easy to understand and use
- outputs are compatible with other frameworks
For the output standardization: probably a dictionary structure, even if nested would be ok. Also a dedicated output class could be considered like in transformer
models but this is probably not necessary here. To make it compatible with e.g. keras
we could add a postprocess function at initialization similar a transform
in datasets
.
There are three options we could implement:
load_metric(..., postprocess="metric_key") # equivalent result to `metric.compute()["metric_key"]`
load_metric(..., postprocess="flatten") # equivalent flattening the output dict: `flatten(metric.compute())`
load_metric(..., postprocess=func) # equivalent result to `func(metric.compute())`
Issue Analytics
- State:
- Created a year ago
- Comments:27 (26 by maintainers)
Top Results From Across the Web
Feature Scaling | Standardization Vs Normalization
What is Standardization? Standardization is another scaling technique where the values are centered around the mean with a unit standard ...
Read more >All about Feature Scaling
The most common techniques of feature scaling are Normalization and Standardization.
Read more >How to use Data Scaling Improve Deep Learning Model ...
Data scaling can be achieved by normalizing or standardizing real-valued input and output variables. How to apply standardization and ...
Read more >When and Why to Standardize Your Data
Standardization comes into the picture when features of the input data set have large differences between their ranges, or simply when they are ......
Read more >How and why do normalization and feature scaling work?
Standardization is the central preprocessing step in data mining, to standardize values of features or attributes from different dynamic range into a specific ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
As per our meeting today, we proposed to have standardized structure for inputs, in dictionary form.
An initial proposal of that structure can be:
With
references
andpredictions
being of any format (strings, images, numbers, vectors, etc.). I was looking at examples of computer vision metrics and this should work for those as well.Edge cases:
sources
average
parameter for multiclass labels, but we could define a default if neededI think we could have additional, optional,
source
andaverage
inputs, but I don’t really know what to so for perplexity 😅 (I mean, in any case, the metrics will not function without these arguments, but I guess waiting for them to crash isn’t the best solution)CC @lvwerra @lhoestq @apsdehal
I have been thinking about the scalar vs. dict question. Having a dict across all metrics at least internally is nice as it allows to treat them all the same way and we can also combine metrics by merging dicts. At the same time we could check if the return is just a dict with one value and if that’s the case just return its. value.
What do you think @lhoestq?
Regarding Keras, I’ll think a bit more about how to do that smoothly.