question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Feature: standardize inputs/outputs of metrics

See original GitHub issue

Currently there are several different inputs/output formats possible in Metrics. We should standardize them as much as possible and respecting the following principle:

  • inputs/outputs are easy to understand and use
  • outputs are compatible with other frameworks

For the output standardization: probably a dictionary structure, even if nested would be ok. Also a dedicated output class could be considered like in transformer models but this is probably not necessary here. To make it compatible with e.g. keras we could add a postprocess function at initialization similar a transform in datasets.

There are three options we could implement:

load_metric(..., postprocess="metric_key") # equivalent result to `metric.compute()["metric_key"]`
load_metric(..., postprocess="flatten") # equivalent flattening the output dict: `flatten(metric.compute())`
load_metric(..., postprocess=func) # equivalent result to `func(metric.compute())`

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:27 (26 by maintainers)

github_iconTop GitHub Comments

2reactions
sashavorcommented, Apr 8, 2022

As per our meeting today, we proposed to have standardized structure for inputs, in dictionary form.

An initial proposal of that structure can be:


     {
        "references":  ,
        "predictions": ,
     }

With references and predictions being of any format (strings, images, numbers, vectors, etc.). I was looking at examples of computer vision metrics and this should work for those as well.

Edge cases:

  • COMET, WikiSplit and SARI – take an additional input, sources
  • F1, Precision and Recall – require average parameter for multiclass labels, but we could define a default if needed
  • Perplexity – needs an input string and a model

I think we could have additional, optional, source and average inputs, but I don’t really know what to so for perplexity 😅 (I mean, in any case, the metrics will not function without these arguments, but I guess waiting for them to crash isn’t the best solution)

CC @lvwerra @lhoestq @apsdehal

1reaction
lvwerracommented, May 19, 2022

I have been thinking about the scalar vs. dict question. Having a dict across all metrics at least internally is nice as it allows to treat them all the same way and we can also combine metrics by merging dicts. At the same time we could check if the return is just a dict with one value and if that’s the case just return its. value.

metric = evaluate.load("accuracy") 
>>> 0.6

metric = evaluate.load("accuracy",  force_dict=True)
>>> {"accuracy": 0.6}

What do you think @lhoestq?

Regarding Keras, I’ll think a bit more about how to do that smoothly.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Feature Scaling | Standardization Vs Normalization
What is Standardization? Standardization is another scaling technique where the values are centered around the mean with a unit standard ...
Read more >
All about Feature Scaling
The most common techniques of feature scaling are Normalization and Standardization.
Read more >
How to use Data Scaling Improve Deep Learning Model ...
Data scaling can be achieved by normalizing or standardizing real-valued input and output variables. How to apply standardization and ...
Read more >
When and Why to Standardize Your Data
Standardization comes into the picture when features of the input data set have large differences between their ranges, or simply when they are ......
Read more >
How and why do normalization and feature scaling work?
Standardization is the central preprocessing step in data mining, to standardize values of features or attributes from different dynamic range into a specific ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found