Add runtime metrics to `Evaluator`
See original GitHub issueCurrently, the Evaluator
class for text-classification
computes the model metrics on a given dataset. In addition to model metrics, it would be nice if the Evaluator
could also report runtime metrics like eval_runtime
(latency) and eval_samples_per_second
(throughput).
cc @philschmid
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:7 (7 by maintainers)
Top Results From Across the Web
Adding a custom metric to AutoGluon
This tutorial describes how to add a custom evaluation metric to AutoGluon that is used to inform validation scores, model ensembling, hyperparameter tuning ......
Read more >Template Variable Evaluation - Datadog Docs
Evaluation operators allow you to perform basic arithmetic operations on a numerical template variable. The syntax uses the following format.
Read more >Matcher Evaluation - MELT - GitHub Pages
For a local evaluation within MELT, multiple Metrics and Evaluators are available. TL;DR. MELT defines a simple work flow: After you implemented your...
Read more >Publishing custom metrics - Amazon CloudWatch
You can publish your own metrics to CloudWatch using the AWS CLI or an API. You can view statistical graphs of your published...
Read more >Tutorial 5: Customize Runtime Settings
Assume you want to add a optimizer named MyOptimizer , which has arguments a ... other arguments such as metric will be passed...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I think we can can just add the throughput information to the dict that is returned by the evaluator.
I like the idea of an evaluation report, however, I don’t think we can assume to know e.g. the device a the pipeline is running on: for now it is a
transformer
pipeline but it could be any callable so we would not know how to get that info. Theevaluate.save
function lets you store any information and by default also saves some system information. Maybe we could extend this and then let the user add whatever can not be easily inferred (e.g. the device of the pipeline). What do you think?As for dummy inputs: I think this is something we should let the user handle. Maybe we can extend the docs with a dedicated “Evaluator” section and have a guide “How to measure the performance of your pipeline” section where we show best practices.
I think that’s a great idea! True - it is backend dependent, but it will be very useful for debugging.
I wonder if it would be useful to optionally output not only the metric values but some sort of an evaluation report - basic setup information along with the runtime metrics, e.g. what device it was evaluated on etc.
I think it would be valuable to have these numbers for full evaluation, not only dummy input, as it doesn’t really cost anything and it can provide additional insights (again, mainly in the debugging scenario)