Add NLP-specific metrics
See original GitHub issue@mattdangerw and the keras-nlp team:
For standard classification metrics (AUC, F1, Precision, Recall, Accuracy, etc.), keras.metrics can be used. But there are several NLP-specific metrics which can be implemented here, i.e., we can expose native APIs for these metrics.
I would like to take this up. I can start with the popular ones first and open PRs. Let me know if this is something the team is looking to add!
I’ve listed a few metrics (this list is, by no means, comprehensive):
-
Perplexity
-
ROUGE paper Pretty standard metric for text generation. We can implement all variations: ROUGE-N, ROUGE-L, ROUGE-W, etc.
-
BLEU paper Another standard text generation metric. Note: We can also implement SacreBleu.
-
Character Error Rate, Word Error Rate, etc. paper
-
Pearson Coefficient and Spearman Coefficient Looks like
keras.metrics
does not have these two metrics. They are not NLP-specific metrics…so, maybe, implementing them in Keras is better than implementing them here.
Thank you!
Issue Analytics
- State:
- Created 2 years ago
- Comments:16 (4 by maintainers)
Top GitHub Comments
@aflah02, good point. Will do!
@abheesht17 I’d suggest adding perplexity as well as it’s one of the trickier metrics to use. Especially since it often gives inconsistent results and hugely varying results (in orders of magnitude) across different implementations by different existing libraries in my experience