feature request: hits (or accuracy?)See original GitHub issue
It is often difficult to estimate the total number of relevant document for a query. For example, in Question Answering, if you have a large enough Knowledge Base, you can find the answer to your question in a surprisingly large number of documents that one cannot annotate in advance. Because of this, the relevance of the document is often estimated on-the-go, by checking whether the answer string is in the document retrieved by the system.
Because of this, recall is not an appropriate metric. However, one way to circumvent this is to compute recall “as if” there was only a single relevant document. After averaging over the whole dataset, it corresponds to the proportion of question for which the system retrieved at least one relevant document in top-K. This is what @osf9018 and I call “hits@K” (I can’t remember but I’ve seen it in a paper) and others, such as Karpukhin et al., call “accuracy”. Accuracy is a confusing term IMO.
Would you be interested in implementing or integrating this feature in your library?
It might take some renaming but it could be implemented very easily by using the
_hits function. It is simply
min(1, _hits(qrels, run, k))
- Created a year ago
- Comments:7 (3 by maintainers)