Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Possible error in ndcg_diffs calculation

See original GitHub issue

I think it should be (with log2 on denominator only):

ndcg_diffs = (1 / (1 + doc_ranks[:n_rel]).log2()) - (1 / (1 + doc_ranks[n_rel:]).log2()).view(n_irr)

Issue Analytics

State:
Created 5 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

airalcorn2commented, Jan 5, 2019

The relevance is the human label for the document while the rank is where the document ends up after being scored by the model. To be a little more concrete about the calculations, imagine we provide a scoring model M with a query Q and it returns a set of ten documents in the following order: (D_6, D_0, D_7, D_4, D_8, D_5, D_9, D_1, D_3, D_2). A human could then go through each document and assign it a relevance of (in the binary case) one or zero for the given Q (an alternative labeling strategy would be to have several relevance levels, e.g., “very relevant”, “somewhat relevant”, and “irrelevant”). Once we have this information, we can calculate the DCG for this result set and each D_i will contribute a term of:

dcg_i = (2 ^ rel_i - 1) / log2(rank_i + 1)

to the summation. If we swap D_0 and D_3 and calculate the new DCG, all of the dcg_i' terms except for dcg_0' and dcg_3' will stay the same, so those terms will cancel when taking the difference (this is true whether we use binary relevance or not). As a result, what we have left is:

delta_DCG = (dcg_0 + dcg_3) - (dcg_0' + dcg_3')

If we pretend like both documents are relevant, then:

dcg_0 = (2 ^ 1 - 1) / log2(2 + 1)
dcg_3 = (2 ^ 1 - 1) / log2(9 + 1)

After swapping them, we have:

dcg_0' = (2 ^ 1 - 1) / log2(9 + 1)
dcg_3' = (2 ^ 1 - 1) / log2(2 + 1)

so:

(dcg_0 + dcg_3) - (dcg_0' + dcg_3') = 0

The same is true when you have two irrelevant documents, but it’s even less interesting because their dcg_is are equal to zero. In the case where D_0 is relevant and D_3 is irrelevant, we have:

dcg_0 = (2 ^ 1 - 1) / log2(2 + 1)
dcg_3 = (2 ^ 0 - 1) / log2(9 + 1) = 0
dcg_0' = (2 ^ 1 - 1) / log2(9 + 1)
dcg_3' = (2 ^ 0 - 1) / log2(2 + 1) = 0

so:

delta_DCG = (dcg_0 + dcg_3) - (dcg_0' + dcg_3') = dcg_0 - dcg_0'

which is what I’m calculating.

1reaction

airalcorn2commented, Jan 5, 2019

I think it is suppose to be the difference between the predicted NDCG and the NDCGs where the ranks of each (i, j) document pair are swapped but all other ranks stay the same.

That’s what I’m calculating. As I mention in the code, I’m assuming binary relevance. Because the denominator for the normalized discounted cumulative gain (NDCG) won’t change following a ranking swap, we can ignore it until the end (I changed the name of the ndcg_diffs variable to dcg_diffs to make it a little more clear what’s being computed right there). The discounted cumulative gain (DCG) is a summation of terms corresponding to each document, where a document i’s term is equal to (2 ^ rel_i - 1) / log2(rank_i + 1). As a result, swapping the ranks of two documents only changes the values of the corresponding two terms in the summation. Therefore, swapping two relevant documents or two irrelevant documents will result in the same DCG, and so the difference between those two NDCGs will be zero (i.e., they don’t need to be computed). Because the value of a term associated with an irrelevant document is zero (i.e., (2 ^ 0 - 1) = 0), the difference in the DCGs following a ranking swap between a relevant and an irrelevant document is simply the difference between the value of the term for the relevant document in its original ranking and the value of the term for the relevant document in its “irrelevant” ranking.

1 / (1 + doc_ranks[:n_rel]).log2() are the terms corresponding to the relevant documents in their original rankings. 1 / (1 + doc_ranks[n_rel:]).log2() are the terms corresponding to the relevant documents in their swapped rankings.

You’ll notice the dcg_diffs tensor has a shape of (n_rel, n_irr) (i.e., it holds the DCG differences for all non-zero swaps). The reason I did it the way I did is because it’s vectorized, which will run much faster than a for loop (which is what the code you linked to uses). In fact, if you look at this commit, I was originally using a for loop (and had the logarithm in the right place; that explains why I remember it working!).