question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Possible error in ndcg_diffs calculation

See original GitHub issue

I think it should be (with log2 on denominator only):

ndcg_diffs = (1 / (1 + doc_ranks[:n_rel]).log2()) - (1 / (1 + doc_ranks[n_rel:]).log2()).view(n_irr)

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
airalcorn2commented, Jan 5, 2019

The relevance is the human label for the document while the rank is where the document ends up after being scored by the model. To be a little more concrete about the calculations, imagine we provide a scoring model M with a query Q and it returns a set of ten documents in the following order: (D_6, D_0, D_7, D_4, D_8, D_5, D_9, D_1, D_3, D_2). A human could then go through each document and assign it a relevance of (in the binary case) one or zero for the given Q (an alternative labeling strategy would be to have several relevance levels, e.g., “very relevant”, “somewhat relevant”, and “irrelevant”). Once we have this information, we can calculate the DCG for this result set and each D_i will contribute a term of:

dcg_i = (2 ^ rel_i - 1) / log2(rank_i + 1)

to the summation. If we swap D_0 and D_3 and calculate the new DCG, all of the dcg_i' terms except for dcg_0' and dcg_3' will stay the same, so those terms will cancel when taking the difference (this is true whether we use binary relevance or not). As a result, what we have left is:

delta_DCG = (dcg_0 + dcg_3) - (dcg_0' + dcg_3')

If we pretend like both documents are relevant, then:

dcg_0 = (2 ^ 1 - 1) / log2(2 + 1)
dcg_3 = (2 ^ 1 - 1) / log2(9 + 1)

After swapping them, we have:

dcg_0' = (2 ^ 1 - 1) / log2(9 + 1)
dcg_3' = (2 ^ 1 - 1) / log2(2 + 1)

so:

(dcg_0 + dcg_3) - (dcg_0' + dcg_3') = 0

The same is true when you have two irrelevant documents, but it’s even less interesting because their dcg_is are equal to zero. In the case where D_0 is relevant and D_3 is irrelevant, we have:

dcg_0 = (2 ^ 1 - 1) / log2(2 + 1)
dcg_3 = (2 ^ 0 - 1) / log2(9 + 1) = 0
dcg_0' = (2 ^ 1 - 1) / log2(9 + 1)
dcg_3' = (2 ^ 0 - 1) / log2(2 + 1) = 0

so:

delta_DCG = (dcg_0 + dcg_3) - (dcg_0' + dcg_3') = dcg_0 - dcg_0'

which is what I’m calculating.

1reaction
airalcorn2commented, Jan 5, 2019

I think it is suppose to be the difference between the predicted NDCG and the NDCGs where the ranks of each (i, j) document pair are swapped but all other ranks stay the same.

That’s what I’m calculating. As I mention in the code, I’m assuming binary relevance. Because the denominator for the normalized discounted cumulative gain (NDCG) won’t change following a ranking swap, we can ignore it until the end (I changed the name of the ndcg_diffs variable to dcg_diffs to make it a little more clear what’s being computed right there). The discounted cumulative gain (DCG) is a summation of terms corresponding to each document, where a document i’s term is equal to (2 ^ rel_i - 1) / log2(rank_i + 1). As a result, swapping the ranks of two documents only changes the values of the corresponding two terms in the summation. Therefore, swapping two relevant documents or two irrelevant documents will result in the same DCG, and so the difference between those two NDCGs will be zero (i.e., they don’t need to be computed). Because the value of a term associated with an irrelevant document is zero (i.e., (2 ^ 0 - 1) = 0), the difference in the DCGs following a ranking swap between a relevant and an irrelevant document is simply the difference between the value of the term for the relevant document in its original ranking and the value of the term for the relevant document in its “irrelevant” ranking.

1 / (1 + doc_ranks[:n_rel]).log2() are the terms corresponding to the relevant documents in their original rankings. 1 / (1 + doc_ranks[n_rel:]).log2() are the terms corresponding to the relevant documents in their swapped rankings.

You’ll notice the dcg_diffs tensor has a shape of (n_rel, n_irr) (i.e., it holds the DCG differences for all non-zero swaps). The reason I did it the way I did is because it’s vectorized, which will run much faster than a for loop (which is what the code you linked to uses). In fact, if you look at this commit, I was originally using a for loop (and had the logarithm in the right place; that explains why I remember it working!).

Read more comments on GitHub >

github_iconTop Results From Across the Web

How To Calculate Error (With Steps, Example and Types)
In this article, we explain how to calculate error, then share an example and 12 types of errors in business, mathematics and science....
Read more >
How to Determine Greatest Possible Error - Study.com
Learn how to determine greatest possible error, and see examples that walk through sample ... Thus, it is calculated by dividing the precision...
Read more >
How to Calculate Percent Error - ThoughtCo
The purpose of a percent error calculation is to gauge how close a measured value is to a true value. Percent error (percentage...
Read more >
Uncertainties and Error Propagation
Derivation: We will assume that the uncertainties are arranged so as to make z as far from its true value as possible. Average...
Read more >
Calculate Percentage Error - HeyTutor
Why Do We Calculate Percentage Error? There are many reasons for calculating percentage errors. Engineers use it to determine the precision of a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found