Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[🐛BUG] Over-estimated TopK metrics

See original GitHub issue

In my evaluations, I found that sometimes, precision or recall is larger than 1.

I went through the coding, and I beleive this is a bug in the library. Correct me, if I am wrong

Let’s take a look at recbole/evaluator/base_metric.py

def used_info(self, dataobject):
        """Get the bool matrix indicating whether the corresponding item is positive
        and number of positive items for each user.
        """
        rec_mat = dataobject.get('rec.topk')
        topk_idx, pos_len_list = torch.split(rec_mat, [max(self.topk), 1], dim=1)
        return rec_mat.to(torch.bool).numpy(), pos_len_list.squeeze(-1).numpy()

In the coding above, I blieve that we should:

return topk_idx.to(torch.bool).numpy(), pos_len_list.squeeze(-1).numpy()

Not

return rec_mat.to(torch.bool).numpy(), pos_len_list.squeeze(-1).numpy()

If I am going to recommender topK items, the first parameter returned is always with size num_users x K+1

Issue Analytics

State:
Created 2 years ago
Comments:10 (4 by maintainers)

Top GitHub Comments

1reaction

guijiqlcommented, Jan 6, 2022

np.cumsum function indeed count the last column. The shape of rec_mat is (user_num, max(self.topk) + 1). And the return value of metric_info is also shaped like (user_num, max(self.topk)+1). However, in the above mentioned code: https://github.com/RUCAIBox/RecBole/blob/1bd8a587867959e8c37b881b2321eb6be7579912/recbole/evaluator/base_metric.py#L65-L80 The shape of avg_result is max((self.topk)+1). But we only save the first max(self.topk) values into dict in the loop. And the extra value is in the max(self.topk)+1-th column.

for k in self.topk:
        key = '{}@{}'.format(metric, k)
        metric_dict[key] = round(avg_result[k - 1], self.decimal_place)

1reaction

chenyushuocommented, Jan 6, 2022

But the last column will not be used in topk_result. In this function, we will only use columns in self.topk. The last column is max(self.topk) + 1 and will not be used.

def topk_result(self, metric, value):
    """Match the metric value to the `k` and put them in `dictionary` form.

    Args:
        metric(str): the name of calculated metric.
        value(numpy.ndarray): metrics for each user, including values from `metric@1` to `metric@max(self.topk)`.

    Returns:
        dict: metric values required in the configuration.
    """
    metric_dict = {}
    avg_result = value.mean(axis=0)
    for k in self.topk:
        key = '{}@{}'.format(metric, k)
        metric_dict[key] = round(avg_result[k - 1], self.decimal_place)  # Here we only use the columns in `self.topk`.
    return metric_dict

Top Results From Across the Web

Top Evaluation Metrics for Regression Problems in Machine ...

In this tutorial, you will learn the top evaluation metrics for regression ... Another commonly used metric is the root mean squared error, ......

Regression metrics when underestimation is worse than ...

I am trying to predict time that it will take to complete some task given some data. However the important thing to me...

Performance Metrics in Machine Learning [Complete Guide]

Performance metrics are a part of every machine learning pipeline. They tell you if you're making progress, and put a number on it....

Facebook Overestimated Key Video Metric for Two Years - WSJ

Big ad buyers and marketers are upset with Facebook Inc. after learning the tech giant vastly overestimated average viewing time for video ...

Performance Metrics (Error Measures) in Machine Learning ...

Performance metrics (error measures) are vital components of the evaluation frameworks in ... Top three metrics identified in the surveys, percentage.