Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Basic Recommender Ranking: Big Optimization issue

See original GitHub issue

Hi everyone,

I am a Data Scientist and I start following your examples on Tensorflow recommenders few weeks ago.

I noticed that the Basic Recommender (ranking) is deeply flawed.

Here is the reason:

Try to plot the ranking results as histogram or kde
Do the same with the labels from the test dataset
Plot also a random integer distribution (from 1 to 5).

If you add the following Code to the basic ranking recommender notebook. You should get my same results.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_style('darkgrid')

res_data = pd.DataFrame()

# get predictions from cached test 
res_data['predictions'] = model.predict(cached_test)[:, 0]

# add random integer distribution
res_data['random'] = np.random.randint(1,6, len(res_data))

# get user rating from test dataset
test_labels = []
for r in cached_test:
  test_labels.append((r['user_rating']).numpy())

res_data['test_labels'] = np.concatenate(test_labels)

# plot everythin as kde
plt.figure(figsize=(10,7), dpi=100)
sns.kdeplot(data=res_data, fill=True, bw_adjust=0.9, alpha=0.6, linewidth=0, legend=False)
plt.legend(["Predictions", "Random Monkey",  "Test Labels"][::-1], title="Legend", fontsize=12, title_fontsize=16)
plt.title('Predictions vs. test labels', fontsize=20);

Results: download

Did you get the issue? Is it normal that our predictions are not able to properly rank the input data since they are distributed in a gaussian way around a mean value of 3.5? Did I miss something?

Thank you in advance! @albertvillanova @maciejkula @MarkDaoust @hojinYang

Issue Analytics

State:
Created 10 months ago
Reactions:1
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

MarkDaoustcommented, Dec 13, 2022

Yes. Broader is exactly what I’d expect since this is being approached as a regression. MSE loss doesn’t encourage sharpness, it tends to give a blurry result. It doesn’t meant to predict exactly 1.0, 2.0, 3.0,… because it needs to hedge its bets. It’s approximating a discrete distribution as a continuous distribution. The plot would be more fair if the labels were plotted as a bar chart.

If you want integer predictions or train this as a classifier (you could round off the predictions).

@rageSpin could you send a PR to add your plotting code to the notebook (without the random-monkey)?

0reactions

rageSpincommented, Dec 15, 2022

Thanks for the PR! LGTM.

Don’t forget to follow the CLA instructions the bot posted: #593 (comment)

We can’t merge the PR unless you do.

Done!

Top Results From Across the Web

Recommender systems based on ranking performance ...

Therefore, in this paper, we advocate to treat the recommendation as a ranking problem. Normalized discounted cumulative gain (NDCG) is chosen ...

Optimizing Ranking Algorithm in Recommender System via ...

To address these issues, this paper redefines the ranking function and proposes a Deep Reinforcement learning based Ranking Strategy (DRRS) to maximize the ......

Recommendation Systems: A Review - Towards Data Science

Ranking algorithms normally put more relevant items closer to the top of the showing list whereas recommender systems sometimes try to avoid overspecialization....

Personalized Re-ranking for Recommendation - arXiv

ABSTRACT. Ranking is a core task in recommender systems, which aims at providing an ordered list of items to users. Typically, a ranking....

Improving Top-N Recommendation Performance Using ...

1] propose an improvement of matrix factorization by considering all missing values in the user rating matrix as 0, which is captured as...