question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Basic Recommender Ranking: Big Optimization issue

See original GitHub issue

Hi everyone,

I am a Data Scientist and I start following your examples on Tensorflow recommenders few weeks ago.

I noticed that the Basic Recommender (ranking) is deeply flawed.

Here is the reason:

  1. Try to plot the ranking results as histogram or kde
  2. Do the same with the labels from the test dataset
  3. Plot also a random integer distribution (from 1 to 5).

If you add the following Code to the basic ranking recommender notebook. You should get my same results.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_style('darkgrid')

res_data = pd.DataFrame()

# get predictions from cached test 
res_data['predictions'] = model.predict(cached_test)[:, 0]

# add random integer distribution
res_data['random'] = np.random.randint(1,6, len(res_data))

# get user rating from test dataset
test_labels = []
for r in cached_test:
  test_labels.append((r['user_rating']).numpy())

res_data['test_labels'] = np.concatenate(test_labels)

# plot everythin as kde
plt.figure(figsize=(10,7), dpi=100)
sns.kdeplot(data=res_data, fill=True, bw_adjust=0.9, alpha=0.6, linewidth=0, legend=False)
plt.legend(["Predictions", "Random Monkey",  "Test Labels"][::-1], title="Legend", fontsize=12, title_fontsize=16)
plt.title('Predictions vs. test labels', fontsize=20);

Results: download

Did you get the issue? Is it normal that our predictions are not able to properly rank the input data since they are distributed in a gaussian way around a mean value of 3.5? Did I miss something?

Thank you in advance! @albertvillanova @maciejkula @MarkDaoust @hojinYang

Issue Analytics

  • State:open
  • Created 10 months ago
  • Reactions:1
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
MarkDaoustcommented, Dec 13, 2022

Yes. Broader is exactly what I’d expect since this is being approached as a regression. MSE loss doesn’t encourage sharpness, it tends to give a blurry result. It doesn’t meant to predict exactly 1.0, 2.0, 3.0,… because it needs to hedge its bets. It’s approximating a discrete distribution as a continuous distribution. The plot would be more fair if the labels were plotted as a bar chart.

If you want integer predictions or train this as a classifier (you could round off the predictions).

@rageSpin could you send a PR to add your plotting code to the notebook (without the random-monkey)?

0reactions
rageSpincommented, Dec 15, 2022

Thanks for the PR! LGTM.

Don’t forget to follow the CLA instructions the bot posted: #593 (comment)

We can’t merge the PR unless you do.

Done!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Recommender systems based on ranking performance ...
Therefore, in this paper, we advocate to treat the recommendation as a ranking problem. Normalized discounted cumulative gain (NDCG) is chosen ...
Read more >
Optimizing Ranking Algorithm in Recommender System via ...
To address these issues, this paper redefines the ranking function and proposes a Deep Reinforcement learning based Ranking Strategy (DRRS) to maximize the ......
Read more >
Recommendation Systems: A Review - Towards Data Science
Ranking algorithms normally put more relevant items closer to the top of the showing list whereas recommender systems sometimes try to avoid overspecialization....
Read more >
Personalized Re-ranking for Recommendation - arXiv
ABSTRACT. Ranking is a core task in recommender systems, which aims at providing an ordered list of items to users. Typically, a ranking....
Read more >
Improving Top-N Recommendation Performance Using ...
1] propose an improvement of matrix factorization by considering all missing values in the user rating matrix as 0, which is captured as...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found