Basic Recommender Ranking: Big Optimization issue
See original GitHub issueHi everyone,
I am a Data Scientist and I start following your examples on Tensorflow recommenders few weeks ago.
I noticed that the Basic Recommender (ranking) is deeply flawed.
Here is the reason:
- Try to plot the ranking results as histogram or kde
- Do the same with the labels from the test dataset
- Plot also a random integer distribution (from 1 to 5).
If you add the following Code to the basic ranking recommender notebook. You should get my same results.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('darkgrid')
res_data = pd.DataFrame()
# get predictions from cached test
res_data['predictions'] = model.predict(cached_test)[:, 0]
# add random integer distribution
res_data['random'] = np.random.randint(1,6, len(res_data))
# get user rating from test dataset
test_labels = []
for r in cached_test:
test_labels.append((r['user_rating']).numpy())
res_data['test_labels'] = np.concatenate(test_labels)
# plot everythin as kde
plt.figure(figsize=(10,7), dpi=100)
sns.kdeplot(data=res_data, fill=True, bw_adjust=0.9, alpha=0.6, linewidth=0, legend=False)
plt.legend(["Predictions", "Random Monkey", "Test Labels"][::-1], title="Legend", fontsize=12, title_fontsize=16)
plt.title('Predictions vs. test labels', fontsize=20);
Results:

Did you get the issue? Is it normal that our predictions are not able to properly rank the input data since they are distributed in a gaussian way around a mean value of 3.5? Did I miss something?
Thank you in advance! @albertvillanova @maciejkula @MarkDaoust @hojinYang
Issue Analytics
- State:
- Created 10 months ago
- Reactions:1
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Recommender systems based on ranking performance ...
Therefore, in this paper, we advocate to treat the recommendation as a ranking problem. Normalized discounted cumulative gain (NDCG) is chosen ...
Read more >Optimizing Ranking Algorithm in Recommender System via ...
To address these issues, this paper redefines the ranking function and proposes a Deep Reinforcement learning based Ranking Strategy (DRRS) to maximize the ......
Read more >Recommendation Systems: A Review - Towards Data Science
Ranking algorithms normally put more relevant items closer to the top of the showing list whereas recommender systems sometimes try to avoid overspecialization....
Read more >Personalized Re-ranking for Recommendation - arXiv
ABSTRACT. Ranking is a core task in recommender systems, which aims at providing an ordered list of items to users. Typically, a ranking....
Read more >Improving Top-N Recommendation Performance Using ...
1] propose an improvement of matrix factorization by considering all missing values in the user rating matrix as 0, which is captured as...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Yes. Broader is exactly what I’d expect since this is being approached as a regression. MSE loss doesn’t encourage sharpness, it tends to give a blurry result. It doesn’t meant to predict exactly 1.0, 2.0, 3.0,… because it needs to hedge its bets. It’s approximating a discrete distribution as a continuous distribution. The plot would be more fair if the labels were plotted as a bar chart.
If you want integer predictions or train this as a classifier (you could round off the predictions).
@rageSpin could you send a PR to add your plotting code to the notebook (without the random-monkey)?
Done!