Similar dataset but recommendations always the same for all users
See original GitHub issueHaving successfully got my data into the correct format I’ve been able to run the quickstart recommender on my own dataset. However for some reason on a given run every user gets identical recommendations for the same item over and over again.
Looking at the movielens data in the same I see that it has the following properties:
- 944 users
- 1665 movies
- 100,000 mappings (each user rating ~100 movies)
My own data set has the following properties
- 4867 users
- 76 interests
- 24540 mappings (each user having ~5 interests)
I think I’ve gotten the data into the correct format. For example, the movie data mappings are like this:
b"One Flew Over the Cuckoo's Nest (1975)" b'138'
b'Strictly Ballroom (1992)' b'92'
b'Very Brady Sequel, A (1996)' b'301'
b'Pulp Fiction (1994)' b'60'
b'Scream 2 (1997)' b'197'
while I’ve now got the interest data similarly structured like so
b'Books' b'1242047'
b'Dance' b'91242048'
b'Sustainability' b'2870269'
b'Books' b'3970361'
b'Photography' b'3970362'
however the recommended interests with my dataset are locked into the same thing each time.
Running with the movielens data there’s a nice spread of recommendations, and the training output is like this:
Epoch 1/3
25/25 [==============================] - 8s 260ms/step - factorized_top_k/top_1_categorical_accuracy: 7.0000e-05 - factorized_top_k/top_5_categorical_accuracy: 0.0015 - factorized_top_k/top_10_categorical_accuracy: 0.0047 - factorized_top_k/top_50_categorical_accuracy: 0.0445 - factorized_top_k/top_100_categorical_accuracy: 0.1001 - loss: 33082.5255 - regularization_loss: 0.0000e+00 - total_loss: 33082.5255
Epoch 2/3
25/25 [==============================] - 6s 246ms/step - factorized_top_k/top_1_categorical_accuracy: 1.9000e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0047 - factorized_top_k/top_10_categorical_accuracy: 0.0136 - factorized_top_k/top_50_categorical_accuracy: 0.1065 - factorized_top_k/top_100_categorical_accuracy: 0.2112 - loss: 31007.2517 - regularization_loss: 0.0000e+00 - total_loss: 31007.2517
Epoch 3/3
25/25 [==============================] - 6s 242ms/step - factorized_top_k/top_1_categorical_accuracy: 2.3000e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0078 - factorized_top_k/top_10_categorical_accuracy: 0.0212 - factorized_top_k/top_50_categorical_accuracy: 0.1432 - factorized_top_k/top_100_categorical_accuracy: 0.2669 - loss: 30418.3815 - regularization_loss: 0.0000e+00 - total_loss: 30418.3815
Top 3 recommendations for user 42: [b'Rent-a-Kid (1995)' b'Only You (1994)' b'Just Cause (1995)']
and we can see a nice pattern of recommendations for users:
thing, titles = index(np.array(["1"]))
print(f"Top 10 recommendations for user {user_ids_vocabulary.get_vocabulary()[1]}: {titles}")
print(f"thing: {thing}")
thing, titles = index(np.array(["2"]))
print(f"Top 10 recommendations for user {user_ids_vocabulary.get_vocabulary()[2]}: {titles}")
print(f"thing: {thing}")
Top 10 recommendations for user 405: [[b'Doom Generation, The (1995)'
b'Brother Minister: The Assassination of Malcolm X (1994)'
b'Theodore Rex (1995)' b'Nadja (1994)'
b'Turbo: A Power Rangers Movie (1997)'
b'All Dogs Go to Heaven 2 (1996)' b'Kansas City (1996)'
b'Maya Lin: A Strong Clear Vision (1994)' b'White Balloon, The (1995)'
b'Flipper (1996)']]
thing: [[4.8432746 4.619417 4.452347 4.371699 3.4825168 3.1067772 3.0736032
3.0411866 3.0028658 2.8818445]]
Top 10 recommendations for user 655: [[b'3 Ninjas: High Noon At Mega Mountain (1998)' b'Promesse, La (1996)'
b'For the Moment (1994)' b'City of Angels (1998)'
b"Antonia's Line (1995)" b"Marvin's Room (1996)"
b'Once Upon a Time... When We Were Colored (1995)'
b'Unhook the Stars (1996)' b'Kolya (1996)' b'Secrets & Lies (1996)']]
thing: [[9.494508 6.5379915 5.68907 5.3878336 5.3482184 5.290077 5.2577763
5.2102613 5.1238346 5.1227913]]
but for the interests data we get this:
Epoch 1/3
6/6 [==============================] - 10s 2s/step - factorized_top_k/top_1_categorical_accuracy: 0.1213 - factorized_top_k/top_5_categorical_accuracy: 0.1213 - factorized_top_k/top_10_categorical_accuracy: 0.1213 - factorized_top_k/top_50_categorical_accuracy: 0.1238 - factorized_top_k/top_100_categorical_accuracy: 0.1293 - loss: 31238.8574 - regularization_loss: 0.0000e+00 - total_loss: 31238.8574
Epoch 2/3
6/6 [==============================] - 10s 2s/step - factorized_top_k/top_1_categorical_accuracy: 0.2418 - factorized_top_k/top_5_categorical_accuracy: 0.2418 - factorized_top_k/top_10_categorical_accuracy: 0.2418 - factorized_top_k/top_50_categorical_accuracy: 0.2457 - factorized_top_k/top_100_categorical_accuracy: 0.2508 - loss: 37982.0569 - regularization_loss: 0.0000e+00 - total_loss: 37982.0569
Epoch 3/3
6/6 [==============================] - 9s 2s/step - factorized_top_k/top_1_categorical_accuracy: 0.1624 - factorized_top_k/top_5_categorical_accuracy: 0.1624 - factorized_top_k/top_10_categorical_accuracy: 0.1624 - factorized_top_k/top_50_categorical_accuracy: 0.1667 - factorized_top_k/top_100_categorical_accuracy: 0.1732 - loss: 31066.0243 - regularization_loss: 0.0000e+00 - total_loss: 31066.0243
Top 3 recommendations for user 42: [b'Spa' b'Spa' b'Spa']
and here’s the details for the first two users:
Top 10 recommendations for user 990000155054: [[b'Spa' b'Spa' b'Spa' b'Spa' b'Spa' b'Spa' b'Spa' b'Spa' b'Spa' b'Spa']]
thing: [[0.194 0.194 0.194 0.194 0.194 0.194 0.194 0.194 0.194 0.194]]
Top 10 recommendations for user 990000154983: [[b'Spa' b'Spa' b'Spa' b'Spa' b'Spa' b'Spa' b'Spa' b'Spa' b'Spa' b'Spa']]
thing: [[0.194 0.194 0.194 0.194 0.194 0.194 0.194 0.194 0.194 0.194]]
On different runs the recommendation might recommend a different interest, but it always gets stuck with the same one for all users. I’ve tried longer training runs, but I’m starting to wonder if there’s a need to have some minimum number of “ratings” per user? I think that’s the main difference between the data sets, i.e. 100 ratings for each user in the movielens dataset, but only five in my interests dataset. Or could it be the number of possible interests is just too small?
Or I’m making some stupid mistake in the code (in the interest vocabulary lookup table perhaps?). What are the requirements on the dataset size to allow this model to work?
Is there anything in the output I show above to indicate what’s going wrong? The net getting stuck in a local minima perhaps?
I’ve tried longer training runs but that doesn’t seem to make any difference. Perhaps the settings for the retrieval model need to be different given the fewer “ratings” or some other difference in the dataset proportions?
Issue Analytics
- State:
- Created 2 years ago
- Reactions:2
- Comments:32

Top Related StackOverflow Question
okay fixed it - gosh so silly:
I had assumed that in example the queries as:
represented the 1st user, when this is actually representing user id “1”. I just had to look up the correct user ids e.g.
and we get the expected behaviour … but this has been very educational in understanding a lot more about the system. I’m still not quite seeing which matrix multiplications are leading to particular outputs, but I’m well on the way.
@GaetanDu your input about the uniqueness in the vocabulary was critical @maciejkula thanks for your input on this and for making the whole framework available
I wonder if it would be worth adding a note to the quick start docs about the user-ids? or perhaps having the first lookup be for a user id that couldn’t be confused with an index?
it seems strange that they should all be identical. Each user has a different combination of interests, so I would have thought that on that basis there should be some variation between users.
My suspicion is that this approach (for this number of epochs, learning rate and network size) relies on there being sufficient numbers of items “rated” by each user. When we reduce to a much smaller movielens dataset where each user has only rated a single movie we get the same behaviour, i.e. same set of recommendations for all users