Use as a "pure" content-based recommender
See original GitHub issueI was testing lightfm as a content-based recommender. I was thinking in the starting point of a recommendation system, when users have not rated anything yet, so the data matrix is all zeroes. I would like to use several features, but, for the sake of simplicity, let’s say there is just one feature “color” for both users (that user’s favourite color) and items (the item color). I think I am doing the right encoding. I am passing matrices with several columns for different colors. Each row has a value of 1 in the appropriate color (right now, items have only one colour and users have only one favourite color) and a value of 0 in the rest of the positions. I was expecting that if a user likes the color purple, the prediction for that user would score better purple items and thus, they will be on the top of the list when I order the array by scores in descending order. However, I am getting quite random recommendations every time I train the model an ask for a recommendation for the same user. For example if I execute:
for i in range(5):
model = LightFM(loss='warp')
model.fit(data, user_features=user_features,
item_features=image_features, num_threads=2)
scores = model.predict(1, np.arange(image_features.shape[0]), user_features=user_features,
item_features=image_features)
print(np.argsort(-scores)[0:5]+1) #adding one to align with id numbering
I get this output
[31 29 38 32 40]
[40 24 16 22 25]
[ 8 24 6 36 25]
[20 8 2 1 17]
[ 1 19 39 12 5]
But the only purple items are 26, 27 and 28. Is it possible to use the model this way?
Issue Analytics
- State:
- Created 6 years ago
- Comments:7
Top GitHub Comments
Great, I think I get it know. Thanks for the clarifications, the suggestion, and, of course, for making the library available!
Thanks for the clarification I think that I had misunderstood how the model works. I thought that the model could learn from both interaction-data and feature-data, even if one of them were not present. From your answer, I understand that feature information on its own is not considered data since it does not express information about what items the users actually like or dislike. A short question just to make myself a better idea of how the features are used: if I have a positive interaction data for a user and an item, then the model will recommend items with similar features to that user or predict that users with similar features will like that item. Is this (roughly) right?