Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Use as a "pure" content-based recommender

See original GitHub issue

I was testing lightfm as a content-based recommender. I was thinking in the starting point of a recommendation system, when users have not rated anything yet, so the data matrix is all zeroes. I would like to use several features, but, for the sake of simplicity, let’s say there is just one feature “color” for both users (that user’s favourite color) and items (the item color). I think I am doing the right encoding. I am passing matrices with several columns for different colors. Each row has a value of 1 in the appropriate color (right now, items have only one colour and users have only one favourite color) and a value of 0 in the rest of the positions. I was expecting that if a user likes the color purple, the prediction for that user would score better purple items and thus, they will be on the top of the list when I order the array by scores in descending order. However, I am getting quite random recommendations every time I train the model an ask for a recommendation for the same user. For example if I execute:

for i in range(5):
    model = LightFM(loss='warp')
    model.fit(data, user_features=user_features,
              item_features=image_features, num_threads=2)
    scores = model.predict(1, np.arange(image_features.shape[0]), user_features=user_features,
              item_features=image_features)    
    print(np.argsort(-scores)[0:5]+1) #adding one to align with id numbering

I get this output

[31 29 38 32 40]
[40 24 16 22 25]
[ 8 24  6 36 25]
[20  8  2  1 17]
[ 1 19 39 12  5]

But the only purple items are 26, 27 and 28. Is it possible to use the model this way?

Issue Analytics

State:
Created 6 years ago
Comments:7

Top GitHub Comments

3reactions

jmmroldancommented, Nov 25, 2017

Great, I think I get it know. Thanks for the clarifications, the suggestion, and, of course, for making the library available!

3reactions

jmmroldancommented, Nov 25, 2017

Thanks for the clarification I think that I had misunderstood how the model works. I thought that the model could learn from both interaction-data and feature-data, even if one of them were not present. From your answer, I understand that feature information on its own is not considered data since it does not express information about what items the users actually like or dislike. A short question just to make myself a better idea of how the features are used: if I have a positive interaction data for a user and an item, then the model will recommend items with similar features to that user or predict that users with similar features will like that item. Is this (roughly) right?