question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Retrieval using only one feature of User

See original GitHub issue

I want to build deep recommender system to predict movie for a given user. I have dataset which contain information about user like its id, gender, city etc (dataset contain different rows for same user with different location city) and movie information like its title, genre etc. I can train the model using this dataset by having user information in query tower and movie information in candidate tower. But during retrieval, I only have information about user’s id(this user id is also in the dataset). How to give only user id embedding in BruteForce layer for predicting movie? Like in deep recommender model, we would write brute_force = tfrs.layers.factorized_top_k.BruteForce(model.query_model.embedding_model.user_embedding) but how to take this user’s past locations and its gender into context while retrieving as we are can’t pass whole query model?

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
patrickorlandocommented, Jun 19, 2022

@dexter1729

  1. Yes, it’s the trained model. You are calculating the query vector based on a different dataset, containing only one row per user.

  2. To pass multiple locations to the model you would need to modify it to accept an array of user locations and take the average vector. Since you have no such examples during training the results could be garbage. Alternatively you would take the most recent location for that user. In either case you will not be capturing the true user location at inference time and are therefore introducing training-serving skew. You can still give this a try depending on your use-case and requirements, but your results may vary. Again, the correct way to handle this is to serve these features to the model at inference time.

1reaction
patrickorlandocommented, Jun 17, 2022

Here’s the general idea @dexter1729, I haven’t tested this so you may encounter some errors. As I mentioned, this removes the ability to pass different values to the query model at inference time. A user_id will be the only input.

# users_ds is tf.data.Dataset with each record containing a user_id and the other features. Each user only appears once.
user_id_batches = []
user_vec_batches = []
for batch in user_ds:
    user_id_batches.appen(batch['user_id'])
    user_vec_batches.append(model.query_model(batch)) 

serving_user_vecs = tf.concat(user_vec_batches, axis=0)

num_users, vector_dim = tf.shape(serving_user_vecs)

serving_user_embedding_layer = tf.keras.layers.Embedding(num_users, vector_dim, mask_zero=False)

serving_user_embedding_layer.set_weights([serving_user_vecs])

serving_user_lookup = tf.keras.layers.experimental.preprocessing.StringLookup(
    vocabulary=tf.concat(user_id_batches, axis=0), mask_token=None, num_oov_indices=0
)

brute_force = tfrs.layers.factorized_top_k.BruteForce(tf.keras.Model([
    serving_user_lookup,
    serving_user_embedding_layer
])
Read more comments on GitHub >

github_iconTop Results From Across the Web

Memory Recognition and Recall in User Interfaces
Summary: Showing users things they can recognize improves usability over needing to recall items from scratch because the extra context helps ...
Read more >
Feature retrieval - Feast
It is possible to retrieve features from multiple feature views with a single request, and Feast is able to join features from multiple...
Read more >
Multiple Query Content-Based Image Retrieval Using ... - NCBI
For multiple query retrieval, the user is required to provide a set of images as query. The distance between the features of this...
Read more >
Clarifying Search: A User-Interface Framework for Text Searches
Relevance feedback is the process of taking retrieved documents that have been determined to be good examples of what the user wants, and...
Read more >
Content-Based Image Retrieval and Feature Extraction
The previous research shows that by retrieving the image using a single feature does not provide good results and using multiple features for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found