Generating Predictions
See original GitHub issueThis may be a stupid question corresponding to basic functionality, but…
I have n users (O(millions)
) and m items (O(100k)
), with very sparse interactions. Building a csr matrix and training (in a pure CF case) is pretty straight forward. I’m interested in generating the top k (O(100)
) recommendations for each user. Is there an example of how to do this? Is there an efficient batch operation? I can generate recs for each person individually, but, even with multithreading, generating recs for everyone will take days.
Issue Analytics
- State:
- Created 5 years ago
- Comments:6
Top Results From Across the Web
Generating and Interpreting Predictions - Amazon Machine ...
Amazon ML provides two mechanisms for generating predictions: asynchronous (batch-based) and synchronous (one-at-a-time). Use asynchronous predictions, or batch ...
Read more >Understanding Hypotheses and Predictions - Academic Skills
Hypotheses and predictions are different components of the scientific method. The scientific method is a systematic process that helps minimize bias in ...
Read more >5.4 Generating Predictions from the Empty Model - CourseKata
This prediction answers the question: what would our model have predicted this thumb length to be if we hadn't collected the data? There's...
Read more >Build a prediction - Amplitude Help Center
This article will help you: Build a prediction in Amplitude Audiences Create a predictive cohort from your prediction Analyze your...
Read more >When generating a prediction boosts learning: The element of ...
Generating predictions facilitated acquisition of geography knowledge. •. Eye-tracking revealed that outcomes that refute a prediction yield a surprise reaction ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The easiest way is probably to use the
get_item_representations
andget_user_representations
methods.They return matrices of embeddings and arrays of biases, allowing you to perform fast batch multiplications via standard numpy
dot
operations. Combined with pre-allocating the receiver (using theout
argument indot
), this should give you pretty decent speed for predictions themselves.For sorting,
np.argpartition
may help for efficient partial sort up to 100, followed by a full argsort on the top 100.Together, and parallelized using joblib or similar, this should take you close enough to C speed. If that is still too slow, you may have to distribute this across a cluster.
This is a good lead, thanks. i think i should be able to do some manipulation and use nmslib too.