[Question]: Using other metrics such as `AUC`.
See original GitHub issueIn much of the literature / guides outside of this project AUC seems to be a popular metric for recommender systems. TF and Keras specifically has an implementation here.
However I am not sure on:
- is this a valid metric to use for retrieval when using this project?
- what hyperparams would be important to consider?
- would the use of this as
batch_metricmake sense?
Thanks in advance!
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:15
Top Results From Across the Web
Which Metric Should I Use? Accuracy vs. AUC - KDnuggets
Accuracy and AUC (Area Under the Curve) are measures to evaluate the goodness of model performance. Both are helpful to gauge the model ......
Read more >F1 Score vs ROC AUC vs Accuracy vs PR AUC - Neptune.ai
PR AUC and F1 Score are very robust evaluation metrics that work great for many classification problems but from my experience more commonly ......
Read more >Understanding AUC - ROC Curve | by Sarang Narkhede
ROC is a probability curve and AUC represents the degree or measure of separability. It tells how much the model is capable of...
Read more >Compare classifiers based on AUROC or accuracy?
Another interesting paper with metrics for assertion of model performance is Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & ...
Read more >8 Unique Machine Learning Questions on Performance ...
This makes it quite essential to have a look at other performance metrics such as ROC Curves, AUC, Precision, and Recall.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Just the positives for that batch @rlcauvin. In practice sampling a negative that is a positive in another batch doesn’t affect performance and provides some mild regularisation.
You have candidate_ids of shape
(batch_size, 1), and scores of shape(batch_size, batch_size). Essentially we are just creating a mask wheretf.cast(candidate_ids == tf.transpose(candidate_ids), dtype=tf.float32) - tf.identity(batch_size). e.g.,Hey @rlcauvin, I would start with
The model should learn based on just this.
When you experimented with the ranking model, is everything else in your code kept the same? Same lookup layers, embedding layers, tf.data pipelines?
I would:
[UNK]token1? Is the shape correct, they should be only 1 dimensional,(batch_size,).(batch_size, n_dim)(batch_size, batch_size)?The shape is important, because if the query and candidate tensors have an extra dimension, the matrix multiplication will produce an incorrect result. Your loss will decrease but your model will be junk.