question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question]: Using other metrics such as `AUC`.

See original GitHub issue

In much of the literature / guides outside of this project AUC seems to be a popular metric for recommender systems. TF and Keras specifically has an implementation here.

However I am not sure on:

  • is this a valid metric to use for retrieval when using this project?
  • what hyperparams would be important to consider?
  • would the use of this as batch_metric make sense?

Thanks in advance!

Issue Analytics

  • State:open
  • Created a year ago
  • Reactions:1
  • Comments:15

github_iconTop GitHub Comments

3reactions
patrickorlandocommented, May 11, 2022

Just the positives for that batch @rlcauvin. In practice sampling a negative that is a positive in another batch doesn’t affect performance and provides some mild regularisation.

You have candidate_ids of shape (batch_size, 1), and scores of shape (batch_size, batch_size). Essentially we are just creating a mask where tf.cast(candidate_ids == tf.transpose(candidate_ids), dtype=tf.float32) - tf.identity(batch_size). e.g.,

candidate_ids = [[0,1,2,0,3,4,3,5]]
mask = [
  [0, 0, 0, 1, 0, 0, 0, 0], # 0
  [0, 0, 0, 0, 0, 0, 0, 0], # 1
  [0, 0, 0, 0, 0, 0, 0, 0], # 2
  [1, 0, 0, 0, 0, 0, 0, 0], # 0 
  [0, 0, 0, 0, 0, 0, 1, 0], # 3
  [0, 0, 0, 0, 0, 0, 0, 0], # 4
  [0, 0, 0, 0, 1, 0, 0, 0], # 3
  [0, 0, 0, 0, 0, 0, 0, 0]  # 5
]
2reactions
patrickorlandocommented, May 16, 2022

Hey @rlcauvin, I would start with

just using user IDs and item IDs

The model should learn based on just this.

When you experimented with the ranking model, is everything else in your code kept the same? Same lookup layers, embedding layers, tf.data pipelines?

I would:

  1. Ensure that the lookups are working as expected, take a few examples and manually pass them through. Are any items being mapped to the [UNK] token 1? Is the shape correct, they should be only 1 dimensional, (batch_size,).
  2. Pass them through the embedding layers. is each row different? Is the shape correct (batch_size, n_dim)
  3. Do the matrix multiplication, are the scores different? do you get a shape that is (batch_size, batch_size)?

The shape is important, because if the query and candidate tensors have an extra dimension, the matrix multiplication will produce an incorrect result. Your loss will decrease but your model will be junk.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Which Metric Should I Use? Accuracy vs. AUC - KDnuggets
Accuracy and AUC (Area Under the Curve) are measures to evaluate the goodness of model performance. Both are helpful to gauge the model ......
Read more >
F1 Score vs ROC AUC vs Accuracy vs PR AUC - Neptune.ai
PR AUC and F1 Score are very robust evaluation metrics that work great for many classification problems but from my experience more commonly ......
Read more >
Understanding AUC - ROC Curve | by Sarang Narkhede
ROC is a probability curve and AUC represents the degree or measure of separability. It tells how much the model is capable of...
Read more >
Compare classifiers based on AUROC or accuracy?
Another interesting paper with metrics for assertion of model performance is Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & ...
Read more >
8 Unique Machine Learning Questions on Performance ...
This makes it quite essential to have a look at other performance metrics such as ROC Curves, AUC, Precision, and Recall.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found