question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cross-Encoder outputs values greater than 1.0

See original GitHub issue

According to https://sbert.net/examples/applications/retrieve_rerank/README.html#re-ranker-cross-encoder the cross encoder “outputs a single score between 0 and 1”. I do get these results with some underlying models: (transformers==4.6.1, sentence-transformers==1.2.0)

cross_encoder_model = CrossEncoder('cross-encoder/stsb-TinyBERT-L-4')  # 0.95060706

But for:

cross_encoder_model = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-4-v2')
query_anchor_pairs = [
    ['i got the client VERY drunk', 'i got the client drunk'],
]
ce_scores = cross_encoder_model.predict(query_anchor_pairs)
print(ce_scores)  # 8.14593

I get scores much greater than 1. What am I doing wrong? Btw, I am getting very good results with either one, especially the latter (Retrieve & Re-Rank). Thank you, Nils.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:1
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
nreimerscommented, Jun 15, 2021

Hi @niebb Thanks for pointing this out. The documentation is there outdated.

Previously, a sigmoid was applied on top of the logits score, i.e. the output was sigmoid(logits). This gives scores between 0 and 1.

The new cross-encoders for msmarco output the logits directly, hence, they can be below 0 or above 1. For re-ranking, this does make any difference. If you like, you can call sigmoid() on top of these values to get back a score 0…1

But as mentioned, for re-ranking it does not make any difference.

0reactions
nreimerscommented, Aug 13, 2021

You can check the MSMARCO examples for the cross-encoder.

In principle you need some pairs (query, passage) with labels 0: not relevant and 1: relevant.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Cross-Encoders — Sentence-Transformers documentation
It produces than an output value between 0 and 1 indicating the similarity of the input sentence pair: A Cross-Encoder does not produce...
Read more >
[Part I] Predicting on Text Pairs with Transformers: Cross ...
The label here is set to 1.0. label = 1.0. This is the value you want BERT to output when given the above...
Read more >
Poly-encoders: architectures and pre-training strategies for ...
The. Cross-encoder uses much more memory than the Bi-encoder, resulting in a much smaller batch size. Inference speed Unfortunately, the Cross- ...
Read more >
Poly-encoders: Transformer Architectures and Pre-training ...
Average over 5 runs (Bi-encoders) or 3 runs (Cross-encoders). ... Cross-encoder, 84.9 ± 0.3, 65.3 ± 1.0, 73.8 ± 0.6, 83.1 ± 0.7,...
Read more >
Domain Transfer with BERT - Pinecone
Then we will explain and work through the AugSBERT domain-transfer ... The labels output by our cross encoder are continuous values in the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found