Cross-Encoder outputs values greater than 1.0
See original GitHub issueAccording to https://sbert.net/examples/applications/retrieve_rerank/README.html#re-ranker-cross-encoder the cross encoder “outputs a single score between 0 and 1”. I do get these results with some underlying models: (transformers==4.6.1, sentence-transformers==1.2.0)
cross_encoder_model = CrossEncoder('cross-encoder/stsb-TinyBERT-L-4') # 0.95060706
But for:
cross_encoder_model = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-4-v2')
query_anchor_pairs = [
['i got the client VERY drunk', 'i got the client drunk'],
]
ce_scores = cross_encoder_model.predict(query_anchor_pairs)
print(ce_scores) # 8.14593
I get scores much greater than 1. What am I doing wrong? Btw, I am getting very good results with either one, especially the latter (Retrieve & Re-Rank). Thank you, Nils.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Cross-Encoders — Sentence-Transformers documentation
It produces than an output value between 0 and 1 indicating the similarity of the input sentence pair: A Cross-Encoder does not produce...
Read more >[Part I] Predicting on Text Pairs with Transformers: Cross ...
The label here is set to 1.0. label = 1.0. This is the value you want BERT to output when given the above...
Read more >Poly-encoders: architectures and pre-training strategies for ...
The. Cross-encoder uses much more memory than the Bi-encoder, resulting in a much smaller batch size. Inference speed Unfortunately, the Cross- ...
Read more >Poly-encoders: Transformer Architectures and Pre-training ...
Average over 5 runs (Bi-encoders) or 3 runs (Cross-encoders). ... Cross-encoder, 84.9 ± 0.3, 65.3 ± 1.0, 73.8 ± 0.6, 83.1 ± 0.7,...
Read more >Domain Transfer with BERT - Pinecone
Then we will explain and work through the AugSBERT domain-transfer ... The labels output by our cross encoder are continuous values in the...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi @niebb Thanks for pointing this out. The documentation is there outdated.
Previously, a sigmoid was applied on top of the logits score, i.e. the output was sigmoid(logits). This gives scores between 0 and 1.
The new cross-encoders for msmarco output the logits directly, hence, they can be below 0 or above 1. For re-ranking, this does make any difference. If you like, you can call sigmoid() on top of these values to get back a score 0…1
But as mentioned, for re-ranking it does not make any difference.
You can check the MSMARCO examples for the cross-encoder.
In principle you need some pairs (query, passage) with labels 0: not relevant and 1: relevant.