question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Errors in inference cross-encoders

See original GitHub issue

Hi

I finetuned the cross encoders model using one of the huggingface model (link) on the sts dataset using your training script. I loaded the model using the command and it shows the following warning.

model = CrossEncoder('lordtt13/COVID-SciBERT', num_labels=1)

Some weights of the model checkpoint at lordtt13/COVID-SciBERT were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.predictions.decoder.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at lordtt13/COVID-SciBERT and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Now, when I use the model after training,

  1. It is comparatively slow during inference time as compared to cross-encoder models provided by sentence-transformer.
  2. It gives the following error for some longer input pairs. RuntimeError: The size of tensor a (535) must match the size of tensor b (512) at non-singleton dimension 1

Could you please tell why is this happening or if I am missing something?

Many thanks Iknoor

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
nreimerscommented, Nov 24, 2020

Hi @iknoorjobs The Nboost models have the issue that they use 2 labels (relevant and not relevant). If you want to use this model as base, you need binary labels, i.e. int(0) and int(1) as labels.

I will release today improved cross-encoder models for MS Marco that

  1. Are quicker than the nboost models
  2. Achieve a better performance on MS Marco & TREC DL 2019 dataset
  3. And only use a single output to indicate if query and passage are relevant
1reaction
nreimerscommented, Nov 24, 2020

@iknoorjobs The models are now online: https://github.com/UKPLab/sentence-transformers/tree/master/examples/applications/information-retrieval

Can your cross encoder training scripts be used to train the model if I have dataset with binary labels? Yes. The MS Marco dataset had only binary labels (relevant or not relevant), which were encoded as 1 and 0.

It will also work if you have more fine-grained labels, like 0, 0.5, 0.8 and 1

Read more comments on GitHub >

github_iconTop Results From Across the Web

Speeding up Cross-encoders for both Training and Inference
We present how we use cross-encoder transformers and attention masking for superior performance.
Read more >
Cross-Encoders — Sentence-Transformers documentation
Bi-Encoders (see Computing Sentence Embeddings) are used whenever you need a sentence embedding in a vector space for efficient comparison. Applications are for ......
Read more >
Paragraph-based Transformer Pre-training for Multi-Sentence ...
In this paper, we first show that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks.
Read more >
SBERT: CROSS - ENCODER for Zero-Shot Classification ...
SBERT Sentence Transformers include Cross-Encoders for sentence pair score (Question and Answer - QA) and zero-shot classification tasks.
Read more >
Some Practice for Improving the Search Results of E-commerce
from cross-encoders models or the gain from feature engineering, ... Here are we listed some other inference acceleration strategies:.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found