question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DPR usage of BertPooler

See original GitHub issue

Environment info

  • transformers version: 4.8.2
  • Platform: Linux-5.8.0-50-generic-x86_64-with-debian-bullseye-sid
  • Python version: 3.7.4
  • PyTorch version (GPU?): 1.5.1 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed

Who can help

RAG, DPR: @patrickvonplaten, @lhoestq

Information

DPR initializes BertModel with a BertPooler module which is not used in the end

Although this seems consistent with the original implementation, it is confusing for the user. One would expect that the pooled_output will come from the BertPooler module, if it is present, and the last layer of the model. Moreover it wastes memory and compute.

How to fix

Simply add the add_pooling_layer=False flag in https://github.com/huggingface/transformers/blob/master/src/transformers/models/dpr/modeling_dpr.py#L178 Some other parts of the code need also to be fixed, like https://github.com/huggingface/transformers/blob/master/src/transformers/models/dpr/modeling_dpr.py#L205 should be sequence_output = outputs[0]

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
PaulLernercommented, Jan 5, 2022

Ok, I’ll let you know. I’m quite busy atm.

1reaction
PaulLernercommented, Nov 25, 2021

DPR has an optional projection layer in the original implementation but it is only applied on the sequence output, not on BertPooler’s output.

Read more comments on GitHub >

github_iconTop Results From Across the Web

DPR - Hugging Face
It is used to instantiate the components of the DPR model according to the specified arguments, defining the model component architectures.
Read more >
DPR-Models — Sentence-Transformers documentation
Usage¶. To encode paragraphs, you need to provide a title (e.g. the Wikipedia article title) and the text passage. These must be seperated...
Read more >
bert_pooler - AllenNLP v2.10.1
The pooling layer at the end of the BERT model. This returns an embedding for the [CLS] token, after passing it through a...
Read more >
Retrieval Specifics - Simple Transformers
Retrieval models ( RetrievalModel ) are models used to retrieve relevant documents from a corpus given a query. Currently, only DPR models ...
Read more >
Utilizing Transformer Representations Efficiently - Kaggle
Note 2: I have used torch.no_grad to fetch outputs from transformer in each technique since gradients gets accumulated and that results in OOM...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found