Dev Observability
Product
Pricing
Docs
Resources
Blog
Company
Debug Wordle

question-mark

Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DPR usage of BertPooler

See original GitHub issue

Environment info

transformers version: 4.8.2
Platform: Linux-5.8.0-50-generic-x86_64-with-debian-bullseye-sid
Python version: 3.7.4
PyTorch version (GPU?): 1.5.1 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed

Who can help

RAG, DPR: @patrickvonplaten, @lhoestq

Information

DPR initializes BertModel with a BertPooler module which is not used in the end

Although this seems consistent with the original implementation, it is confusing for the user. One would expect that the pooled_output will come from the BertPooler module, if it is present, and the last layer of the model. Moreover it wastes memory and compute.

How to fix

Simply add the add_pooling_layer=False flag in https://github.com/huggingface/transformers/blob/master/src/transformers/models/dpr/modeling_dpr.py#L178 Some other parts of the code need also to be fixed, like https://github.com/huggingface/transformers/blob/master/src/transformers/models/dpr/modeling_dpr.py#L205 should be sequence_output = outputs[0]

Issue Analytics

State:
Created 2 years ago
Comments:7 (6 by maintainers)

Top GitHub Comments

1reaction

PaulLernercommented, Jan 5, 2022

Ok, I’ll let you know. I’m quite busy atm.

1reaction

PaulLernercommented, Nov 25, 2021

DPR has an optional projection layer in the original implementation but it is only applied on the sequence output, not on BertPooler’s output.

Read more comments on GitHub >

Top Results From Across the Web

DPR - Hugging Face

It is used to instantiate the components of the DPR model according to the specified arguments, defining the model component architectures.

DPR-Models — Sentence-Transformers documentation

Usage¶. To encode paragraphs, you need to provide a title (e.g. the Wikipedia article title) and the text passage. These must be seperated...

bert_pooler - AllenNLP v2.10.1

The pooling layer at the end of the BERT model. This returns an embedding for the [CLS] token, after passing it through a...

Retrieval Specifics - Simple Transformers

Retrieval models ( RetrievalModel ) are models used to retrieve relevant documents from a corpus given a query. Currently, only DPR models ...

Utilizing Transformer Representations Efficiently - Kaggle

Note 2: I have used torch.no_grad to fetch outputs from transformer in each technique since gradients gets accumulated and that results in OOM...

Top Related Medium Post

No results found

Top Related StackOverflow Question

No results found

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Top Related Reddit Thread

No results found

Top Related Hackernoon Post

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Top Related Hashnode Post

No results found

Feature request: Add built-in support for autorregressive text generation with ONNX models

Wav2Vec2ForPreTraining in 4.12 broke SpeechBrain implementation