Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pooler weights not being updated for Multiple Choice models?

See original GitHub issue

I’m trying use pretrained BERT to finetune on a multiple choice dataset.

The parameters from pooler are excluded from the optimizer params here, however, the MutlipleChoice model does indeed use pooled_output (which passes through the pooler) here.

I wasn’t able to find a similar exclusion of pooler params from the optimizer in the official repo. I think I’m missing something here. Thanks for your patience.

Issue Analytics

State:
Created 4 years ago
Reactions:1
Comments:5 (4 by maintainers)

Top GitHub Comments

1reaction

thomwolfcommented, Apr 11, 2019

Indeed this looks like a bug in the run_swag.py example. What do you think @rodgzilla? Isn’t the exclusion of the pooler parameters from optimization (line 392 of run_swag.py) a typo?

0reactions

meetpscommented, Jun 12, 2019

Fixed in #675.

Top Results From Across the Web

Pooler weights not being updated for Multiple Choice models?

I'm trying use pretrained BERT to finetune on a multiple choice dataset. The parameters from pooler are excluded from the optimizer params ...

Source code for transformers.modeling_bert - Hugging Face

Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the :meth:`~transformers.PreTrainedModel.

Model weights not being updated - PyTorch Forums

Everything is working fine, EXCEPT the update bit of the weights. The update method is being called in a train_loop function that calls...

Advanced Techniques for Fine-tuning Transformers

Learn these techniques for fine-tuning BERT, RoBERTa, etc. Layer-wise Learning Rate Decay (LLRD) Warm-up Steps Re-initializing Layers ...

Understanding text with BERT - Scaleway's Blog

Here we are going to look at a new language representation model called ... BERT layers are not frozen, and their weights are...