question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`activation_dropout` in OPT is never used

See original GitHub issue

System Info

main

Who can help?

@patil-suraj, @patrickvonplaten, @LysandreJik

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, …)
  • My own task or dataset (give details below)

Reproduction

https://github.com/huggingface/transformers/blob/ee67e7ad4fd7a766891b68f708cf03e30f609976/src/transformers/models/opt/modeling_opt.py#L279

activation_dropout in modeling_opt.py is never used. It would not behave as expected if one initial a model randomly while setting it to non-zero.

Expected behavior

activation_dropout is used or removed.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:8 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
shijie-wucommented, Jul 28, 2022

I’m happy to contribute if removing is what we want 😊

1reaction
ArthurZuckercommented, Sep 12, 2022

Gonna merge it to main 🥳

Read more comments on GitHub >

github_iconTop Results From Across the Web

Dropout Regularization in Deep Learning Models with Keras
Dropout is only used during the training of a model and is not used when evaluating the skill of the model.
Read more >
Where should I place dropout layers in a neural network?
Dropout was used after the activation function of each convolutional layer: CONV->RELU->DROP. So should they be placed after all layers, or only the...
Read more >
Dropout behavior in Keras with rate=1 (dropping all input units ...
The Dropout layer simply doesn't do anything when rate is set to 1 (or 0, see here). I guess it's because the scaling...
Read more >
InvalidArgumentError: No OpKernel was registered to ... - GitHub
I am user macintosh code `import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Dropout, ...
Read more >
Dropout and Batch Normalization - Kaggle
It seems that batch normalization can be used at almost any point in a network. You can put it after a layer... layers.Dense(16,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found