Can't freeze pre-trained params
See original GitHub issueThanks for releasing this repo, but I met a problem. I can.t freeze pre-trained params, I use the following code to frezze params, but it did’t work. `model_dir = “D:/ProgramData/Pre_Traines_Model_Of_Bert/chinese_L-12_H-768_A-12”
bert_params = bert.params_from_pretrained_ckpt(model_dir)
l_bert = bert.BertModelLayer.from_params(bert_params, name="bert")
l_bert.apply_adapter_freeze()
max_seq_len = 128
l_input_ids = keras.layers.Input(shape=(max_seq_len,), dtype='int32')
# l_token_type_ids = keras.layers.Input(shape=(max_seq_len,), dtype='int32')
# using the default token_type/segment id 0
output = l_bert(l_input_ids) # output: [batch_size, max_seq_len, hidden_size]
model = keras.Model(inputs=l_input_ids, outputs=output)
model.build(input_shape=(None, max_seq_len))
bert.loader.load_stock_weights(l_bert, os.path.join(model_dir, "bert_model.ckpt"))
model.summary()`
And I get following result:
As you see, all params are trainable, Could u help me ASAP, THX! 😃
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
How to fix some layers for transfer learning? #1706 - GitHub
We want to freeze the params of the layers whose names start with frozen , so we assign them the label zero (the...
Read more >Cannot freeze batch normalization parameters - autograd
during training my model i am making some of the layers not trainable via: for param in model.parameters(): param.requires_grad = False.
Read more >How to freeze layers using trainer? - Hugging Face Forums
Hey, I am trying to figure out how to freeze layers of a model and read that I had to use for param...
Read more >load and freeze one model and train others in PyTorch
Yes, that is correct. When you structure your model the way you explained, what you are doing is correct. ModelA consists of three...
Read more >What are the consequences of not freezing layers in transfer ...
The result of not freezing the pretrained layers will be to destroy the information they contain during future training rounds.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
you may set it with l_bert.trainable = False. but it’s just a wild guess, I Don’t know what apply_adapter_freeze() function does.
@ck37 -
apply_adapter_freeze()
is used together with theadapter_size
parameter, i.e. it will freeze all but the adapter layers for training; in the case you use plan bert (i.e. without adapter layers), the method will not have an effect.To freeze your layers, you might use the standard keras mechanisms:
just like @ptamas88 have suggested!