question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Convert our checkpoint colabs into runnable scripts

See original GitHub issue

The colabs we currently have in tools/checkpoint_conversion are useful in that we don’t loose the code for converting checkpoints. But they are fairly unwieldy. They must be pointed to a specific branch used for the model development, they are a ton of lines of code, and we need one for each model variant.

Instead we could try to write one script per model that handles checkpoint conversion (perhaps with a flag to control the model variant?). Potential file structure.

tools
└── checkpoint_conversion
    ├── README.md
    ├── convert_bert_weights.py
    ├── convert_gpt2_weights.py
    └── requirements.txt

This will make it much easier to re-run and test checkpoint conversion code in the future.

Issue Analytics

  • State:open
  • Created 10 months ago
  • Comments:5

github_iconTop GitHub Comments

2reactions
abheesht17commented, Dec 3, 2022

@vulkomilev, please go ahead with writing the conversion script for BERT! You can follow the same template as RoBERTa’s script: https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_roberta_checkpoints.py.

0reactions
abheesht17commented, Dec 6, 2022

@vulkomilev, KerasNLP does not have a separate class for BertBase. There is a model class for BertBackbone: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/bert/bert_backbone.py#L35. If you want the base variant of BERT, you can do this:

# w/o loading the weights
bert_base = keras_nlp.models.BertBackbone.from_preset("bert_base_uncased_en", load_weights=False)

# loading the model with the pretrained weights
bert_base = keras_nlp.models.BertBackbone.from_preset("bert_base_uncased_en", load_weights=True)

These “presets” are drawn from here: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/bert/bert_presets.py.

Regarding checkpoint conversion for BERT, follow the same format as RoBERTa. Use the conversion notebooks mentioned in this directory as reference: https://github.com/keras-team/keras-nlp/tree/master/tools/checkpoint_conversion.

So, for example, contents of this cell


# Model garden BERT paths.
zip_path = f"""https://storage.googleapis.com/tf_model_garden/nlp/bert/v3/{TOKEN_TYPE}_L-12_H-768_A-12.tar.gz"""
zip_file = keras.utils.get_file(
    f"""/content/{MODEL_NAME}""",
    zip_path,
    extract=True,
    archive_format="tar",
)

can go in the download_model() function.

Contents of this cell:

model.get_layer("token_embedding").embeddings.assign(
    weights["encoder/layer_with_weights-0/embeddings/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("position_embedding").position_embeddings.assign(
    weights["encoder/layer_with_weights-1/embeddings/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("segment_embedding").embeddings.assign(
    weights["encoder/layer_with_weights-2/embeddings/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("embeddings_layer_norm").gamma.assign(
    weights["encoder/layer_with_weights-3/gamma/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("embeddings_layer_norm").beta.assign(
    weights["encoder/layer_with_weights-3/beta/.ATTRIBUTES/VARIABLE_VALUE"]
)

for i in range(model.num_layers):
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._key_dense.kernel.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_key_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE"]
    )
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._key_dense.bias.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_key_dense/bias/.ATTRIBUTES/VARIABLE_VALUE"]
    )
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._query_dense.kernel.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_query_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE"]
    )
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._query_dense.bias.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_query_dense/bias/.ATTRIBUTES/VARIABLE_VALUE"]
    )
...

can go in convert_checkpoints().

etc., etc.

The conversion script should work for all BERT presets (passed as an arg to the script).

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Convert Diffusers Dreambooth Models to CKPT Format!
Well - now you can with this quick and easy to run script ! Put yourself into any film your like, and make...
Read more >
Share a model - Hugging Face
In this tutorial, you will learn two methods for sharing a trained or fine-tuned model on the Model Hub: Programmatically push your files...
Read more >
Migrate checkpoint saving | TensorFlow Core
In TensorFlow 1, to configure checkpoint saving during training/validation with the tf.estimator.Estimator APIs, you specify a schedule in tf.estimator.
Read more >
Checkpoint file not being created with ... - GitHub
Hi, I'm trying to run this script: tf1_checkpoint_converter_lib.py to convert the BERT checkpoints generated with tensorflow 1.15 to ...
Read more >
How to Use Google Colab for Deep Learning
You can use the code cell in Colab not only to run Python code but also to run shell commands. Just add a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found