Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Convert our checkpoint colabs into runnable scripts

See original GitHub issue

The colabs we currently have in tools/checkpoint_conversion are useful in that we don’t loose the code for converting checkpoints. But they are fairly unwieldy. They must be pointed to a specific branch used for the model development, they are a ton of lines of code, and we need one for each model variant.

Instead we could try to write one script per model that handles checkpoint conversion (perhaps with a flag to control the model variant?). Potential file structure.

tools
└── checkpoint_conversion
    ├── README.md
    ├── convert_bert_weights.py
    ├── convert_gpt2_weights.py
    └── requirements.txt

This will make it much easier to re-run and test checkpoint conversion code in the future.

Issue Analytics

State:
Created 10 months ago
Comments:5

Top GitHub Comments

2reactions

abheesht17commented, Dec 3, 2022

@vulkomilev, please go ahead with writing the conversion script for BERT! You can follow the same template as RoBERTa’s script: https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_roberta_checkpoints.py.

0reactions

abheesht17commented, Dec 6, 2022

@vulkomilev, KerasNLP does not have a separate class for BertBase. There is a model class for BertBackbone: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/bert/bert_backbone.py#L35. If you want the base variant of BERT, you can do this:

# w/o loading the weights
bert_base = keras_nlp.models.BertBackbone.from_preset("bert_base_uncased_en", load_weights=False)

# loading the model with the pretrained weights
bert_base = keras_nlp.models.BertBackbone.from_preset("bert_base_uncased_en", load_weights=True)

These “presets” are drawn from here: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/bert/bert_presets.py.

Regarding checkpoint conversion for BERT, follow the same format as RoBERTa. Use the conversion notebooks mentioned in this directory as reference: https://github.com/keras-team/keras-nlp/tree/master/tools/checkpoint_conversion.

So, for example, contents of this cell


# Model garden BERT paths.
zip_path = f"""https://storage.googleapis.com/tf_model_garden/nlp/bert/v3/{TOKEN_TYPE}_L-12_H-768_A-12.tar.gz"""
zip_file = keras.utils.get_file(
    f"""/content/{MODEL_NAME}""",
    zip_path,
    extract=True,
    archive_format="tar",
)

can go in the download_model() function.

Contents of this cell:

model.get_layer("token_embedding").embeddings.assign(
    weights["encoder/layer_with_weights-0/embeddings/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("position_embedding").position_embeddings.assign(
    weights["encoder/layer_with_weights-1/embeddings/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("segment_embedding").embeddings.assign(
    weights["encoder/layer_with_weights-2/embeddings/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("embeddings_layer_norm").gamma.assign(
    weights["encoder/layer_with_weights-3/gamma/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("embeddings_layer_norm").beta.assign(
    weights["encoder/layer_with_weights-3/beta/.ATTRIBUTES/VARIABLE_VALUE"]
)

for i in range(model.num_layers):
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._key_dense.kernel.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_key_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE"]
    )
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._key_dense.bias.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_key_dense/bias/.ATTRIBUTES/VARIABLE_VALUE"]
    )
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._query_dense.kernel.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_query_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE"]
    )
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._query_dense.bias.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_query_dense/bias/.ATTRIBUTES/VARIABLE_VALUE"]
    )
...

can go in convert_checkpoints().

etc., etc.

The conversion script should work for all BERT presets (passed as an arg to the script).

Top Results From Across the Web

How to Convert Diffusers Dreambooth Models to CKPT Format!

Well - now you can with this quick and easy to run script ! Put yourself into any film your like, and make...

Share a model - Hugging Face

In this tutorial, you will learn two methods for sharing a trained or fine-tuned model on the Model Hub: Programmatically push your files...

Migrate checkpoint saving | TensorFlow Core

In TensorFlow 1, to configure checkpoint saving during training/validation with the tf.estimator.Estimator APIs, you specify a schedule in tf.estimator.

Checkpoint file not being created with ... - GitHub

Hi, I'm trying to run this script: tf1_checkpoint_converter_lib.py to convert the BERT checkpoints generated with tensorflow 1.15 to ...

How to Use Google Colab for Deep Learning

You can use the code cell in Colab not only to run Python code but also to run shell commands. Just add a...