question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error while saving model: TypeError: ('Not JSON Serializable:', DistilBertConfig

See original GitHub issue

🐛 Bug

Information

In this problem, I am using the pre-trained distillbert model embedding to build a custom model (See the code snippet below). Everything works perfectly fine except saving the model (See error below). I am using the latest version of the transformer, which is 3.0.0. I could not even save the same model when using the last version 2.11 (see this issue: https://github.com/huggingface/transformers/issues/4444).

I was just wondering if you could help me solve the problem.

Code

config = DistilBertConfig.from_pretrained( 'distilbert-base-uncased')
config.output_hidden_states = False
distillbert_main = TFDistilBertMainLayer(config = config)

input_word_ids = tf.keras.layers.Input(shape=(8,), dtype = tf.int32, name = "input_word_ids"),
x = distillbert_main(input_word_ids)[0]
x = tf.keras.layers.Lambda(lambda seq: seq[:, 0, :])(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Dropout(0.2)(x)
out = tf.keras.layers.Dense(2)(x)

model = tf.keras.Model(inputs=input_word_ids, outputs=out)
for layer in model.layers[:3]:
    layer.trainable = False
model.summary() # Works fine
model.get_config() # Works fine

model.save('./model.h5') # Does not work and produce error

Error

TypeError                                 Traceback (most recent call last)
<ipython-input-32-1fbe6dabead0> in <module>
----> 1 model.save('./model.h5')

/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py in save(self, filepath, overwrite, include_optimizer, save_format, signatures, options)
   1050     """
   1051     save.save_model(self, filepath, overwrite, include_optimizer, save_format,
-> 1052                     signatures, options)
   1053 
   1054   def save_weights(self, filepath, overwrite=True, save_format=None):

/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/saving/save.py in save_model(model, filepath, overwrite, include_optimizer, save_format, signatures, options)
    133           'or using `save_weights`.')
    134     hdf5_format.save_model_to_hdf5(
--> 135         model, filepath, overwrite, include_optimizer)
    136   else:
    137     saved_model_save.save(model, filepath, overwrite, include_optimizer,

/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/saving/hdf5_format.py in save_model_to_hdf5(model, filepath, overwrite, include_optimizer)
    111       if isinstance(v, (dict, list, tuple)):
    112         f.attrs[k] = json.dumps(
--> 113             v, default=serialization.get_json_type).encode('utf8')
    114       else:
    115         f.attrs[k] = v

/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    236         check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    237         separators=separators, default=default, sort_keys=sort_keys,
--> 238         **kw).encode(obj)
    239 
    240 

/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py in encode(self, o)
    197         # exceptions aren't as detailed.  The list call should be roughly
    198         # equivalent to the PySequence_Fast that ''.join() would do.
--> 199         chunks = self.iterencode(o, _one_shot=True)
    200         if not isinstance(chunks, (list, tuple)):
    201             chunks = list(chunks)

/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py in iterencode(self, o, _one_shot)
    255                 self.key_separator, self.item_separator, self.sort_keys,
    256                 self.skipkeys, _one_shot)
--> 257         return _iterencode(o, 0)
    258 
    259 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,

/usr/local/lib/python3.7/site-packages/tensorflow/python/util/serialization.py in get_json_type(obj)
     74     return obj.__wrapped__
     75 
---> 76   raise TypeError('Not JSON Serializable:', obj)

TypeError: ('Not JSON Serializable:', DistilBertConfig {
  "activation": "gelu",
  "architectures": [
    "DistilBertForMaskedLM"
  ],
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "tie_weights_": true,
  "vocab_size": 30522
}
)
  • transformers version: 3.0.0
  • Platform: Mac OSX
  • Python version: 3.7
  • PyTorch version (GPU?): No
  • Tensorflow version: 2.2.0
  • Using GPU in script?: No
  • Using distributed or parallel set-up in script?: NO

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
msahamedcommented, Jul 14, 2020

I found that, model could be save in tensorflow saved_model using: tf.saved_model.save(model, './models/model')

However, I was not able to save in Keras .h5 format. That’s fine for me now. So, I close this issue.

0reactions
jplucommented, Jul 6, 2020

As a first glance, I can say that it is “normal” because the DistilBert model has a config parameter, which doesn’t make it compliant with sequential models. Create a subclass model instead to see if it works.

But this is just a quick guess, I will check it deeper when have some time.

Read more comments on GitHub >

github_iconTop Results From Across the Web

TypeError: ('Not JSON Serializable:'... when saving a model ...
The following error occurred when saving the model in Google Cloud Platform, while successfully saving the model in Google Colab.
Read more >
When finetuning Bert on classification task raised TypeError(f ...
Hello, I am trying to finetune bert on classification task but I am getting ... TypeError: Object of type ndarray is not JSON...
Read more >
Python JSON Serialize Set - PYnative
To solve TypeError: Object of type set is not JSON serializable we need to build a custom encoder to make set JSON serializable....
Read more >
TypeError: Object of type set is not JSON serializable
The Python TypeError: Object of type set is not JSON serializable occurs when we try to convert a `set` object to a JSON...
Read more >
TypeError: Object of type int64 is not JSON serializable
2, etc. But on my server, Ubuntu 20.04, PyStan3.2, etc, my Stan model won't even compile: import stan import numpy ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found