TypeError when saving a model with `numpy.bool_` types
See original GitHub issuenumpy.bool_
types are not being correctly serialized to json.
What is the current behavior?
The ComplexEncoder
class (here) does not handle numpy.bool_
which is not JSON serializable. This raises a TypeError when saving certain models.
If the current behavior is a bug, please provide the steps to reproduce.
model = TabNetClassifier(...)
model.fit(...) # training data and model parameters contain values of type numpy.bool_
model.save_model('path/to/model')
Expected behavior
numpy.bool_
should be cast to python’s bool
before being serialized to JSON. Here is my suggested fix. Please let me know if this is acceptable for a PR:
class ComplexEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, np.int64):
return int(obj)
if isinstance(obj, np.bool_):
return bool(obj)
# Let the base class default method raise the TypeError
return json.JSONEncoder.default(self, obj)
Other relevant information: poetry version: “poetry-core>=1.0.0” python version: “^3.9” Operating System: “Linux Kernel 5.18.14-arch1-1” Additional tools: CUDA Version: 11.7 Driver Version: 515.57
Additional context
Here’s a stacktrace:
File ".venv/lib/python3.10/site-packages/pytorch_tabnet/abstract_model.py", line 375, in save_model
json.dump(saved_params, f, cls=ComplexEncoder)
File "/usr/lib/python3.10/json/__init__.py", line 179, in dump
for chunk in iterable:
File "/usr/lib/python3.10/json/encoder.py", line 431, in _iterencode
yield from _iterencode_dict(o, _current_indent_level)
File "/usr/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
yield from chunks
File "/usr/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
yield from chunks
File "/usr/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
yield from chunks
File "/usr/lib/python3.10/json/encoder.py", line 438, in _iterencode
o = _default(o)
File ".venv/lib/python3.10/site-packages/pytorch_tabnet/utils.py", line 339, in default
return json.JSONEncoder.default(self, obj)
File "/usr/lib/python3.10/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type bool_ is not JSON serializable
I ran into this when trying tabnet in a kaggle competition. If you need to, you can look here in my code where the error happens.
Issue Analytics
- State:
- Created a year ago
- Reactions:2
- Comments:14
@Optimox Hi. I don’t know if that happens in the AMEX competition, but I guess so, since the json encoding is not working for dtypes other than np.int64.
Sorry for not being clear enough in my description of the problem. I’ve attached therefor a minimal working example to trigger the bug.
As said the problem is that y_train aka the target variable is of type bool (or np.int8 in my case) and you’re only handling np.int64 in
ComplexEncoder
https://github.com/dreamquark-ai/tabnet/blob/5ac55834b32693abc4b22028a74475ee0440c2a5/pytorch_tabnet/utils.py#L338https://github.com/dreamquark-ai/tabnet/blob/5ac55834b32693abc4b22028a74475ee0440c2a5/pytorch_tabnet/utils.py#L336-L341
thanks I’ll fix this soon