ConvNeXt not compatible with mixed precision
See original GitHub issueSystem information.
- Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04
- TensorFlow installed from (source or binary): nightly
- TensorFlow version (use command below): 2.11.0a20220816
- Python version: 3.8.10
- GPU model and memory: NVIDIA Quadro RTX 6000 (24 GB VRAM)
- Do you want to contribute a PR? (yes/no): No
Describe the problem. Starting from this issue, I observed that ConvNeXt was not compatible with TimeDistributed, this was then fixed in the nightly release (see here). As it was working I then tried to use mixed precision, where I got a new error. Note that MobileNetV3 works seemlessly with mixed precision. Hence, I think only ConvNeXt might be affected, but not sure.
I believe the model itself is working fine with mixed precision, but it contains the layer LayerScale
, which may not be (see logs below for more details).
Describe the expected behavior. Mixed precision should work seemlessly with ConvNeXt.
Standalone code to reproduce the issue. It failed when initializing the ConvNeXt model, after mixed precision was enabled. Hence, I believe running this might reproduce the issue (note that source logs are not directly from this script, but I believe you will get the same error):
import tensorflow as tf
from tensorflow.keras.applications import ConvNeXtSmall
tf.keras.mixed_precision.set_global_policy('mixed_float16')
model = ConvNeXtSmall(include_top=False, weights="imagenet", pooling="none")
Source code / logs.
Traceback (most recent call last):
File "source/main.py", line 454, in <module>
main()
File "source/main.py", line 216, in main
model = get_classifier_architecture(MODEL_ARCH=ret.arch, ret=ret, instance_size=instance_size,
File "/home/andrep/workspace/bcgrade/source/models/classifiers.py", line 371, in get_classifier_architecture
shared_base_model = ConvNeXtSmall(include_top=False, weights="imagenet", pooling="none", input_shape=instance_size[1:])
File "/home/andrep/workspace/bcgrade/venv/lib/python3.8/site-packages/keras/applications/convnext.py", line 610, in ConvNeXtSmall
return ConvNeXt(
File "/home/andrep/workspace/bcgrade/venv/lib/python3.8/site-packages/keras/applications/convnext.py", line 516, in ConvNeXt
x = ConvNeXtBlock(
File "/home/andrep/workspace/bcgrade/venv/lib/python3.8/site-packages/keras/applications/convnext.py", line 283, in apply
x = LayerScale(
File "/home/andrep/workspace/bcgrade/venv/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/andrep/workspace/bcgrade/venv/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py", line 588, in _ExtractInputsAndAttrs
raise TypeError(
TypeError: Exception encountered when calling layer "convnext_small_stage_0_block_0_layer_scale" (type LayerScale).
Input 'y' of 'Mul' Op has type float32 that does not match type float16 of argument 'x'.
Call arguments received by layer "convnext_small_stage_0_block_0_layer_scale" (type LayerScale):
• x=tf.Tensor(shape=(None, None, None, 96), dtype=float16)
Issue Analytics
- State:
- Created a year ago
- Comments:5
Top GitHub Comments
I was able to fix this by adding in casts to the appropriate dtype for the the
build
and__call__
functions (lines 219-225) in the custom layer LayerScale as follows:@gowthamkpr, I was able to reproduce the issue on tensorflow v2.8, v2.9 and nightly. Kindly find the gist of it here.