tuner with max_model_size not skipping oversized models
See original GitHub issueWhen using max_model_size I am finding the tuner repeatedly tries the same oversized model, then errors out. This does not fit with the expected behavior as the warning message says the oversized model will be skipped.
Some dummy code to recreate the issue:
def modelbuilder(hp):
model = keras.Sequential()
model.add(keras.layers.Dense(hp.Int('width_1',2,20,step=1,sampling='linear'),input_shape=[100],activation='linear',name='Dense_1'))
model.add(keras.layers.Dense(1,activation='linear',name='output'))
optimizer = keras.optimizers.Adam(lr=0.01,beta_1=0.9,beta_2=0.999,epsilon= 1e-8)
model.compile(loss='mse',optimizer=optimizer,metrics=['mse'])
return model
...
tuner = RandomSearch(modelbuilder,objective='val_loss',max_trials=100,max_model_size=1800,overwrite=True)
tuner.search(x=train_data,y=train_target,epochs=3,validation_data=(valid_data,valid_target),verbose=0)
Giving an output of:
[Trial complete] [Trial summary]
Hp values: |-width_1: 11 |-Score: 0.08742512226104736 |-Best step: 0 [Trial complete] [Trial summary] Hp values: |-width_1: 12 |-Score: 0.09093187510967254 |-Best step: 0 [Trial complete] [Trial summary] Hp values: |-width_1: 8 |-Score: 0.09126158863306046 |-Best step: 0 [Trial complete] [Trial summary] Hp values: |-width_1: 14 |-Score: 0.10759384512901306 |-Best step: 0 [Trial complete] [Trial summary] Hp values: |-width_1: 6 |-Score: 0.08583792209625245 |-Best step: 0 [Trial complete] [Trial summary] Hp values: |-width_1: 10 |-Score: 0.09473302498459817 |-Best step: 0 [Trial complete] [Trial summary] Hp values: |-width_1: 3 |-Score: 0.13883694425225257 |-Best step: 0 [Trial complete] [Trial summary] Hp values: |-width_1: 13 |-Score: 0.08331702768802643 |-Best step: 0 [Trial complete] [Trial summary] Hp values: |-width_1: 7 |-Score: 0.10881250619888305 |-Best step: 0 [Trial complete] [Trial summary] Hp values: |-width_1: 17 |-Score: 0.09528512597084045 |-Best step: 0 [Warning] Oversized model: 2041 parameters – skipping [Warning] Oversized model: 2041 parameters – skipping [Warning] Oversized model: 2041 parameters – skipping [Warning] Oversized model: 2041 parameters – skipping [Warning] Oversized model: 2041 parameters – skipping [Warning] Oversized model: 2041 parameters – skipping
Issue Analytics
- State:
- Created 4 years ago
- Comments:11 (3 by maintainers)
Top GitHub Comments
Hey everyone,
I have created a dirty fix that can be used with models created with either the Keras Sequential or Functional API (i.e. you should at least be able to use the Keras backend method count_params on your model). This dirty fix can be used while we wait for a general solution to be envisioned and developed.
The basic idea is, once you have instantiated a high level tuner class (e.g. BayesianOptimization), to overwrite a few methods inherited from parent classes: the Tuner’s class _build_and_fit_model method and BaseTuner’s class on_trial_end method.
This dirty fix essentially just skips over trials. Therefore, I advice to increase the number of trials to account for the lost trials.
Regards, Bram
Updates 04/05/2021 and 08/05/2021 To prevent Keras Tuner from crashing due to GPU Out Of Memory (OOM) exceptions, you can add exception handling to the model.fit method call (only tested with TensorFlow 2.3.0 so far):
These crashes can still happen (happen less frequent with checking model size) due to the error margin present in manual model size calculation methods. In addition to the ResourceExhaustedError, in case a tf.distribute strategy is used during training, the InternalError also has to be handled because TensorFlow may raise any one of the two errors when the GPU is OOM.
After running the HyperBand tuner for a large number of trials, I discovered that the line model = self.hypermodel.build(trial.hyperparameters) was raising the RuntimeError as a result of consecutive GPU OOM errors. This error has been fixed by removing global Keras callbacks from the search method and including them locally in the fit_kwargs argument in the _build_and_fit_model method, e.g.:
Looks like this may be the culprit for crashing AutoKeras resulting in OOM errors that I’ve been facing. It is a huge problem for me at the moment. Any further updates regarding this issue? https://github.com/keras-team/autokeras/issues/1078