Error in mobilenet backbone training
See original GitHub issueHello,
I am experiencing an issue while trying to train the retinanet with any of the mobilenet backbones.
Whenever i start the training with my custom dataset (that works with resnet 50) with the command keras_retinanet/bin/train.py --backbone mobilenet128_0.75 --batch-size 2 csv annotations_train.csv classes_to_int_map.csv
, it results in the error:
None Epoch 1/50 2018-12-14 10:05:26.555197: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at gather_nd_op.cc:50 : Invalid argument: indices[274827] = [0, 274827] does not index into param shape [2,272010,1] Traceback (most recent call last): File “keras_retinanet/bin/train.py”, line 492, in <module> main() File “keras_retinanet/bin/train.py”, line 487, in main callbacks=callbacks, File “/home/marco/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py”, line 91, in wrapper return func(*args, **kwargs) File “/home/marco/.local/lib/python3.6/site-packages/keras/engine/training.py”, line 1418, in fit_generator initial_epoch=initial_epoch) File “/home/marco/.local/lib/python3.6/site-packages/keras/engine/training_generator.py”, line 217, in fit_generator class_weight=class_weight) File “/home/marco/.local/lib/python3.6/site-packages/keras/engine/training.py”, line 1217, in train_on_batch outputs = self.train_function(ins) File “/home/marco/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py”, line 2715, in call return self._call(inputs) File “/home/marco/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py”, line 2675, in _call fetched = self._callable_fn(*array_vals) File “/home/marco/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py”, line 1439, in call run_metadata_ptr) File “/home/marco/.local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py”, line 528, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[274827] = [0, 274827] does not index into param shape [2,272010,1] [[{{node loss/classification_loss/GatherNd_1}} = GatherNd[Tindices=DT_INT64, Tparams=DT_FLOAT, _class=[“loc:@training/Adam/gradients/loss/classification_loss/GatherNd_1_grad/ScatterNd”], _device=“/job:localhost/replica:0/task:0/device:CPU:0”](classification/concat, **loss/classification_loss/Where)]]
I am up to date with keras, tensorflow, keras-retinanet.
Thank for your help,
M
Issue Analytics
- State:
- Created 5 years ago
- Comments:6 (2 by maintainers)
Top GitHub Comments
I’m not sure if its the same error but I had this error using mobilenet backbone:
I was using the current release of tensorflow for CPU-only, but as described in this issue it works on GPU, so I tried the GPU version of tensorflow and mobilenet was working.
Maybe, if you are using the CPU release you can try using the other one.
I can confirm that trained on GPU, mobilenet works.