question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error in mobilenet backbone training

See original GitHub issue

Hello, I am experiencing an issue while trying to train the retinanet with any of the mobilenet backbones. Whenever i start the training with my custom dataset (that works with resnet 50) with the command keras_retinanet/bin/train.py --backbone mobilenet128_0.75 --batch-size 2 csv annotations_train.csv classes_to_int_map.csv, it results in the error:

None Epoch 1/50 2018-12-14 10:05:26.555197: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at gather_nd_op.cc:50 : Invalid argument: indices[274827] = [0, 274827] does not index into param shape [2,272010,1] Traceback (most recent call last): File “keras_retinanet/bin/train.py”, line 492, in <module> main() File “keras_retinanet/bin/train.py”, line 487, in main callbacks=callbacks, File “/home/marco/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py”, line 91, in wrapper return func(*args, **kwargs) File “/home/marco/.local/lib/python3.6/site-packages/keras/engine/training.py”, line 1418, in fit_generator initial_epoch=initial_epoch) File “/home/marco/.local/lib/python3.6/site-packages/keras/engine/training_generator.py”, line 217, in fit_generator class_weight=class_weight) File “/home/marco/.local/lib/python3.6/site-packages/keras/engine/training.py”, line 1217, in train_on_batch outputs = self.train_function(ins) File “/home/marco/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py”, line 2715, in call return self._call(inputs) File “/home/marco/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py”, line 2675, in _call fetched = self._callable_fn(*array_vals) File “/home/marco/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py”, line 1439, in call run_metadata_ptr) File “/home/marco/.local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py”, line 528, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[274827] = [0, 274827] does not index into param shape [2,272010,1] [[{{node loss/classification_loss/GatherNd_1}} = GatherNd[Tindices=DT_INT64, Tparams=DT_FLOAT, _class=[“loc:@training/Adam/gradients/loss/classification_loss/GatherNd_1_grad/ScatterNd”], _device=“/job:localhost/replica:0/task:0/device:CPU:0”](classification/concat, **loss/classification_loss/Where)]]

I am up to date with keras, tensorflow, keras-retinanet.

Thank for your help,

M

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
dredonietocommented, Dec 17, 2018

I’m not sure if its the same error but I had this error using mobilenet backbone:

InvalidArgumentError: flat indices[179577, :] = [0, 180077] does not index into param (shape: [1,179928,1]).

I was using the current release of tensorflow for CPU-only, but as described in this issue it works on GPU, so I tried the GPU version of tensorflow and mobilenet was working.

Maybe, if you are using the CPU release you can try using the other one.

1reaction
marcociara379commented, Dec 20, 2018

I can confirm that trained on GPU, mobilenet works.

Read more comments on GitHub >

github_iconTop Results From Across the Web

When i use the mobilenet as a backbone,i met this mistake?
I am using mobilenet backbone, but getting this error while inference: ValueError: Layer #1 (named "conv1") expects 2 weight(s), but the saved ...
Read more >
python - Double and Float RuntimeError while training ...
I make my model and train overall like this but still get an error. #Model torch.set_default_dtype(torch.float) backbone = torchvision.models.
Read more >
Training with "train_ssd.py" - error at the end of the dataset
Hi there,. I re-trained the SSD-Mobilenet network according to the description here and a set of images from the open-images database:.
Read more >
Backbones-Review: Feature Extraction Networks for Deep ...
A backbone is the recognized architecture or network used for feature extraction and its trained in many other task before and demonstrate its ......
Read more >
The learning curve accuracy (a) and error (b) obtained by ...
Download scientific diagram | The learning curve accuracy (a) and error (b) obtained by DeTraC model when VGG19 is used as a backbone...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found