question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Transfer learning with the pretrained model.

See original GitHub issue

Hi there, I just tried to load the pretrained model (20170512-110547) as the initial state to perform the transfer learning with my own database, which has 367 identities. What I did is following the instructions you’ve shown on the page Classifier training of Inception-ResNet-v1. Here was my command:

$ python src/train_softmax.py --logs_base_dir logs/ --models_base_dir models/ --data_dir ~/Database/my_own_database/ --image_size 160 --model_def models.inception_resnet_v1 --pretrained_model models/20170512-110547/model-20170512-110547.ckpt-250000 --optimizer RMSPROP --learning_rate -1 --max_nrof_epochs 80 --keep_probability 0.8 --random_crop --random_flip --learning_rate_schedule_file data/learning_rate_schedule_classifier_casia.txt --weight_decay 5e-5 --center_loss_factor 1e-2 --center_loss_alfa 0.9

However, my process met a size dismatch problem. The error message was:

`Traceback (most recent call last): File “src/train_softmax.py”, line 447, in <module> main(parse_arguments(sys.argv[1:])) File “src/train_softmax.py”, line 206, in main saver.restore(sess, pretrained_model) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py”, line 1457, in restore {self.saver_def.filename_tensor_name: save_path}) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py”, line 778, in run run_metadata_ptr) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py”, line 982, in _run feed_dict_string, options, run_metadata) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py”, line 1032, in _do_run target_list, options, run_metadata) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py”, line 1052, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [128,367] rhs shape= [128,44052] [[Node: save/Assign_491 = Assign[T=DT_FLOAT, _class=[“loc:@Logits/weights”], use_locking=true, validate_shape=true, _device=“/job:localhost/replica:0/task:0/gpu:0”](Logits/weights, save/RestoreV2_491/_77)]]

Caused by op u’save/Assign_491’, defined at: File “src/train_softmax.py”, line 447, in <module> main(parse_arguments(sys.argv[1:])) File “src/train_softmax.py”, line 188, in main saver = tf.train.Saver(tf.trainable_variables(), max_to_keep=3) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py”, line 1056, in init self.build() File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py”, line 1086, in build restore_sequentially=self._restore_sequentially) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py”, line 691, in build restore_sequentially, reshape) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py”, line 419, in _AddRestoreOps assign_ops.append(saveable.restore(tensors, shapes)) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py”, line 155, in restore self.op.get_shape().is_fully_defined()) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/state_ops.py”, line 270, in assign validate_shape=validate_shape) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_state_ops.py”, line 47, in assign use_locking=use_locking, name=name) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py”, line 768, in apply_op op_def=op_def) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py”, line 2336, in create_op original_op=self._default_original_op, op_def=op_def) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py”, line 1228, in init self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [128,367] rhs shape= [128,44052] [[Node: save/Assign_491 = Assign[T=DT_FLOAT, _class=[“loc:@Logits/weights”], use_locking=true, validate_shape=true, _device=“/job:localhost/replica:0/task:0/gpu:0”](Logits/weights, save/RestoreV2_491/_77)]] `

I have check the input image type(.png) and size(182x182), which I believe were not the major reason causing the error. Is it possible for you to offer the training command you used to train the model (20170512-110547). Or should I change any parameters of fully-connected layers in the model?

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:7 (1 by maintainers)

github_iconTop GitHub Comments

14reactions
hector246288commented, Jul 14, 2017

Actually…I have found an easy way to reach the goal. Assume that you want the layers in set A to inherit the variables from the pre-trained model. The remaining set B contains the layers which are newly defined or with different input/output size. Here I just use FaceNet (Incep-Res-v1) as an example:

set_A_vars = [v for v in tf.trainable_variables() if v.name.startswith('InceptionResnetV1')] saver_set_A = tf.train.Saver(set_A_vars, max_to_keep=3) saver_set_A_and_B = tf.train.Saver(tf.trainable_variables(), max_to_keep=3)

In my approach, I firstly do the global initialization as below: sess.run(tf.global_variables_initializer()) sess.run(tf.local_variables_initializer())

Then I do the transfer learning by restoring the variables in set A: saver_set_A.restore(sess, pretrained_model)

And please don’t forget to save your trained model completely: saver_set_A_and_B.save(sess, checkpoint_path[, ])

If you have any more efficient way for transfer learning, please let me know. =)

0reactions
lvmeng8commented, May 8, 2018
Read more comments on GitHub >

github_iconTop Results From Across the Web

Transfer learning and fine-tuning | TensorFlow Core
A pre-trained model is a saved network that was previously trained on a large dataset, typically on a large-scale image-classification task. You either...
Read more >
Transfer Learning | Pretrained Models in Deep Learning
By using pre-trained models which have been previously trained on large datasets, we can directly use the weights and architecture obtained and ...
Read more >
Transfer Learning Using Pretrained Network - MathWorks
Transfer learning is commonly used in deep learning applications. You can take a pretrained network and use it as a starting point to...
Read more >
Transfer Learning in Keras with Computer Vision Models
The pre-trained model can be used as a separate feature extraction program, in which case input can be pre-processed by the model or...
Read more >
A Practical Tutorial With Examples for Images and Text in Keras
Transfer learning is about leveraging feature representations from a pre-trained model, so you don't have to train a new model from scratch. The...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found