Transfer learning with the pretrained model.
See original GitHub issueHi there, I just tried to load the pretrained model (20170512-110547) as the initial state to perform the transfer learning with my own database, which has 367 identities. What I did is following the instructions you’ve shown on the page Classifier training of Inception-ResNet-v1. Here was my command:
$ python src/train_softmax.py --logs_base_dir logs/ --models_base_dir models/ --data_dir ~/Database/my_own_database/ --image_size 160 --model_def models.inception_resnet_v1 --pretrained_model models/20170512-110547/model-20170512-110547.ckpt-250000 --optimizer RMSPROP --learning_rate -1 --max_nrof_epochs 80 --keep_probability 0.8 --random_crop --random_flip --learning_rate_schedule_file data/learning_rate_schedule_classifier_casia.txt --weight_decay 5e-5 --center_loss_factor 1e-2 --center_loss_alfa 0.9
However, my process met a size dismatch problem. The error message was:
`Traceback (most recent call last): File “src/train_softmax.py”, line 447, in <module> main(parse_arguments(sys.argv[1:])) File “src/train_softmax.py”, line 206, in main saver.restore(sess, pretrained_model) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py”, line 1457, in restore {self.saver_def.filename_tensor_name: save_path}) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py”, line 778, in run run_metadata_ptr) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py”, line 982, in _run feed_dict_string, options, run_metadata) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py”, line 1032, in _do_run target_list, options, run_metadata) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py”, line 1052, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [128,367] rhs shape= [128,44052] [[Node: save/Assign_491 = Assign[T=DT_FLOAT, _class=[“loc:@Logits/weights”], use_locking=true, validate_shape=true, _device=“/job:localhost/replica:0/task:0/gpu:0”](Logits/weights, save/RestoreV2_491/_77)]]
Caused by op u’save/Assign_491’, defined at: File “src/train_softmax.py”, line 447, in <module> main(parse_arguments(sys.argv[1:])) File “src/train_softmax.py”, line 188, in main saver = tf.train.Saver(tf.trainable_variables(), max_to_keep=3) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py”, line 1056, in init self.build() File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py”, line 1086, in build restore_sequentially=self._restore_sequentially) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py”, line 691, in build restore_sequentially, reshape) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py”, line 419, in _AddRestoreOps assign_ops.append(saveable.restore(tensors, shapes)) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py”, line 155, in restore self.op.get_shape().is_fully_defined()) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/state_ops.py”, line 270, in assign validate_shape=validate_shape) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_state_ops.py”, line 47, in assign use_locking=use_locking, name=name) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py”, line 768, in apply_op op_def=op_def) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py”, line 2336, in create_op original_op=self._default_original_op, op_def=op_def) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py”, line 1228, in init self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [128,367] rhs shape= [128,44052] [[Node: save/Assign_491 = Assign[T=DT_FLOAT, _class=[“loc:@Logits/weights”], use_locking=true, validate_shape=true, _device=“/job:localhost/replica:0/task:0/gpu:0”](Logits/weights, save/RestoreV2_491/_77)]] `
I have check the input image type(.png) and size(182x182), which I believe were not the major reason causing the error. Is it possible for you to offer the training command you used to train the model (20170512-110547). Or should I change any parameters of fully-connected layers in the model?
Issue Analytics
- State:
- Created 6 years ago
- Comments:7 (1 by maintainers)
Top GitHub Comments
Actually…I have found an easy way to reach the goal. Assume that you want the layers in set A to inherit the variables from the pre-trained model. The remaining set B contains the layers which are newly defined or with different input/output size. Here I just use FaceNet (Incep-Res-v1) as an example:
set_A_vars = [v for v in tf.trainable_variables() if v.name.startswith('InceptionResnetV1')]
saver_set_A = tf.train.Saver(set_A_vars, max_to_keep=3)
saver_set_A_and_B = tf.train.Saver(tf.trainable_variables(), max_to_keep=3)
In my approach, I firstly do the global initialization as below:
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
Then I do the transfer learning by restoring the variables in set A:
saver_set_A.restore(sess, pretrained_model)
And please don’t forget to save your trained model completely:
saver_set_A_and_B.save(sess, checkpoint_path[, ])
If you have any more efficient way for transfer learning, please let me know. =)
thinks @hector246288