question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Errors while traying to fine_tune the model

See original GitHub issue

Hello,

I pass all steps and used fine_tune.py to fine-tune the model. I got some errors. Number of classes are 2. I used this command: python fine_tune.py --not-restore-last

I used these parameters:

IMG_MEAN = np.array((145.2201, 119.0066, 97.9356), dtype=np.float32)

BATCH_SIZE = 1
DATA_DIRECTORY = '/home/hesam/Desktop/2/train'
DATA_LIST_PATH = 'data/train.txt'
IGNORE_LABEL = 255
INPUT_SIZE = '321,321'
LEARNING_RATE = 1e-4
NUM_CLASSES = 2
NUM_STEPS = 20000
RANDOM_SEED = 1234
RESTORE_FROM = 'data/deeplab_resnet.ckpt'
SAVE_NUM_IMAGES = 2
SAVE_PRED_EVERY = 100
SNAPSHOT_DIR = 'data'

and

# colour map
label_colours = [(0,0,0),(131,0,5)]

The errors:

(tensorflow) hesam@hesam-MS-7994:~/Desktop/tensorflow-deeplab-resnet-master$ python fine_tune.py --not-restore-last
Couldn't import dot_parser, loading of dot files will not be possible.
2017-04-26 11:26:40.077467: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-04-26 11:26:40.077492: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-04-26 11:26:40.077496: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-04-26 11:26:40.077499: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-04-26 11:26:40.077502: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-04-26 11:26:40.195856: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-04-26 11:26:40.196757: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: 
name: GeForce GTX 1060 6GB
major: 6 minor: 1 memoryClockRate (GHz) 1.759
pciBusID 0000:01:00.0
Total memory: 5.93GiB
Free memory: 5.54GiB
2017-04-26 11:26:40.196771: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 
2017-04-26 11:26:40.196775: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0:   Y 
2017-04-26 11:26:40.196780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0)
Restored model parameters from data/deeplab_resnet.ckpt
2017-04-26 11:26:46.111049: W tensorflow/core/framework/op_kernel.cc:1152] Not found: /home/hesam/Desktop/2/train/labels/0039.png
2017-04-26 11:26:46.112367: W tensorflow/core/framework/op_kernel.cc:1152] Not found: /home/hesam/Desktop/2/train/labels/0039.png
	 [[Node: create_inputs/ReadFile_1 = ReadFile[_device="/job:localhost/replica:0/task:0/cpu:0"](create_inputs/input_producer/Gather_1)]]
2017-04-26 11:26:46.222843: W tensorflow/core/framework/op_kernel.cc:1152] Out of range: FIFOQueue '_1_create_inputs/batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
	 [[Node: create_inputs/batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](create_inputs/batch/fifo_queue, create_inputs/batch/n)]]
2017-04-26 11:26:46.223115: W tensorflow/core/framework/op_kernel.cc:1152] Out of range: FIFOQueue '_1_create_inputs/batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
	 [[Node: create_inputs/batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](create_inputs/batch/fifo_queue, create_inputs/batch/n)]]
Traceback (most recent call last):
  File "fine_tune.py", line 207, in <module>
    main()
  File "fine_tune.py", line 196, in main
    loss_value, images, labels, preds, summary, _ = sess.run([reduced_loss, image_batch, label_batch, pred, total_summary, optim])
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 778, in run
    run_metadata_ptr)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 982, in _run
    feed_dict_string, options, run_metadata)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1032, in _do_run
    target_list, options, run_metadata)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1052, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_1_create_inputs/batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
	 [[Node: create_inputs/batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](create_inputs/batch/fifo_queue, create_inputs/batch/n)]]

Caused by op u'create_inputs/batch', defined at:
  File "fine_tune.py", line 207, in <module>
    main()
  File "fine_tune.py", line 125, in main
    image_batch, label_batch = reader.dequeue(args.batch_size)
  File "/home/hesam/Desktop/tensorflow-deeplab-resnet-master/deeplab_resnet/image_reader.py", line 179, in dequeue
    num_elements)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 917, in batch
    name=name)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 712, in _batch
    dequeued = queue.dequeue_many(batch_size, name=name)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/data_flow_ops.py", line 458, in dequeue_many
    self._queue_ref, n=n, component_types=self._dtypes, name=name)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 1328, in _queue_dequeue_many_v2
    timeout_ms=timeout_ms, name=name)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op
    op_def=op_def)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1228, in __init__
    self._traceback = _extract_stack()

OutOfRangeError (see above for traceback): FIFOQueue '_1_create_inputs/batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
	 [[Node: create_inputs/batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](create_inputs/batch/fifo_queue, create_inputs/batch/n)]]


Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:28 (11 by maintainers)

github_iconTop GitHub Comments

2reactions
DrSleepcommented, Apr 26, 2017

https://github.com/DrSleep/tensorflow-deeplab-resnet/issues/75

TL;DR: there was a bug, which is now fixed. Please clone the repository again

1reaction
DrSleepcommented, Apr 27, 2017

Please take a look here: https://github.com/martinkersner/train-DeepLab#data-conversions (the second paragraph). Also this function should be useful to perform the conversion: https://github.com/martinkersner/train-DeepLab/blob/master/utils.py#L91

Read more comments on GitHub >

github_iconTop Results From Across the Web

Errors while fine-tuning using Keras - Hugging Face Forums
Hello there, I am trying to fine-tune the bart model using pre-downloaded “imdb” datasets following the exact same code from the example ...
Read more >
Error while trying to fine-tune the ReformerModelWithLMHead ...
After you have finetuned your model, you can increase the number of hashes again to increase the performance (compare Table 2 of the...
Read more >
Error while trying to load model's weights for fine-tuning ...
I reproduce my issue on oversimplified code. I'm trying to train model for one dataset (say, on MNIST) and then fine-tune it on...
Read more >
Weird Error while finetuning - OpenAI API Community Forum
I am trying to finetune curie with custom dataset, I prepared the dataset successfully (as can be seen in the image), it creates...
Read more >
Common Mistakes in Hyper-Parameters Tuning
Hyper-parameter tuning for machine learning models is a trial and error game. To succeed, it is best to avoid the following mistakes.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found