Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Errors while traying to fine_tune the model

See original GitHub issue

Hello,

I pass all steps and used fine_tune.py to fine-tune the model. I got some errors. Number of classes are 2. I used this command: python fine_tune.py --not-restore-last

I used these parameters:

IMG_MEAN = np.array((145.2201, 119.0066, 97.9356), dtype=np.float32)

BATCH_SIZE = 1
DATA_DIRECTORY = '/home/hesam/Desktop/2/train'
DATA_LIST_PATH = 'data/train.txt'
IGNORE_LABEL = 255
INPUT_SIZE = '321,321'
LEARNING_RATE = 1e-4
NUM_CLASSES = 2
NUM_STEPS = 20000
RANDOM_SEED = 1234
RESTORE_FROM = 'data/deeplab_resnet.ckpt'
SAVE_NUM_IMAGES = 2
SAVE_PRED_EVERY = 100
SNAPSHOT_DIR = 'data'

and

# colour map
label_colours = [(0,0,0),(131,0,5)]

The errors:

(tensorflow) hesam@hesam-MS-7994:~/Desktop/tensorflow-deeplab-resnet-master$ python fine_tune.py --not-restore-last
Couldn't import dot_parser, loading of dot files will not be possible.
2017-04-26 11:26:40.077467: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-04-26 11:26:40.077492: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-04-26 11:26:40.077496: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-04-26 11:26:40.077499: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-04-26 11:26:40.077502: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-04-26 11:26:40.195856: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-04-26 11:26:40.196757: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: 
name: GeForce GTX 1060 6GB
major: 6 minor: 1 memoryClockRate (GHz) 1.759
pciBusID 0000:01:00.0
Total memory: 5.93GiB
Free memory: 5.54GiB
2017-04-26 11:26:40.196771: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 
2017-04-26 11:26:40.196775: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0:   Y 
2017-04-26 11:26:40.196780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0)
Restored model parameters from data/deeplab_resnet.ckpt
2017-04-26 11:26:46.111049: W tensorflow/core/framework/op_kernel.cc:1152] Not found: /home/hesam/Desktop/2/train/labels/0039.png
2017-04-26 11:26:46.112367: W tensorflow/core/framework/op_kernel.cc:1152] Not found: /home/hesam/Desktop/2/train/labels/0039.png
	 [[Node: create_inputs/ReadFile_1 = ReadFile[_device="/job:localhost/replica:0/task:0/cpu:0"](create_inputs/input_producer/Gather_1)]]
2017-04-26 11:26:46.222843: W tensorflow/core/framework/op_kernel.cc:1152] Out of range: FIFOQueue '_1_create_inputs/batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
	 [[Node: create_inputs/batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](create_inputs/batch/fifo_queue, create_inputs/batch/n)]]
2017-04-26 11:26:46.223115: W tensorflow/core/framework/op_kernel.cc:1152] Out of range: FIFOQueue '_1_create_inputs/batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
	 [[Node: create_inputs/batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](create_inputs/batch/fifo_queue, create_inputs/batch/n)]]
Traceback (most recent call last):
  File "fine_tune.py", line 207, in <module>
    main()
  File "fine_tune.py", line 196, in main
    loss_value, images, labels, preds, summary, _ = sess.run([reduced_loss, image_batch, label_batch, pred, total_summary, optim])
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 778, in run
    run_metadata_ptr)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 982, in _run
    feed_dict_string, options, run_metadata)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1032, in _do_run
    target_list, options, run_metadata)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1052, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_1_create_inputs/batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
	 [[Node: create_inputs/batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](create_inputs/batch/fifo_queue, create_inputs/batch/n)]]

Caused by op u'create_inputs/batch', defined at:
  File "fine_tune.py", line 207, in <module>
    main()
  File "fine_tune.py", line 125, in main
    image_batch, label_batch = reader.dequeue(args.batch_size)
  File "/home/hesam/Desktop/tensorflow-deeplab-resnet-master/deeplab_resnet/image_reader.py", line 179, in dequeue
    num_elements)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 917, in batch
    name=name)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 712, in _batch
    dequeued = queue.dequeue_many(batch_size, name=name)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/data_flow_ops.py", line 458, in dequeue_many
    self._queue_ref, n=n, component_types=self._dtypes, name=name)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 1328, in _queue_dequeue_many_v2
    timeout_ms=timeout_ms, name=name)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op
    op_def=op_def)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/hesam/Desktop/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1228, in __init__
    self._traceback = _extract_stack()

OutOfRangeError (see above for traceback): FIFOQueue '_1_create_inputs/batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
	 [[Node: create_inputs/batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](create_inputs/batch/fifo_queue, create_inputs/batch/n)]]