Load Failure around Checkpoint Reading
See original GitHub issueHi! Bumping into some problems trying to run this. If helpful, setup is GTX0180 using Cuda 7.5 with Tensorflow v.8 on Ubuntu 14.04
with a folder of my own images Im getting:
tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
{'batch_size': 64,
'beta1': 0.5,
'checkpoint_dir': 'checkpoint',
'dataset': 'doom2graphics',
'epoch': 25,
'image_size': 108,
'is_crop': False,
'is_train': True,
'learning_rate': 0.0002,
'sample_dir': 'samples',
'train_size': inf,
'visualize': False}
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 6.88GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:755] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
[*] Reading checkpoints...
[!] Load failed...
and then I get a little more info at the end when I try with the celeb set.
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
{'batch_size': 64,
'beta1': 0.5,
'checkpoint_dir': 'checkpoint',
'dataset': 'celebA',
'epoch': 25,
'image_size': 108,
'is_crop': True,
'is_train': True,
'learning_rate': 0.0002,
'sample_dir': 'samples',
'train_size': inf,
'visualize': False}
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 6.87GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:755] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
[*] Reading checkpoints...
[!] Load failed...
F tensorflow/stream_executor/cuda/cuda_dnn.cc:427] could not set cudnn filter descriptor: CUDNN_STATUS_BAD_PARAM
F tensorflow/stream_executor/cuda/cuda_dnn.cc:427] could not set cudnn filter descriptor: CUDNN_STATUS_BAD_PARAM
Aborted (core dumped)
noob to this universe - any thoughts? thank you for the time!!
Issue Analytics
- State:
- Created 7 years ago
- Comments:6 (2 by maintainers)
Top Results From Across the Web
[*] Reading checkpoints... [!] Load failed... · Issue #51 - GitHub
I am working on super resolution algorithm based on DCGAN. I am getting same error like Reading checkpoints , load failed .. I...
Read more >Cannot load checkpoints - tensorflow - Stack Overflow
You should be able to load the checkpoints according to the TensorFlow documentation like this:
Read more >Installation failed. Reason: Load on Module failed...
Hi All, we have a environment, where management is on R80 and gateway is on R75.40 SPLAT. We sometime faces following error when...
Read more >How to Fix the Error: Hyper-V Checkpoint Operation Failed
Change the checkpoint type · Open VM settings. · Click Checkpoints in the Management section. · Change the type of checkpoint by selecting...
Read more >OSError: Unable to load weights from pytorch checkpoint file
If you tried to load a PyTorch model from a TF 2.0 checkpoint, ... PytorchStreamReader failed reading zip archive: failed finding central ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I fix the same problem by checking this line in model.py:
data = glob(os.path.join("./data", config.dataset, "*.jpg"))
It turned out that my directory name was not compatible with
config.dataset
. Try to print out the result of this line to make sure that you don’t get thedata
as an empty list. Then celebA can run.I have this line instead, still I have load failed…
self.data = glob(os.path.join("./data", self.dataset_name, self.input_fname_pattern))