Can't convert ImageNet to TFDS
See original GitHub issueWhat I need help with / What I was wondering
I need to run python -m tensorflow_datasets.scripts.download_and_prepare --datasets=imagenet2012
to convert imagenet dataset to “tfds” format.
I have:
~/tensorflow_datasets/downloads/manual/ILSVRC2012_img_train.tar
~/tensorflow_datasets/downloads/manual/ILSVRC2012_img_val.tar
I get a crash:
$ python -m tensorflow_datasets.scripts.download_and_prepare --datasets=imagenet2012
2020-07-09 12:42:44.810165: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
I0709 12:42:46.014064 139951095191360 download_and_prepare.py:201] Running download_and_prepare for dataset(s):
imagenet2012
2020-07-09 12:42:46.039449: W tensorflow/core/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "Not found: Could not locate the credentials file.". Retrieving token from GCE failed with "Failed precondition: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Couldn't resolve host 'metadata'".
I0709 12:42:46.465344 139951095191360 dataset_info.py:427] Load pre-computed DatasetInfo (eg: splits, num examples,...) from GCS: imagenet2012/5.0.0
I0709 12:42:46.734692 139951095191360 dataset_info.py:358] Load dataset info from /tmp/tmpq9l_v2v7tfds
I0709 12:42:46.738695 139951095191360 dataset_info.py:398] Field info.description from disk and from code do not match. Keeping the one from code.
I0709 12:42:46.738808 139951095191360 dataset_info.py:398] Field info.citation from disk and from code do not match. Keeping the one from code.
I0709 12:42:46.739076 139951095191360 download_and_prepare.py:139] download_and_prepare for dataset imagenet2012/5.0.0...
I0709 12:42:46.739398 139951095191360 dataset_builder.py:346] Generating dataset imagenet2012 (/home/bryanloz/tensorflow_datasets/imagenet2012/5.0.0)
Downloading and preparing dataset imagenet2012/5.0.0 (download: 144.02 GiB, generated: Unknown size, total: 144.02 GiB) to /home/bryanloz/tensorflow_datasets/imagenet2012/5.0.0...
I0709 12:42:49.851376 139951095191360 dataset_builder.py:947] Generating split train
76809 examples [01:11, 1230.11 examples/s]2020-07-09 12:44:01.563442: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-07-09 12:44:01.758785: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:3b:00.0 name: Tesla V100-PCIE-16GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 15.75GiB deviceMemoryBandwidth: 836.37GiB/s
2020-07-09 12:44:01.760404: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 1 with properties:
pciBusID: 0000:af:00.0 name: Tesla V100-PCIE-16GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 15.75GiB deviceMemoryBandwidth: 836.37GiB/s
2020-07-09 12:44:01.761609: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 2 with properties:
pciBusID: 0000:d8:00.0 name: Tesla V100-PCIE-16GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 15.75GiB deviceMemoryBandwidth: 836.37GiB/s
2020-07-09 12:44:01.761638: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-07-09 12:44:01.763099: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-07-09 12:44:01.764466: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-07-09 12:44:01.768670: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-07-09 12:44:01.807622: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-07-09 12:44:01.809431: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-07-09 12:44:01.815753: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-07-09 12:44:01.827543: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0, 1, 2
2020-07-09 12:44:01.828474: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-07-09 12:44:01.868361: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2100000000 Hz
2020-07-09 12:44:01.878461: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x557610c82a00 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-09 12:44:01.878515: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-07-09 12:44:01.900848: W tensorflow/compiler/xla/service/platform_util.cc:210] unable to create StreamExecutor for CUDA:1: failed initializing StreamExecutor for CUDA device ordinal 1: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OPERATING_SYSTEM: OS call failed or operation not supported on this OS
2020-07-09 12:44:02.069513: W tensorflow/compiler/xla/service/platform_util.cc:210] unable to create StreamExecutor for CUDA:2: failed initializing StreamExecutor for CUDA device ordinal 2: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OPERATING_SYSTEM: OS call failed or operation not supported on this OS
2020-07-09 12:44:02.069596: W tensorflow/compiler/xla/service/platform_util.cc:210] unable to create StreamExecutor for CUDA:0: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OPERATING_SYSTEM: OS call failed or operation not supported on this OS
2020-07-09 12:44:02.069938: I tensorflow/compiler/jit/xla_gpu_device.cc:161] Ignoring visible XLA_GPU_JIT device. Device number is 0, reason: Internal: no supported devices found for platform CUDA
2020-07-09 12:44:02.074957: W tensorflow/compiler/xla/service/platform_util.cc:210] unable to create StreamExecutor for CUDA:0: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OPERATING_SYSTEM: OS call failed or operation not supported on this OS
2020-07-09 12:44:02.214344: W tensorflow/compiler/xla/service/platform_util.cc:210] unable to create StreamExecutor for CUDA:2: failed initializing StreamExecutor for CUDA device ordinal 2: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OPERATING_SYSTEM: OS call failed or operation not supported on this OS
2020-07-09 12:44:02.214487: W tensorflow/compiler/xla/service/platform_util.cc:210] unable to create StreamExecutor for CUDA:1: failed initializing StreamExecutor for CUDA device ordinal 1: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OPERATING_SYSTEM: OS call failed or operation not supported on this OS
2020-07-09 12:44:02.214891: I tensorflow/compiler/jit/xla_gpu_device.cc:161] Ignoring visible XLA_GPU_JIT device. Device number is 1, reason: Internal: no supported devices found for platform CUDA
2020-07-09 12:44:02.220597: W tensorflow/compiler/xla/service/platform_util.cc:210] unable to create StreamExecutor for CUDA:0: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OPERATING_SYSTEM: OS call failed or operation not supported on this OS
2020-07-09 12:44:02.376580: W tensorflow/compiler/xla/service/platform_util.cc:210] unable to create StreamExecutor for CUDA:1: failed initializing StreamExecutor for CUDA device ordinal 1: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OPERATING_SYSTEM: OS call failed or operation not supported on this OS
2020-07-09 12:44:02.376694: W tensorflow/compiler/xla/service/platform_util.cc:210] unable to create StreamExecutor for CUDA:2: failed initializing StreamExecutor for CUDA device ordinal 2: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OPERATING_SYSTEM: OS call failed or operation not supported on this OS
2020-07-09 12:44:02.377093: I tensorflow/compiler/jit/xla_gpu_device.cc:161] Ignoring visible XLA_GPU_JIT device. Device number is 2, reason: Internal: no supported devices found for platform CUDA
2020-07-09 12:44:02.518009: F tensorflow/stream_executor/lib/statusor.cc:34] Attempting to fetch value instead of handling error Internal: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OPERATING_SYSTEM: OS call failed or operation not supported on this OS
Fatal Python error: Aborted
Thread 0x00007f4786554700 (most recent call first):
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/threading.py", line 300 in wait
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/threading.py", line 552 in wait
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tqdm/_monitor.py", line 69 in run
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/threading.py", line 917 in _bootstrap_inner
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/threading.py", line 885 in _bootstrap
Current thread 0x00007f48e7509740 (most recent call first):
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 539 in ensure_initialized
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 97 in convert_to_eager_tensor
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 300 in _constant_eager_impl
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 275 in _constant_impl
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 264 in constant
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 338 in _constant_tensor_conversion_function
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1525 in convert_to_tensor
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow/python/ops/gen_image_ops.py", line 1241 in decode_jpeg_eager_fallback
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow/python/ops/gen_image_ops.py", line 1177 in decode_jpeg
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow_datasets/core/utils/tf_utils.py", line 77 in run
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow_datasets/core/utils/image_utils.py", line 54 in jpeg_cmyk_to_rgb
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow_datasets/image_classification/imagenet.py", line 176 in _fix_image
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow_datasets/image_classification/imagenet.py", line 197 in _generate_examples
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tqdm/std.py", line 1129 in __iter__
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow_datasets/core/dataset_builder.py", line 1034 in _prepare_split
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow_datasets/core/dataset_builder.py", line 951 in _download_and_prepare
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow_datasets/core/dataset_builder.py", line 1019 in _download_and_prepare
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow_datasets/core/dataset_builder.py", line 376 in download_and_prepare
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow_datasets/core/api_utils.py", line 69 in disallow_positional_args_dec
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow_datasets/scripts/download_and_prepare.py", line 156 in download_and_prepare
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow_datasets/scripts/download_and_prepare.py", line 236 in main
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/absl/app.py", line 250 in _run_main
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/absl/app.py", line 299 in run
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/site-packages/tensorflow_datasets/scripts/download_and_prepare.py", line 241 in <module>
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/runpy.py", line 85 in _run_code
File "/scratch/bryanloz/anaconda3/envs/tf22/lib/python3.7/runpy.py", line 193 in _run_module_as_main
Aborted (core dumped)
What I’ve tried so far I’ve tried moving the tar balls from network storage to local storage with no improvement
It would be nice if… It might be helpful if documentation for tfds was a bit more verbose, what is tfds even doing to my tar balls?
Environment information (if applicable)
- Ubuntu 18.04
- Python version: 3.7.0
tensorflow-datasets
/tfds-nightly
version: (tfds-nightly) 3.1.0tensorflow
/tensorflow-gpu
/tf-nightly
/tf-nightly-gpu
version: (tf-nightly) 2.4.0-dev20200709
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
TensorFlow Datasets
TFDS provides a collection of ready-to-use datasets for use with TensorFlow, Jax, and other Machine Learning frameworks.
Read more >Preparing the ImageNet dataset with TensorFlow
We have worked through setting up the ImageNet dataset. Unfortunately, we cannot set up the test dataset as conveniently. Further, no labels are ......
Read more >Preparing the ImageNet dataset with ... - Pascal Janetzky
Without a doubt, the ImageNet dataset has been a critical factor in developing advanced Machine ... import tensorflow_datasets as tfds.
Read more >Tensorflow Datasets Reshape Images - python - Stack Overflow
Because each data has different shapes, I can't build a data pipeline. import tensorflow_datasets as tfds import tensorflow as tf ...
Read more >Download, pre-process, and upload the ImageNet dataset
The validation and test data are not contained in the ImageNet training data ... You cannot download the dataset until ImageNet confirms your...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
TFDS preprocess public data into standard uniform tf-record which can be loaded as efficient
tf.data.Dataset
pipeline. I would recommend our introduction: https://www.tensorflow.org/datasets/overviewClosing, because I am happy, but I think this issue, could be helpful to others in the future. (If you see weird CUDA problems, try debugging by running with CPU only).