Improving ImageNet-1k support
See original GitHub issueW.r.t the current support for ImageNet-1k, we can improve things:
- First, let’s start leveraging TFDS. It significantly reduces the work expected to be done by a user. Let’s walk through an example.
First, the user needs to keep the ILSVRC2012_img_train.tar
and ILSVRC2012_img_val.tar
archives to this path: gs://[BUCKET-NAME]/tensorflow_datasets/downloads/manual
.
- One this is done, the user does the following:
import tensorflow_datasets as tfds
data_dir = "gs://[BUCKET-NAME]/tensorflow_datasets"
builder = tfds.builder("imagenet2012", data_dir=data_dir)
builder.download_and_prepare()
builder.download_and_prepare()
takes some time but it’s lesser than what the current process of obtaining the initial TFRecords takes.
- The the user can load the ImageNet-1k dataset with
tfds.load("imagenet2012", data_dir=data_dir)
and that is it.
The above two points assume the user already has access to the GCS bucket and all the necessary privileges to write data into it.
General recommendations
W.r.t
enable interleaved reading by setting num_parallel_reads=tf.data.AUTOTUNE
.
W.r.t
enable prefetching of a few batches so that the accelerator doesn’t have to wait by using dataset.prefetch(tf.data.AUTOTUNE)
.
Issue Analytics
- State:
- Created a year ago
- Reactions:6
- Comments:5 (4 by maintainers)
Top Results From Across the Web
[2205.01580] Better plain ViT baselines for ImageNet-1k - arXiv
This note presents a few minor modifications to the original Vision Transformer (ViT) vanilla training setting that dramatically improve the ...
Read more >Does someone reproduce the accuracy of imagenet1k? #47
Could someone share the log of reproducing the accuracy of imagenet1k?
Read more >Is ImageNet21k a Better Dataset for Transfer Learning in ...
This informal report describes short experiments comparing ImageNet21k and ImageNet1k on a steganalysis task. Made by Yassine Yousfi using ...
Read more >imagenet-1k · Datasets at Hugging Face
Supported Tasks and Leaderboards. image-classification : The goal of this task is to ... Increasing the shape bias improves the accuracy and robustness....
Read more >Achieving Deep Learning Training in less than 40 Minutes on ...
By using the ILSVRC2014 validation data, we consistently increase the top-1 validation accuracy by 0.3%-0.4%, thus all models trained for at ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Yes, it’s possible. However, keeping things inside a GCS Bucket is necessary to leverage TPU-based training runs. So, it kind of solves different purposes.
tfds still requires you to download the dataset manually. Are you referring to the process of converting from .tar.gz to TFRecords?