Train on datasets larger than memory
See original GitHub issueI wish to train a MaskRCNN model, but I can’t fit all training set annotations as a list[dict]
in memory (as I understand, this is needed when using DatasetCatalog
).
How can we train really large models, where the annotations alone are vastly larger than the memory size?
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (2 by maintainers)
Top Results From Across the Web
Training on Large Datasets That Don't Fit In Memory in Keras
Training your Deep Learning algorithms on a huge dataset that is too large to fit in memory? If yes, this article will be...
Read more >tensorflow - Training on datasets too big to fit in RAM
I am using TensorFlow to train on a very large dataset, which is too large to fit in RAM. Therefore, I have split...
Read more >Performance tips | TensorFlow Datasets
Large datasets are sharded (split in multiple files) and typically do not fit in memory, so they should not be cached. Shuffle and...
Read more >Training models when data doesn't fit in memory
The data is not huge and it actually fits in memory, but it's big enough so we can demonstrate memory usage gains with...
Read more >Train Models on Large Datasets
Estimators implemented in Dask-ML work well with Dask Arrays and DataFrames. This can be much larger than a single machine's RAM. They can...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
D2Go should support training for all Detectron2 model configs.
@zhanghang1989 Hey, does D2Go cache option require training from the more limited D2Go Model Zoo? If so, is there any way to train on datasets where annotations don’t fit into memory and still be able to choose from any detectron2 model?
(I have a very large dataset to train, but am not interested in a model optimized for mobile deployment)