ResourceExhaustedError after several iterations in a grid search
See original GitHub issueFirst off, make sure to check your support options.
The preferred way to resolve usage related matters is through the docs which are maintained up-to-date with the latest version of Talos.
If you do end up asking for support in a new issue, make sure to follow the below steps carefully.
1) Confirm the below
- I have looked for an answer in the Docs
- My Python version is 3.5 or higher
- I have searched through the issues Issues for a duplicate
- I’ve tested that my Keras model works as a stand-alone
2) Include the output of:
talos.__version__ == 0.6.7
3) Explain clearly what you are trying to achieve
I am running a grid search that gives 36 rounds.
After about 4 or 5 rounds, during a model.fit I suddenly get hit by a ResourceExhaustedError
. I think this is very odd given that I am able to complete at least 3 rounds of fitting on the GPU (with a model and batch size that takes up pretty much all the gpu memory), so it seems that there is a small but significant memory leak somewhere. Any ideas what it could be?
Issue Analytics
- State:
- Created 3 years ago
- Comments:33 (12 by maintainers)
Top Results From Across the Web
TensorFlow OOM when looping over multiple experiments
ResourceExhaustedError : OOM when allocating tensor. The models are all the same with a grid search on the learning rate.
Read more >tf.keras.backend.clear_session | TensorFlow v2.11.0
Keras starts with a blank state at each iteration # and memory consumption is constant over time. tf.keras.backend.clear_session() model = tf.keras.
Read more >OOM when allocating tensor with shape[128,8,21]....
I was on Epoch 1 / 100 and 2054 / 20736 iterations when it crashed with this message. OS: Windows 10. CUDA v10....
Read more >Your First Deep Learning Project in Python with Keras Step-by ...
The model will always have some error, but the amount of error will level out after some point for a given model configuration....
Read more >Hyperparameter Tuning - Intro to Deep Learning
There are no set rules for choosing many of these hyperparameters, ... This can be done in many ways, such as through a...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Sure! I use custom
keras.utils.Sequence
data generators, so I have two dummy variables for my scan command as shown below:I will take a look at talos 1.0 right away!
I would love to, but that option crashes my python kernel, so it’s not really possible. This is a long-standing Keras bug, I believe.