Memory chunks overflow when using tf.nn.seq2seq.embedding_attention_seq2seq
See original GitHub issueHi,
I’m running the project from source (master) using Python 3.5, and when I change the model from:
tf.nn.seq2seq.embedding_rnn_seq2seq
to
tf.nn.seq2seq.embedding_attention_seq2seq
on line 160 of model.py, it blows up, and I get this message: . . . I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 7681919232 totalling 7.15GiB
My GPU (a GTX 1080) only has roughly 5 GB of RAM. I tried decreasing the batch size; but even with a batchSize of 5, I still get the same error.
It appears that the chunks are simply too large. How do I decrease them? Also, what exactly does each chunk represent? Is it a vector embedding? Or something else?
Issue Analytics
- State:
- Created 7 years ago
- Comments:12 (5 by maintainers)
Top Results From Across the Web
Memory chunks overflow when using tf.nn.seq2seq ... - GitHub
when you change parameter,maxLength、numLayers Increase at the same time,Memory chunks overflow。What can we do to solve it?
Read more >Understanding Seq2Seq model - tensorflow - Stack Overflow
The first is a many to one LSTM, which summarises the question at the last hidden layer/ cell memory. The second set (blue)...
Read more >Train a deep learning model in chunks/sequentially to avoid ...
When I run the above, I keep getting the memory error: Unable to allocate 37.9 GiB for an array with shape (67912, 75000)...
Read more >TensorFlow Addons Networks : Sequence-to-Sequence NMT ...
Define a NMTDataset class with necessary functions to follow Step 1 to ... Model with Attention; Final Translation with tf.addons.seq2seq.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Try to set the
softmaxSamples
parameter:--softmaxSamples 512
. That may helps.Also I’ll say that the vocabulary size is probably too big as described here https://github.com/Conchylicultor/DeepQA/issues/29#issuecomment-267771058.
Hi, I am trying to train on the Cornell movie data corpus using attention as well. I am executing the code on a GPU. As can be seen in the image, this is the step where the execution is stuck and not moving ahead. What could be the problem? I checked again by training it without attention, it executes smoothly. As soon as I change the bit for having attention, the execution gets stuck here.