question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] getting started `03-Training-with-TF` nb gives OOM on a 16 GB GPU

See original GitHub issue

Describe the bug

I am getting OOM issue when I train 03-Training-with-TF nb on a 16 GB GPU, and this can be problematic for users who are running these notebooks in the cloud with GPU memory sizes.

I can avoid OOM if I comment out os.environ["TF_MEMORY_ALLOCATION"] = "0.7" line. It works fine then.

Steps/Code to reproduce bug

Run 03-Training-with-TF to repro.

Expected behavior Should run without OOM. I recommend to remove os.environ["TF_MEMORY_ALLOCATION"] = "0.7" line from the example nb.

Environment details (please complete the following information):

  • Environment location: [Bare-metal, Docker, Cloud(specify cloud provider)]
  • Method of NVTabular install: [conda, Docker, or from source]
    • If method of install is [Docker], provide docker pull & docker run commands used

I am using merlin-tensorflow-training:22.04.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:9 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
rnyakcommented, May 4, 2022

@EvenOldridge and @karlhigley My understanding is thatos.environ["TF_MEMORY_ALLOCATION"] = "0.5" is currently the default behavior under the hood (because of configure_tensorflow()) if we use NVT Keras dataloader. So I dont think we need to define it again in the notebook.

1reaction
karlhigleycommented, Apr 22, 2022

Which example is this for? Let me see how it goes on an 11GB GPU. 😺

Read more comments on GitHub >

github_iconTop Results From Across the Web

How much GPU Memory do you REALLY need? - YouTube
People get REALLY caught up on Video Card memory... so today lets talk about how much you ACTUALLY need! Learn more about the...
Read more >
Radeon Software: GPU Utilization 100% bug - AMD Community
Occurs when the following settings are made in some video games. Remains at 100% usage even after exiting the game program. The Talos...
Read more >
Lightroom Classic GPU FAQ - Adobe Support
Learn how to use Adobe Lightroom Classic GPU (graphics processor acceleration, ... 8 GB of dedicated GPU RAM or 16 GB of shared...
Read more >
GeForce RTX 30-Series Laptops - NVIDIA
NVIDIA ® GeForce RTX ™ 30 Series Laptop GPUs power the world's fastest laptops for gamers and creators. They're built with Ampere—NVIDIA's 2nd...
Read more >
GeForce RTX™ 4080 16GB EAGLE OC - Graphics Card
The 3D Active Fan provides semi-passive cooling, and the fans will remain off when the GPU is in a low load or low...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found