Dev Observability
Product
Pricing
Docs
Resources
Blog
Company
Debug Wordle

question-mark

Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

About per_process_gpu_memory_fraction and multi gpu.

See original GitHub issue

Hi，I have som problems when I train the model in the multi gpu way. Just like that ValueError: To call multi_gpu_model with gpus=2, we expect the following devices to be available: [‘/cpu:0’, ‘/gpu:0’, ‘/gpu:1’]. However this machine only has: [u’/cpu:0’]. Try reducing gpus. But I can train the model in one gpu. And the memory of gpu just cost 105MB, no matter how can i modified the batchsize from 8 to 128. Could you give me some help? Thank you very much.

Issue Analytics

State:
Created 5 years ago
Comments:8 (3 by maintainers)

Top GitHub Comments

1reaction

hgaisercommented, Apr 30, 2018

I’m running 1.7.

1reaction

hgaisercommented, Apr 29, 2018

It sounds to me like your GPUs are not being found / used at all. The network takes way more than 105mb. Does nvidia-smi list your GPU?

Read more comments on GitHub >

Top Results From Across the Web

Use a GPU | TensorFlow Core

The simplest way to run on multiple GPUs, on one or many machines, is using ... By default, TensorFlow maps nearly all of...

Multi-Process Service :: GPU Deployment and Management ...

Without MPS each CUDA processes using a GPU allocates separate storage ... This mechanism provides a facility to fractionalize GPU memory ...

How to limit GPU Memory in TensorFlow 2.0 (and 1.x)

This code below corresponds to TF2.0's 2nd option, but it sets memory fraction, not a definite value. # change the memory fraction as...

A parameter to set per process gpu memory limit/fraction #1383

I am interested to know if you are looking to: Torchserve instance ( its multi-model serving, so it can have multi models and...

How To Fit a Bigger Model and Train It Faster - Hugging Face

Training ever larger models can become challenging even on modern GPUs. Due to their immense size we often run out of GPU memory...

Top Related Medium Post

No results found

Top Related StackOverflow Question

No results found

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Top Related Reddit Thread

No results found

Top Related Hackernoon Post

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Top Related Hashnode Post

No results found

Memory Error

Keras hangs on predit