question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Ludwig does not use tensorflow-gpu?

See original GitHub issue

I have tensorflow-gpu installed and Keras can use the GPU effectively. I only have one GPU. With ludwig, I tried a regression problem and found the training is very slow. train_stats = ludwig_model.train(data_df=df, logging_level=logging.ERROR, gpus=[0]) By watch -n 1 nvidia-smi, I found the training did not actually utilize the GPU but stored the data in the GPU memory anyway.

±----------------------------------------------------------------------------+ | NVIDIA-SMI 410.93 Driver Version: 410.93 CUDA Version: 10.0 | |-------------------------------±---------------------±---------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Quadro P6000 Off | 00000000:03:00.0 On | Off | | 26% 44C P8 19W / 250W | 24289MiB / 24449MiB | 0% Default | ±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 822 G /usr/bin/gnome-shell 186MiB | | 0 4687 C /home/yshi1/anaconda3/bin/python 22929MiB | | 0 10166 C /home/yshi1/anaconda3/bin/python 977MiB | | 0 23459 G /usr/bin/X 191MiB | ±----------------------------------------------------------------------------+

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:7 (1 by maintainers)

github_iconTop GitHub Comments

2reactions
w4nderlustcommented, Feb 13, 2019

The fact the the GPU memory is fully utilized by TesnorFlow means that the model is running on GPU. Try to run the same model from the command line instead of using the API and you should see the TensorFlow messaged printed on stderr. The fact that the utilization of the GPU is low may have to do with a couple things: your model is really small so there’s not much computation per batch to be done, or your batch is really small and so again there’s not much computation to be done per batch. To test for this, try to increase the batch size considerably. Finally, the process that reads data and provides it to TensorFlow at the moment is not super optimized, we are working on improving it, but you may be hitting an i/o bottleneck if your computation per batch is too small.

0reactions
w4nderlustcommented, Feb 14, 2019

As for the YAML examples, you find a bunch here. Be mindful of the - and the indentation. Glad you were able to make it work decently fast with a bigger batch size. Regarding the initialization, you can specify which initializer to use, so playing around with that may give you some better results. Regarding the reproducible example, you can use the data_synthesyzer script in ludwig/data co create a dataset that looks like yours pretty easily, we use it for integration tests. That should resolve the data issue. I’m closing the issue, but feel free to either open another one or reach out in private if you can provide me with the comparison script. You’re welcome.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Ludwig does not use tensorflow-gpu? · Issue #73 - GitHub
I have tensorflow-gpu installed and Keras can use the GPU effectively. I only have one GPU. With ludwig, I tried a regression problem...
Read more >
python - TensorFlow typechecking error in ludwig training
I am getting this issue in Ludwig training from TensorFlow. C:\Users\FRT\Desktop\workspace\pipeline-builder-ide\python\python37\lib\site- ...
Read more >
First impressions about Uber's Ludwig. A simple machine ...
Ludwig is a toolbox that allows to train and test deep learning ... to use GPU you need to install all of the...
Read more >
Install TensorFlow with pip
First install the NVIDIA GPU driver if you have not. You can use the following command to verify it is installed. nvidia- ...
Read more >
Command Line Interface - ludwig-ai
This is useful for reproducibility. Be aware that due to asynchronicity in the TensorFlow GPU execution, when training on GPU results may not...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found