Low GPU utilization with tfjs-node-gpu
See original GitHub issueTensorFlow.js version
"dependencies": {
"@tensorflow/tfjs": "^0.11.4",
"@tensorflow/tfjs-node": "^0.1.5",
"@tensorflow/tfjs-node-gpu": "^0.1.7",
}
Browser version
N/A. Node v8.9.4. Ubuntu 16.04
Describe the problem or feature request
Using tfjs-node-gpu
, I can’t seem to get GPU utilization above ~0-3%. I have CUDA 9 and CuDNN 7.1 installed, am importing @tensorflow/tfjs-node-gpu
, and am setting the “tensorflow” backend with tf.setBackend('tensorflow')
. CPU usage is at 100% on one core, but GPU utilization is practically none. I’ve tried tfjs-examples/baseball-node
(replacing import'@tensorflow/tfjs-node'
with import'@tensorflow/tfjs-node-gpu'
of course) as well as my own custom LSTM code. Does tfjs-node-gpu
actually run processes on the GPU?
Code to reproduce the bug / link to feature request
# assumes CUDA 9, CuDNN 7.1, and latest nvidia drivers are already installed
git clone https://github.com/tensorflow/tfjs-examples
cd tfjs-examples/baseball-node
# replace tfjs-node import with tfjs-node-gpu
sed -i s/tfjs-node/tfjs-node-gpu/ src/server/server.ts
# install dependencies and download data
yarn add @tensorflow/tfjs-node-gpu
yarn && yarn download-data
# start the server
yarn start-server
Now open another terminal and watch GPU usage. Note that if you are running the process on the same GPU as an X window server GPU usage will likely be greater than 3% because of that process. I’ve tested this on a dedicated GPU running no other processes using the CUDA_VISIBLE_DEVICES
env var.
# monitor GPU utilization
watch -n 0.1 nvidia-smi
Issue Analytics
- State:
- Created 5 years ago
- Comments:10 (5 by maintainers)
Top GitHub Comments
Gotcha. Thanks for that clarification. I’ve revisited the char-rnn tfjs-node-gpu example I was telling you about and it looks like it is indeed running on the GPU as memory is allocated, but GPU utilization is ~1%. If I’m understanding you correctly this is because tfjs-node-gpu is using TF Eager mode. So I should expect the same type of model to run ~1 GPU utilization if it were written in Python using TF Eager mode as well, correct?
Does tfjs-node-gpu intend to add support for graph-based execution at some point in the near future? Unless I’m missing something, this “Eager mode only” behavior creates some significance performance hurdles, no? In general, how does tfjs-node-gpu compare in performance to similar implementations in Keras?
I ask because I’m writing some documentation for my team and am beginning to consider a javascript-first approach to common high-level ML tasks. A year ago that would have seemed like a crazy idea, but with tfjs, maybe not so. Basically I’m curious if tfjs-node-gpu will ever be comparable in performance to Keras and Python Tensorflow?
We actually experience the same. Running our model on CPU takes ~400ms, running it on GPU takes ~3000ms. This happens on a server with two NVIDIA GeForce RTX 3090 and cuda 11.6 with cudnn 8.3. Relevant logs:
I can confrim that cuda is installed well as I am able to utilize it with several other tools correctly.
This does not happen in the browser though, running on WebGL is way faster than CPU inference.
UPDATE: I actually have to admit, that I was only testing these by only doing 1 inference instead of 100s or 1000s. I created test suites for larger magnitudes of inference, and it's actually true that copying the model to GPU memory is what takes a lot of time. After that's done, GPU inference is way faster than CPU inference:
GPU info:
CPU info:
Following were the results for averaging 100 inferences on a hot GPU (model is loaded to GPU memory and not disposed between
model.execute
calls):