kedro run --pipeline is not working properly with GPU
See original GitHub issueDescription
I’m trying to run a pipeline that trains a Deep Learning classifier but the child processes that run the training are not able to use the GPU.
Context
The network is implemented in PyTorch and there is a NVIDIA GPU available (both nvidia-smi and torch.cuda.is_available() indicate that). I’m also able to run a simple script that multiplies two matrices in the GPU for many times and can see, at the same time, the GPU usage when I run nvidia-smi. When I start a jupyter lab from Kedro, I can also use the GPU normally.
When the pipeline starts (kedro run --pipeline <pipeline name> -t <arg>), the GPU memory is always filled with 255 MB, independent of me calling any of the PyTorch code (I have checked and commented even the imports that might have PyTorch code). Then, from inside the training method, if I print torch.cuda.is_available(), I get False.
Steps to Reproduce
- Create a simple PyTorch module that assigns the device to ‘cuda:0’ if the GPU is available, else it assigns the device to ‘cpu’.
- Call that module from a node in a kedro pipeline.
- Run
nvidia-smifor the pipeline execution using and not using the PyTorch module.
Environment
- Kedro version used: 0.17.4
- Python version used: 3.8.5
- Operating system and version: Ubuntu 18.04.1 x86_64 GNU/Linux
Issue Analytics
- State:
- Created a year ago
- Comments:6 (4 by maintainers)

Top Related StackOverflow Question
So it sounds like your nodes have side effects that are breaking this - I think to get this working we need to do the cuda/pytorch context stuff outside of the nodes and in some sort of singleton, the nodes shouldn’t have any awareness of IO just accept and return data
Keen to work this one out and coach you through this, but I think we might need to approach things differently.
I’ll close this issue due to inactivity. Feel free to re-open it if more support is needed!