Causal model running on GPU
See original GitHub issueHi, I am trying to run the LM model with the causal = True
on the GPU but I am getting some issues.
I am trying to run the following example:
import torch
from torch import nn
from performer_pytorch import PerformerLM
model = PerformerLM(
num_tokens = 20000,
max_seq_len = 2048, # max sequence length
dim = 512, # dimension
depth = 6, # layers
heads = 8, # heads
causal = True, # auto-regressive or not
nb_features = 256, # number of random features, if not set, will default to (d * log(d)), where d is the dimension of each head
generalized_attention = False, # defaults to softmax approximation, but can be set to True for generalized attention
kernel_fn = nn.ReLU(), # the kernel function to be used, if generalized attention is turned on, defaults to Relu
reversible = True, # reversible layers, from Reformer paper
ff_chunks = 10, # chunk feedforward layer, from Reformer paper
).cuda()
x = torch.randint(0, 20000, (1, 2048)).cuda()
model(x) # (1, 2048, 20000)
And I am getting this error:
Traceback (most recent call last):
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3343, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-4-a530c03a976e>", line 20, in <module>
model(x) # (1, 2048, 20000)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 253, in forward
x = self.performer(x, **kwargs)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 238, in forward
return self.net(x, **kwargs)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/reversible.py", line 160, in forward
out = _ReversibleFunction.apply(x, blocks, args)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/reversible.py", line 113, in forward
x = block(x, **kwarg)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/reversible.py", line 65, in forward
y1 = x1 + self.f(x2, record_rng=self.training, **f_args)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/reversible.py", line 40, in forward
return self.net(*args, **kwargs)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 170, in forward
return self.fn(self.norm(x), **kwargs)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 216, in forward
out = self.fast_attention(q, k, v)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 159, in forward
out = attn_fn(q, k, v)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/performer_pytorch/performer_pytorch.py", line 110, in causal_linear_attention
return CausalDotProduct.apply(q, k, v)
File "/home/walter/Desktop/minGPT/venv/lib/python3.6/site-packages/fast_transformers/causal_product/__init__.py", line 48, in forward
product
TypeError: 'NoneType' object is not callable
My system has: TITAN RTX CUDA Version: 10.2 Driver Version: 440.100
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (3 by maintainers)
Top Results From Across the Web
cuPC: CUDA-based Parallel PC Algorithm for Causal ... - arXiv
In this paper, we propose a novel GPU-based parallel algorithm, called cuPC, to execute an order-independent version of PC. The proposed solution has...
Read more >Training a causal language model from scratch - Hugging Face
Course documentation. Training a causal language model from scratch ... If you have access to a machine with multiple GPUs, try to run...
Read more >Implementing Causal Impact on Top of TensorFlow Probability
We'll be able to fully analyze whether a given random variable causes impact on another one (given a degree of confidence) which will...
Read more >cuPC: CUDA-based Parallel PC Algorithm for Causal ...
In this paper, we proposed a GPU-based parallel algorithm for PC-stable with two variants, i.e., cuPC-E and cuPC-S, to learn causal structures ...
Read more >Advanced Tutorial — Causal Discovery Toolbox 0.6.0 ...
4. Run SAM on GPUs ... Kalainathan, Diviyan & Goudet, Olivier & Guyon, Isabelle & Lopez-Paz, David & Sebag, Michèle. (2018). SAM: Structural...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I had the same issue. I am not sure what worked for me but after some steps training with casual=True is working.
my steps:
export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
@Warvito I’m in the dark as much as you are 😦 I have been putting off custom CUDA code for as long as I could, but the results of this paper was irresistible