Error with sparse attention
See original GitHub issueI get this error when I enable sparse attention: RuntimeError: Unable to JIT load the sparse_attn op due to it not being compatible due to hardware/software issue.
My nvcc --version
says
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0
And I am running the script on 4 RTX 2080Ti’s
Issue Analytics
- State:
- Created 3 years ago
- Comments:11 (4 by maintainers)
Top Results From Across the Web
Understanding BigBird's Block Sparse Attention - Hugging Face
BigBird relies on block sparse attention instead of normal attention (i.e. BERT's attention) and can handle sequences up to a length of 4096...
Read more >DeepSpeed Sparse Attention
In this tutorial we describe how to use DeepSpeed Sparse Attention (SA) and its building-block kernels. The easiest way to use SA is...
Read more >Sparse Attention with Learning to Hash - OpenReview
To overcome these issues, this paper proposes a new strategy for sparse attention, namely LHA (Learning-to-Hash Attention), which directly learns separate ...
Read more >sparse attention and its relation with attention mask
Can anyone please explain in a clear way what is the usage of mask in attention for sparse attention? I just can not...
Read more >Efficient Content-Based Sparse Attention with Routing ...
We show that our model outperforms comparable sparse attention models on language modeling on Wikitext-103 (15.8 vs 18.3 perplexity), as well as on...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Why is GeForce RTX being left out? or are you working to handle RTX later? Is RTX very different compute platform compared to V100?
Hi @ShivanshuPurohit
Thanks for sending the report. The sparse-attention code should be compiled through JIT and there is no need to be pre-installed. However, I am seeing that the CUDA version for torch is 10.2 and for the nvcc is 11.0. I think there might be some compatibility issue there! Can you make the CUDA versions similar and try it again? Thanks.