How exactly to use GPU with KeyBERT?
See original GitHub issueI’m trying to extract keywords and keyphrases from around 20k abstracts of journal articles. The FAQ mentions that it is recommended to use GPU with KeyBERT. However, I’m unclear how exactly to run the extract_keywords function on GPU. I tried
model = KeyBERT()
model.to(device)
but it says KeyBERT() has no attribute ‘to’. I’d appreciate some help in implementing KeyBERT on GPU. Thanks!
Issue Analytics
- State:
- Created a year ago
- Comments:14 (7 by maintainers)
Top Results From Across the Web
FAQ - KeyBERT - Maarten Grootendorst
How can I speed up the model?¶. Since KeyBERT uses large language models as its backend, a GPU is typically prefered when using...
Read more >How to Extract Relevant Keywords with KeyBERT
To use this method, you start by setting the top_n argument to a value, say 20. Then 2 x top_n keywords are extracted...
Read more >KeyBERT for Keyword Extraction - Mark III Systems
To use KeyBERT, only a few lines of code are required. · from keybert import KeyBERT · Then, create a variable to hold...
Read more >This keyboard has a CPU and GPU to help create crazy effects
Have you ever wanted to own a keyboard that has its own CPU and GPU? No? Well, it's almost here regardless, and it's...
Read more >Introducing The GeForce RTX Keyboard Keycap - NVIDIA
... 2022 | Featured Stories Community Contests GeForce RTX GPUs. It's time to bring the magic of RTX technology… to your keyboard!
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

@Amaimersion Let me start off by saying thank you for this extensive search into what exactly is happening here! You are one of the few that goes that much in-depth and it makes my work a whole lot easier 😄
There are a few small things that I have noticed but I believe most of it is indeed due to the
KeyphraseCountVectorizerwhich I will come back to in a bit.After performing the above, it might be worthwhile to again check whether cuda is enabled. From your results, I am quite sure it is but just to be certain.
Thank you for this example, it indeed clearly indicates that GPU is working as it should in pytorch.
Based on these, I think you are correct in stating that it is likely the
KeyphraseCountVectorizer. In my experiments, the model can be quite slow compared to a SentenceTransformer model for example. The processing it needs to do seems to require much more compute, so it is unsurprising that it slows down quite a bit. Having said that, you should still need to see some improvement when using a cuda-enabled GPU, which you clearly have.I believe what is happening is a mixture of two things:
KeyphraseCountVectorizer, as a default, actually uses a model optimized for CPU, namelyen_core_web_smThe lengths of the documents make it a bit misleading
This might sound a bit strange seeing as you got the same results regardless of the length of the texts. The misleading part here is that SentenceTransformers simply truncates the text if is passes a certain length but this same process does not happen with the
KeyphraseCountVectorizer. Thus, the GPU will only be used for a short time on the truncated text since embedding a single text is relatively quickly. This leads me to the following:KeyphraseCountVectorizeruses a CPU-optimized modelThe default model in
KeyphraseCountVectorizeris Spacy’sen_core_web_smwhich is optimized for the CPU and not the GPU. What likely happens is that after embedding the documents using the SentenceTransformer, which happens typically quite fast, theKeyphraseCountVectorizerwill take some time to generate the candidate keywords.I think the solution here is to either stop using
KeyphraseCountVectorizeror, which I would highly advise testing out, use theen_core_web_trfmodel instead. That model is, like SentenceTransformer, a transformer model and thereby benefits from using a GPU. This does not mean it will automatically be faster thanen_core_web_smsince they differ in size and speed.@thtang This depends on the model that you are using, some support it and others do not. As a default,
sentence-transformersis used and will only use a single GPU. However, you can create custom back-ends that support this: