pipeline fill_mask.py - needs to convert input_ids to cpu before calling numpy
See original GitHub issueShould be
tokens = input_ids.cpu().numpy()
from transformers import AutoTokenizer, AutoModel, AutoModelForMaskedLM, pipeline
#tokenizer = AutoTokenizer.from_pretrained("google/fnet-base")
#model = AutoModelForMaskedLM.from_pretrained("google/fnet-base").cuda()
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModelForMaskedLM.from_pretrained("bert-base-uncased").cuda().half()
unmasker = pipeline('fill-mask', model=model, tokenizer=tokenizer, device=0)
unmasker("Hello I'm a [MASK] model.")
[
{"sequence": "hello i'm a new model.", "score": 0.12073223292827606, "token": 351, "token_str": "new"},
{"sequence": "hello i'm a first model.", "score": 0.08501081168651581, "token": 478, "token_str": "first"},
{"sequence": "hello i'm a next model.", "score": 0.060546260327100754, "token": 1037, "token_str": "next"},
{"sequence": "hello i'm a last model.", "score": 0.038265593349933624, "token": 813, "token_str": "last"},
{"sequence": "hello i'm a sister model.", "score": 0.033868927508592606, "token": 6232, "token_str": "sister"},
]
Will produce the right result with the change. But will raise an error without the change:
TypeError: can’t convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
Why do we call .detach() before calling .numpy() on a Pytorch ...
You need to convert your tensor to another tensor that isn't requiring a gradient in addition to its actual value definition. In the...
Read more >Using Dali Pipeline on a GPU input · Issue #1478 - GitHub
Hi all, I tried to figure out if there was a way, using Dali's Python API, to run the pipeline on data which...
Read more >Using NumPy efficiently between processes | Analytics Vidhya
First, a quick primer on some terminology. In Python, if we want to take full advantage of the processing power of your CPU,...
Read more >RoBERTa - Hugging Face
Retrieve sequence ids from a token list that has no special tokens added. This method is called when adding special tokens using the...
Read more >The numpy.ma module — NumPy v1.24 Manual
What is a masked array?#. In many circumstances, datasets can be incomplete or tainted by the presence of invalid data. For example, a...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Maybe after I finish finding some more bugs… I’m looking at fnet and found another bug. I’m tryin to get fnet working in half mode…
This was fixed on master, feel free to reopen.