Release the new BertQA class trained on SQUAD (GPU + CPU Version)
See original GitHub issue@fmikaelian
As we modified the BertQA()
class, the .joblib
models we released are not based on the same class structure anymore.
We should retrain it using the new version of BertQA()
and release them.
Issue Analytics
- State:
- Created 4 years ago
- Comments:19
Top Results From Across the Web
Building a QA System with BERT on Wikipedia
A high-level code walk-through of an IR-based QA system with PyTorch and Hugging Face.
Read more >Running BERT SQUAD model on GPU - python - Stack Overflow
I know that I can send the model to the GPU using model.tocuda(). But how do I send the inputs, train the model,...
Read more >Question answering - Hugging Face Course
We will fine-tune a BERT model on the SQuAD dataset, which consists of questions ... If you're interested in this type of generative...
Read more >Which flavor of BERT should you use for your QA task?
We ran predictions with our selected models on both versions of SQuAD (version 1 and version 2). The difference between them is that ......
Read more >How to Train a BERT Model From Scratch
BERT is a powerful NLP model for many language tasks. In this article we will create our own model from scratch and train...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I did not do it with this particular model, but I have already worked with other models in Pytorch sending the model to CPU than back to GPU.
The Pytorch library is built in a way that by doing this operation we should not expect any effect on performances. When we send a model and/or tensors to a device (“cpu” or “cuda”) all the parameters of the model remain the same and do not have their parameters changed, they are just sent to the device memory. As all the tensors and parameters have the same value after this operation, the model is exactly a copy of the original model but allocated in another space of memory (the device memory)
Yes, I just released the new CPU model