Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TorchScripted BERT model not working on GPU

See original GitHub issue

The BERT model example is not working on GPU machines. Model gets loaded but the predictions are failing.

Context

torchserve version: Tested with both v0.1.1 release and install from source
torch version: 1.5.1
torchvision version [if any]: 0.6.1
torchtext version [if any]: 0.6.0
torchaudio version [if any]: n/a
java version: 11
Operating System and version: ubuntu 18.04
Transformers: v2.11.0

Your Environment

Installed using source? [yes/no]: tested with both - same issue on both
Are you planning to deploy it using docker container? [yes/no]: no
Is it a CPU or GPU environment?: GPU
Using a default/custom handler? [If possible upload/share custom handler/model]: BERT model example
What kind of model is it e.g. vision, text, audio?: Transformer model
Are you planning to use local models from model-store or public url being used e.g. from S3 bucket etc.? [If public url then provide link.]: S3
Provide config.properties, logs [ts.log] and parameters used for model registration/update APIs:
Link to your project [if any]:

Expected Behavior

Prediction example using steps in readme should work http://127.0.0.1:8080/predictions/my_tc -T ./Seq_classification_artifacts/sample_text.txt

Current Behavior

Getting 503 error instead. See detailed logs:

Possible Solution

None for GPU at present

Steps to Reproduce

Followed steps in readme

python Download_Transformer_models.py
torch-model-archiver --model-name BERTSeqClassification_Torchscript --version 1.0 --serialized-file Transformer_model/traced_model.pt --handler ./Transformer_handler_generalized.py --extra-files “./setup_config.json,./Seq_classification_artifacts/index_to_name.json”
mkdir model_store
mv BERTSeqClassification_Torchscript.mar model_store/
torchserve --start --model-store model_store --models my_tc=BERTSeqClassification_Torchscript.mar
http://127.0.0.1:8080/predictions/my_tc -T ./Seq_classification_artifacts/sample_text.txt