CPU based pre-trained model
See original GitHub issueI am guessing that the model provided is for machines with CUDA-capable device. Do you guys happen to have a pre-trained CPU version for cnndm_model.bin ?
@@ -165,7 +165,7 @@ def main():
print(args.model_recover_path)
for model_recover_path in glob.glob(args.model_recover_path.strip()):
logger.info("***** Recover model: %s *****", model_recover_path)
- model_recover = torch.load(model_recover_path)
+ model_recover = torch.load(model_recover_path, map_location="cpu")
DATA_DIR=../cnndm_data
MODEL_RECOVER_PATH=../cnndm_model.bin
EVAL_SPLIT=test
export PYTORCH_PRETRAINED_BERT_CACHE=/tmp/bert-cased-pretrained-cache
# run decoding
python biunilm/decode_seq2seq.py --fp16 --amp --bert_model bert-large-cased --new_segment_ids --mode s2s --need_score_t
races \
--input_file ${DATA_DIR}/${EVAL_SPLIT}.src --split ${EVAL_SPLIT} --tokenized_input \
--model_recover_path ${MODEL_RECOVER_PATH} \
--max_seq_length 768 --max_tgt_length 128 \
--batch_size 64 --beam_size 5 --length_penalty 0 \
--forbid_duplicate_ngrams --forbid_ignore_word ".|[X_SEP]"
11/04/2019 15:55:06 - INFO - pytorch_pretrained_bert.tokenization - loading vocabulary file https://s3.amazonaws.com/
models.huggingface.co/bert/bert-large-cased-vocab.txt from cache at /tmp/bert-cased-pretrained-cache/cee054f6aafe5e2cf8
16d2228704e326446785f940f5451a5b26033516a4ac3d.e13dbb970cb325137104fb2e5f36fe865f27746c6b526f6352861b1980eb80b1
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=51 error=38 : no CUDA-capable device is detected
Traceback (most recent call last):
File "biunilm/decode_seq2seq.py", line 254, in <module>
main()
File "biunilm/decode_seq2seq.py", line 147, in main
amp_handle = amp.init(enable_caching=True)
File "/home/john/.virtualenvs/unilm/lib/python3.6/site-packages/apex/amp/amp.py", line 65, in init
handle = AmpHandle(enable_caching, verbose)
File "/home/john/.virtualenvs/unilm/lib/python3.6/site-packages/apex/amp/handle.py", line 14, in __init__
self._default_scaler = LossScaler()
File "/home/john/.virtualenvs/unilm/lib/python3.6/site-packages/apex/amp/scaler.py", line 35, in __init__
self._overflow_buf = torch.cuda.IntTensor([0])
File "/home/john/.virtualenvs/unilm/lib/python3.6/site-packages/torch/cuda/__init__.py", line 163, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /pytorch/aten/src/THC/THCGeneral.cpp:51
[1] 72305 exit 1 python biunilm/decode_seq2seq.py --fp16 --amp --bert_model bert-large-cased
without --amp
:
Traceback (most recent call last):
File "biunilm/decode_seq2seq.py", line 254, in <module>
main()
File "biunilm/decode_seq2seq.py", line 216, in main
position_ids, input_mask, task_idx=task_idx, mask_qkv=mask_qkv)
File "/home/john/.virtualenvs/unilm/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/home/john/code/unilm/src/pytorch_pretrained_bert/modeling.py", line 1409, in forward
return self.beam_search(input_ids, token_type_ids, position_ids, attention_mask, task_idx=task_idx, mask_qkv=mask_qkv)
File "/home/john/code/unilm/src/pytorch_pretrained_bert/modeling.py", line 1528, in beam_search
output_all_encoded_layers=True, prev_embedding=prev_embedding, prev_encoded_layers=prev_encoded_layers, mask_qkv=mask_qkv)
File "/home/john/.virtualenvs/unilm/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/home/john/code/unilm/src/pytorch_pretrained_bert/modeling.py", line 1062, in forward
input_ids, token_type_ids, attention_mask)
File "/home/john/code/unilm/src/pytorch_pretrained_bert/modeling.py", line 1037, in get_extended_attention_mask
extended_attention_mask = (1.0 - extended_attention_mask) * -10000.0
File "/home/john/.virtualenvs/unilm/lib/python3.6/site-packages/torch/tensor.py", line 371, in __rsub__
return _C._VariableFunctions.rsub(self, other)
RuntimeError: "add_cpu" not implemented for 'Half'
Packages:
pytorch-pretrained-bert 0.4.0
torch 1.1.0
tensorboardX 1.9
apex 0.1
Issue Analytics
- State:
- Created 4 years ago
- Comments:11 (2 by maintainers)
Top Results From Across the Web
Train models on CPU — VISSL 0.1.6 documentation
VISSL supports training any model on CPUs. Typically, this involves correctly setting the MACHINE.DEVICE=cpu and adjusting the distributed settings ...
Read more >Handling big models - Hugging Face
In step 2, we load another full version of the model in RAM, with the pretrained weights. If you're loading a model with...
Read more >Using pretrained LSTM and Bert Models in CPU Only ...
I loaded the models and in a computer where only CPU is available. They both work fine but the model.predict(text) function is super...
Read more >GPUs vs CPUs for deployment of deep learning models
It can be concluded that for deep learning inference tasks which use models with high number of parameters, GPU based deployments benefit from ......
Read more >Do I need gpu while working with pretrained model?
Usually, when using a trained model you just need to do a sparse prediction per time unit. In such situation CPU approach should...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks aretius
The following changes made it work. Lowering batch_size and beam_size is also crucial.
Hey @johnyoonh To run in CPU mode you should run
decode_seq2seq.py
without--amp
and--fp16
. Also don’t installapex
. Where ever the code is importingapex
just comment that part out.Apex
is just supposed to make things faster not necessary for inference. Also stick to the versions of libraries as suggested in the README.md I have spent a whole day to figure out the CPU inference for UniLM. If you have more doubts do tell me.