question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot use trained BERT model from a trained checkpoint

See original GitHub issue

I trained the BERT and got the model.ckpt.data, model.ckpt.meta. model.ckpt.index in the output directory along with predictions.json, etc.

python run_squad.py \
  --vocab_file=$BERT_LARGE_DIR/vocab.txt \
  --bert_config_file=$BERT_LARGE_DIR/bert_config.json \
  --init_checkpoint=$BERT_LARGE_DIR/bert_model.ckpt \
  --do_train=True \
  --train_file=$SQUAD_DIR/train-v2.0.json \
  --do_predict=True \
  --predict_file=$SQUAD_DIR/dev-v2.0.json \
  --train_batch_size=24 \
  --learning_rate=3e-5 \
  --num_train_epochs=2.0 \
  --max_seq_length=384 \
  --doc_stride=128 \
  --output_dir=gs://some_bucket/squad_large/ \
  --use_tpu=True \
  --tpu_name=$TPU_NAME \
  --version_2_with_negative=True

I tried to copy the model.ckpt.meta, model.ckpt.index, model.ckpt.data to the BERT directory and changed the run_squad.py flags as follows to only predict the answer and not train using a dataset:

python run_squad.py \
  --vocab_file=$BERT_LARGE_DIR/vocab.txt \
  --bert_config_file=$BERT_LARGE_DIR/bert_config.json \
  --init_checkpoint=$BERT_LARGE_DIR/model.ckpt \
  --do_train=False \
  --train_file=$SQUAD_DIR/train-v2.0.json \
  --do_predict=True \
  --predict_file=$SQUAD_DIR/dev-v2.0.json \
  --train_batch_size=24 \
  --learning_rate=3e-5 \
  --num_train_epochs=2.0 \
  --max_seq_length=384 \
  --doc_stride=128 \
  --output_dir=gs://some_bucket/squad_large/ \
  --use_tpu=True \
  --tpu_name=$TPU_NAME \
  --version_2_with_negative=True

It throws bucket directory/model.ckpt does not exist error.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
JeevaTMcommented, Jul 10, 2019

Is it supposed to create a new model.ckpt-# file each time I run a train? I trained on two sample datasets and got 2 model.ckpt but it’s not creating anymore. Thanks for your help!

Yes, satyapraffulRCG. It is supposed to create checkpoints for each training. For SQUAD 2.0, there was 11 checkpoints.

0reactions
satyapraffulRCGcommented, Jul 9, 2019

Is it supposed to create a new model.ckpt-# file each time I run a train? I trained on two sample datasets and got 2 model.ckpt but it’s not creating anymore. Thanks for your help!

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to use trained BERT model checkpoints for prediction?
It throws bucket directory/model.ckpt does not exist error. How to utilize the checkpoints generated after training and use it for prediction?
Read more >
Models - Hugging Face Course
This is a model checkpoint that was trained by the authors of BERT themselves; you can find more details about it in its...
Read more >
How to load the pre-trained BERT model from local/colab ...
I want to train the bert masked language model on custom corpus ,i followed the step shared in BERT githhub "github.com/google-research/bert# ...
Read more >
Transfer learning and fine-tuning | TensorFlow Core
You either use the pretrained model as is or use transfer learning to ... dlerror: libnvinfer.so.7: cannot open shared object file: No such...
Read more >
pytorch-pretrained-bert - PyPI
Here is a quick-start example using BertTokenizer , BertModel and BertForMaskedLM class with Google AI's pre-trained Bert base uncased model. See the doc ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found