question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[ALBERT] Tokenization crashes while trying to finetune classifier with TF Hub model

See original GitHub issue

I’m trying to get ALBERT running locally with the following command line: python -m albert.run_classifier_with_tfhub --task_name=MNLI --data_dir=./multinli_1.0 --albert_hub_module_handle=https://tfhub.dev/google/albert_large/1 --output_dir=./output --do_train=True

When tokenizer is initialized from TF Hub model it crashes:

Traceback (most recent call last):
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/vladimirbugay/Knoema/GitHub/google-research/albert/run_classifier_with_tfhub.py", line 320, in <module>
    tf.app.run()
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/Users/vladimirbugay/Knoema/GitHub/google-research/albert/run_classifier_with_tfhub.py", line 187, in main
    tokenizer = create_tokenizer_from_hub_module(FLAGS.albert_hub_module_handle)
  File "/Users/vladimirbugay/Knoema/GitHub/google-research/albert/run_classifier_with_tfhub.py", line 161, in create_tokenizer_from_hub_module
    spm_model_file=FLAGS.spm_model_file)
  File "/Users/vladimirbugay/Knoema/GitHub/google-research/albert/tokenization.py", line 247, in __init__
    self.vocab = load_vocab(vocab_file)
  File "/Users/vladimirbugay/Knoema/GitHub/google-research/albert/tokenization.py", line 201, in load_vocab
    token = token.strip().split()[0]
IndexError: list index out of range

The issue is with the line being just a newline character ‘\n’. However, even if I modify code to ignore them it still crashes later with

Traceback (most recent call last):
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/vladimirbugay/Knoema/GitHub/google-research/albert/run_classifier_with_tfhub.py", line 320, in <module>
    tf.app.run()
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/Users/vladimirbugay/Knoema/GitHub/google-research/albert/run_classifier_with_tfhub.py", line 187, in main
    tokenizer = create_tokenizer_from_hub_module(FLAGS.albert_hub_module_handle)
  File "/Users/vladimirbugay/Knoema/GitHub/google-research/albert/run_classifier_with_tfhub.py", line 161, in create_tokenizer_from_hub_module
    spm_model_file=FLAGS.spm_model_file)
  File "/Users/vladimirbugay/Knoema/GitHub/google-research/albert/tokenization.py", line 249, in __init__
    self.vocab = load_vocab(vocab_file)
  File "/Users/vladimirbugay/Knoema/GitHub/google-research/albert/tokenization.py", line 198, in load_vocab
    token = convert_to_unicode(reader.readline())
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 179, in readline
    return self._prepare_value(self._read_buf.ReadLineAsString())
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 98, in _prepare_value
    return compat.as_str_any(val)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/compat.py", line 117, in as_str_any
    return as_str(value)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/compat.py", line 87, in as_text
    return bytes_or_text.decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 8: invalid start byte

I’m running the code on OS X Catalina, Anaconda, Python 3.6

sentencepiece             0.1.83                   pypi_0    pypi
tensorflow                1.14.0          mkl_py36h933f829_0  
tensorflow-base           1.14.0          mkl_py36h655c25b_0  
tensorflow-estimator      1.14.0                     py_0  
tensorflow-hub            0.6.0              pyhe1b5a44_0    conda-forge

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:6

github_iconTop GitHub Comments

3reactions
vbougaycommented, Oct 29, 2019

It turned out that the file which the script tries to load as a vocabulary in fact is saved SentencePiece model. The change in the lines 159-161 did the trick:

  return tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case,
      spm_model_file=vocab_file)

vocab_file is ignored when spm_model_file is set and there is no way to pass null vocab_file so there is no issue.

I managed to get the train data loaded and preprocessed, but then the training crashes further with

ERROR:tensorflow:Error recorded from training_loop: No gradient defined for operation 'module_apply_tokens/bert/encoder/transformer/group_0_23/layer_23/inner_group_0/LayerNorm_1/batchnorm/add_1' (op type: AddV2)
E1029 17:14:13.403403 4515265856 error_handling.py:70] Error recorded from training_loop: No gradient defined for operation 'module_apply_tokens/bert/encoder/transformer/group_0_23/layer_23/inner_group_0/LayerNorm_1/batchnorm/add_1' (op type: AddV2)
INFO:tensorflow:training_loop marked as finished
I1029 17:14:13.404175 4515265856 error_handling.py:96] training_loop marked as finished
WARNING:tensorflow:Reraising captured error
W1029 17:14:13.404798 4515265856 error_handling.py:130] Reraising captured error
Traceback (most recent call last):
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gradients_util.py", line 673, in _GradientsHelper
    grad_fn = ops.get_gradient_function(op)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2775, in get_gradient_function
    return _gradient_registry.lookup(op_type)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/registry.py", line 97, in lookup
    "%s registry has no entry for: %s" % (self._name, name))
LookupError: gradient registry has no entry for: AddV2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/vladimirbugay/.vscode-insiders/extensions/ms-python.python-2019.11.43735-dev/pythonFiles/ptvsd_launcher.py", line 43, in <module>
    main(ptvsdArgs)
  File "/Users/vladimirbugay/.vscode-insiders/extensions/ms-python.python-2019.11.43735-dev/pythonFiles/lib/python/old_ptvsd/ptvsd/__main__.py", line 432, in main
    run()
  File "/Users/vladimirbugay/.vscode-insiders/extensions/ms-python.python-2019.11.43735-dev/pythonFiles/lib/python/old_ptvsd/ptvsd/__main__.py", line 342, in run_module
    run_module_as_main(target, alter_argv=True)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/vladimirbugay/Knoema/GitHub/google-research/albert/run_classifier_with_tfhub.py", line 320, in <module>
    tf.app.run()
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/Users/vladimirbugay/Knoema/GitHub/google-research/albert/run_classifier_with_tfhub.py", line 244, in main
    estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2876, in train
    rendezvous.raise_errors()
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 131, in raise_errors
    six.reraise(typ, value, traceback)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/six.py", line 693, in reraise
    raise value
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2871, in train
    saving_listeners=saving_listeners)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 367, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1158, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1188, in _train_model_default
    features, labels, ModeKeys.TRAIN, self.config)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2709, in _call_model_fn
    config)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1146, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2967, in _model_fn
    features, labels, is_export_mode=is_export_mode)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 1549, in call_without_tpu
    return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 1867, in _call_model_fn
    estimator_spec = self._model_fn(features=features, **kwargs)
  File "/Users/vladimirbugay/Knoema/GitHub/google-research/albert/run_classifier_with_tfhub.py", line 116, in model_fn
    total_loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu)
  File "/Users/vladimirbugay/Knoema/GitHub/google-research/albert/optimization.py", line 101, in create_optimizer
    grads = tf.gradients(loss, tvars)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 158, in gradients
    unconnected_gradients)
  File "/Users/vladimirbugay/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gradients_util.py", line 689, in _GradientsHelper
    (op.name, op.type))
LookupError: No gradient defined for operation 'module_apply_tokens/bert/encoder/transformer/group_0_23/layer_23/inner_group_0/LayerNorm_1/batchnorm/add_1' (op type: AddV2)
1reaction
donggyukimccommented, Oct 28, 2019

same problem here. ignoring empty lines leads to second encoding error.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What to do when you get an error - Hugging Face Course
In this section we'll look at some common errors that can occur when you're trying to generate predictions from your freshly tuned Transformer...
Read more >
Fine-tuning a BERT model | Text - TensorFlow
This tutorial demonstrates how to fine-tune a Bidirectional Encoder Representations from Transformers (BERT) (Devlin et al., 2018) model ...
Read more >
Recently Active 'bert-language-model' Questions
I am trying to inference through Bert model. I have tokenized my input using the padding parameter during training as shown below. encoding...
Read more >
IMPROVING DEEP QUESTION ANSWERING: THE ALBERT ...
Chapter 4 describes ALBERT [21], the model this work focuses on and the necessary steps involved in its ... B.5 ALBERT with Binary...
Read more >
Empirical Study on the Software Engineering Practices in ...
This empirical work is trying to fill those research gaps. ... While TFHub only contains models developed with TensorFlow [20] and PyTorch ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found