question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

generic text classification with TensorFlow error (AttributeError: 'TFTrainingArguments' object has no attribute 'args')

See original GitHub issue

Environment info

  • transformers version: 3.2.0
  • Platform: Linux-4.15.0-1091-oem-x86_64-with-Ubuntu-18.04-bionic
  • Python version: 3.6.9
  • PyTorch version (GPU?): not installed (NA)
  • Tensorflow version (GPU?): 2.3.0 (True)
  • Using GPU in script?: <fill in>
  • Using distributed or parallel set-up in script?: <fill in>

Who can help

@jplu

Information

Model I am using (Bert, XLNet …): bert-base-multilingual-uncased

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below) Running run_tf_text_classification.py with flags from the example in the “Run generic text classification script in TensorFlow” section of examples/text-classification

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below) Text classification dataset for classifying answers to questions. Using 3 CSVs (train, dev, and test) that each have headers (class, text) and columns containing class labels (int) and questions (strings). There are no commas present in the questions, for reference.

To reproduce

Steps to reproduce the behavior:

  1. Call run_tf_text_classification.py with flags from the example in the “Run generic text classification script in TensorFlow” section of examples/text-classification:
python run_tf_text_classification.py \
  --train_file train.csv \
  --dev_file dev.csv \ 
  --test_file test.csv \ 
  --label_column_id 0 \ 
  --model_name_or_path bert-base-multilingual-uncased \
  --output_dir model \
  --num_train_epochs 4 \
  --per_device_train_batch_size 16 \
  --per_device_eval_batch_size 32 \
  --do_train \
  --do_eval \
  --do_predict \
  --logging_steps 10 \
  --evaluate_during_training \
  --save_steps 10 \
  --overwrite_output_dir \
  --max_seq_length 128
  1. Error is encountered:
Traceback (most recent call last):
  File "run_tf_text_classification.py", line 283, in <module>
    main()
  File "run_tf_text_classification.py", line 199, in main
    training_args.n_replicas,
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/transformers/file_utils.py", line 936, in wrapper
    return func(*args, **kwargs)
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/transformers/training_args_tf.py", line 180, in n_replicas
    return self._setup_strategy.num_replicas_in_sync
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/transformers/file_utils.py", line 914, in __get__
    cached = self.fget(obj)
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/transformers/file_utils.py", line 936, in wrapper
    return func(*args, **kwargs)
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/transformers/training_args_tf.py", line 122, in _setup_strategy
    if self.args.xla:
AttributeError: 'TFTrainingArguments' object has no attribute 'args'
  1. If the logger.info call is commented out (lines 197-202), the above error is prevented but another error is encountered:
Traceback (most recent call last):
  File "run_tf_text_classification.py", line 282, in <module>
    main()
  File "run_tf_text_classification.py", line 221, in main
    max_seq_length=data_args.max_seq_length,
  File "run_tf_text_classification.py", line 42, in get_tfds
    ds = datasets.load_dataset("csv", data_files=files)
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/datasets/load.py", line 604, in load_dataset
    **config_kwargs,
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/datasets/builder.py", line 158, in __init__
    **config_kwargs,
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/datasets/builder.py", line 269, in _create_builder_config
    for key in sorted(data_files.keys()):
TypeError: '<' not supported between instances of 'NamedSplit' and 'NamedSplit'

Here is a pip freeze:

absl-py==0.10.0
astunparse==1.6.3
cachetools==4.1.1
certifi==2020.6.20
chardet==3.0.4
click==7.1.2
dataclasses==0.7
datasets==1.0.2
dill==0.3.2
filelock==3.0.12
gast==0.3.3
google-auth==1.21.3
google-auth-oauthlib==0.4.1
google-pasta==0.2.0
grpcio==1.32.0
h5py==2.10.0
idna==2.10
importlib-metadata==2.0.0
joblib==0.16.0
Keras-Preprocessing==1.1.2
Markdown==3.2.2
numpy==1.18.5
oauthlib==3.1.0
opt-einsum==3.3.0
packaging==20.4
pandas==1.1.2
protobuf==3.13.0
pyarrow==1.0.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pyparsing==2.4.7
python-dateutil==2.8.1
pytz==2020.1
regex==2020.7.14
requests==2.24.0
requests-oauthlib==1.3.0
rsa==4.6
sacremoses==0.0.43
scipy==1.4.1
sentencepiece==0.1.91
six==1.15.0
tensorboard==2.3.0
tensorboard-plugin-wit==1.7.0
tensorflow==2.3.0
tensorflow-estimator==2.3.0
termcolor==1.1.0
tokenizers==0.8.1rc2
tqdm==4.49.0
transformers==3.2.0
urllib3==1.25.10
Werkzeug==1.0.1
wrapt==1.12.1
xxhash==2.0.0
zipp==3.2.0

Expected behavior

Model begins to train on custom dataset.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
jplucommented, Sep 24, 2020

@sunnyville01 Just install the version on master with pip install git+https://github.com/huggingface/transformers.git

0reactions
jplucommented, Oct 2, 2020

@pvcastro Can you open a new issue please with all the details to be able for us to reproduce it. This thread is closed and about a different one.

Read more comments on GitHub >

github_iconTop Results From Across the Web

AttributeError: 'TFBertModel' object has no attribute 'parameters'
Hello I am trying to train a Bert Model for a tokenizer I had trained. I imported from transformers import TFBertModel model ...
Read more >
Trainer - Hugging Face
The Trainer class is optimized for Transformers models and can have surprising behaviors when you use it on other models. When using it...
Read more >
SimpleTransformers: Transformers Made Easy - Wandb
In this article, we will build a sentiment classifier on the IMDB dataset using both HuggingFace and SimpleTransformers.
Read more >
Fine-tuning a BERT model | Text - TensorFlow
tf-models-official is the TensorFlow Model Garden package. Note that it may not include the latest changes in the tensorflow_models GitHub repo.
Read more >
BERT Fine-Tuning Tutorial with PyTorch - Chris McCormick
In this tutorial, we will use BERT to train a text classifier. ... getting an error: AttributeError: 'BertTokenizer' object has no attribute ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found