Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TensorFlow Question-Answering example fails to run (cardinality error)

See original GitHub issue

Environment info

transformers version: 4.4.0.dev0
Platform: Linux-4.15.0-111-generic-x86_64-with-Ubuntu-18.04-bionic
Python version: 3.6.8
PyTorch version (GPU?): 1.7.1 (False)
Tensorflow version (GPU?): 2.2.0 (False)
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

Information

Model I am using (Bert, XLNet …): bert-base-uncased or roberta-base

The problem arises when using:

the official example scripts: question-answering (run_tf_squad.py)

Error message: Instructions for updating: back_prop=False is deprecated. Consider using tf.stop_gradient instead. Instead of: results = tf.map_fn(fn, elems, back_prop=False) Use: results = tf.nest.map_structure(tf.stop_gradient, tf.map_fn(fn, elems)) 87599it [01:00, 1437.03it/s] 10570it [00:11, 958.83it/s] convert squad examples to features: 2%|_ | 1697/87599 [00:13<10:43, 133.40it/s][WARNING|squad.py:118] 2021-02-17 22:20:03,736 >> Could not find answer: ‘municipal building and’ vs. ‘a municipal building’ convert squad examples to features: 50%|_____ | 43393/87599 [05:04<05:04, 145.24it/s][WARNING|squad.py:118] 2021-02-17 22:24:55,103 >> Could not find answer: ‘message stick,’ vs. ‘a message stick’ convert squad examples to features: 100%|| 87599/87599 [10:10<00:00, 143.59it/s] add example index and unique id: 100%|| 87599/87599 [00:00<00:00, 784165.53it/s] convert squad examples to features: 100%|| 10570/10570 [01:14<00:00, 140.99it/s] add example index and unique id: 100%|| 10570/10570 [00:00<00:00, 510000.04it/s] [WARNING|integrations.py:60] 2021-02-17 22:31:16,214 >> Using the WAND_DISABLED environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none). [INFO|trainer_tf.py:125] 2021-02-17 22:31:16,214 >> To use comet_ml logging, run pip/conda install comet_ml see https://www.comet.ml/docs/python-sdk/huggingface/ Traceback (most recent call last): File “run_tf_squad.py”, line 256, in <module> main() File “run_tf_squad.py”, line 250, in main trainer.train() File “/home/transformers/src/transformers/trainer_tf.py”, line 457, in train train_ds = self.get_train_tfdataset() File “/home/transformers/src/transformers/trainer_tf.py”, line 141, in get_train_tfdataset self.num_train_examples = self.train_dataset.cardinality().numpy() AttributeError: ‘_AssertCardinalityDataset’ object has no attribute ‘cardinality’

The tasks I am working on is:

an official GLUE/SQUaD task: SQUaD v1

To reproduce

Use the latest master from huggingface/transformers
Go to examples/question-answering
Run WANDB_DISABLED=true python run_tf_squad.py --model_name_or_path roberta-base --output_dir model --max_seq_length 384 --num_train_epochs 2 --per_device_train_batch_size 8 --per_device_eval_batch_size 16 --do_train --do_eval --logging_dir logs --logging_steps 10 --learning_rate 3e-5 --no_cuda=True --doc_stride 128

Could you take a look @sgugger?

Issue Analytics

State:
Created 3 years ago
Comments:6 (3 by maintainers)

Top GitHub Comments

6reactions

jeehunkangcommented, Jul 9, 2022

Hello, I am getting the same error.

It says “AttributeError: ‘Dataset’ object has no attribute ‘cardinality’” when I train it. Does anyone know how I should address this issue?

0reactions

github-actions[bot]commented, Apr 14, 2021

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Top Results From Across the Web

Cardinality issue when training bert from scratch (tensorflow)

I saw a related issue here TensorFlow Question-Answering example fails to run (cardinality error) · Issue #10246 · huggingface/transformers ...

How to fix cardinality error in my CNN model - python

I used the following code to generate random data and was able to replicate the issue with it. # Generating random X and...

Recovering Question Answering Errors via Query Revision

r = r1r2 ...rk is used to retrieve an answer set. Figure 2: Illustration of different question revision strategies on the running example...

"ValueError: Data cardinality is ambiguous" in model ...

So the model takes one sample each time to move it through layers. Problem: Now, regarding you have 502 and 1002 samples, the...

Pre-training Summarization Models of Structured Datasets

We consider the problem of pre-training models which convert structured datasets into succinct summaries that can be used to answer cardinality estimation ...