assertion failed: [predictions must be >= 0]
See original GitHub issueTrying to train a binary classifier over sentence pairs with custom dataset throws a Tensroflow error.
Environment info
transformers
version:4.2.2
- Platform: Ubuntu 18.04
- Python version:
3.7.5
- PyTorch version (GPU?):
- Tensorflow version (GPU):
2.3.1
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: Nope
Who can help
Information
Model I am using (TFRoberta, TFXLMRoberta…):
The problem arises when using:
- the official example scripts: (give details below)
- my own modified scripts: (give details below)
The tasks I am working on is:
- an official GLUE/SQUaD task: https://huggingface.co/transformers/training.html#fine-tuning-in-native-tensorflow-2
- my own task or dataset:
To reproduce
Steps to reproduce the behavior:
from transformers import TFAutoModelForSequenceClassification, AutoTokenizer
from keras.metrics import Precision, Recall
import tensorflow as tf
def build_dataset(tokenizer, filename):
data = [[], [], []]
with open(filename, 'r') as file_:
for line in file_:
fields = line.split('\t')
data[0].append(fields[0].strip())
data[1].append(fields[1].strip())
data[2].append(int(fields[2].strip()))
sentences = tokenizer(data[0], data[1],
padding=True,
truncation=True)
return tf.data.Dataset.from_tensor_slices((dict(sentences),
data[2]))
settings = {
"model": 'roberta-base',
"batch_size": 8,
"n_classes": 1,
"epochs": 10,
"steps_per_epoch": 128,
"patience": 5,
"loss": "binary_crossentropy",
"lr": 5e7,
"clipnorm": 1.0,
}
tokenizer = AutoTokenizer.from_pretrained(settings["model"])
train_dataset = build_dataset(tokenizer, 'train.head')
train_dataset = train_dataset.shuffle(
len(train_dataset)).batch(settings["batch_size"])
dev_dataset = build_dataset(tokenizer, 'dev.head').batch(
settings["batch_size"])
model = TFAutoModelForSequenceClassification.from_pretrained(
settings['model'],
num_labels=1)
model.compile(optimizer='adam',
#loss='binary_crossentropy',
loss=model.compute_loss,
metrics=[Precision(name='p'), Recall(name='r')])
model.summary()
model.fit(train_dataset,
epochs=settings["epochs"],
#steps_per_epoch=steps_per_epoch,
validation_data=dev_dataset,
batch_size=settings["batch_size"],
verbose=1)
Gives the following output
Some layers of TFRobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Model: "tf_roberta_for_sequence_classification"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
roberta (TFRobertaMainLayer) multiple 124055040
_________________________________________________________________
classifier (TFRobertaClassif multiple 591361
=================================================================
Total params: 124,646,401
Trainable params: 124,646,401
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
Traceback (most recent call last):
File "finetune.py", line 52, in <module>
verbose=1)
File "/work/user/bicleaner-neural/venv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 108, in _method_wrapper
return method(self, *args, **kwargs)
File "/work/user/bicleaner-neural/venv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1098, in fit
tmp_logs = train_function(iterator)
File "/work/user/bicleaner-neural/venv/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
result = self._call(*args, **kwds)
File "/work/user/bicleaner-neural/venv/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 840, in _call
return self._stateless_fn(*args, **kwds)
File "/work/user/bicleaner-neural/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2829, in __call__
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "/work/user/bicleaner-neural/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1848, in _filtered_call
cancellation_manager=cancellation_manager)
File "/work/user/bicleaner-neural/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1924, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/work/user/bicleaner-neural/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 550, in call
ctx=ctx)
File "/work/user/bicleaner-neural/venv/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: assertion failed: [predictions must be >= 0] [Condition x >= y did not hold element-wise:] [x (tf_roberta_for_sequence_classification/classifier/out_proj/BiasAdd:0) = ] [[0.153356239][0.171548933][0.121127911]...] [y (Cast_3/x:0) = ] [0]
[[{{node assert_greater_equal/Assert/AssertGuard/else/_1/assert_greater_equal/Assert/AssertGuard/Assert}}]]
[[assert_greater_equal_1/Assert/AssertGuard/pivot_f/_31/_205]]
(1) Invalid argument: assertion failed: [predictions must be >= 0] [Condition x >= y did not hold element-wise:] [x (tf_roberta_for_sequence_classification/classifier/out_proj/BiasAdd:0) = ] [[0.153356239][0.171548933][0.121127911]...] [y (Cast_3/x:0) = ] [0]
[[{{node assert_greater_equal/Assert/AssertGuard/else/_1/assert_greater_equal/Assert/AssertGuard/Assert}}]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_20780]
Function call stack:
train_function -> train_function
The dataset examples look like this:
print(list(train_dataset.take(1).as_numpy_iterator()))
[({'input_ids': array([[ 0, 133, 864, ..., 1, 1, 1],
[ 0, 133, 382, ..., 1, 1, 1],
[ 0, 1121, 645, ..., 1, 1, 1],
...,
[ 0, 133, 864, ..., 1, 1, 1],
[ 0, 1121, 144, ..., 1, 1, 1],
[ 0, 495, 21046, ..., 1, 1, 1]], dtype=int32), 'attention_mask': array([[1, 1, 1, ..., 0, 0, 0],
[1, 1, 1, ..., 0, 0, 0],
[1, 1, 1, ..., 0, 0, 0],
...,
[1, 1, 1, ..., 0, 0, 0],
[1, 1, 1, ..., 0, 0, 0],
[1, 1, 1, ..., 0, 0, 0]], dtype=int32)}, array([0, 0, 0, 0, 1, 0, 0, 0], dtype=int32))]
Expected behavior
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Assertion failed: predictions must be >= 0, Condition x >= y did ...
I think that this error is due to the setting of the AUC metric.(see https://www.tensorflow.org/api_docs/python/tf/keras/metrics/AUC) The ...
Read more >Metrics related [predictions must be <= 1] error
Hello, I have the below error with TensorFlow 2.7. Same error happens with 2.6 but stack trace is different. Traceback (most recent call ......
Read more >SIIM-ISIC Melanoma Classification | Kaggle
InvalidArgumentError : assertion failed: [predictions must be >= 0] [Condition x >= y did not hold element-wise:] [x (aNetwork/Effnet0/Sigmoid:0) ...
Read more >InvalidArgumentError: assertion failed: [predictions must be in ...
i am working on tensorflow/slim package there are training file and evaluation file , i have done the training successfully but when i...
Read more >assertion failed: [predictions must be >= 0] [Condition x >= y ...
完整报错如下:tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [predictions must be >= 0] [Condition x >= y ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Figured out that the error was thrown by the
Precision
andRecall
classes because they require values between 0 and 1. In case someone wants to use them when training with native TensowFlow I managed to add theargmax
to the the classes with this:So the code that I posted works with
num_classes=2
and using the the overridden classes as metrics.For binary classification, you have two labels and then neurons, it is more intuitive to proceed that way 😃 but yes you can also do what you propose and round the float value to 0 or 1 depending of the output of your sigmoid activation. Nevertheless, our models don’t propose such approach.