ValueError: max() arg is an empty sequence
See original GitHub issueWhen I run the code like below. I’ve got stack at the titled error. why??
Using TensorFlow backend. 2018-05-22 11:47:25.286883: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.2 AVX AVX2 FMA Epoch 1/15 Traceback (most recent call last): File “test.py”, line 9, in <module> model.train(x_train, y_train, x_valid, y_valid) File “/Users/norio.kosaka/anaconda3/envs/py36/lib/python3.6/site-packages/anago/wrapper.py”, line 50, in train trainer.train(x_train, y_train, x_valid, y_valid) File “/Users/norio.kosaka/anaconda3/envs/py36/lib/python3.6/site-packages/anago/trainer.py”, line 51, in train callbacks=callbacks) File “/Users/norio.kosaka/anaconda3/envs/py36/lib/python3.6/site-packages/keras/legacy/interfaces.py”, line 91, in wrapper return func(*args, **kwargs) File “/Users/norio.kosaka/anaconda3/envs/py36/lib/python3.6/site-packages/keras/engine/training.py”, line 2145, in fit_generator generator_output = next(output_generator) File “/Users/norio.kosaka/anaconda3/envs/py36/lib/python3.6/site-packages/keras/utils/data_utils.py”, line 770, in get six.reraise(value.class, value, value.traceback) File “/Users/norio.kosaka/anaconda3/envs/py36/lib/python3.6/site-packages/six.py”, line 693, in reraise raise value File “/Users/norio.kosaka/anaconda3/envs/py36/lib/python3.6/site-packages/keras/utils/data_utils.py”, line 635, in _data_generator_task generator_output = next(self._generator) File “/Users/norio.kosaka/anaconda3/envs/py36/lib/python3.6/site-packages/anago/reader.py”, line 137, in data_generator yield preprocessor.transform(X, y) File “/Users/norio.kosaka/anaconda3/envs/py36/lib/python3.6/site-packages/anago/preprocess.py”, line 115, in transform sents, y = self.pad_sequence(words, chars, y) File “/Users/norio.kosaka/anaconda3/envs/py36/lib/python3.6/site-packages/anago/preprocess.py”, line 148, in pad_sequence word_ids, sequence_lengths = pad_sequences(word_ids, 0) File “/Users/norio.kosaka/anaconda3/envs/py36/lib/python3.6/site-packages/anago/preprocess.py”, line 197, in pad_sequences max_length = len(max(sequences, key=len)) ValueError: max() arg is an empty sequence
import anago
from anago.reader import load_data_and_labels
x_train, y_train = load_data_and_labels('./data/train.txt')
x_valid, y_valid = load_data_and_labels('./data/valid.txt')
x_test, y_test = load_data_and_labels('./data/test.txt')
model = anago.Sequence()
model.train(x_train, y_train, x_valid, y_valid)
model.eval(x_test, y_test)
words = 'President Obama is speaking at the White House.'.split()
model.analyze(words)
Issue Analytics
- State:
- Created 5 years ago
- Comments:8 (1 by maintainers)
Maybe you cut your off your data at a wrong point? Check the last rows of
train.txt
andvalid.txt
and make sure there is an empty line in the end and the last sentences are complete (a sentence is marked by an empty line after)This error happens when your validation set contains tags that are not existent in your training set.
As this is a possible case in other kinds of machine learning problem, I build a workaround for it:
I defined a new Proprocessing class that includes tags from validation set into self.vocab_tag list.
You also need a new wrapper class that is almost equivalent to Sequence, but uses your new preprocessor:
Anyway, I am thinking about changing my preproccesor by taking a predefined list of tags into the self.vocab_tag list as this may error once you test your model and your test set contains tags that are not existens in training or validation set.