Can I change the vocabulary size? Getting this error in AutoKeras (TextClassifier):
See original GitHub issueBug Description
Getting the following error using multi label classification: Is there a way to increase the vocabulary?
`Epoch 1/1000 2020-03-17 17:56:31.128359: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll 2020-03-17 17:56:31.665519: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2020-03-17 17:56:33.256128: W tensorflow/stream_executor/gpu/redzone_allocator.cc:312] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only Relying on driver to perform ptx compilation. This message will be only logged once.
1/606 […] - ETA: 31:35 - loss: 0.6975 - accuracy: 0.4538 2/606 […] - ETA: 16:35 - loss: 0.6923 - accuracy: 0.5070 3/606 […] - ETA: 11:29 - loss: 0.6879 - accuracy: 0.5554 4/606 […] - ETA: 8:56 - loss: 0.6833 - accuracy: 0.5962 5/606 […] - ETA: 7:24 - loss: 0.6788 - accuracy: 0.6288 6/606 […] - ETA: 6:24 - loss: 0.6741 - accuracy: 0.6549 7/606 […] - ETA: 5:40 - loss: 0.6685 - accuracy: 0.67802020-03-17 17:56:34.376333: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: indices[28,106] = 20000 is not in [0, 20000) [[{{node model/embedding/embedding_lookup}}]] [[VariableShape/_50]] 2020-03-17 17:56:34.376719: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: indices[28,106] = 20000 is not in [0, 20000) [[{{node model/embedding/embedding_lookup}}]]
8/606 […] - ETA: 5:11 - loss: 0.6685 - accuracy: 0.6780WARNING:tensorflow:Early stopping conditioned on metric val_loss
which is not available. Available metrics are: loss,accuracy
WARNING:tensorflow:Can save best model only with val_loss available, skipping.
Traceback (most recent call last):
File “C:\Development\Python\Python376\lib\contextlib.py”, line 130, in exit
self.gen.throw(type, value, traceback)
File “C:\Development\Python\Python376\lib\site-packages\tensorflow_core\python\ops\variable_scope.py”, line 2803, in variable_creator_scope
yield
File “C:\Development\Python\Python376\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py”, line 342, in fit
total_epochs=epochs)
File “C:\Development\Python\Python376\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py”, line 128, in run_one_epoch
batch_outs = execution_function(iterator)
File “C:\Development\Python\Python376\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py”, line 98, in execution_function
distributed_function(input_fn))
File “C:\Development\Python\Python376\lib\site-packages\tensorflow_core\python\eager\def_function.py”, line 568, in call
result = self._call(*args, **kwds)
File “C:\Development\Python\Python376\lib\site-packages\tensorflow_core\python\eager\def_function.py”, line 599, in _call
return self._stateless_fn(*args, **kwds) # pylint: disable=not-callable
File “C:\Development\Python\Python376\lib\site-packages\tensorflow_core\python\eager\function.py”, line 2363, in call
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File “C:\Development\Python\Python376\lib\site-packages\tensorflow_core\python\eager\function.py”, line 1611, in _filtered_call
self.captured_inputs)
File “C:\Development\Python\Python376\lib\site-packages\tensorflow_core\python\eager\function.py”, line 1692, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File “C:\Development\Python\Python376\lib\site-packages\tensorflow_core\python\eager\function.py”, line 545, in call
ctx=ctx)
File “C:\Development\Python\Python376\lib\site-packages\tensorflow_core\python\eager\execute.py”, line 67, in quick_execute
six.raise_from(core._status_to_exception(e.code, message), None)
File “<string>”, line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: indices[28,106] = 20000 is not in [0, 20000)
[[node model/embedding/embedding_lookup (defined at \Development\Python\Python376\lib\site-packages\autokeras\engine\tuner.py:71) ]]
[[VariableShape/_50]]
(1) Invalid argument: indices[28,106] = 20000 is not in [0, 20000)
[[node model/embedding/embedding_lookup (defined at \Development\Python\Python376\lib\site-packages\autokeras\engine\tuner.py:71) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_distributed_function_3354]
Errors may have originated from an input operation. Input Source operations connected to node model/embedding/embedding_lookup: model/embedding/embedding_lookup/2981 (defined at \Development\Python\Python376\lib\contextlib.py:112)
Input Source operations connected to node model/embedding/embedding_lookup: model/embedding/embedding_lookup/2981 (defined at \Development\Python\Python376\lib\contextlib.py:112)
Function call stack: distributed_function -> distributed_function`
Bug Reproduction
Code for reproducing the bug:
Data used by the code:
Expected Behavior
Setup Details
Include the details about the versions of:
- OS type and version:
- Python:
- autokeras:
- keras-tuner:
- scikit-learn:
- numpy:
- pandas:
- tensorflow:
Additional context
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (1 by maintainers)
Top GitHub Comments
You have to use the https://autokeras.com/preprocessor/#texttointsequence-class. The max tokens is the vocabulary size.
@haifeng-jin This did solved my problem, thank you. My problem was mainly not to use
AutoModel
. I had usedTextClassifier
before but this didn’t work. The following code worked: