unclear `prepare_seq2seq_batch` deprecation
See original GitHub issueWhen using prepare_seq2seq_batch
the user now gets:
transformers-master/src/transformers/tokenization_utils_base.py:3277: FutureWarning:
prepare_seq2seq_batch
is deprecated and will be removed in version 5 of 🤗 Transformers. Use the regular__call__
method to prepare your inputs and the tokenizer under thewith_target_tokenizer
context manager to prepare your targets. See the documentation of your specific tokenizer for more details.
It’s very hard to act on as, I’m not sure what “regular __call__
method” refers to and I could find any tokenizer documentation that ever mentions with_target_tokenizer
.
Perhaps this is an unintended typo? was it meant to be with target_tokenizer
? with FooTokenizer
?
Please kindly suggest a more user-friendly deprecation and at least one example or a link to such.
Thank you.
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
I stumbled upon this issue when googling the warning. For the translation task this
tokenized_text = tokenizer.prepare_seq2seq_batch([text], return_tensors='pt')
has to be replaced by this:Which is much clearer than using
prepare_seq2seq_batch
, but for anyone coming from other languages but python, the concept of__call__
might not be transparent in first place 😃Why is
__call__
hard to understand? It’s the regular Python method for when the tokenizer is called directly on inputs. How would you formulate that better?For the
with_target_tokenizer
it’s a typo indeed, it should beas_target_tokenizer
.As for an example, this is what is used in every example script, see for instance the run_translation script.
I’m curious, where did you still find a reference to this method? It’s been removed from all examples and documentation normally (and has been deprecated five months ago).