question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add support for truncation argument when calling a Pipeline

See original GitHub issue

🚀 Feature request

Currently, only the padding argument is supported when calling a pipeline, and it’s not possible to pass truncation argument. For example, running the following code sample would raise an error:

import transformers as trf

model = trf.pipeline(task='feature-extraction', model='bert-base-cased')
output = model('a sample text', padding=False, truncation=True)

Motivation

If toggling padding is supported, then why truncation shouldn’t be?

Your contribution

I think to achieve this, same as padding, only a truncation argument should be added to _parse_and_tokenize method and also when calling the tokenizer. If that’s the case, I would be willing to work on a PR.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

3reactions
buhrmanncommented, Jul 30, 2021

Hi, even though this has been closed as stale, without comment or supposed fix, it seems that in recent versions you can in fact pass both truncation and padding arguments to the pipeline’s __call__ method, and it will correctly use them when tokenizing. I’ve tested it with long texts that fail without the truncation argument, and it seems to work as expected.

0reactions
alexbwcommented, Feb 7, 2021

+1 on this

Read more comments on GitHub >

github_iconTop Results From Across the Web

Truncating sequence -- within a pipeline - Hugging Face Forums
So I have two questions: Is there a way to just add an argument somewhere that does the truncation automatically?
Read more >
How to truncate input in the Huggingface pipeline?
Is there any way of passing the max_length and truncate parameters from the tokenizer directly to the pipeline? My work around is to...
Read more >
Hyperparameter tuning a model (v2) - Azure Machine Learning
Automate efficient hyperparameter tuning using Azure Machine Learning SDK v2 and CLI v2 by way of the SweepJob type. Define the parameter ......
Read more >
Copy number calling pipeline — CNVkit 0.9.8 documentation
A listing of all sub-commands can be obtained with cnvkit --help or -h , and the usage ... The pipeline executed by the...
Read more >
libpipeline(3) - Linux manual page - man7.org
The calling program may then start the pipeline, read output from it, wait for it to ... Convenience function to add an argument...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found