question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

use `@unittest.skipIf` decorators inside tokenizer's tests instead of `if ...: return`

See original GitHub issue

Feature request

Currently in tokenizer testing, many tests coded in test_tokenization_common.py are not relevant for all tokenizers. In most cases, when the test is not relevant, it is still run but no verification is done in it. See the snippet below for example: https://github.com/huggingface/transformers/blob/114295c010dd9c94d48add7a0f091ba6ebdf482b/tests/test_tokenization_common.py#L384-L396

I would like to propose to replace these if tests in the test methods by @unittest.skipIf decorators. On the previous example it would give:

    @unittest.skipIf(not test_sentencepiece, "Not testing sentencepiece")
    def test_subword_regularization_tokenizer(self) -> None:
        # Subword regularization is only available for the slow tokenizer.
        sp_model_kwargs = {"enable_sampling": True, "alpha": 0.1, "nbest_size": -1}
        tokenizer = self.get_tokenizer(sp_model_kwargs=sp_model_kwargs)

        self.assertTrue(hasattr(tokenizer, "sp_model_kwargs"))
        self.assertIsNotNone(tokenizer.sp_model_kwargs)
        self.assertTrue(isinstance(tokenizer.sp_model_kwargs, dict))
        self.assertEqual(tokenizer.sp_model_kwargs, sp_model_kwargs)
        self.check_subword_sampling(tokenizer)

Motivation

The problem with the current method is that we don’t have a view on the number of tests actually performed on each type of tokenizers. If errors are made in the configuration of the test classes, we can have a green check for all the tests but in reality nothing has been checked

Your contribution

If you ever find it relevant, I can make the changes or let someone else who would be available to do it before me.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:16 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
sguggercommented, Sep 22, 2022

That works for me!

1reaction
ydshiehcommented, Sep 22, 2022

This is awesome, @SaulLu ! Thank you 😃. I would love this new approach to skip. Leave @sgugger and @LysandreJik for a final confirmation.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Skipping all unit tests but one in Python by using decorators ...
I went to the python docs and found this code which is a decorator that will skip all unittests that don't have an...
Read more >
25.3. unittest — Unit testing framework - IronPython
JUnit is, in turn, a Java version of Kent's Smalltalk testing framework. ... Skipping a test is simply a matter of using the...
Read more >
unittest — Unit testing framework — Python 3.11.1 ...
Skipping a test is simply a matter of using the skip() decorator or one of its conditional variants, calling TestCase.skipTest() within a setUp()...
Read more >
Testing - Hugging Face
Since unittest is used inside most of the tests, to run specific subtests you need to know ... If a test requires tensorflow...
Read more >
Python Unit Testing - Javatpoint
In this tutorial, we will implement unit testing using the Python. Unit testing using ... If we pass the wrong arguments it will...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found