question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Refactor tests into small test suits

See original GitHub issue

🚀 Refactor tests into small test suits.

As suggested/discussed in some PR, I would like to clean up how unit tests are organized, parameterized and ran.

Motivation

Functionalities in torchaudio have many common test patterns, such as, comparison against librosa, jit-ability and batch consistency. So if a function has nicely-organized test suite, adding the same type of function will be easier. (E.g. “I added this new SoX filter which is very similar to A, and there is a suite of test defined for A, so those are the tests I should be adding.”) That way, it becomes easier for new contributor to write tests.

Let’s take an example of ./test/test_functional.py module. There are a lot of test methods under TestFunctional class.

The difficulty I had with this structure when I worked on https://github.com/pytorch/audio/pull/448 is that it was not immediately clear what kind of tests are typical to add.

The current organization of test_functional.py
$ grep '\bclass\b\|def test' -r ./test/test_functional.py
class TestFunctional(unittest.TestCase):
    def test_torchscript_spectrogram(self):
    def test_torchscript_griffinlim(self):
    def test_griffinlim(self):
    def test_batch_griffinlim(self):
    def test_compute_deltas_onechannel(self):
    def test_compute_deltas_twochannel(self):
    def test_compute_deltas_randn(self):
    def test_batch_pitch(self):
    def test_jit_pitch(self):
    def test_istft_is_inverse_of_stft1(self):
    def test_istft_is_inverse_of_stft2(self):
    def test_istft_is_inverse_of_stft3(self):
    def test_istft_is_inverse_of_stft4(self):
    def test_istft_is_inverse_of_stft5(self):
    def test_istft_of_ones(self):
    def test_istft_of_zeros(self):
    def test_istft_requires_overlap_windows(self):
    def test_istft_requires_nola(self):
    def test_istft_requires_non_empty(self):
    def test_istft_of_sine(self):
    def test_linearity_of_istft1(self):
    def test_linearity_of_istft2(self):
    def test_linearity_of_istft3(self):
    def test_linearity_of_istft4(self):
    def test_batch_istft(self):
    def test_create_fb(self):
    def test_gain(self):
    def test_dither(self):
    def test_vctk_transform_pipeline(self):
    def test_pitch(self):
    def test_torchscript_create_fb_matrix(self):
    def test_torchscript_amplitude_to_DB(self):
    def test_torchscript_create_dct(self):
    def test_torchscript_mu_law_encoding(self):
    def test_torchscript_mu_law_decoding(self):
    def test_torchscript_complex_norm(self):
    def test_mask_along_axis(self):
    def test_mask_along_axis_iid(self):
    def test_torchscript_gain(self):
    def test_torchscript_dither(self):
def test_phase_vocoder(complex_specgrams, rate, hop_length):
def test_complex_norm(complex_tensor, power):
def test_mask_along_axis(specgram, mask_param, mask_value, axis):
def test_mask_along_axis_iid(specgrams, mask_param, mask_value, axis):

Pitch

Continuing with the test_fucntional.py example above, we can start from breaking down test cases and compose test suites.

# This is just an illustration
Class TestSpectrogram:
    """Test suite for `spectrogram`"""

    def test_accuracy(self):
        """Produces expected results"""

    def test_jit_consistency(self):
        """is jit-able and returns consistent result"""

    def test_batch_consistency(self):
        """returns consistent results for batched input"""

    def test_comparison_against_librosa(self):
        """should yield results very close to librosa's implementation"""

Alternatives

An alternative way to break down tests is by the type of test.

class TestJit:
    """Test suite for jit-ability and consistency"""
    def test_spectrogram(self):
        """`spectrogram` should be jit-able"""

    def test_griffinlim(self):
        """`griffinlim` should be jit-able"""

class TestSoxConsistensy:
    """Test suite for sox compatibilities of filters"""
    def test_allpass(self):
        """`allpass` produces result close to SoX"""

    def test_highpass(self):
        """`highpass` produces result close to SoX"""

Pro: With this approach, it is easy to find an answer to question like, ‘Which function is jit-able?’ Con: Suit of tests for same function are scattered.

Related stuff

In addition to the above, when we discuss testing, we can also talk about

  • Run flake8 / black test?
  • Which test runner to use? pytest / unittest
  • Should we use PyTorch’s test utility? etc…

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
mthrokcommented, Mar 30, 2020

@vincentqb I created PR #480, which re-organize tests based on category in a small scope (only test_functional module). Once we can agree on pros-and-cons of the re-org, I will proceed with the rest of the test module.

Regarding, git blame, it is true that the information of the original author will be lost from the latest commit, but if the intention of test is correctly documented (either as descriptive test function name, or docstring), then I think it’s overall more beneficial to make test more readable and intuitively comprehensive while the whole code base is somewhat small.

1reaction
cpuhrschcommented, Mar 26, 2020

As a word of caution on code formatting: git blame can be very useful in quickly figuring out why something went wrong. Therefore, I would not just apply some formatter to all files and commit it for the sake of code formatting. Instead, we could setup a system that forces formatting on new lines or at the very least new files. In general, it makes sense to set a code formatting standard through an automated tool to, at the very least, avoid discussion and spending time on the topic. It’s easy to start bike-shedding whitespace.

On code coverage: It’s a good way to find gaping blindspots in your test, but it’s a bad way to feel good about our tests. There is also some code (for example codegen) that won’t necessarily be covered and therefore will continue to have to be explicitly excluded. I use code coverage for my own personal dev, but it being part of a CI can cause confusion / cause more overhead.

Read more comments on GitHub >

github_iconTop Results From Across the Web

3 Reasons Why It's Important to Refactor Tests - Quality Coding
Refactor tests to have good names, good variables, and well-named helper methods. The better a test expresses itself, the less trouble you'll ...
Read more >
Tests Are for the Future. Refactor your code with confidence ...
Tests are for the future. They provide documentation, help you avoid regressions, and allow you to refactor with confidence. P.S. If you want...
Read more >
Why write tests for code that I will refactor?
Refactoring is cleaning up a piece of code (e.g. improving the style, design, or algorithms), without changing (externally visible) behavior.
Read more >
Refactoring Unit Tests - kdgregory.com
Unit tests are often used to support refactoring mainline code, but tests themselves are candidates for refactoring.
Read more >
Refactoring Tests for Better Application Design - Code Climate
Through the act of writing a test first, we ponder on the interface of the object under test, as well as of other...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found