Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Split test fixtures by language?

See original GitHub issue

I’m working with cltk/cltk/tests/test_tag.py . The setUp fixture downloads all models for all languages tested in the test case, meaning that for every test function, the system re-downloads everything. This makes running tests slow.

Would it make sense to break the one test case into one per language, so that the fixtures can be specific to the language?

Issue Analytics

State:
Created 5 years ago
Reactions:1
Comments:11 (6 by maintainers)

Top GitHub Comments

3reactions

kylepjohnsoncommented, Aug 3, 2018

Clement has submitted #814 which adds ON-only tests.

@free-variation Within this Issue, would you like to submit an OE test module?

Also, a question for all: What do you think of the following?

Move the general-use NLP tests to a new dir, cltk/tests/nlp
Move the language-specific tests to another new dir, cltk/tests/languages

This would be more intuitive to newcomers, I think.

3reactions

kylepjohnsoncommented, Jul 31, 2018

I started writing a long response explaining why I did not like this proposal, however the more I wrote the more intuitive it seemed.

Here is what I like about the current system: the tests mirror the project’s directory structure, eg, anything in cltk.ner will be in test_ner.py. This makes a lot of sense for those developing cross-language functionality.

However, for those focusing on one particular language, you’re certainly right that it’s awkward to only need a subset of tests within each models.

In the long run, I think we will need to cater more to the latter (language-specific devs) than the former (cross-language nlp specialists).

Would it be too confusing to have both? For example, retain at least 1 test for each language, for each of the current test modules (say, Latin for test_tagger.py), but also have language-specific modules, too (eg, test_old_english.py). This would inevitably lead to some test duplication, however in the case of testing, I might view this as a benefit.

On top of this, of course, we are working toward doctests too, which offer good, if sometimes minimal, tests.

BTW:

What I do when I want to check if my tests pass is to comment the code that does not interest me for this case. It’s not a good practice, for sure, but it’s quickly done.

This is what I do when testing locally. I agree that it’s kinda ugly, however combined with the build server, codecov, and PR review ought to keep us from making mistakes (namely, leaving a large block of test code commented out).

Unless there are strong objections to this three-pronged testing methodology (dir-level; module-level; and doctests), I am open to moving ahead. @todd-cook You can be a sanity check – do you think this is crazy?

@free-variation Between you, @Sedictious, and @clemsciences there are enough Old English examples, I imagine, to give this a shot. You like to try?