Spacy behaves differently when testing one case vs testing all cases
See original GitHub issueIt seems Spacy’s tokenizer behaves differently when I run pytest -s --t=emojify
and pytest -s --t=light --f=light
.
For example, I added the following snippet in my generate()
function:
print([str(t) for t in self.nlp(sentence)])
With input sentence "Apple is looking at buying U.K. startup for $132 billion."
pytest -s --t=emojify
gives:
['Apple', 'is', 'looking', 'at', 'buying', 'U.K.', 'startup', 'for', '$', '132', 'billion', '.']
However, pytest -s --t=light --f=light
gives:
['Apple', 'is', 'looking', 'at', 'buying', 'U.K.', 'startup', 'for', '$1', '32', 'billion.']
I use the fowling code to load spacy:
import spacy
from initialize import spacy_nlp
self.nlp = spacy_nlp if spacy_nlp else spacy.load("en_core_web_sm")
It looks very strange. Am I overlooking something?
Issue Analytics
- State:
- Created 2 years ago
- Comments:19 (10 by maintainers)
Top Results From Across the Web
A Beginner's Guide to Testing: Error Handling Edge Cases
Take a small change to the functions we've tested above. It comes down to modifications of state, and understanding the behavior of a...
Read more >Reading 3: Testing
Exhaustive testing is infeasible. The space of possible test cases is generally too big to cover exhaustively. Imagine exhaustively testing a 32-bit ...
Read more >Getting Started with Behavior Testing in Python with Behave
Behavior testing simply means that we should test how an application behaves in certain situations. Often the behavior is given to us developers ......
Read more >How to Write Test Cases: The Ultimate Guide with Examples
Learn what is a Test Case and how to write test cases with step-by-step ... For any application, you need to cover all...
Read more >What is the difference between unit tests and functional tests?
Unit tests tell a developer that the code is doing things right; functional tests tell a developer that the code is doing the...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I have started the workflow. Let’s see if it goes through now.
I pulled your branch and tried running the test for suspecting_paraphraser only, it passes but the test run for all light testcases fails for suspecting_paraphraser. But when I run both the tests in my cloned repository, it is successful! Could you try “fetch & merge” your main branch [Fetch upstream on Github UI] and then pull the main branch into your shuffle_within_segements branch and try the pytest again?