question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Spacy behaves differently when testing one case vs testing all cases

See original GitHub issue

It seems Spacy’s tokenizer behaves differently when I run pytest -s --t=emojify and pytest -s --t=light --f=light.

For example, I added the following snippet in my generate() function:

print([str(t) for t in self.nlp(sentence)])

With input sentence "Apple is looking at buying U.K. startup for $132 billion."

pytest -s --t=emojify gives:

['Apple', 'is', 'looking', 'at', 'buying', 'U.K.', 'startup', 'for', '$', '132', 'billion', '.']

However, pytest -s --t=light --f=light gives:

['Apple', 'is', 'looking', 'at', 'buying', 'U.K.', 'startup', 'for', '$1', '32', 'billion.']

I use the fowling code to load spacy:

import spacy
from initialize import spacy_nlp
self.nlp = spacy_nlp if spacy_nlp else spacy.load("en_core_web_sm")

It looks very strange. Am I overlooking something?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:19 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
AbinayaM02commented, Sep 20, 2021

Thank you for the answer! I have updated my branch again (fetch & merge), and the tests work locally. I can’t run the workflow in my PR, though, as this is my first contribution.

I have started the workflow. Let’s see if it goes through now.

1reaction
AbinayaM02commented, Sep 20, 2021

Hello @AbinayaM02 I see that the first test fails, it wasn’t changed:

E           AssertionError: Mis-match in expected and predicted output for SuspectingParaphraser transformation: 
E              Expected Output: Sally finally returned the french book to Chris, didn't she? 
E              Predicted Output: Sally finally returned the french book to Chris, didn't it?
E           assert "Sally finall...s, didn't it?" == "Sally finall..., didn't she?"
E             Skipping 46 identical leading characters in diff, use -v to show
E             - s, didn't she?
E             ?           ^^^
E             + s, didn't it?
E             ?           ^^

I pulled your branch and tried running the test for suspecting_paraphraser only, it passes but the test run for all light testcases fails for suspecting_paraphraser. But when I run both the tests in my cloned repository, it is successful! Could you try “fetch & merge” your main branch [Fetch upstream on Github UI] and then pull the main branch into your shuffle_within_segements branch and try the pytest again?

Read more comments on GitHub >

github_iconTop Results From Across the Web

A Beginner's Guide to Testing: Error Handling Edge Cases
Take a small change to the functions we've tested above. It comes down to modifications of state, and understanding the behavior of a...
Read more >
Reading 3: Testing
Exhaustive testing is infeasible. The space of possible test cases is generally too big to cover exhaustively. Imagine exhaustively testing a 32-bit ...
Read more >
Getting Started with Behavior Testing in Python with Behave
Behavior testing simply means that we should test how an application behaves in certain situations. Often the behavior is given to us developers ......
Read more >
How to Write Test Cases: The Ultimate Guide with Examples
Learn what is a Test Case and how to write test cases with step-by-step ... For any application, you need to cover all...
Read more >
What is the difference between unit tests and functional tests?
Unit tests tell a developer that the code is doing things right; functional tests tell a developer that the code is doing the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found