question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

English POS Tagger incorrectly tags word into PUNCT

See original GitHub issue

How to reproduce the behaviour

import spacy
nlp=spacy.load("en_core_web_md")
assert(nlp("back scatter")[1].pos_!="PUNCT")

What’s wrong: The POS for scatter should definitely not be PUNCT

Your Environment

  • spaCy version: 2.2.4
  • Platform: Linux-5.4.0-37-generic-x86_64-with-glibc2.29
  • Python version: 3.8.2

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
svlandegcommented, Jun 17, 2020

Because of how POS taggers generally work, I personally think it’s always going to be a challenge to get such results with high accuracy. But I guess you could try to retrain the tagger on phrases only.

0reactions
github-actions[bot]commented, Nov 4, 2021

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Getting incorrect POS tagging - Stack Overflow
1 Answer 1 · Spacy models are statistically trained models, that individually have a specific POS accuracy, in this case around 97%. ·...
Read more >
5. Categorizing and Tagging Words - NLTK
The process of classifying words into their parts of speech and labeling them accordingly is known as part-of-speech tagging, POS-tagging, or simply tagging...
Read more >
Part of Speech Tagging - Stanford University
How difficult is POS tagging in English? Roughly 15% of word types are ambiguous. •. Hence 85% of word types are unambiguous.
Read more >
BNC2 POS-Tagging Guide - UCREL
For examples in this guide, we will retain just the POS-tag of the word (or words) ... or a foreign expression naturalised into...
Read more >
Universal POS tags - Adjectives
Some words that could be seen as adjectives (and are tagged as such in other annotation schemes) have a different tag in UD:...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found