Name Entity labels in Spacy
See original GitHub issueI have always been using Ginza within Spacy and on previous version, the number of labels was small and it was quite general (e.g. 渋谷 and 日本 are classified as GPE). However, with version 3.1.0, I noticed that it has been changed and not following these mappings again. Do you have any documentations on all NER labels that are inside the Ginza’s Spacy implementation?
Also, I noticed that some entities that are previously kinda correct, but with version 3.x.x it’s wrongly classified. For example with these lines of code:
import spacy
nlp = spacy.load('ja_ginza')
doc = nlp('私は東京ディズニーシーへ行った')
for ent in doc.ents:
print(ent.text, ent.label_)
# ver 2.x.x yields 東京ディズニーシー ORG
# ver 3.x.x yields 東京ディズニーシー Person
doc = nlp('エベレストの高さはどのぐらい')
for ent in doc.ents:
print(ent.text, ent.label_)
# ver 2.x.x yields エベレスト LOC
# ver 3.x.x yields エベレスト Company
Thank you very much for your work.
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
EntityRecognizer · spaCy API Documentation
A transition-based named entity recognition component. The entity recognizer identifies non-overlapping labelled spans of tokens.
Read more >Named Entity Recognition (NER) in Spacy Library - MLK
The labels or named entities that Spacy library can recognize include companies, locations, organizations, and products. The Spacy model is pre- ...
Read more >Named Entity Recognition NER using spaCy | NLP | Part 4
Named Entity Recognition NER works by locating and identifying the named entities present in unstructured text into the standard categories such as person...
Read more >NER Tagging in Python using spaCy - Medium
We can tag documents using NER and based on the value and type of entity labels these can be utilized in various scenarios....
Read more >Named Entity Recognition (NER) in Python with Spacy
A named entity is basically a real-life object which has proper identification and can be denoted with a proper name. Named Entities can...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@tomgun132 The GSK2014-A contains some labels, which are defined on Sekine’s ENE version 7. https://sites.google.com/site/extendednamedentity711/top以下の階層の全リスト/0-top-top/1-名前-name/1-5-shi-she-ming-facility/1-5-3-goe-goe/1-5-3-0-goe-sono-ta-goe-other
And also, some miss-spelled labels exist in it. Please see all the ENE labels used in GSK2014-A here. https://github.com/megagonlabs/ginza/blob/develop/ginza/ent_type_mapping.py#L13
Thank you very much for your reply.
I tried
東京ディズニーランド
and東京ディズニーリゾート
with previous codes and the results:It’s interesting that Disney Resort is considered as ORG and Disney Land is considered as FAC. However for Disney Land, in the
ents.label_
value, it’s GOE_Other and I tried to search in Sekine’s extended NER page, GOE doesn’t exist.