Different sentence spans on Document and Token level
See original GitHub issueHow to reproduce the behaviour
I would like to extract the sentence index of a token in a doc. The current workaround uses token.sent and comparing the span with the sentence list of the doc.
Issue: using token.sent results in some cases in different sentence spans than sentences from doc.sents:
import spacy
nlp = spacy.load("en_core_web_md")
text = "Very satisfied!. This product definitely met my expectations. I ordered a refurbished iPhone 4s and it was exactly like it was described: minor scratches on the back (you can not see them unless it has the right kind of light and I have a case on it now anyway), brand new screen with screen protector, and works like new. I have had no problems with it at all. I ordered it and I was scheduled to receive it a week later, but it was in my mailbox four days early. I am extremely satisfied with this product as well as this company. I will probably buy another electronic device from Laptop Angels because they are very trustworthy and honest. If you are looking to buy just an iPhone for a cheaper price than what is in the store, I would tell you to buy it from Laptop Angels. Thank you so much for your honest business. I am a very satisfied customer! :)"
doc = nlp(text)
sentences = [sent for sent in doc.sents]
token = doc[14] #refurbished
token.sent == sentences[2] # False
sentences[2] # I ordered a refurbished iPhone 4s and it was exactly like it was described: minor scratches on the back (you can not see them unless it has the right kind of light
token.sent # I ordered a refurbished iPhone 4s and it was exactly like it was described:
Your Environment
- Python Version Used: 3.8.2
- spaCy Version Used: 2.2.4
- Environment Information: Docker image python:3 (Linux)
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:7 (4 by maintainers)
Top Results From Across the Web
What is the difference between token and span (a slice ...
From spaCy's documentation, a Token represents a single word, punctuation symbol, whitespace, etc. from a document, while a Span is a slice ......
Read more >Span · spaCy API Documentation
spaCy is a free open-source library for Natural Language Processing in Python. It features NER, POS tagging, dependency parsing, word vectors and more....
Read more >Tokenization - Trankit's Documentation
For each token, there are two types of span that we can access: (i) Document-level span (via 'dspan' ) and (ii) Sentence-level span...
Read more >Span-Level Model for Relation Extraction
level task have been token-level models which ... tion for the sentence, ”Washington, D.C. is the ... sible spans in the input document....
Read more >Data Objects and Annotations - Stanza - Stanford NLP Group
Document ; Sentence; Token; Word; Span; ParseTree; Adding new properties to ... A Word object holds a syntactic word and all of its...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Here is another strange behavior of token.sent, where the token is part of the span of token.sent:
Part of the output:
… I ordered a refurbished iPhone 4s and it was exactly like it was described: I I ordered a refurbished iPhone 4s and it was exactly like it was described: ordered I ordered a refurbished iPhone 4s and it was exactly like it was described: a I ordered a refurbished iPhone 4s and it was exactly like it was described: refurbished I ordered a refurbished iPhone 4s and it was exactly like it was described: iPhone I ordered a refurbished iPhone 4s and it was exactly like it was described: 4s I ordered a refurbished iPhone 4s and it was exactly like it was described: and I ordered a refurbished iPhone 4s and it was exactly like it was described: it I ordered a refurbished iPhone 4s and it was exactly like it was described: was I ordered a refurbished iPhone 4s and it was exactly like it was described: exactly I ordered a refurbished iPhone 4s and it was exactly like it was described: like I ordered a refurbished iPhone 4s and it was exactly like it was described: it I ordered a refurbished iPhone 4s and it was exactly like it was described: was I ordered a refurbished iPhone 4s and it was exactly like it was described: described I ordered a refurbished iPhone 4s and it was exactly like it was described: : I ordered a refurbished iPhone 4s and it was exactly like it was described: minor I ordered a refurbished iPhone 4s and it was exactly like it was described: scratches I ordered a refurbished iPhone 4s and it was exactly like it was described: on I ordered a refurbished iPhone 4s and it was exactly like it was described: the I ordered a refurbished iPhone 4s and it was exactly like it was described: back I ordered a refurbished iPhone 4s and it was exactly like it was described: ( I ordered a refurbished iPhone 4s and it was exactly like it was described: you I ordered a refurbished iPhone 4s and it was exactly like it was described: can I ordered a refurbished iPhone 4s and it was exactly like it was described: not I ordered a refurbished iPhone 4s and it was exactly like it was described: see I ordered a refurbished iPhone 4s and it was exactly like it was described: them I ordered a refurbished iPhone 4s and it was exactly like it was described: unless I ordered a refurbished iPhone 4s and it was exactly like it was described: it I ordered a refurbished iPhone 4s and it was exactly like it was described: has I ordered a refurbished iPhone 4s and it was exactly like it was described: the I ordered a refurbished iPhone 4s and it was exactly like it was described: right I ordered a refurbished iPhone 4s and it was exactly like it was described: kind I ordered a refurbished iPhone 4s and it was exactly like it was described: of I ordered a refurbished iPhone 4s and it was exactly like it was described: light …
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.