Question marks at the end swallowed
See original GitHub issueLooks like the example with just question marks is good now:
>>> segmenter.segment("??")
['??']
but the example with double question marks as a token at the end of a sentence still loses the question marks:
>>> segmenter.segment("T stands for the vector transposition. As shown in Fig. ??")
['T stands for the vector transposition.', 'As shown in Fig.']
looks like this is the minimal repro:
>>> segmenter.segment("Fig. ??")
['Fig.']
Issue Analytics
- State:
- Created 4 years ago
- Comments:11 (11 by maintainers)
Top Results From Across the Web
Swallowed Foreign Object - Seattle Children's
Magnets at different spots can become attracted to each other across the bowel wall. The problems include a bowel puncture or blockage. All ......
Read more >Secret role of swallowing in conversations revealed by new ...
Another is that we interpret swallows as places where speech is temporarily not possible, because swallowing makes the vocal tract unavailable ...
Read more >Punctuation with Quotation Marks - Book Cave
Question marks, exclamation points, and em dashes can go inside or outside the quotation mark, depending on whether they're part of the quote...
Read more >8. Punctuation - GovInfo
Quotation marks are not used on a line of asterisks in quoted matter. Where an ellipsis line ends a complete quotation, no closing...
Read more >Where do I put the question mark if a question has a ...
So your suspicion is correct: The question mark goes at the very end, just outside the closing parenthesis. You are probably also correct...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@danielkingai2 : Fixed above bug & have released
char-span
functionality today. Didn’t release it yday since I wanted to add tests and update docs.Yes, I really need to come up with some assertion logic to map respective sentences to the original text. This is the main reason why I’ve been working on https://github.com/nipunsadvilkar/pySBD/tree/sentence-char-span branch because even if pysbd fails to find proper sentence.
tok.is_sent_start
would remain False and we will get an original text at the end