Incorrect quote string matching
See original GitHub issueOverview
The quotation string matching system matches incorrectly for adjacent strings.
Example
const nlp = require('compromise');
// Try splitting the string into quoted strings.
console.log(
nlp('My "String" "with many" adjacent "nested" \'quotes\'')
.quotations()
.out('array')
)
Output
[
"string with many",
"nested quotes"
]
Expected
[
"string",
"with many",
"nested",
"quotes"
]
Workaround
So, I solved this for my specific circumstance like so:
function quotes(string) {
// Match a single word surrounded by (")
return string.match(
'/^"[^"\\\\]*(?:\\\\.[^"\\\\]*)*"$/'
)
// Match a single word surrounded by (')
.concat(string.match(
'/^\'[^\'\\\\]*(?:\\\\.[^\'\\\\]*)*\'$/'
))
// Match a two or more word surrounded by (")
.concat(string.match(
'/^"[^"\\\\]*(?:\\\\.[^"\\\\]*)*$/ ' +
'/[^"\\\\]*(?:\\\\.[^"\\\\]*)*/+? ' +
'/^[^"\\\\]*(?:\\\\.[^"\\\\]*)*"$/'
))
// Match a two or more word surrounded by (')
.concat(string.match(
'/^\'[^\'\\\\]*(?:\\\\.[^\'\\\\]*)*$/ ' +
'/[^\'\\\\]*(?:\\\\.[^\'\\\\]*)*/+? ' +
'/^[^\'\\\\]*(?:\\\\.[^\'\\\\]*)*\'$/'
));
}
Using the RegExp by Jeffrey Friedl provided by ridgerunner on stackoverflow
Edit 1 I will note my solution is not perfect as it outputs:
[
"string",
"nested",
"quotes",
"with many"
]
End: Edit 1
A side note: The regex on the match syntax wiki page should be:
/rain(ing|ed)/
meaning matchrain
and (ing
ored
). The current/rain[ing|ed]/
means matchrain
and (i
,n
,g
,|
e
ord
).
Issue Analytics
- State:
- Created 6 years ago
- Comments:10 (9 by maintainers)
Top Results From Across the Web
️ Incorrect Quote Generator🏳️ - Perchance.org
Names (including extra ones) will be selected from and placed randomly into the quotes. NOTE: Disabling NSFW only disables dirty jokes. It does...
Read more >Regex for quoted string with escaping quotes - Stack Overflow
Explanation: Every quoted string starts with Char: " ;; It may contain any number of any characters: .*? {Lazy match}; ending with non...
Read more >Solved: Compare String Starts With double quotes
Solved: Hi , I have to check in condition if a response content starts with double quote . How ever proxy is not...
Read more >'Single' vs "Double" quotes for strings in javascript - Flexiple
We can fix this by using the fact that javascript allows both single and double quotes to define a string.
Read more >grep: Pattern Matching and Replacement - Rdrr.io
character to a character vector. Long vectors are supported. ignore.case. if FALSE , the pattern matching is case sensitive and if TRUE ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
ah, thanks Sca. This is a great bug. and your workaround is very clever. i think i’ve fixed this in
11.5.2
- at least your example works. Can you take a look? There’s all sorts of issues with, unicode quoations, um and the trailing possessive apostrophe inside a quote, which might trip it still. cheershey, yeah thanks @Errogant. I’ve made a new issue here