question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Parser: (re)allow duplicate group names (e.g. move the check out of the parser)

See original GitHub issue

I am currently revisiting my uxregexp extended regexp module after quite a long time. I use the regexp-tree to parse the extended regexp. However I noticed, you reject duplicate group names now, which breaks my algorithm.

Please see issue #142, where you forced group names to be unique.

As an old perl user, I am not sure, if the mentioned proposal at https://tc39.es/proposal-regexp-named-groups/#sec-patterns-static-semantics-early-errors really wants this at the end…at least, if so, I think this should change. Perl is the mother for such extended regexps and always far ahead of other implementations, meaning there were solutions for all kinds of regexp issues when others didn’t even think of these.

Example: alternative representation of a date, e.g.

   # 2020-09-21
   (?<year>  [0-9]{4} ) -
   (?<month> [0-9]{2} ) -
   (?<day>   [0-9]{2} )
   |
   # 09/21/2020
   (?<month> [0-9]{2} ) /
   (?<day>   [0-9]{2} ) /
   (?<year>  [0-9]{4} )

so, the duplicate group name is legal and quite useful, if only one matches.

Additionally, multiple matches can be collected in arrays. If you can collect with wildcards like: ((?<name> some regexp) some separator regexp)* it is only a step further to allow multiple occurances of the same group name in one expression, e.g.: ((?<name> some regexp) sep1 (?<name> some regexp) sep2)*

I also think, a check shouldn’t be in the parser, because a pure declarative syntax doesn’t include the condition. However a consumer of the AST would check the names as necessary. I also think, it is better to separate such implicit rules from the parsing process.

From my POV, I see these possibilities:

  • you could remove the check for a duplicate group in the parser (may be moving it to another part)
  • you could make it optional (because checking at the parser is more efficient)
  • if you reject this, I need to rename each duplicated group name (add a count), which introduces one more layer [EDIT: not really possible, I would have to change the names before parsing]

thanks for listening

PS: btw. I am not a “native” javascript developer (c++, perl, and many other languages)… So, I wonder, what would be the best way to comment on that proposal? would they listen to my comment at all?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
hg42commented, Jan 22, 2021

thanks for the work… And also for the pointer to the proposal, I also commented there.

1reaction
DmitrySoshnikovcommented, Jan 20, 2021

@hg42, yeah, I think this would make sense for ECMAScript itself eventually, and for now we can probably add --loose-mode parse option, which would allow some some features. Alternatively these could be specific options:

parser.parse(re, {
  allowGroupNameDuplicates: boolean,
});
Read more comments on GitHub >

github_iconTop Results From Across the Web

Dmitry Soshnikov on Twitter: "Why ECMAScript spec chose checking ...
Parser : (re)allow duplicate group names (e.g. move the check out of the parser) · Issue #213 ·... I am currently revisiting my...
Read more >
uniVocity parser to handle duplicate header names
How can I read a csv file which has duplicate column names by using BeanParser. Below is the example header. Col desc, Col...
Read more >
4. Parsing SQL - flex & bison [Book] - O'Reilly
MySQL actually uses a bison parser to parse its SQL input, although for a ... ON DUPLICATE are recognized as single tokens; this...
Read more >
GP Parser-Based Data Models - TIBCO Software
GP Parser-Based Data Models, which are data models that use a ... You must duplicate a built-in data model, save it in the...
Read more >
Parsing - Datadog Docs
Parsing. Overview. Datadog automatically parses JSON-formatted logs. For other formats, Datadog allows you to enrich your logs with the help of Grok Parser....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found