bug: `re_extract` behavior differs between backends
See original GitHub issueThe behavior of re_extract
differs between backends, with regards to what the index
parameter means.
This method has signature re_extract(pattern, index)
, where pattern
is a regex pattern with optional groups, and index
is an index of the group to extract (returning NULL
if no match or no group matches that index).
duckdb
,postgres
,clickhouse
, …:re_extract(pattern, 0)
returns the part matching the first group if there’s a match, andNULL
otherwise.sqlite
,pandas
,dask
,pyspark
:re_extract(pattern, 0)
returns the whole string if there’s a match, andNULL
otherwise. You need to pass in1
not0
to extract the first group.
To put it another way, given a column with a value "row_one"
, column.re_extract("row_([a-z])", 0)
returns "one"
for backends in the first group, and "row_one"
for backends in the second.
Given the docstring, I think the first group has the intended behavior.
Issue Analytics
- State:
- Created a year ago
- Comments:7 (2 by maintainers)
Top Results From Across the Web
What is the difference between 'expected but not desired ...
A bug is when the user-observable behavior differs from the documented behavior. An undesired behavior is when the user-observable behavior correctly ...
Read more >Exporting module: differing behavior vs 'backend' #9607
Bug report summary The exporting module has different connection behavior than the now deprecated backend environment.
Read more >JSON Patch is a bizarre Frankenstein's monster made of the ...
JSON Patch is a bizarre Frankenstein's monster made of the cognitive dissonance of REST aficionados. JSON Patch is not REST. It is not...
Read more >https://mirror.math.princeton.edu/pub/putty/putty-...
... Telnet and Rlogin differ?=t00000002 1 Chapter 2: Getting started with ... The PuTTY command line=t00000027 3 Section 3.8.1: Starting a session from...
Read more >rurban/perl-compiler: Perl5 compiler backends B::Bytecode, B::C, B ...
See test 21. ... with standard Perl but gives a compile-time error with compiled Perl. See test 30. ... large numbers or on...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
A column of strings and a column of groups to extract might be a case where you’d want this.
I think we just handle it in the backend and raise an error during compilation. We do this elsewhere in the codebase.
I’m in favor of not restricting it unless it’s breaking something to keep it that way.
+1 on matching the behavior of
re.match
.