[BUG] unicode.split does not allow to pass None for sep
See original GitHub issueDescribe the bug
I’m hitting the difference in behaviour in between CPython and Cython for unicode.split
- with Cython passing sep=None
explicitly raises TypeError. Please find details below:
To Reproduce Code to reproduce the behaviour:
---- 8< ---- usplit.pyx
# cython: language_level=3
def mysplit(q):
return unicode.split(q, None)
print(mysplit("hello world"))
Expected behavior
I expect it to behave the same as in Python - i.e. print [‘hello’, ‘world’]:
---- 8< ---- usplit_py.py
def mysplit(q):
return str.split(q, None)
print(mysplit("hello world"))
$ python usplit_py.py
['hello', 'world']
However what I get instead is the following exception that None could not be used for sep
:
$ cythonize -i usplit.pyx
Compiling /home/kirr/usplit.pyx because it changed.
[1/1] Cythonizing /home/kirr/usplit.pyx
running build_ext
building 'usplit' extension
creating /home/kirr/tmp3kckc5wa/home
creating /home/kirr/tmp3kckc5wa/home/kirr
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -ffile-prefix-map=/build/python3.9-RNBry6/python3.9-3.9.2=. -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -ffile-prefix-map=/build/python3.9-RNBry6/python3.9-3.9.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/kirr/src/wendelin/venv/py3.venv/include -I/usr/include/python3.9 -c /home/kirr/usplit.c -o /home/kirr/tmp3kckc5wa/home/kirr/usplit.o
x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,-z,relro -g -fwrapv -O2 -g -ffile-prefix-map=/build/python3.9-RNBry6/python3.9-3.9.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 /home/kirr/tmp3kckc5wa/home/kirr/usplit.o -o /home/kirr/usplit.cpython-39-x86_64-linux-gnu.so
$ python -c 'import usplit'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "usplit.pyx", line 6, in init usplit
print(mysplit("hello world"))
File "usplit.pyx", line 4, in usplit.mysplit
return unicode.split(q, None)
TypeError: must be str, not NoneType
Environment (please complete the following information):
- OS: [Debian GNU/Linux 11]
- Python version [e.g. 3.9.2]
- Cython version [e.g. 0.29.27]
Thanks beforehand, Kirill
Issue Analytics
- State:
- Created a year ago
- Comments:16 (13 by maintainers)
Top Results From Across the Web
Python: splitting string by all space characters - Stack Overflow
The function of this character is to allow a line break at positions where it normally would not be allowed, and is thus...
Read more >Unicode Objects and Codecs — Python 3.11.1 documentation
This function checks that unicode is a Unicode object and the index is not out of bounds, in contrast to PyUnicode_READ_CHAR() , which...
Read more >Web Access Gateway bugs and problems
This is an old bug list about the old Web Access Gateway, which is no longer maintained, having been largely replaced by my...
Read more >How to use Split in Python Explained - KnowledgeHut
The split function is used when we need to break down a large string into smaller strings. Strings represent Unicode character values and...
Read more >How to Split a String in Python - 24HourAnswers
Here, “sep” stands for separator or delimiter. This value defaults to whitespace if left blank or set to None. Delimiter characters are characters...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Sure, here is my current list:
Thanks, @scoder. So I’ve tried to do those tests for unicode methods the way we discussed. Please find the patches at https://github.com/cython/cython/pull/4743. More specifically the patches are:
Hope it is ok, Kirill