python3 str to std::string conversion is not automatic
See original GitHub issueRefs: https://groups.google.com/d/msg/cython-users/oqk3GQ2pJ8M/-oBEvfWXDgAJ
I have a python2 project where the pyx files contain the following directive:
# cython: c_string_type=unicode, c_string_encoding=utf8
In the process of converting to python3, I am finding that even with these directive, the conversion from a python3 str
is not automatically encoded to “utf8” bytes when converted to a C++ std::string
:
Reproduction: https://gist.github.com/justinfx/8023d341becc8a1092e5beacd7a249eb
In python3, this results in the following exception:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "test.pyx", line 6, in test.test
File "stringsource", line 15, in string.from_py.__pyx_convert_string_from_py_std__in_string
TypeError: expected bytes, str found
I have tested this behaviour both in cython 0.28.5 as well as master, using all available language level values.
My expected results would be that given the directives, any implicit assignment/conversion to std::string
would automatically encode to ‘utf8’ bytes.
My current workaround in dealing with the ton of locations where a python string is assigned to a std::string
or passed to an argument, or even part of implicit map or list conversions, is to explicitly wrap each site in a conversion helper:
# cython: c_string_type=unicode, c_string_encoding=utf8
from libcpp.string cimport string
from cpython.version cimport PY_MAJOR_VERSION
cdef unicode _text(s):
if type(s) is unicode:
return <unicode>s
elif PY_MAJOR_VERSION < 3 and isinstance(s, bytes):
return (<bytes>s).decode('ascii')
elif isinstance(s, unicode):
return unicode(s)
else:
raise TypeError("Could not convert to unicode.")
cdef string _string(basestring s) except *:
cdef string c_str = _text(s).encode("utf-8")
return c_str
# ...
self.field = _string(s)
This has been error prone since I keep overlooking hard to spot type conversions. It would be amazing for the behaviour in Cython to be updated to support automatic conversions based on my directives.
Issue Analytics
- State:
- Created 5 years ago
- Comments:8 (4 by maintainers)
Top GitHub Comments
I ran into this while wrapping C++ code that. Adding this line to the top of my .pyx wrapper file fixed my problem:
It is somewhat close to the edge, but I think it falls in the bugfix bucket, as this was the obvious intent, and it changes an exception being raised into the desired behavior. So I think it’s safe for a point release.