question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

python3 str to std::string conversion is not automatic

See original GitHub issue

Refs: https://groups.google.com/d/msg/cython-users/oqk3GQ2pJ8M/-oBEvfWXDgAJ

I have a python2 project where the pyx files contain the following directive:

# cython: c_string_type=unicode, c_string_encoding=utf8

In the process of converting to python3, I am finding that even with these directive, the conversion from a python3 str is not automatically encoded to “utf8” bytes when converted to a C++ std::string:

Reproduction: https://gist.github.com/justinfx/8023d341becc8a1092e5beacd7a249eb

In python3, this results in the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "test.pyx", line 6, in test.test
  File "stringsource", line 15, in string.from_py.__pyx_convert_string_from_py_std__in_string
TypeError: expected bytes, str found

I have tested this behaviour both in cython 0.28.5 as well as master, using all available language level values.

My expected results would be that given the directives, any implicit assignment/conversion to std::string would automatically encode to ‘utf8’ bytes.

My current workaround in dealing with the ton of locations where a python string is assigned to a std::string or passed to an argument, or even part of implicit map or list conversions, is to explicitly wrap each site in a conversion helper:

    # cython: c_string_type=unicode, c_string_encoding=utf8

    from libcpp.string cimport string
    from cpython.version cimport PY_MAJOR_VERSION

    cdef unicode _text(s):
        if type(s) is unicode:
            return <unicode>s

        elif PY_MAJOR_VERSION < 3 and isinstance(s, bytes):
            return (<bytes>s).decode('ascii')
        
        elif isinstance(s, unicode):
            return unicode(s)
        
        else:
            raise TypeError("Could not convert to unicode.")

    cdef string _string(basestring s) except *:
        cdef string c_str = _text(s).encode("utf-8")
        return c_str

    # ...
    self.field = _string(s)

This has been error prone since I keep overlooking hard to spot type conversions. It would be amazing for the behaviour in Cython to be updated to support automatic conversions based on my directives.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

4reactions
kylemcdonaldcommented, Nov 14, 2019

I ran into this while wrapping C++ code that. Adding this line to the top of my .pyx wrapper file fixed my problem:

# cython: c_string_type=unicode, c_string_encoding=utf8
1reaction
robertwbcommented, Feb 19, 2019

It is somewhat close to the edge, but I think it falls in the bugfix bucket, as this was the obvious intent, and it changes an exception being raised into the desired behavior. So I think it’s safe for a point release.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Clean Way to Convert Python 3 Unicode to std::string
Looks like the solution exists in python 3.3, with char* PyUnicode_AsUTF8(PyObject* unicode) . This should be exactly the same behavior as ...
Read more >
Unicode and passing strings — Cython 3.0.0a11 documentation
Above all, this means that by default there is no automatic conversion between byte strings and unicode strings (except for what Python 2...
Read more >
C++ String to float/double and vice-versa - Programiz
In this tutorial, we will learn how to convert string to floating-point numbers and vice versa with the help of examples.
Read more >
std::stoi, std::stol, std::stoll - cppreference.com
If pos is not a null pointer, then a pointer ptr - internal to the conversion functions - will receive the address of...
Read more >
Python - How to convert float to String - Mkyong.com
In Python, we can use str() to convert float to String. pi = 3.1415 print(type(pi)) # float piInString = str(pi) # float ->...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found