Unicode conversion issue
See original GitHub issueEnvironment
- Pythonnet version: 3.0.0-preview2021-06-04
- Python version: 3.8.8
- Operating System: ubuntu 21.04
- .NET Runtime: 5.0.300
Details
Unicode characters get lost / mangled during conversion:
scope.Exec("testStr = 'Nom 🍗';");
scope.Exec("""print("python:", testStr);""")
Console.WriteLine($"""dotnet: {scope.Get("testStr").ToString()}""");
writes
python: Nom 🍗
dotnet: Nom �
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:8 (8 by maintainers)
Top Results From Across the Web
Troubleshooting Unicode Conversion and I18N issues: BC ...
You have problems during the Unicode Conversion, question how to execute specific steps or general Internationalization (I18N) problems.
Read more >SSIS Convert Between Unicode and Non-Unicode Error
Open the data conversion block and tick the column for which the error is showing. Below change its data type to unicode string(DT_WSTR)...
Read more >Re: Unicode character conversion issue - KX Community
Hi,. While i was trying to convert the unicode character of ô it is showing a different number. q)`$`char$147 `ô q)`$`char$244
Read more >SSIS Oracle Unicode conversion issue
When moved to SS 2008 64 bit server and run, I get unicode conversion error. I was hoping that someone else has done...
Read more >Cannot Convert Between Unicode and Non-Unicode String ...
This article looks at several ways to handle the SSIS error: cannot convert between Unicode and non-Unicode string data types.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
It looks this problem goes in both directions:
"foo🐼"
to be 4 characters long, but we somehow create a Python object with Length 5.I found some hints of how Python treats unicode conversion internally https://stackoverflow.com/questions/36098984/python-3-3-c-api-and-utf-8-strings
Using
PyUnicode_DecodeUTF16
orPyUnicode_DecodeUTF16Stateful
instead ofPyUnicode_FromKindAndData
would maybe alleviate this problem. But getting these functions imported from .dlls and getting delegates in place is probably beyond my capacity.And thank You, @filmor, for fixing all the rest.