question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support for certain unicode literals depends on sys.maxunicode

See original GitHub issue

It seems that support for unicode literals over code point 0xffff, introduced in cf1c86cb86db74206085e6f83e4586ddc7db9ac2, depends on sys.maxunicode. So for example, it’ll work with standard Python 3 builds, but won’t work with standard Python 2 builds, where sys.maxunicode is 2**16-1 (built with UCS-2).

Here’s a simple test program that works on Python 3, but not on a standard Python 2 build:

# coding=utf8
import yaml

s = '''
"😢"
'''

assert yaml.safe_load(s) == '😢'

It produces the following error on Python 2:

Traceback (most recent call last):
  File "pyyaml_unicode.py", line 8, in <module>
    assert yaml.safe_load(s) == '😢'
  File "/usr/local/lib/python2.7/site-packages/yaml/__init__.py", line 162, in safe_load
    return load(stream, SafeLoader)
  File "/usr/local/lib/python2.7/site-packages/yaml/__init__.py", line 112, in load
    loader = Loader(stream)
  File "/usr/local/lib/python2.7/site-packages/yaml/loader.py", line 34, in __init__
    Reader.__init__(self, stream)
  File "/usr/local/lib/python2.7/site-packages/yaml/reader.py", line 81, in __init__
    self.determine_encoding()
  File "/usr/local/lib/python2.7/site-packages/yaml/reader.py", line 137, in determine_encoding
    self.update(1)
  File "/usr/local/lib/python2.7/site-packages/yaml/reader.py", line 174, in update
    self.check_printable(data)
  File "/usr/local/lib/python2.7/site-packages/yaml/reader.py", line 149, in check_printable
    'unicode', "special characters are not allowed")
yaml.reader.ReaderError: unacceptable character #xd83d: special characters are not allowed
  in "<string>", position 2

Is there any reasonable way to support this functionality on Python 2 as well?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
perlpunkcommented, Dec 2, 2019

We released 5.2: https://pypi.org/project/PyYAML/5.2/

edit: oops, wrong comment, as the fix was not in 5.2

1reaction
perlpunkcommented, Nov 20, 2019

Just a note: In python 2 the assert should look like this I believe:

assert yaml.safe_load(s) == u'😢'

And then it works for me in a standard Python 2.7.14 on linux (openSUSE). But I know it doesn’t on other Python 2 builds (windows, macos).

I don’t have such a machine available, so I’m not able to check if this would be possible. Maybe you can try to fix it?

btw, https://pythonclock.org/ says python 2 will retire in 1 month 11 days…

Read more comments on GitHub >

github_iconTop Results From Across the Web

python - What does sys.maxunicode mean? - Stack Overflow
maxunicode is "An integer giving the largest supported code point for a Unicode character." Does this mean that unicode operations aren't ...
Read more >
Unicode Objects and Codecs — Python 3.11.1 documentation
When dealing with single Unicode characters, use Py_UCS4 . New in version 3.3. ... Return 1 or 0 depending on whether ch is...
Read more >
Unicode and passing strings — Cython 3.0.0a11 documentation
Cython supports four Python string types: bytes , str , unicode and basestring . The bytes and unicode types are the specific types...
Read more >
Python and unicode — pydagogue 0.2 documentation
Python 2 supports unicode with unicode strings: ... See below for some complications of using these 32 bit unicode characters in some builds...
Read more >
Getting unicode right in Python - Nick's Blog
Languages in this category tend to have unicode support that's spotty, ... which results in some serious shortfalls in unicode support for ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found