question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BytesWarning in execnet when deserializing float

See original GitHub issue

Behold, this is probably one of the strangest issues I’ve ever had the joy (?) of debugging. I’d really appreciate some help, as I’m completely stumped here.

I run tests in qutebrowser with python -bb, which enables byte warnings and turns them into errors.

When I run the following in the qutebrowser repository:

  • tox -e py38-pyqt515 --notest
  • .tox/py38-pyqt515/bin/pip install pytest-xdist
  • tox -e py38-pyqt515 -- -n 2

I get crashing gateways with:

Traceback (most recent call last):
  File "/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/execnet/gateway_base.py", line 855, in _local_receive
    data = loads_internal(data, channel, strconfig)
  File "/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/execnet/gateway_base.py", line 1368, in loads_internal
    return Unserializer(io, channelfactory, strconfig).load()
  File "/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/execnet/gateway_base.py", line 1177, in load
    loader(self)
  File "/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/execnet/gateway_base.py", line 1218, in load_float
    self.stack.append(struct.unpack(FLOAT_FORMAT, binary)[0])
BytesWarning: Comparison between bytes and string

This happens when trying to deserialize the test duration (?). When I print self.stack before that line I see e.g.:

['testreport', {}, 'data', {'nodeid': 'tests/end2end/features/test_backforward_bdd.py::test_going_backforward', 'location': ('.tox/py38-pyqt515/lib/python3.8/site-packages/pytest_bdd/scenario.py', 197, 'test_going_backforward'), 'keywords': {'pytestmark': 1, 'usefixtures': 1, 'end2end': 1, 'tests/end2end/features/test_backforward_bdd.py': 1, 'git': 1, 'test_going_backforward': 1, 'gui': 1, '__pytest_bdd_counter__': 1, 'skip': 1, '__scenario__': 1}, 'outcome': 'skipped', 'longrepr': ('/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/pytest_bdd/scenario.py', 198, 'Skipped: unconditional skip'), 'when': 'setup', 'user_properties': [], 'sections': []}, 'duration']

As you might guess, there’s no comparison between bytes and strings involved, this is a completely normal struct.unpack call:

(Pdb) interact
*interactive*
>>> FLOAT_FORMAT
'!d'
>>> binary
b'?\xb7\xd8x\x89t\x00\x00'

Yet I can reproduce the error when I run the same thing by hand, inside the pdb environment:

>>> struct.unpack(FLOAT_FORMAT, binary)                                                                              
Traceback (most recent call last):
  File "<console>", line 1, in <module>
BytesWarning: Comparison between bytes and string
 
>>> struct.unpack('!d', b'?\xb7\xd8x\x89t\x00\x00')
Traceback (most recent call last):
  File "<console>", line 1, in <module>
BytesWarning: Comparison between bytes and string

No funny things are going on, as far as I can see:

>>> struct.unpack
<built-in function unpack>
>>> struct
<module 'struct' from '/usr/lib/python3.8/struct.py'>
>>> type(FLOAT_FORMAT)
<class 'str'>
>>> type(binary)
<class 'bytes'>

Here’s where things are getting weird: Both according to the docs and according to the cpython code, both > and ! mean the same - but using > instead works!

>>> struct.unpack('>d', b'?\xb7\xd8x\x89t\x00\x00')
(0.09314683299817261,)

And because this wasn’t weird enough, I can not reproduce the issue in a Python console with the same Python version, on the same machine:

$ PYTHONSTARTUP= python3 -bb
Python 3.8.5 (default, Sep  5 2020, 10:50:12)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import struct
 
>>> struct.unpack('!d', b'?\xb7\xd8x\x89t\x00\x00')
(0.09314683299817261,)

Any idea what kind of black magic xdist/execnet could be doing to cause this? I haven’t been able to find a reproducer outside of qutebrowser so far, but I guess that’s not too surprising with how strange this issue is. Did I find a bug in cpython or something?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:15 (15 by maintainers)

github_iconTop GitHub Comments

3reactions
tadeucommented, Sep 13, 2020

Hey folks, I was intrigued by this and took some time to debug it, here are my findings.

There’s an internal cache in struct implementations (such as struct.calcsize and struct.unpack), and what’s happening is that the “string type” (bytes or str) of the first call to that function is being stored, and subsequent calls will try to coerce need to compare the arguments with the stored type:

>>> import struct
>>> struct.calcsize(b'!d')  # cache for '!d' uses bytes
8
>>> struct.calcsize('!d')  # so there's a warning when trying to use str
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
BytesWarning: Comparison between bytes and string
>>> struct.calcsize('>d')  # cache for '>d' uses str
8
>>> struct.calcsize(b'>d')  # so now the warning is inverted, it shows up when trying to use bytes
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
BytesWarning: Comparison between bytes and string
>>> struct.calcsize('>d')  # no problem when using str
8

In this case, hypothesis is to blame, since it’s the first one to populate the cache (via struct.unpack).

2reactions
The-Compilercommented, Sep 13, 2020

The problem is that their hashes are equal:

>>> hash('a')
-8194812638379419632
>>> hash(b'a')
-8194812638379419632

And since hashes are never a guarantee of equality, a dict internally will use == on any objects where the hashes are equal. But yeah, agreed, this is probably a problem in Python with dicts in general then, not just with struct!

Read more comments on GitHub >

github_iconTop Results From Across the Web

C - Serialization of the floating point numbers (floats, doubles)
Assuming you're using mainstream compilers, floating point values in C and C++ obey the IEEE standard and when written in binary form to...
Read more >
Unable to deserialize float from json · Issue #319 - GitHub
This is the core problem behind #318 . Because the json only reads double it is unable to deserialize float. Which is a...
Read more >
execnet Documentation
Messages are serialized to and from InputOutput objects. The details of this protocol are locally defined in this module. There is no need...
Read more >
execnet - PyPI
execnet : rapid multi-Python deployment. ... simple serialization of python builtin types (no pickling). grouped creation and robust termination of processes.
Read more >
Float Serialization/Deserialization with Union - Tinkercad
A simple demonstration of serializing and deserializing floats using the union data type. The process of converting IEEE-754 Floating Point ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found