BytesWarning in execnet when deserializing float
See original GitHub issueBehold, this is probably one of the strangest issues I’ve ever had the joy (?) of debugging. I’d really appreciate some help, as I’m completely stumped here.
I run tests in qutebrowser with python -bb
, which enables byte warnings and turns them into errors.
When I run the following in the qutebrowser repository:
tox -e py38-pyqt515 --notest
.tox/py38-pyqt515/bin/pip install pytest-xdist
tox -e py38-pyqt515 -- -n 2
I get crashing gateways with:
Traceback (most recent call last):
File "/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/execnet/gateway_base.py", line 855, in _local_receive
data = loads_internal(data, channel, strconfig)
File "/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/execnet/gateway_base.py", line 1368, in loads_internal
return Unserializer(io, channelfactory, strconfig).load()
File "/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/execnet/gateway_base.py", line 1177, in load
loader(self)
File "/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/execnet/gateway_base.py", line 1218, in load_float
self.stack.append(struct.unpack(FLOAT_FORMAT, binary)[0])
BytesWarning: Comparison between bytes and string
This happens when trying to deserialize the test duration (?). When I print self.stack
before that line I see e.g.:
['testreport', {}, 'data', {'nodeid': 'tests/end2end/features/test_backforward_bdd.py::test_going_backforward', 'location': ('.tox/py38-pyqt515/lib/python3.8/site-packages/pytest_bdd/scenario.py', 197, 'test_going_backforward'), 'keywords': {'pytestmark': 1, 'usefixtures': 1, 'end2end': 1, 'tests/end2end/features/test_backforward_bdd.py': 1, 'git': 1, 'test_going_backforward': 1, 'gui': 1, '__pytest_bdd_counter__': 1, 'skip': 1, '__scenario__': 1}, 'outcome': 'skipped', 'longrepr': ('/home/florian/proj/qutebrowser/git/.tox/py38-pyqt515/lib/python3.8/site-packages/pytest_bdd/scenario.py', 198, 'Skipped: unconditional skip'), 'when': 'setup', 'user_properties': [], 'sections': []}, 'duration']
As you might guess, there’s no comparison between bytes and strings involved, this is a completely normal struct.unpack
call:
(Pdb) interact
*interactive*
>>> FLOAT_FORMAT
'!d'
>>> binary
b'?\xb7\xd8x\x89t\x00\x00'
Yet I can reproduce the error when I run the same thing by hand, inside the pdb environment:
>>> struct.unpack(FLOAT_FORMAT, binary)
Traceback (most recent call last):
File "<console>", line 1, in <module>
BytesWarning: Comparison between bytes and string
>>> struct.unpack('!d', b'?\xb7\xd8x\x89t\x00\x00')
Traceback (most recent call last):
File "<console>", line 1, in <module>
BytesWarning: Comparison between bytes and string
No funny things are going on, as far as I can see:
>>> struct.unpack
<built-in function unpack>
>>> struct
<module 'struct' from '/usr/lib/python3.8/struct.py'>
>>> type(FLOAT_FORMAT)
<class 'str'>
>>> type(binary)
<class 'bytes'>
Here’s where things are getting weird: Both according to the docs and according to the cpython code, both >
and !
mean the same - but using >
instead works!
>>> struct.unpack('>d', b'?\xb7\xd8x\x89t\x00\x00')
(0.09314683299817261,)
And because this wasn’t weird enough, I can not reproduce the issue in a Python console with the same Python version, on the same machine:
$ PYTHONSTARTUP= python3 -bb
Python 3.8.5 (default, Sep 5 2020, 10:50:12)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import struct
>>> struct.unpack('!d', b'?\xb7\xd8x\x89t\x00\x00')
(0.09314683299817261,)
Any idea what kind of black magic xdist/execnet could be doing to cause this? I haven’t been able to find a reproducer outside of qutebrowser so far, but I guess that’s not too surprising with how strange this issue is. Did I find a bug in cpython or something?
Issue Analytics
- State:
- Created 3 years ago
- Comments:15 (15 by maintainers)
Top GitHub Comments
Hey folks, I was intrigued by this and took some time to debug it, here are my findings.
There’s an internal cache in
struct
implementations (such asstruct.calcsize
andstruct.unpack
), and what’s happening is that the “string type” (bytes
orstr
) of the first call to that function is being stored, and subsequent calls willtry to coerceneed to compare the arguments with the stored type:In this case, hypothesis is to blame, since it’s the first one to populate the cache (via
struct.unpack
).The problem is that their hashes are equal:
And since hashes are never a guarantee of equality, a dict internally will use
==
on any objects where the hashes are equal. But yeah, agreed, this is probably a problem in Python with dicts in general then, not just withstruct
!