Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Trivial crashes found via fuzzing

See original GitHub issue

While experimenting with Python fuzzing, we’ve found the following crashes in pyyaml. Are you interested in this kind of issues? Shall we report all of them here?

$ cat test.py 
import yaml
yaml.load('!!float')

$ python3 test.py 
test.py:3: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  yaml.load('!!float')
Traceback (most recent call last):
  File "test.py", line 3, in <module>
    yaml.load('!!float')
  File "/usr/lib/python3/dist-packages/yaml/__init__.py", line 114, in load
    return loader.get_single_data()
  File "/usr/lib/python3/dist-packages/yaml/constructor.py", line 43, in get_single_data
    return self.construct_document(node)
  File "/usr/lib/python3/dist-packages/yaml/constructor.py", line 47, in construct_document
    data = self.construct_object(node)
  File "/usr/lib/python3/dist-packages/yaml/constructor.py", line 92, in construct_object
    data = constructor(self, node)
  File "/usr/lib/python3/dist-packages/yaml/constructor.py", line 266, in construct_yaml_float
    if value[0] == '-':
IndexError: string index out of range
$

Issue Analytics

State:
Created 4 years ago
Reactions:1
Comments:7 (1 by maintainers)

Top GitHub Comments

1reaction

nitzmahonecommented, Sep 16, 2020

Yes and no- the spec and the various schema/datatype definitions are pretty vague about how implementations should distinguish semantic validity vs well-formedness. IIRC the common datatypes are defined by their various specs only in terms of regexes, not semantic validity or native type represent-ability.

If we go down this road, IMO it’s very important that we preserve the distinction between implicit resolver failures and native representation failures of explicitly tagged data. I have no problem with allowing an untagged scalar to fall back to a string if it can’t be represented natively by a matching implicit resolver, but at least by default, an explicitly tagged scalar (ie this case) that can’t be represented by the implementation’s chosen native datatype should continue to be a failure.

@ingy any thoughts to add?

0reactions

drawkscommented, Sep 15, 2020

Could we choose to convert them into string instead of throwing an exception.

I think the standard says that is exactly what should be done https://yaml.org/spec/1.2/spec.html#id2767381

Top Results From Across the Web

The Fuzzing Hype-Train: How Random Testing Triggers ...

The idea of fuzzing is incredibly simple: execute a program in a test environment with random input and detect if it crashes. The...

Troubleshooting AFL Fuzzing Problems - LiveOverflow

We are using AFL to find the critical sudo bug. After fixing issues with computer resources and user privileges, we got everything running ......

Handling results found through fuzzing - Fuchsia

Handling results found through fuzzing ... When your fuzzer runs, it searches for inputs that crash the program or violate checked conditions.

afl-fuzz: crash exploration mode - lcamtuf Blog

One of the most labor-intensive portions of any fuzzing project is the work needed to determine if a particular crash poses a security...

Dozens of crashes found with libdislocator, but cannot ...

The crashing tests are also completely random (i.e. never several tests stemming from the same source test), which typically indicates a random crash...