sigtool behaviour with inverted ranges
See original GitHub issueIf you look at PRONOM fmt/142 the raw signature is 52494646{4}57415645666D7420[!10]{3}[!FEFF]{16-*}64617461
This is decomposed within the PRONOM binary signature file as:
<ByteSequence Endianness="Big-endian" Reference="BOFoffset">
<SubSequence Position="1" SubSeqMaxOffset="0" SubSeqMinOffset="0">
<Sequence>57415645666D7420</Sequence>
<LeftFragment MaxOffset="4" MinOffset="4" Position="1">52494646</LeftFragment>
<RightFragment MaxOffset="0" MinOffset="0" Position="1">[!10]</RightFragment>
<RightFragment MaxOffset="3" MinOffset="3" Position="2">[!FEFF]</RightFragment>
</SubSequence>
<SubSequence Position="2" SubSeqMinOffset="16">
<Sequence>64617461</Sequence>
</SubSequence>
</ByteSequence>
If I use sigtool to generate the XML instead, I get this…
<ByteSequence Reference="BOFoffset">
<SubSequence Position="1" SubSeqMaxOffset="0" SubSeqMinOffset="0">
<Sequence>57415645666D7420</Sequence>
<LeftFragment MaxOffset="4" MinOffset="4" Position="1">52494646</LeftFragment>
<RightFragment MaxOffset="0" MinOffset="0" Position="1">10</RightFragment>
<RightFragment MaxOffset="3" MinOffset="3" Position="2">[00:FD]</RightFragment>
</SubSequence>
<SubSequence Position="2" SubSeqMaxOffset="16" SubSeqMinOffset="16">
<Sequence>64617461</Sequence>
</SubSequence>
</ByteSequence>
Note that for the first RightFragment, the value has become ‘10’ rather than [!10] thereby inverting the logic.
This can be reproduced more simply by running commands like:
sigtool [!10] sigtool [!10:12]
which give the results: 10 [10:12]
…without the necessary exclamation mark, and thereby inverting the logic.
Although I mention sigtool here, I believe this is calling core DROID code.
Note that fmt/142 includes the troublesome and ambiguous [!FEFF] string, but this issue isn’t related to that.
It was suggested that @nishihatapalmer might be interested in this behaviour.
Issue Analytics
- State:
- Created a year ago
- Comments:36 (16 by maintainers)
Top GitHub Comments
Bug is confirmed, and exists in the ByteSequenceSerializer in droid-core, in the method toPRONOMExpression().
This is failing to add inverted syntax when bytes or ranges are inverted. Fix should be fairly simple hopefully. The test suite could also be improved to ensure the standard syntax is serialized correctly in all cases.
Perhaps the skeleton suite generator can be updated at some-point to provide inverse/negative testing. This seems like a good use-case to create a sample of a format that shouldn’t match. It may require reorganizing the output, but should be pretty easy to start writing. I think this is a weakness that was discussed in the original paper but I this thread is helpful in seeing how it can be tested.
@steve-daly The signature development utility http://ffdev.info outputs the correct sequence. Trying to follow this thread, your original issue is simply the signature file is generated incorrectly and affects processing of the file? Is the behavior as expected if you use ffdev.info and process your files in DROID?