question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

sigtool behaviour with inverted ranges

See original GitHub issue

If you look at PRONOM fmt/142 the raw signature is 52494646{4}57415645666D7420[!10]{3}[!FEFF]{16-*}64617461

This is decomposed within the PRONOM binary signature file as:

<ByteSequence Endianness="Big-endian" Reference="BOFoffset">
     <SubSequence Position="1" SubSeqMaxOffset="0" SubSeqMinOffset="0">
       <Sequence>57415645666D7420</Sequence>
       <LeftFragment MaxOffset="4" MinOffset="4" Position="1">52494646</LeftFragment>
       <RightFragment MaxOffset="0" MinOffset="0" Position="1">[!10]</RightFragment>
       <RightFragment MaxOffset="3" MinOffset="3" Position="2">[!FEFF]</RightFragment>
     </SubSequence>
     <SubSequence Position="2" SubSeqMinOffset="16">
       <Sequence>64617461</Sequence>
     </SubSequence>
   </ByteSequence>

If I use sigtool to generate the XML instead, I get this…

<ByteSequence Reference="BOFoffset">
  <SubSequence Position="1" SubSeqMaxOffset="0" SubSeqMinOffset="0">
    <Sequence>57415645666D7420</Sequence>
    <LeftFragment MaxOffset="4" MinOffset="4" Position="1">52494646</LeftFragment>
    <RightFragment MaxOffset="0" MinOffset="0" Position="1">10</RightFragment>
    <RightFragment MaxOffset="3" MinOffset="3" Position="2">[00:FD]</RightFragment>
  </SubSequence>
  <SubSequence Position="2" SubSeqMaxOffset="16" SubSeqMinOffset="16">
    <Sequence>64617461</Sequence>
  </SubSequence>
</ByteSequence>

Note that for the first RightFragment, the value has become ‘10’ rather than [!10] thereby inverting the logic.

This can be reproduced more simply by running commands like:

sigtool [!10] sigtool [!10:12]

which give the results: 10 [10:12]

…without the necessary exclamation mark, and thereby inverting the logic.

Although I mention sigtool here, I believe this is calling core DROID code.

Note that fmt/142 includes the troublesome and ambiguous [!FEFF] string, but this issue isn’t related to that.

It was suggested that @nishihatapalmer might be interested in this behaviour.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:36 (16 by maintainers)

github_iconTop GitHub Comments

2reactions
nishihatapalmercommented, Aug 18, 2022

Bug is confirmed, and exists in the ByteSequenceSerializer in droid-core, in the method toPRONOMExpression().

This is failing to add inverted syntax when bytes or ranges are inverted. Fix should be fairly simple hopefully. The test suite could also be improved to ensure the standard syntax is serialized correctly in all cases.

1reaction
ross-spencercommented, Aug 19, 2022

But it’s clear fmt/142 isn’t working as intended and that there’ll be false negatives that are ID’ing as fmt/6 as well as false positives that shouldn’t be ID’ing as fmt/142, so this specifically needs a bit of thought. (although from a digital preservation perspective you’re unlikely to treat fmt/6 and fmt/142 differently).

Perhaps the skeleton suite generator can be updated at some-point to provide inverse/negative testing. This seems like a good use-case to create a sample of a format that shouldn’t match. It may require reorganizing the output, but should be pretty easy to start writing. I think this is a weakness that was discussed in the original paper but I this thread is helpful in seeing how it can be tested.

@steve-daly The signature development utility http://ffdev.info outputs the correct sequence. Trying to follow this thread, your original issue is simply the signature file is generated incorrectly and affects processing of the file? Is the behavior as expected if you use ffdev.info and process your files in DROID?

Read more comments on GitHub >

github_iconTop Results From Across the Web

DROID does not match inverted sequences or multiple-byte ...
I agree that we should keep the DROID behaviour as-is (i.e. assuming expressions in square brackets are ranges, not sequences) and we'll go ......
Read more >
SigTools: an Exploratory Visualization Tool for Genomic Signals
autocorrelation that provide insights regarding the behavior of a group of signals in large regions – such as a chromosome or the whole...
Read more >
Role of Thalamic Projection in NMDA Receptor-Induced ...
We presume that multiple neuronal networks, involving thalamic nuclei contribute to disrupted behavior and cognition following NMDAR ...
Read more >
Source code for scipy.signal.signaltools - Climate Data Store
_correlateND(in1zpadded, in2, out, val) if swapped_inputs: # Reverse and ... 1 for a in range(in1.ndim) if a not in axes): raise ValueError("incompatible ...
Read more >
Distinct contribution of cone photoreceptor subtypes to ... - PNAS
Whereas rods are capable of driving photoentrainment at a wide range of light intensities ... Behavioral activity recordings of entrainment to LD cycles....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found