question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

sigtool behaviour with ?? wildcard

See original GitHub issue

If I use sigtool to create a Pronom style signature from a byte sequence thus:

sigtool -b -p "04{1}[01:0C][01:1F]{28}([41:5A]|[61:7A]){10}(43|44|46|4C|4E)"

The results splits correctly into the XML form…

  <SubSequence Position="1" SubSeqMaxOffset="0" SubSeqMinOffset="0">
    <Sequence>04</Sequence>
    <RightFragment MaxOffset="1" MinOffset="1" Position="1">[01:0C][01:1F]</RightFragment>
    <RightFragment MaxOffset="28" MinOffset="28" Position="2">[41:5A]</RightFragment>
    <RightFragment MaxOffset="28" MinOffset="28" Position="2">[61:7A]</RightFragment>
    <RightFragment MaxOffset="10" MinOffset="10" Position="3">43</RightFragment>
    <RightFragment MaxOffset="10" MinOffset="10" Position="3">44</RightFragment>
    <RightFragment MaxOffset="10" MinOffset="10" Position="3">46</RightFragment>
    <RightFragment MaxOffset="10" MinOffset="10" Position="3">4C</RightFragment>
    <RightFragment MaxOffset="10" MinOffset="10" Position="3">4E</RightFragment>
  </SubSequence>
</ByteSequence>

but if I swap the {1} for ?? which should be syntactically identical, the processing doesn’t split a fragment at the ?? like it does with {1} and the question marks pass through into the XML output, which DROID then hangs when processing.

sigtool -b -p "04??[01:0C][01:1F]{28}([41:5A]|[61:7A]){10}(43|44|46|4C|4E)"

  <SubSequence Position="1" SubSeqMaxOffset="0" SubSeqMinOffset="0">
    <Sequence>04</Sequence>
    <RightFragment MaxOffset="0" MinOffset="0" Position="1">??[01:0C][01:1F]</RightFragment>
    <RightFragment MaxOffset="28" MinOffset="28" Position="2">[41:5A]</RightFragment>
    <RightFragment MaxOffset="28" MinOffset="28" Position="2">[61:7A]</RightFragment>
    <RightFragment MaxOffset="10" MinOffset="10" Position="3">43</RightFragment>
    <RightFragment MaxOffset="10" MinOffset="10" Position="3">44</RightFragment>
    <RightFragment MaxOffset="10" MinOffset="10" Position="3">46</RightFragment>
    <RightFragment MaxOffset="10" MinOffset="10" Position="3">4C</RightFragment>
    <RightFragment MaxOffset="10" MinOffset="10" Position="3">4E</RightFragment>
  </SubSequence>
</ByteSequence>

Note the questions marks on RightFragment position 1, and the offset being 0 rather than 1.

There are about 47 Pronom signatures using ?? syntax currently.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:14 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
nishihatapalmercommented, Oct 4, 2022

DROID already supports putting an entire signature into a ByteSequence directly, via a Sequence attribute:

  <InternalSignature ID="3" Specificity="Specific">
   <ByteSequence Reference="BOFoffset" Sequence="04??[01:0C][01:1F]{28}([41:5A]|[61:7A]){10}(43|44|46|4C|4E)"/>
  </InternalSignature>

It will then compile it into the actual objects DROID uses. It can include * wildcards and the entire syntax. This was done to let container signatures use the full range of syntax available, without having to manually figure out the XML object structure that PRONOM currently produces.

The current system is pretty awful really. PRONOM is essentially pre-compiling the signatures into what it thinks DROID wants internally! Fixing in stone how DROID works in the actual signature file. But DROID has moved on substantially since then, and the decisions PRONOM makes are no longer optimal for DROID. So DROID actually changes what PRONOM tells it to do, reverse engineering the signatures so it can optimise them properly. It should really just give the signature in full, and let DROID figure out the best way to search for it.

However - the big problem with changing the signature file format is backwards compatibility with earlier versions of DROID. Maybe there could be a “new format” URL, and the older format is still published so older versions of DROID continue to work?

0reactions
steve-dalycommented, Nov 25, 2022

Fixed and available in 6.6-rc1. Thanks @nishihatapalmer

Read more comments on GitHub >

github_iconTop Results From Across the Web

Supported wildcard for Behavior Monitoring Approved List
This article lists the supported wildcard characters in the OfficeScan Behavior Monitoring Approved List. Wildcard support in Behavior Monitoring Approved List ...
Read more >
Examples of wildcard characters - Microsoft Support
Wildcards are special characters that can stand in for unknown characters in a text value and are handy for locating multiple items with...
Read more >
git - .gitignore directory/wildcard unexpected behaviour
The files .gitignore , file1 , and file2 are all empty. I run git add .gitignore and git commit -m "create empty .gitignore"...
Read more >
Decode signatures with Sigtool - ClamAV - Malware Expert
For decoding signature you can use ClamAV sigtool command line tool. ... sometimes signatures may include wildcard in hex pattern:
Read more >
URI wildcards | Cloud Storage
gcloud storage and gsutil support the following wildcards: ... Per standard Unix behavior, the wildcard * only matches files that don't start with...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found