Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

overly broad scanners matching

See original GitHub issue

_Issue originally created by user ygrek on date 2016-07-14 02:32:01. Link to original issue: https://github.com/SpiderLabs/owasp-modsecurity-crs/issues/406._

Hello, #258 added AhrefsBot (which is a general purpose crawler, not a vulnerability scanner) to the scanners-user-agents.data which contradicts the purpose of this list, afaics :

#
# -=[ Vulnerability Scanner Checks ]=-
#
# These rules inspect the default User-Agent and Header values sent by 
# various commercial and open source vuln scanners.
#

People are using Ahrefs to explore their own sites to find known problems (like duplicate pages, missing title tags, etc). modsecurity is often installed and configured by the hoster without actual website owner control of its configuration, and due to this rule they lose the ability to use Ahrefs services for their own websites. This is rather unfortunate.

Please remove AhrefsBot from the list of vulnerability scanners.

Issue Analytics

State:
Created 3 years ago
Comments:7

Top GitHub Comments

1reaction

CRS-migration-botcommented, May 13, 2020

User ygrek commented on date 2016-07-15 07:58:27:

dune73 I cannot promise that I will do a comprehensive filtering of all scanner user-agents and I don’t know CRS structure well enough, but I see that there was a list of marketing bots (which also included googlebot) which was removed in 2014 - that would be a proper fit if there is an analogue for such list in 3.0.0

csanders-git reputation is an interesting topic, I believe for googlebot you are evaluating the reputation of google as a whole, not the googlebot (which is known to be faked constantly actually). In that case Ahrefs is one of the top tools for online marketing (I think it will be present in every “top 10 SEO tools” collections one can find). Also consider that google is a well-known public company, while many bots are run by smaller companies with niche business which one may not know about until doing a research. The problem with reporting false positives is that users who get hit by this problem may not (1) notice that it happened (2) may not have an ability to understand why it happens (3) even with knowledge may have no permission to fix things - in case of shared hosting where apache is configured by a hosting provider. In best scenario complains will be sent to bot authors, not here. What bothers me is that sysadmins are enabling CRS thinking of it as a protection from security breaches and hacks but instead will be doing arbitrary censorship for their users based on crowdsourced criteria…

All in all and given the above considerations, I understand the best way to resolve this would be for me to create a pull request and continue discussion with more specifics in there.

Thank you.

1reaction

CRS-migration-botcommented, May 13, 2020

User ygrek commented on date 2016-07-14 05:28:24:

being known to be used for reconnaissance, vuln scanning and has a bad habit of ignoring robots.txt changes for days on end

Said the internets. Seems legit (not really).

I have one more candidate for this list of vuln scanners - googlebot. Why it is not included? And there are actually documented cases when google results are used to search for vulnerabilities, unlike AhrefsBot. What are the criterias to be included in this scanners list?

Top Results From Across the Web

overly broad scanners matching · Issue #406 · SpiderLabs/owasp ...

People are using Ahrefs to explore their own sites to find known problems (like duplicate pages, missing title tags, etc). modsecurity is often...

Fast surfers, broad scanners and deep divers as users of ...

Fast surfers, broad scanners and deep divers as users of information technology – Relating information preferences to personality traits.

Optimisation of CT protocols in PET-CT across different ... - NCBI

Matching protocols on different scanners within an imaging centre is important for the consistency of image quality and dose. This paper ...

General Scanning Principles :: WhiteHat Security Docs

The Sentinel UI provides a simple string matching mechanism for excluding files or ... checked out by the scanner, to avoid overly broad...

(PDF) Fast surfers, Broad scanners and Deep divers ...

Broad scanning was particularly common among social science students. The Deep divers were intrinsically motivated and searched for information in order to ...