question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

regexp-assemble generates a malformed regex pattern for rule 932100

See original GitHub issue

Describe the bug

When executing

./regexp-assemble.py generate 932100

a malformed regular expression pattern is generated. Specifically, one of the double quote marks is not correctly escaped. Note at the end here:

(?:[;\n\r`]|\$(?:\(?\(|{)|(?:\|)?\||\(\s*\)|[<>]\(|&?&|\{)\s*(?:(?:\w+=(?:[^\s]*|\$.*|\$.*|<.*|>.*|\'.*\'|\".*\")\s+|(?:\s*\(|!)\s*|\{|\$))*\s*(?:["'])...

the (?:["']) should be escaped as (?:[\"']).

The un-escaped " causes havoc: the server config becomes invalid and the server process will not start (at least with Apache and nginx).

Steps to reproduce

In the v4.0/dev branch, execute:

./regexp-assemble.py generate 932100

Observe that the double quote mark is not escaped.

Execute:

./regexp-assemble.py update 932100

and try start either of the Docker containers. Observe that they fail to start correctly:

Apache:

$ sudo docker-compose up modsec2-apache
[+] Running 2/2
 ⠿ Container tests-backend-1  Running                                                                                                                                                    0.0s
 ⠿ Container modsec2-apache   Created                                                                                                                                                    0.8s
Attaching to modsec2-apache
modsec2-apache  | AH00526: Syntax error on line 119 of /etc/modsecurity.d/owasp-crs/rules/REQUEST-932-APPLICATION-ATTACK-RCE.conf:
modsec2-apache  | SecRule takes two or three arguments, rule target, operator and optional action list
modsec2-apache exited with code 1

nginx:

$ sudo docker-compose up modsec3-nginx
[+] Running 2/0
 ⠿ Container tests-backend-1  Running                                                                                                                                                    0.0s
 ⠿ Container modsec3-nginx    Created                                                                                                                                                    0.0s
Attaching to modsec3-nginx
modsec3-nginx  | nginx: [emerg] "modsecurity_rules_file" directive Rules error. File: /etc/modsecurity.d/owasp-crs/rules/REQUEST-932-APPLICATION-ATTACK-RCE.conf. Line: 99. Column: 6504. Expecting an action, got:  \./\x5c]+/)?[\x5c'\"]*(?:l[\x5c'\"]*(?:w[\x5c'\"]*p[\x5c'\"]*-[\x5c'\"]*(?:d[\x5c'\"]*(?:o[\x5c'\"]*w[\x5c'\"]*n[\x5c'\"]*l[\x5c'\"]*o[\x5c'\"]*a[\x5c'\"]*d|u[\x5c'\"]*m[\x5c'\"]*p)|r[\x5c'\"]*e[\x5c'\"]*q[\x5c'\"]*u[\x5c'\"]*e[\x5c'\"]*s[\x5c'\"]*t|m[\x5c'\"]*i[\x5c'\"]*r[\x5c'\"]*r[\x5c'\"]*o[\x5c'\"]*r)|s(?:[\x5c'\"]*(?:b[\x5c'\"]*_[\x5c'\"]*r[\x5c'\"]*e[\x5c'\"]*l[\x5c'\"]*e[\x5c'\"]*a[\x5c'\"]*s[\x5c'\"]*e|c[\x5c'\"]*p[\x5c'\"]*u|m[\x5c'\"]*o[\x5c'\"]*d|p[\x5c'\"]*c[\x5c'\"]*i|u[\x5c'\"]*s[\x5c'\"]*b|-[\x5c'\"]*F|h[\x5c'\"]*w|o[\x5c'\"]*f))?|z[\x5c'\"]*(?:(?:[ef][\x5c'\"]*)?g[\x5c'\"]*r[\x5c'\"]*e[\x5c'\"]*p|c[\x5c'\"]*(?:a[\x5c'\"]*t|m[\x5c'\"]*p)|m[\x5c'\"]*(?:o[\x5c'\"]*r[\x5c'\"]*e|a)|d[\x5c'\"]*i[\x5c'\"]*f[\x5c'\"]*f|l[\x5c'\"]*e[\x5c'\"]*s[\x5c'\"]*s)|o[\x5c'\"]*(?:g[\x5c'\"]*(?:(?:n[\x5c'\"]*a[\x5c'\"]*m|s[\x5c'\"]*a[\ in /etc/nginx/conf.d/modsecurity.conf:2
modsec3-nginx exited with code 1

Expected behaviour

The double quote mark " should be escaped in the output pattern, as is expected.

Actual behaviour

The double quote mark " is not correctly escaped in the output pattern.

Your Environment

  • CRS version (e.g., v3.2.0): v4.0/dev

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:18 (18 by maintainers)

github_iconTop GitHub Comments

1reaction
theseioncommented, Aug 30, 2022

I’m really no Perl expert. AFAICT, the module wasn’t really designed for inheritance and I had to patch a subroutine for one of the changes (the subroutines are huge and complicated, so overriding would mean copying large amounts of code). I don’t know how well monkey patching works in Perl, I think I read somewhere that it can lead to interesting results.

You’re welcome to try something different. From my point of view the current approach works reasonably well and is reasonably maintainable, given the circumstances.

0reactions
fzipicommented, Aug 30, 2022

JFYI: these are the differences between upstream and our changes

❯ diff -b Assemble.pm ~/perl5/lib/perl5/Regexp/Assemble.pm
75d74
< 		cook_hex
80,86d78
< 	# Replace the string with an array ref
< 	my @fet = ();
< 	if (length $args{force_escape_tokens}) {
< 		@fet = split '', $args{force_escape_tokens};
< 	}
< 	@args{force_escape_tokens} = \@fet;
<
140c132
< 	$debug and print "# _fastlex <$record>\n";
---
>     $debug and print "# _lex <$record>\n";
340c332
< 				elsif( $self->{cook_hex} and $token =~ /^\\x([\da-fA-F]{2})$/ ) {
---
>                 elsif( $token =~ /^\\x([\da-fA-F]{2})$/ ) {
359,365d350
<
< 			foreach (@{$self->{force_escape_tokens}}) {
< 				$token =~ s/([^\\])($_)/$1\\$2/;
< 				$token =~ s/^($_)/\\$1/;
< 				last;
< 			}
<
810,823d794
< sub force_escape_tokens {
< 	my $self = shift;
< 	my $arrayref = [];
< 	if (defined($_[0])) {
< 		if (ref($_[0])) {
< 			$arrayref = $_[0];
< 		} else {
< 			$arrayref = \$_[0];
< 		}
< 	}
< 	$self->{force_escape_tokens} = $arrayref;
< 	return $self;
< }
<
842,847d812
< sub cook_hex {
< 	my $self = shift;
< 	$self->{cook_hex} = defined($_[0]) ? $_[0] : 1;
< 	return $self;
< }
<
2470,2474d2434
< B<force_escape_tokens>, specifies optional tokens that must always be
< escaped. This can be useful when you know that the resulting expression
< will be surrounded by quotes for instance.
< Note that this only works for _lex, not for _fastlex.
<
2500,2502d2459
< B<cook_hex>, controls whether hexadecimal escape secquences, such as
< C<\x00>, should be replaced by the bytes they represent.
<
2914c2871
<   $re->track( 1 )t
---
>   $re->track( 1 );
2932,2938d2888
< =head2 force_escape_tokens(ARRAY REF)
<
< Specifies optional tokens that must always be
< escaped. This can be useful when you know that the resulting expression
< will be surrounded by quotes for instance.
< Note that this only works for _lex, not for _fastlex.
<
2961,2965d2910
< =head2 cook_hex(0|1)
< Controls whether hexadecimal escape secquences, such as
< C<\x00>, should be replaced by the bytes they represent.
< On by default.
<
Read more comments on GitHub >

github_iconTop Results From Across the Web

Improve System Command Injection rule (932100) · Issue #318
Rule 932100 seems to hit a lot of false positives as well as false negatives. ... The rule has some regexp magic to...
Read more >
The regular expression error is malformed - Stack Overflow
I guess, it means that you must escape the character '-' by writing '\-' within the regular expression, when it's not used as...
Read more >
Regular expressions - JavaScript - MDN Web Docs
Regular expressions are patterns used to match character combinations in strings. In JavaScript, regular expressions are also objects.
Read more >
How the CRS optimizes regular expressions
As the name suggests, Regexp::Assemble knows how to assemble a number of regular expressions into one optimized regular expression. Since  ...
Read more >
Assemble multiple Regular Expressions into a single RE
Clones the contents of a Regexp::Assemble object and creates a new object (in other words it performs a deep copy). If the Storable...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found