Skip to content

Reuse the precompiled regex for collection variable selectors#3573

Open
chweidling wants to merge 2 commits into
owasp-modsecurity:v3/masterfrom
chweidling:perf/precompiled-regex-collection-selector
Open

Reuse the precompiled regex for collection variable selectors#3573
chweidling wants to merge 2 commits into
owasp-modsecurity:v3/masterfrom
chweidling:perf/precompiled-regex-collection-selector

Conversation

@chweidling

Copy link
Copy Markdown

what

  • Add a Collection::resolveRegularExpression(Utils::Regex *) overload that takes an already-compiled regex, and override it in InMemoryPerProcess.
  • Tx_DictElementRegexp (the TX:/regex/ variable selector) now passes its pre-compiled m_r instead of the pattern string.
  • The base class keeps the previous behaviour by default (it compiles from r->pattern and delegates), so backends that don't override it — and the existing compartment-prefixed string overloads — are unchanged.
  • Adds a regression test for TX:/regex/ selection.

why

  • The collection backends' resolveRegularExpression(const std::string&, ...) built a fresh Utils::Regex — i.e. a pcre2_compile() and pcre2_jit_compile() — on every call.
  • For a regex variable selector such as TX:/regex/ that is evaluated per transaction (e.g. CRS 921180), this recompiled the same pattern on every request, even though the calling VariableRegex already holds it compiled once at configuration time in m_r. That compiled regex was simply being ignored.
  • Behaviour is unchanged: m_r is constructed with the same arguments — Utils::Regex(pattern, /*ignoreCase=*/true) — that the backend used, so the identical regex is applied; it is just compiled once instead of per transaction.
  • Measured (built against v3/master, gcc -O2, system PCRE2 with JIT):
    • Microbenchmark of the resolution call: reusing the compiled regex vs recompiling is ~1.6× faster for a short selector, and the saving grows with pattern length (it is the eliminated compile + JIT).
    • End-to-end, a standalone TX:/…/ rule over 100k transactions: ~64.9 → 62.2 µs/tx (+4 % throughput); the gain scales with the number of regex selectors evaluated per request.
  • No behaviour change: the full regression suite passes (634 passed, 80 skipped — unchanged), including the new test.

references

chweidling and others added 2 commits June 10, 2026 11:27
resolveRegularExpression() in the collection backends compiled (and
JIT-compiled) a fresh Utils::Regex from the pattern string on every
call. For a regex variable selector such as TX:/regex/ that is evaluated
per transaction, this recompiled the same pattern on every request -
even though the calling VariableRegex already holds it compiled once at
configuration time in its m_r member.

Add a Collection::resolveRegularExpression(Utils::Regex *) overload that
accepts the pre-compiled regex. The base class keeps the previous
behaviour by default (it compiles from r->pattern and delegates), so
backends that do not override it are unaffected; InMemoryPerProcess
overrides it to scan the collection with the supplied regex directly.
Tx_DictElementRegexp now passes its already-compiled &m_r instead of the
pattern string.

Behaviour is unchanged: m_r is constructed with the same arguments
(Utils::Regex(pattern, /*ignoreCase=*/true)) the backend used, so the
identical regex is applied - it is just compiled once instead of per
transaction. A regression test covering TX:/regex/ selection is added.
@sonarqubecloud

Copy link
Copy Markdown

@chweidling

Copy link
Copy Markdown
Author

The failing cppcheck check is unrelated to this PR. On a clean v3/master
checkout with no changes, make check-static already fails under cppcheck
2.21.0 (and passes under 2.20.0) — the job installs cppcheck unpinned, so the
analyzer version drifts. All 31 findings are in files this PR doesn't touch.

Filed as #3574.

All other CI jobs (Linux x32/x64 gcc+clang, all configs, macOS) pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant