Skip to content

SONARJAVA-6443 Narrow S5852 to exponential backtracking and create S8786#5674

Open
pierre-loup-tristant-sonarsource wants to merge 2 commits into
masterfrom
plt/sonarjava-6443
Open

SONARJAVA-6443 Narrow S5852 to exponential backtracking and create S8786#5674
pierre-loup-tristant-sonarsource wants to merge 2 commits into
masterfrom
plt/sonarjava-6443

Conversation

@pierre-loup-tristant-sonarsource

@pierre-loup-tristant-sonarsource pierre-loup-tristant-sonarsource commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Narrows S5852 (vulnerability) to exponential backtracking only
  • Creates S8786 (code-smell) for remaining super-linear cases (polynomial, quadratic)
  • Extracts shared AbstractRedosCheck base class
  • Updates ruling IT expected results

…smell S8786 (super-linear)

S5852 (vulnerability) is narrowed to only report exponential backtracking.
New rule S8786 (code-smell) covers the remaining super-linear cases
(polynomial, quadratic) that are performance concerns but not vulnerabilities.
@hashicorp-vault-sonar-prod

hashicorp-vault-sonar-prod Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

SONARJAVA-6443

@sonarqubecloud

sonarqubecloud Bot commented Jun 15, 2026

Copy link
Copy Markdown

Agentic Analysis: Early Results

Agentic Analysis and Context Augmentation are available on your project. Here are some issues that could have been prevented. Follow the links to learn how to put them into action.

2 issue(s) found across 2 file(s):

Rule File Line Message
java:S110 java-checks/src/main/java/org/sonar/java/checks/regex/AbstractRedosCheck.java 51 This class has 6 parents which is greater than 5 authorized.
java:S110 java-checks/src/main/java/org/sonar/java/checks/regex/SuperLinearRegexCheck.java 23 This class has 7 parents which is greater than 5 authorized.

Analyzed by SonarQube Agentic Analysis in 8.1 s

Comment on lines +11 to +16
<li>Replace <code>.</code> with negated character classes to exclude separators where applicable (e.g., <code><strong></strong></code><strong>
instead of <code>.</code></strong> before <code>,</code>).</li>
<li>Use bounded quantifiers such as <code>{1,5}</code> to limit repetitions.</li>
<li>Restructure alternations and quantifiers to eliminate ambiguity — avoid patterns where multiple alternatives can match the same character.</li>
<li>Use possessive quantifiers (<code>+`, `*</code>, <code>?+</code>) or atomic grouping to prevent the regex engine from keeping backtracking
positions.</li>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: S8786.html has broken markup in fix recommendations

In S8786.html the 'How to fix it' list contains malformed HTML. The negated-character-class example renders as an empty <code></code> tag instead of showing something like [^,] (line ~11), and the possessive-quantifier bullet renders literal backticks: <code>+, *</code>, <code>?+</code> instead of ++, *+, ?+ (line ~15). These render as confusing/empty snippets in the rule description shown to users.

Fix:

<li>Replace <code>.</code> with negated character classes to exclude separators where applicable (e.g.,
<code>[^,]</code> instead of <code>.</code> before <code>,</code>).</li>
<li>Use bounded quantifiers such as <code>{1,5}</code> to limit repetitions.</li>
<li>Restructure alternations and quantifiers to eliminate ambiguity.</li>
<li>Use possessive quantifiers (<code>++</code>, <code>*+</code>, <code>?+</code>) or atomic grouping to prevent the
regex engine from keeping backtracking positions.</li>
  • Apply fix

Check the box to apply the fix or reply for a change | Was this helpful? React with 👍 / 👎

Backreferences disable the Java 9 loop optimization, so
QUADRATIC_WHEN_OPTIMIZED and LINEAR_WHEN_OPTIMIZED with backrefs
remain exponential and must stay in S5852. S8786 no longer reports
LINEAR_WHEN_OPTIMIZED at all (either linear/safe or exponential/S5852).
@pierre-loup-tristant-sonarsource pierre-loup-tristant-sonarsource changed the title SONARJAVA-6443 Split S5852 into vulnerability and code-smell S8786 SONARJAVA-6443 Narrow S5852 to exponential backtracking and create S8786 Jun 15, 2026
@gitar-bot

gitar-bot Bot commented Jun 15, 2026

Copy link
Copy Markdown

Gitar is working

Code Review 👍 Approved with suggestions 1 resolved / 2 findings

Splits regex backtracking logic by introducing S8786 for super-linear cases and restricting S5852 to exponential complexity. Correct the broken markup in the S8786 documentation to ensure proper rendering of fix recommendations.

💡 Quality: S8786.html has broken markup in fix recommendations

📄 sonar-java-plugin/src/main/resources/org/sonar/l10n/java/rules/java/S8786.html:11-16

In S8786.html the 'How to fix it' list contains malformed HTML. The negated-character-class example renders as an empty <code></code> tag instead of showing something like [^,] (line ~11), and the possessive-quantifier bullet renders literal backticks: <code>+, *</code>, <code>?+</code> instead of ++, *+, ?+ (line ~15). These render as confusing/empty snippets in the rule description shown to users.

Fix
<li>Replace <code>.</code> with negated character classes to exclude separators where applicable (e.g.,
<code>[^,]</code> instead of <code>.</code> before <code>,</code>).</li>
<li>Use bounded quantifiers such as <code>{1,5}</code> to limit repetitions.</li>
<li>Restructure alternations and quantifiers to eliminate ambiguity.</li>
<li>Use possessive quantifiers (<code>++</code>, <code>*+</code>, <code>?+</code>) or atomic grouping to prevent the
regex engine from keeping backtracking positions.</li>
✅ 1 resolved
Security: Exponential backref regexes downgraded from S5852 to S8786 on Java 9+

📄 java-checks/src/main/java/org/sonar/java/checks/regex/RedosCheck.java:29-35 📄 java-checks/src/main/java/org/sonar/java/checks/regex/SuperLinearRegexCheck.java:28-38 📄 java-checks-test-sources/default/src/main/java/checks/regex/RedosCheckSample.java 📄 java-checks-test-sources/default/src/main/java/checks/regex/RedosCheckJava8.java
The original RedosCheck.message() classified QUADRATIC_WHEN_OPTIMIZED and LINEAR_WHEN_OPTIMIZED as exponential (the security-relevant case) whenever optimized was false, where optimized = isJava9OrHigher() && !regexContainsBackReference. A capturing-group backreference disables the Java 9 loop optimization, so e.g. (.*,)*\1 and (?:.*,)*(X)\1 were exponential ReDoS even on Java 9+ and were reported by S5852 with the "exponential runtime" message.

After the split, RedosCheck.buildMessage() (S5852) no longer consults regexContainsBackReference at all:

case QUADRATIC_WHEN_OPTIMIZED -> isJava9OrHigher() ? Optional.empty() : Optional.of(MESSAGE);
default -> Optional.empty();  // LINEAR_WHEN_OPTIMIZED never reported

So on Java 9+ these backreference cases return empty from S5852 and are instead reported only by SuperLinearRegexCheck (S8786) with the message "...has super-linear performance due to backtracking." This is a genuine regression: an exponential ReDoS vulnerability is reclassified as a polynomial/super-linear code smell, and the user-facing message now understates the severity (calls exponential behaviour "super-linear"). The relabelled test expectations in RedosCheckSample.java (lines 34-35) and RedosCheckJava8.java (line 51) bake in this downgrade.

If the intent is truly to keep S5852 = exponential only, then the genuinely-exponential not-optimized cases of QUADRATIC_WHEN_OPTIMIZED/LINEAR_WHEN_OPTIMIZED should remain in S5852 (using the optimized flag, not just isJava9OrHigher()), with S8786 covering only the actually-polynomial cases to avoid both miscategorization and double reporting.

🤖 Prompt for agents
Code Review: Splits regex backtracking logic by introducing S8786 for super-linear cases and restricting S5852 to exponential complexity. Correct the broken markup in the S8786 documentation to ensure proper rendering of fix recommendations.

1. 💡 Quality: S8786.html has broken markup in fix recommendations
   Files: sonar-java-plugin/src/main/resources/org/sonar/l10n/java/rules/java/S8786.html:11-16

   In S8786.html the 'How to fix it' list contains malformed HTML. The negated-character-class example renders as an empty `<code></code>` tag instead of showing something like `[^,]` (line ~11), and the possessive-quantifier bullet renders literal backticks: `<code>+`, `*</code>, <code>?+</code>` instead of `++`, `*+`, `?+` (line ~15). These render as confusing/empty snippets in the rule description shown to users.

   Fix:
   <li>Replace <code>.</code> with negated character classes to exclude separators where applicable (e.g.,
   <code>[^,]</code> instead of <code>.</code> before <code>,</code>).</li>
   <li>Use bounded quantifiers such as <code>{1,5}</code> to limit repetitions.</li>
   <li>Restructure alternations and quantifiers to eliminate ambiguity.</li>
   <li>Use possessive quantifiers (<code>++</code>, <code>*+</code>, <code>?+</code>) or atomic grouping to prevent the
   regex engine from keeping backtracking positions.</li>

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant