Skip to content

[bugfix] fix(sql): strip trailing -- comments before OPTIONS regex match#18425

Open
tarun11Mavani wants to merge 1 commit intoapache:masterfrom
tarun11Mavani:fix/sql-option-comment-parsing
Open

[bugfix] fix(sql): strip trailing -- comments before OPTIONS regex match#18425
tarun11Mavani wants to merge 1 commit intoapache:masterfrom
tarun11Mavani:fix/sql-option-comment-parsing

Conversation

@tarun11Mavani
Copy link
Copy Markdown
Contributor

@tarun11Mavani tarun11Mavani commented May 5, 2026

The legacy OPTIONS regex (e.g. option(skipUpsert=true)) is applied to the raw SQL string before it is passed to the Calcite parser. Because sanitizeSql only stripped trailing whitespace, a query ending with a single-line comment such as

SELECT col1 FROM foo -- option(skipUpsert=true)

would be mistakenly treated as if skipUpsert=true were a real query option, since the regex anchors at end-of-string (\Z).

Fix: scan only the last line of the sanitized SQL for an unquoted -- sequence and remove it (plus any resulting trailing whitespace) before the OPTIONS regex is applied. Block comments (/* ... */) are not affected because they shift the option(...) text away from the end-of-string anchor and therefore never triggered the bug.

Added two regression test cases to CalciteSqlCompilerTest#testQueryOptions:

  • -- option(skipUpsert=true) trailing comment must not set query options
  • /* option(skipUpsert=true) */ block comment must not set query options

Bug reproduced in cluster:

Without skipUpsert=true:
image

With skipUpsert=true
image

With skipUpsert=true but in comments. -> This is incorrect.
image

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 5, 2026

Codecov Report

❌ Patch coverage is 94.23077% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.60%. Comparing base (4e40672) to head (1873fbb).

Files with missing lines Patch % Lines
...java/org/apache/pinot/sql/parsers/ParserUtils.java 94.23% 0 Missing and 3 partials ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             master   #18425       +/-   ##
=============================================
+ Coverage     34.91%   63.60%   +28.69%     
- Complexity      857     1717      +860     
=============================================
  Files          3252     3252               
  Lines        199132   199182       +50     
  Branches      30875    30889       +14     
=============================================
+ Hits          69528   126694    +57166     
+ Misses       123518    62408    -61110     
- Partials       6086    10080     +3994     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 63.60% <94.23%> (+28.69%) ⬆️
temurin 63.60% <94.23%> (+28.69%) ⬆️
unittests 63.60% <94.23%> (+28.69%) ⬆️
unittests1 55.69% <94.23%> (?)
unittests2 34.91% <67.30%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@tarun11Mavani tarun11Mavani force-pushed the fix/sql-option-comment-parsing branch 2 times, most recently from 689b698 to 118f78d Compare May 5, 2026 19:03
@tarun11Mavani tarun11Mavani changed the title fix(sql): strip trailing -- comments before OPTIONS regex match [bugfix] fix(sql): strip trailing -- comments before OPTIONS regex match May 6, 2026
@tarun11Mavani tarun11Mavani force-pushed the fix/sql-option-comment-parsing branch from 118f78d to 63d1600 Compare May 6, 2026 10:07
The legacy OPTIONS regex (e.g. `option(skipUpsert=true)`) is applied to
the raw SQL string before it is passed to the Calcite parser. Because
`sanitizeSql` only stripped trailing whitespace, a query ending with a
single-line comment such as

  SELECT col1 FROM foo -- option(skipUpsert=true)

would be mistakenly treated as if `skipUpsert=true` were a real query
option, since the regex anchors at end-of-string (\Z).

Fix: scan only the last line of the sanitized SQL for an unquoted `--`
sequence and remove it (plus any resulting trailing whitespace) before
the OPTIONS regex is applied. Block comments (`/* ... */`) are not
affected because they shift the `option(...)` text away from the
end-of-string anchor and therefore never triggered the bug.

Added two regression test cases to `CalciteSqlCompilerTest#testQueryOptions`:
- `-- option(skipUpsert=true)` trailing comment must not set query options
- `/* option(skipUpsert=true) */` block comment must not set query options
@tarun11Mavani tarun11Mavani force-pushed the fix/sql-option-comment-parsing branch from 63d1600 to 1873fbb Compare May 6, 2026 11:50
@tarun11Mavani
Copy link
Copy Markdown
Contributor Author

@Jackie-Jiang @xiangfu0 Can you take a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants