add broker config options for sql log redaction by jadami10 · Pull Request #18430 · apache/pinot

jadami10 · 2026-05-05T21:11:37Z

This is both a bugfix and a new feature to support query redaction.

By default, query logs are not redacted.

With literal_values, we use the the query fingerprint to only log the redacted query with no literal values. This is useful if folks still want the structure of the query without potentially leaking PII.

This also fixes a bug where query fingerprinting was modifying the AST in place and breaking queries. This closes #18426.

The final option is full redaction. This is good if you want no SQL ending up in your logging system.

I tested all options internally on a QA cluster. We plan to stick with full redaction going forward.

codecov-commenter · 2026-05-05T22:07:14Z

Codecov Report

❌ Patch coverage is 31.37255% with 70 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.65%. Comparing base (b870804) to head (871a7e2).
⚠️ Report is 14 commits behind head on master.

Files with missing lines	Patch %	Lines
...sthandler/BaseSingleStageBrokerRequestHandler.java	14.81%	43 Missing and 3 partials ⚠️
...requesthandler/MultiStageBrokerRequestHandler.java	9.09%	17 Missing and 3 partials ⚠️
.../org/apache/pinot/broker/querylog/QueryLogger.java	80.95%	4 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #18430      +/-   ##
============================================
+ Coverage     63.61%   63.65%   +0.04%     
- Complexity     1717     1735      +18     
============================================
  Files          3252     3254       +2     
  Lines        199051   199501     +450     
  Branches      30838    30984     +146     
============================================
+ Hits         126618   126993     +375     
- Misses        62352    62370      +18     
- Partials      10081    10138      +57

Flag	Coverage Δ
custom-integration1	`100.00% <ø> (ø)`
integration	`100.00% <ø> (ø)`
integration1	`100.00% <ø> (ø)`
integration2	`0.00% <ø> (ø)`
java-21	`63.65% <31.37%> (+0.04%)`	⬆️
temurin	`63.65% <31.37%> (+0.04%)`	⬆️
unittests	`63.65% <31.37%> (+0.04%)`	⬆️
unittests1	`55.72% <100.00%> (+0.07%)`	⬆️
unittests2	`34.97% <31.37%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

xiangfu0

Found two high-signal SQL redaction gaps; see inline comments.

xiangfu0 · 2026-05-06T12:14:10Z

+        return valueOf(value.toUpperCase());
+      } catch (IllegalArgumentException e) {
+        LOGGER.warn("Invalid SQL redaction mode '{}', defaulting to NONE", value);
+        return NONE;


This fails open on misconfiguration. If an operator sets an invalid pinot.broker.query.log.sqlRedaction value, we silently fall back to NONE and start emitting raw SQL, which is the exact unsafe behavior this knob is supposed to prevent. For a privacy feature, the safer behavior is to reject startup or fail closed to a redacted mode instead of disabling redaction.

ya, really good point. I've updated this for now while I think about your other comment.

xiangfu0 · 2026-05-06T12:14:10Z

@@ -332,6 +338,8 @@ protected BrokerResponse handleRequest(long requestId, String query, SqlNodeAndO
      }
    }


If fingerprint generation failed above, this still hands the raw SQL to the query logger and the request-handler warning path already logged it once. The same pattern also exists on other broker error paths that still log query directly, so literal_values and especially full do not actually guarantee that SQL stays out of broker logs. We need a shared redaction helper for every broker-side query log before advertising this as broker SQL redaction.

another great catch. From what I initially found, the queries are all being logged from BaseSingleStageBrokerRequestHandler and MultiStageBrokerRequestHandler. My thinking is to start by exposing redactQuery as a method on QueryLogger and have both classes use that. This minimizes the amount of changes and doesn't require a global redaction config that all classes need access to right. It does leave things open to this pattern if needed in the future.

What do you think?

add broker config options for sql log redaction

99ace8d

jadami-stripe added 2 commits May 5, 2026 18:24

do not parse twice

7c899e9

fix checkstyle

cf497a0

xiangfu0 reviewed May 6, 2026

View reviewed changes

jadami-stripe added 2 commits May 6, 2026 17:36

default to FULL redaction when failing to parse

77c9c84

redact other logs too

871a7e2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add broker config options for sql log redaction#18430

add broker config options for sql log redaction#18430
jadami10 wants to merge 5 commits intoapache:masterfrom
jadami10:jadami/oss-redact-query-sql

jadami10 commented May 5, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented May 5, 2026 •

edited

Loading

Uh oh!

xiangfu0 left a comment

Uh oh!

xiangfu0 May 6, 2026

Uh oh!

jadami10 May 6, 2026

Uh oh!

xiangfu0 May 6, 2026

Uh oh!

jadami10 May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		@@ -332,6 +338,8 @@ protected BrokerResponse handleRequest(long requestId, String query, SqlNodeAndO
		}
		}

Conversation

jadami10 commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

xiangfu0 left a comment

Choose a reason for hiding this comment

Uh oh!

xiangfu0 May 6, 2026

Choose a reason for hiding this comment

Uh oh!

jadami10 May 6, 2026

Choose a reason for hiding this comment

Uh oh!

xiangfu0 May 6, 2026

Choose a reason for hiding this comment

Uh oh!

jadami10 May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jadami10 commented May 5, 2026 •

edited

Loading

codecov-commenter commented May 5, 2026 •

edited

Loading