Skip to content

fix(tls): log transient SSL factory renewal failures at WARN instead of INFO#18403

Open
dkranchii wants to merge 1 commit intoapache:masterfrom
dkranchii:fix/renewable-tls-utils-warn-log-level
Open

fix(tls): log transient SSL factory renewal failures at WARN instead of INFO#18403
dkranchii wants to merge 1 commit intoapache:masterfrom
dkranchii:fix/renewable-tls-utils-warn-log-level

Conversation

@dkranchii
Copy link
Copy Markdown
Contributor

Bumps the log level from INFO to WARN in the retry catch block of
RenewableTlsUtils.reloadSslFactory, so transient SSL renewal failures
are visible to anyone monitoring at WARN+ (which is most prod log
pipelines).

Also tightens the message: the old text ended with "... truststore {}) on "
which looked truncated. The new wording stays accurate even on the
final retry, so it doesn't contradict the LOGGER.error that fires
right after when all 3 attempts fail.

Diff

} catch (Exception e) {
- LOGGER.info(
+ LOGGER.warn(
      "reloadSslFactory :: Encountered issues when renewing SSLFactory "
      + "{} (built from key store {} and "
-     + "truststore {}) on ", baseSslFactory, keyStorePath, trustStorePath, e);
+     + "truststore {}), retrying if attempts remain",
+     baseSslFactory, keyStorePath, trustStorePath, e);
  return false;
}

Scope

Single file: pinot-common/src/main/java/org/apache/pinot/common/utils/tls/RenewableTlsUtils.java
No behavior change, no config / SPI / wire-format change

Testing

Existing RenewableTlsUtilsTest still passes (catch block isn't currently covered by a test, so no test changes needed).
Manually verified the new message renders correctly when an exception is thrown — the SLF4J trailing Throwable argument still produces the full stack trace.

Release note

none (observability-only change)

Fixes #18402

…ry renewal failures in RenewableTlsUtils

Transient failures in reloadSslFactory (e.g. cert files mid-rotation)
were silently swallowed at INFO level, making them invisible to
operators who rely on WARN/ERROR dashboards to detect TLS issues.

Changed LOGGER.info to LOGGER.warn in the retry catch block so that
each transient failure is surfaced at the appropriate level. Also
improved the message from the truncated "...truststore {}) on " to
"...truststore {}), retrying if attempts remain" — neutral wording that
remains accurate on the final exhausted attempt, before the existing
LOGGER.error fires.
@dkranchii dkranchii force-pushed the fix/renewable-tls-utils-warn-log-level branch from dbcc287 to a63fb42 Compare May 2, 2026 08:20
@dkranchii
Copy link
Copy Markdown
Contributor Author

@Jackie-Jiang can you review this PR.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 4, 2026

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 63.46%. Comparing base (80fb0e8) to head (a63fb42).
⚠️ Report is 9 commits behind head on master.

Files with missing lines Patch % Lines
...ache/pinot/common/utils/tls/RenewableTlsUtils.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##             master   #18403   +/-   ##
=========================================
  Coverage     63.46%   63.46%           
  Complexity     1701     1701           
=========================================
  Files          3254     3254           
  Lines        199104   199104           
  Branches      30830    30830           
=========================================
+ Hits         126354   126366   +12     
+ Misses        62665    62650   -15     
- Partials      10085    10088    +3     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 63.46% <0.00%> (+<0.01%) ⬆️
temurin 63.46% <0.00%> (+<0.01%) ⬆️
unittests 63.46% <0.00%> (+<0.01%) ⬆️
unittests1 55.41% <0.00%> (-0.01%) ⬇️
unittests2 34.99% <0.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Transient SSL factory renewal failures are logged at INFO instead of WARN in RenewableTlsUtils

2 participants