Skip to content

Add possibility to configure mTLS validityDays and renewalDays for each KafkaUser#12658

Open
im-konge wants to merge 3 commits intostrimzi:mainfrom
im-konge:kafka-user-validity-and-renewal-days
Open

Add possibility to configure mTLS validityDays and renewalDays for each KafkaUser#12658
im-konge wants to merge 3 commits intostrimzi:mainfrom
im-konge:kafka-user-validity-and-renewal-days

Conversation

@im-konge
Copy link
Copy Markdown
Member

@im-konge im-konge commented Apr 20, 2026

Type of change

  • Enhancement / new feature

Description

This PR implements proposal about Configurable validityDays and renewalDays per KafkaUser. As described in the proposal, it adds validityDays and renewalDays to the KafkaUser CRD, when the type of authn is tls - which is covered by CEL validation, together with values of both fields to be higher than 0.

Because of handling of the validity period inside the code (so checking if the current certificate would be expired or if current certificate's validity period exceeds the new one), I had to add generating of clients cert and key inside the MockCertManager class, because current certificate was valid for more like 100 years, which would every time trigger the immediate renewal of certificate. With the generated cert it will pass the validation period check and it will continue doing the checks as before. Also, I removed some of the unused methods.

Also, the values inside KafkaUserModelCertificateHandlingTest are lowered from 1000 and 500 to 365 and 182, to match the validity of the current certificate - stored in USER_CRT_FOR_EXPIRATION_TEST.

Fixes #12336

Checklist

  • Write tests
  • Make sure all tests pass
  • Update documentation
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
  • Reference relevant issue(s) and close them after merging
  • Update CHANGELOG.md

@im-konge im-konge added this to the 1.1.0 milestone Apr 20, 2026
@im-konge im-konge self-assigned this Apr 20, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 20, 2026

Codecov Report

❌ Patch coverage is 54.54545% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.01%. Comparing base (42522ab) to head (2cf2705).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...main/java/io/strimzi/operator/common/model/Ca.java 0.00% 15 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #12658      +/-   ##
============================================
- Coverage     75.07%   75.01%   -0.06%     
- Complexity     6513     6517       +4     
============================================
  Files           377      377              
  Lines         25092    25124      +32     
  Branches       3268     3276       +8     
============================================
+ Hits          18838    18848      +10     
- Misses         4914     4934      +20     
- Partials       1340     1342       +2     
Files with missing lines Coverage Δ
...io/strimzi/operator/user/model/KafkaUserModel.java 83.25% <100.00%> (-0.55%) ⬇️
...main/java/io/strimzi/operator/common/model/Ca.java 41.59% <0.00%> (-1.47%) ⬇️

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@im-konge
Copy link
Copy Markdown
Member Author

/gha run pipeline=regression,upgrade

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 20, 2026

⏳ System test verification started: link

The following 10 job(s) will be executed:

  • regression-brokers-and-security-amd64 (oracle-vm-8cpu-32gb-x86-64)
  • regression-operators-amd64 (oracle-vm-8cpu-32gb-x86-64)
  • regression-operands-amd64 (oracle-vm-8cpu-32gb-x86-64)
  • regression-brokers-and-security-arm64 (oracle-vm-8cpu-32gb-arm64)
  • regression-operators-arm64 (oracle-vm-8cpu-32gb-arm64)
  • regression-operands-arm64 (oracle-vm-8cpu-32gb-arm64)
  • upgrade-azp_kraft_upgrade-amd64 (oracle-vm-4cpu-16gb-x86-64)
  • upgrade-azp_kafka_upgrade-amd64 (oracle-vm-4cpu-16gb-x86-64)
  • upgrade-azp_kraft_upgrade-arm64 (oracle-vm-4cpu-16gb-arm64)
  • upgrade-azp_kafka_upgrade-arm64 (oracle-vm-4cpu-16gb-arm64)

Tests will start after successful build completion.

@github-actions
Copy link
Copy Markdown

🎉 System test verification passed: link

@im-konge im-konge force-pushed the kafka-user-validity-and-renewal-days branch from 16e3c12 to ae7ec92 Compare April 21, 2026 11:11
Signed-off-by: Lukas Kral <lukywill16@gmail.com>

finish implementation

Signed-off-by: Lukas Kral <lukywill16@gmail.com>

fix tests

Signed-off-by: Lukas Kral <lukywill16@gmail.com>

add changelog

Signed-off-by: Lukas Kral <lukywill16@gmail.com>

same value of validityDays and renewalDays in KafkaUserModelCertificateHandlingTest

Signed-off-by: Lukas Kral <lukywill16@gmail.com>

crds 🤦

Signed-off-by: Lukas Kral <lukywill16@gmail.com>

update API docs and add ST for this change

Signed-off-by: Lukas Kral <lukywill16@gmail.com>
@im-konge im-konge force-pushed the kafka-user-validity-and-renewal-days branch from ae7ec92 to 8d6cfed Compare April 21, 2026 11:13
Signed-off-by: Lukas Kral <lukywill16@gmail.com>
@im-konge im-konge force-pushed the kafka-user-validity-and-renewal-days branch from 4b32af4 to 7d3f1cd Compare April 21, 2026 11:15
@im-konge im-konge requested review from ppatierno and scholzj April 21, 2026 11:15
@im-konge im-konge marked this pull request as ready for review April 21, 2026 11:15
Signed-off-by: Lukas Kral <lukywill16@gmail.com>
@im-konge
Copy link
Copy Markdown
Member Author

/gha run pipeline=regression,upgrade

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 21, 2026

⏳ System test verification started: link

The following 10 job(s) will be executed:

  • regression-brokers-and-security-amd64 (oracle-vm-8cpu-32gb-x86-64)
  • regression-operators-amd64 (oracle-vm-8cpu-32gb-x86-64)
  • regression-operands-amd64 (oracle-vm-8cpu-32gb-x86-64)
  • regression-brokers-and-security-arm64 (oracle-vm-8cpu-32gb-arm64)
  • regression-operators-arm64 (oracle-vm-8cpu-32gb-arm64)
  • regression-operands-arm64 (oracle-vm-8cpu-32gb-arm64)
  • upgrade-azp_kraft_upgrade-amd64 (oracle-vm-4cpu-16gb-x86-64)
  • upgrade-azp_kafka_upgrade-amd64 (oracle-vm-4cpu-16gb-x86-64)
  • upgrade-azp_kraft_upgrade-arm64 (oracle-vm-4cpu-16gb-arm64)
  • upgrade-azp_kafka_upgrade-arm64 (oracle-vm-4cpu-16gb-arm64)

Tests will start after successful build completion.

@github-actions
Copy link
Copy Markdown

🎉 System test verification passed: link

Comment on lines +38 to +41
@CelValidation.CelValidationRule(
rule = "self > 0",
message = "'validityDays' has to be higher than 0."
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work when it is not set?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no validation if it is not set. I tried that. Or better - I didn't hit any issue when I was testing the feature.

})
@Description(
"Number of days for which the user certificate should be valid. " +
"If not configured, default User Operator value is used. " +
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given majority of uses are from Kafka CR, should we say that by default it uses Clients CA cofiguration? (here as well as below)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I can change that, good point, thanks.

*
* @return True if the certificate should be renewed due to new validity period. False otherwise.
*/
public boolean requiresImmediateRenewalDueToValidityChange(Secret secret, String certKey) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? Why not have it behave as with other certificates?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why does it belong here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? Why not have it behave as with other certificates?

I added this because of the approved proposal:

If validityDays is set or reduced on an existing KafkaUser whose current certificate would already be expired or would exceed the new validity period, the User Operator must renew the certificate immediately on the next reconciliation rather than waiting for the natural renewal window.

Or am I misunderstanding this sentence in some way?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why does it belong here?

I added it there, because there are also other methods checking if the certificate is expiring etc.
But maybe better place would be ClientsCa.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not follow this sentence. In all our other cases, the validity is used for new certificates. The renewal period (or manual annotation) is used to decide when to renew it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why does it belong here?

I added it there, because there are also other methods checking if the certificate is expiring etc. But maybe better place would be ClientsCa.

I do not think any other certificate is using it because it is a completely unique thing in this PR. So it probably should not mix into the Ca at all.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not follow this sentence. In all our other cases, the validity is used for new certificates. The renewal period (or manual annotation) is used to decide when to renew it.

Well I just implemented what was approved and written in the proposal. Without that, it would have much less changes in the tests and everywhere else.

Maybe I completely misunderstood the sentence. But that's how I understand it - in case that the current certificate's validity would exceed the new one or the current certificate would be already expired under new validity, renew the certificate immediately.

But if it doesn't make sense, I think I should update the proposal and remove this block of code.

@sebastiangaiser did I understand the sentence incorrectly?

Comment on lines +356 to +358
if (renewalDays >= validityDays) {
throw new InvalidResourceException("spec.authentication.renewalDays is higher or equal to spec.authentication.validityDays. It should be configured to be lower than spec.authentication.validityDays.");
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you validate this also in CEL? And while a validation in code might be useful, it should probably happen somewhere in some constructor?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't use CEL for this one in case of some weird combination of default validity/renewal days will be used from operator config and also one of those two will be configured in the KafkaUser.
So let's say that in KafkaUser you will have validityDays set to 30, without renewalDays being set in the KafkaUser's spec -> then it would use default for renewalDays, which can be 30 as well.
But that is not something you can check using CEL.

And I can have a look for a better place to put this one, maybe it will be then less confusing where is it being configured (I mean those two variables).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is kinds wierd. Should that be allowed? 🤔

In any case, I think you definitely need to decidde if this is a class thing and then do it in cosntructor or a method thing and then do it in the method. Given certifcates are used only by some users, method might be fine. But don't mix it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is kinds wierd. Should that be allowed? 🤔

Maybe it would be better to validate that in case that one field is set, the second field has to be set as well. In that case we can move all of this to the CEL validation.

Given certifcates are used only by some users, method might be fine. But don't mix it.

I will update it, yes. Maybe it will be even simpler in case that we will forbid to set just one field and leave some default. That makes sense to me, thanks.

Comment on lines +88 to +89
private int validityDays;
private int renewalDays;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty confusing. How and where is it set? What it is?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is configured in validateValidityAndRenewalDays. I'm not sure why it is confusing - they were used there even before. But they were passed as the method arguments, I renamed them to be "defaults" coming from the User Operator config and in case that KafkaUser has different configuration, it is used like that.

validityDays,
renewalDays,
this.validityDays,
this.renewalDays,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super confusing.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what is it confusing?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole way how you handle it is confusing. Why do you need the class level fields? Why do you set them from some method called from another method much later? Why don't you handle all of it right here in the code or in the place where it is called from?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From other comments I get why it is confusing 😀 I just didn't get it from this comment. I will anyway rewrite the whole thing based on other comments, so this will be removed as well. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement]: Configure mTLS validityDays and renewalDays per KafkaUser

2 participants