Skip to content

docs(autoRestart): document when the restart counter resets for connectors#11917

Merged
scholzj merged 1 commit intostrimzi:mainfrom
rodrigo-molina:feat/explain-restart-counter-feature-in-kafka-connect-connector-docs
Sep 26, 2025
Merged

docs(autoRestart): document when the restart counter resets for connectors#11917
scholzj merged 1 commit intostrimzi:mainfrom
rodrigo-molina:feat/explain-restart-counter-feature-in-kafka-connect-connector-docs

Conversation

@rodrigo-molina
Copy link
Copy Markdown
Contributor

Type of change

  • Documentation

Description

While walking through the Kafka Connect AutoReset implementation, I discovered the reset counter feature is not explained in the documentation.

The feature is implemented in:

/**
* Checks whether the connector is stable for long enough after the previous restart to reset the auto-restart
* counters. Normally, this follows the same backoff intervals as the restarts. For example, after 4 restarts, the
* connector needs to be running for 20 minutes to be considered stable.
*
* @param autoRestartStatus Status field with auto-restart status
*
* @return True if the previous auto-restart status of the connector should be reset to 0. False otherwise.
*/
/* test */ static boolean shouldResetAutoRestartStatus(AutoRestartStatus autoRestartStatus) {
if (autoRestartStatus != null
&& autoRestartStatus.getLastRestartTimestamp() != null
&& autoRestartStatus.getCount() > 0) {
// There are previous auto-restarts => we check if it is time to reset the status
long minutesSinceLastRestart = StatusUtils.minutesDifferenceUntilNow(StatusUtils.isoUtcDatetime(autoRestartStatus.getLastRestartTimestamp()));
return minutesSinceLastRestart > nextAutoRestartBackOffIntervalInMinutes(autoRestartStatus.getCount());
} else {
// There are no previous restarts => nothing to reset
return false;
}
}

and it's used by the handler:

// Connector and tasks are not failed
if (previousAutoRestartStatus != null) {
if (shouldResetAutoRestartStatus(previousAutoRestartStatus)) {
// The connector is not failing now for some time => time to reset the auto-restart status
LOGGER.infoCr(reconciliation, "Resetting the auto-restart status of connector {} ", connectorName);
status.autoRestart = null;
return Future.succeededFuture(status);
} else {

See the full method code:

@SuppressWarnings({ "rawtypes" })
/* test */ Future<ConnectorStatusAndConditions> autoRestartFailedConnectorAndTasks(Reconciliation reconciliation, String host, KafkaConnectApi apiClient, String connectorName, KafkaConnectorSpec connectorSpec, ConnectorStatusAndConditions status, CustomResource resource) {
JsonObject statusResultJson = new JsonObject(status.statusResult);
if (connectorSpec.getAutoRestart() != null && connectorSpec.getAutoRestart().isEnabled()) {
return previousAutoRestartStatus(reconciliation, connectorName, resource)
.compose(previousAutoRestartStatus -> {
boolean needsRestart = connectorHasFailed(statusResultJson) || !failedTaskIds(statusResultJson).isEmpty();
if (needsRestart) {
// Connector or task failed, and we should check it for auto-restart
if (shouldAutoRestart(previousAutoRestartStatus, connectorSpec.getAutoRestart().getMaxRestarts())) {
// There are failures, and it is a time to restart the connector now
metrics().connectorsAutoRestartsCounter(reconciliation.namespace()).increment();
return autoRestartConnector(reconciliation, host, apiClient, connectorName, status, previousAutoRestartStatus);
} else {
// There are failures, but the next restart should happen only later => keep the original status
status.autoRestart = new AutoRestartStatusBuilder(previousAutoRestartStatus).build();
return Future.succeededFuture(status);
}
} else {
// Connector and tasks are not failed
if (previousAutoRestartStatus != null) {
if (shouldResetAutoRestartStatus(previousAutoRestartStatus)) {
// The connector is not failing now for some time => time to reset the auto-restart status
LOGGER.infoCr(reconciliation, "Resetting the auto-restart status of connector {} ", connectorName);
status.autoRestart = null;
return Future.succeededFuture(status);
} else {
// The connector is not failing, but it is not sure yet if it is stable => keep the original status
status.autoRestart = new AutoRestartStatusBuilder(previousAutoRestartStatus).build();
return Future.succeededFuture(status);
}
} else {
// No failures and no need to reset the previous auto.restart state => nothing to do
return Future.succeededFuture(status);
}
}
});
} else {
return Future.succeededFuture(status);
}
}

Checklist

Please go through this checklist and make sure all applicable tasks have been done

  • [ ] Write tests
  • [ ] Make sure all tests pass
  • Update documentation
  • [ ] Check RBAC rights for Kubernetes / OpenShift roles
  • [ ] Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
  • [ ] Reference relevant issue(s) and close them after merging
  • [ ] Update CHANGELOG.md
  • [ ] Supply screenshots for visual changes, such as Grafana dashboards

Copy link
Copy Markdown
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable. Thanks for the PR!

@scholzj scholzj requested a review from PaulRMellor September 25, 2025 15:45
@scholzj scholzj added this to the 0.49.0 milestone Sep 25, 2025
@rodrigo-molina rodrigo-molina force-pushed the feat/explain-restart-counter-feature-in-kafka-connect-connector-docs branch from d5389fa to 3fe0ffe Compare September 25, 2025 15:47
Clarify that the restart counter resets once a connector runs longer
than the next backoff interval.

Signed-off-by: rodrigo-molina <43792562+rodrigo-molina@users.noreply.github.com>
@rodrigo-molina rodrigo-molina force-pushed the feat/explain-restart-counter-feature-in-kafka-connect-connector-docs branch from 3fe0ffe to 5c64aac Compare September 25, 2025 15:51
@rodrigo-molina
Copy link
Copy Markdown
Contributor Author

rodrigo-molina commented Sep 25, 2025

This seems reasonable. Thanks for the PR!

Anytime 😄 !

Copy link
Copy Markdown
Member

@ppatierno ppatierno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

Copy link
Copy Markdown
Contributor

@PaulRMellor PaulRMellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent. Thanks

@scholzj scholzj merged commit 514606b into strimzi:main Sep 26, 2025
10 checks passed
@rodrigo-molina rodrigo-molina deleted the feat/explain-restart-counter-feature-in-kafka-connect-connector-docs branch September 29, 2025 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants