Skip to content

HDDS-15165. Recon: Add admin REST APIs to trigger, monitor, and cancel SCM DB snapshot sync.#10186

Open
devmadhuu wants to merge 3 commits intoapache:masterfrom
devmadhuu:HDDS-15165
Open

HDDS-15165. Recon: Add admin REST APIs to trigger, monitor, and cancel SCM DB snapshot sync.#10186
devmadhuu wants to merge 3 commits intoapache:masterfrom
devmadhuu:HDDS-15165

Conversation

@devmadhuu
Copy link
Copy Markdown
Contributor

@devmadhuu devmadhuu commented May 5, 2026

What changes were proposed in this pull request?

This change adds Recon admin REST APIs to explicitly manage full SCM DB snapshot syncs as a follow-up to #10074.
The new APIs are exposed under TriggerDBSyncEndpoint:

  - POST /api/v1/triggerdbsync/scm/snapshot triggers an async full SCM DB snapshot download and refresh.
  - GET /api/v1/triggerdbsync/scm/snapshot/status returns current status, phase, start time, finish time, duration, cancellation eligibility, and last error.
  - POST /api/v1/triggerdbsync/scm/snapshot/cancel cancels the operation while checkpoint download is still in progress.

The implementation keeps full SCM snapshot recovery as an explicit admin action instead of coupling it to periodic SCM container sync. It tracks snapshot sync lifecycle state in ReconStorageContainerManagerFacade, prevents concurrent SCM DB sync operations, allows cancellation only before DB initialization starts, and cleans up failed or cancelled checkpoint directories.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-15165

How was this patch tested?

Tests were added for the new trigger/status/cancel endpoint behavior in TestTriggerDBSyncEndpoint.

@devmadhuu devmadhuu marked this pull request as ready for review May 5, 2026 15:46
Copy link
Copy Markdown
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devmadhuu given few comment

synchronized (scmSnapshotLock) {
scmSnapshotTaskStarted = true;
}
DBCheckpoint dbSnapshot = scmServiceProvider.getSCMDBSnapshot();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a check is required here if cancelled or not, as there is a chance of cancel marked cancel, and set isSyncDataFromSCMRunning to false. But its still running waiting in synchronized block. This check should be added in synchronized block

This needs to be done before starting sync db from scm

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for catching this race condition. Yes I fixed it now.

} else {
LOG.error("Null snapshot location got from SCM.");
try {
updateReconSCMDBWithNewSnapshotWithoutGuard();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do startup sync and trigger can have some conflict ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes . fixed it now.

@devmadhuu devmadhuu requested a review from sumitagrawl May 7, 2026 10:08
@devmadhuu devmadhuu marked this pull request as draft May 7, 2026 13:56
@devmadhuu devmadhuu marked this pull request as ready for review May 8, 2026 09:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants