Refactor BaseTableDataManager #18381
Draft
krishan1390 wants to merge 6 commits intoapache:masterfrom
Draft
Conversation
…ts building blocks Three small refactors that keep single-segment behavior identical and let a future multi-segment SegmentDataManager (e.g. a wrapper around N constituent segments) reuse the same load and reload primitives without forking them: 1. Extract `protected ImmutableSegment loadSegment(zkMetadata, ilc)` from `downloadAndLoadSegment`. The new helper performs only the download + `ImmutableSegmentLoader.load`, returning the segment without registering it in `_segmentDataManagerMap` or invoking upsert hooks. Single-segment callers continue to use `downloadAndLoadSegment`, which now composes the helper + `addSegment(...)`. This lets a multi-segment manager load all of its members first and register a single wrapper entry under one name. 2. Push `_segmentReloadSemaphore` acquire/release down into `reloadSegment(SegmentDataManager, IndexLoadingConfig, boolean)`. The public `reloadSegment(String)` and the private parallel `reloadSegments(List<SDM>)` both used to wrap the inner call with the semaphore; that acquire is now inside the per-physical-segment body and the outer wrappers are removed (which would otherwise double-acquire on a non-reentrant semaphore). For non-group tables this is structurally identical (one segment -> one acquire -> one release; same concurrency bound). For multi-segment managers that fan out N reloads, each member contends for a slot independently. 3. Drop `@VisibleForTesting` on `isSegmentStale(IndexLoadingConfig, SegmentDataManager)` and widen to plain `protected` so subclasses can call it from group-aware overrides of `getStaleSegments` / `needReloadSegments`. The semaphore stays at the orchestration boundary in `doReplaceSegment` (not relocated into `replaceSegmentIfCrcMismatch`), because subclasses commonly override `replaceSegmentIfCrcMismatch` and a relocation there would leak the acquire across paths that bypass the override; subclasses needing per-member acquire on a multi-segment replace can wrap the call themselves. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #18381 +/- ##
=============================================
+ Coverage 34.91% 63.58% +28.66%
- Complexity 857 1717 +860
=============================================
Files 3252 3252
Lines 199132 199153 +21
Branches 30875 30877 +2
=============================================
+ Hits 69528 126624 +57096
+ Misses 123518 62454 -61064
- Partials 6086 10075 +3989
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…ions and ownership for future work
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Small refactors to
BaseTableDataManagerto enable cleaner extensions, abstractions and ownership for future workPush
_segmentReloadSemaphoreacquire/release intoreloadSegment(SegmentDataManager, ILC, boolean). Removes the duplicated outer acquire inreloadSegment(String)and the parallelreloadSegments(List<SDM>)paths.Centralize segment delete in
BaseTableDataManager. AddTableDataManager#deleteSegment(String)to the SPI and a staticdeleteSegmentFilesFromDiskhelper.HelixInstanceDataManager.deleteSegmentnow delegates to the TDM (addressing the long-standing TODO) and only falls back to path-only cleanup when the TDM is null. The lookup + delete is held under the per-segment lock so it is atomic vs. concurrent table teardown.SPI changes
deleteSegment(String).Test plan
BaseTableDataManagerand subclass tests pass.