Skip to content

Add afterMerge() lifecycle hook to KnnVectorsWriter and PerFieldKnnVe…#15952

Open
MrFlap wants to merge 1 commit into
apache:mainfrom
MrFlap:after-merge
Open

Add afterMerge() lifecycle hook to KnnVectorsWriter and PerFieldKnnVe…#15952
MrFlap wants to merge 1 commit into
apache:mainfrom
MrFlap:after-merge

Conversation

@MrFlap
Copy link
Copy Markdown

@MrFlap MrFlap commented Apr 10, 2026

Resolves #15935

Problem

PerFieldKnnVectorsFormat subclasses that allocate merge-time resources (such as thread pools for concurrent HNSW graph construction) have no way to know when a merge completes. getKnnVectorsFormatForField is called per-field during merge, but there is no corresponding callback when the merge finishes. This makes it impossible to cleanly release per-merge resources, leading to resource leaks.

Solution

Add a protected void afterMerge() throws IOException method with a no-op default to both KnnVectorsWriter and PerFieldKnnVectorsFormat.

A consumer can then override afterMerge() to shrink a shared thread pool after each merge completes, ensuring the pool scales elastically with the number of concurrent merges rather than leaking threads indefinitely.

Changes

KnnVectorsWriter.java: Added afterMerge() no-op method. Wrapped the field merge loop in merge() with try/finally { afterMerge() }.
PerFieldKnnVectorsFormat.java: Added afterMerge() no-op method on the format. FieldsWriter overrides afterMerge() to delegate to PerFieldKnnVectorsFormat.this.afterMerge().

cc @navneet1v @shatejas

@navneet1v
Copy link
Copy Markdown
Contributor

@MrFlap can you add entry in change log and also unit test for the new interface

Comment on lines -107 to 127
if (mergeState.infoStream.isEnabled("VV")) {
mergeState.infoStream.message("VV", "merge done " + mergeState.segmentInfo);
if (mergeState.infoStream.isEnabled("VV")) {
mergeState.infoStream.message("VV", "merge done " + mergeState.segmentInfo);
}
}
}
} finally {
afterMerge();
}
finish();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't finish work? It seems like the writer should know its in a merge context already and finish should just handle that? Adding a new interface seems unnecessary?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would like to manage the threads in PerFieldKnnVectorsFormat, which doesn't get a callback once KnnVectorsWriter.finish() happens. Also, the suggested afterMerge() runs in a finally block, so we can be sure to close out the threads even on exception.

Copy link
Copy Markdown
Member

@benwtrent benwtrent Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. On an exception with merge, all files, etc. should be closed, do we want to move finish into the finally? There may be edge cases, for sure...
  2. Are we sure extending PerFieldKnnVectorsFormat is the way forward here? My point is that this type of thing is doable via experts in Lucene in different ways than adding another top-level API to the KnnVectorFormat which then impacts all formats.

I want to be very careful about adding an expert level API that Apache Lucene itself doesn't actually use.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On an exception with merge, all files, etc. should be closed, do we want to move finish into the finally? There may be edge cases, for sure...

I'm not sure I can give a completely educated call on this, but it seems like the right thing to do. If mergeOneField throws mid-way, the finish() call will be skipped.

Are we sure extending PerFieldKnnVectorsFormat is the way forward here? My point is that this type of thing is doable via experts in Lucene in different ways than adding another top-level API to the KnnVectorFormat which then impacts all formats.

Understandable. I'm not a Lucene expert, but I don't think there is an easy way for PerFieldKnnVectorsFormat to know when a merge completes. I'll continue looking into this. cc @navneet1v

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am saying, that any expert extending PerFieldKnnVectorsFormat can likely do this in another way and requiring a Lucene change that Lucene doesn't benefit from is a tricky situation.

I am just wondering if this is the best way for Lucene.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, the overrider for PerFieldKnnVectorsFormat could wrap a try/final directly in mergeOneField. I am not sure a new interface is required at all.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our desired functionality is to have a set number N threads per merge. This way we won't have resource contention between merges. So if one field needs 50 singular merge operations, we should see 50 * N threads spawning. Wrapping functionality around mergeOneField is difficult because we want to

  1. Allocate N threads when each singular merge operation starts.
  2. Release N threads when each singular merge operation completes.
  3. Be able to release N threads on a specific merge operation when an exception happens

We would really like to keep it to a "N threads per singular merge operation" as making it a set number of threads per field merge could lead to resource contention per thread. We don't really see an easy way to be this granular about how to detect and allocate resources per merge, since there are no hooks to see when a singular merge operation happens.

@MrFlap
Copy link
Copy Markdown
Author

MrFlap commented Apr 16, 2026

@MrFlap can you add entry in change log and also unit test for the new interface

Adding these.

@github-actions github-actions Bot added this to the 11.0.0 milestone Apr 16, 2026
…ctorsFormat

Signed-off-by: Andrew Klepchick <aklepchi@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

@github-actions github-actions Bot added the Stale label May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hook for merge thread pool

3 participants