Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions lucene/CHANGES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,13 @@ Optimizations

Bug Fixes
---------------------
* GITHUB#15901: Fix undercounting of RAM used by vectors buffered in in-memory segments.
BufferingKnnVectorsWriter hardcoded Float.BYTES for all vector encodings, overcounting byte
vectors by 4x. Scalar quantized vector writers (Lucene104, Lucene99, Lucene102) did not
account for rawVectorDelegate RAM, making byte vector fields completely invisible in RAM
reporting. Also added missing dimensionSums array accounting in quantized writers.
(Prithvi S)

* GITHUB#14049: Randomize KNN codec params in RandomCodec. Fixes scalar quantization div-by-zero
when all values are identical. (Mike Sokolov)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@
import org.apache.lucene.search.VectorScorer;
import org.apache.lucene.store.IndexOutput;
import org.apache.lucene.util.IOUtils;
import org.apache.lucene.util.RamUsageEstimator;
import org.apache.lucene.util.VectorUtil;
import org.apache.lucene.util.quantization.OptimizedScalarQuantizer;

Expand Down Expand Up @@ -453,9 +454,14 @@ static int calculateCentroid(MergeState mergeState, FieldInfo fieldInfo, float[]
@Override
public long ramBytesUsed() {
long total = SHALLOW_RAM_BYTES_USED;
// The rawVectorDelegate tracks all vector data for both byte and float32 fields.
// For byte vector fields (which bypass our FieldWriter), this is the only accounting.
total += rawVectorDelegate.ramBytesUsed();
for (FieldWriter field : fields) {
// the field tracks the delegate field usage
total += field.ramBytesUsed();
// quantizationOverheadBytesUsed() intentionally excludes flatFieldVectorsWriter
// because rawVectorDelegate.ramBytesUsed() already accounts for all flat vector
// data at the writer level. Calling field.ramBytesUsed() here would double-count.
total += field.quantizationOverheadBytesUsed();
}
return total;
}
Expand Down Expand Up @@ -530,11 +536,22 @@ public float[] copyValue(float[] vectorValue) {
throw new UnsupportedOperationException();
}

/**
* Returns the RAM usage of quantization-specific state only (magnitudes, dimensionSums, shallow
* object overhead). The underlying flat vector data is tracked separately by the
* rawVectorDelegate at the writer level to avoid double-counting.
*/
long quantizationOverheadBytesUsed() {
long size = SHALLOW_SIZE;
size += magnitudes.ramBytesUsed();
size += RamUsageEstimator.sizeOf(dimensionSums);
return size;
}

@Override
public long ramBytesUsed() {
long size = SHALLOW_SIZE;
long size = quantizationOverheadBytesUsed();
size += flatFieldVectorsWriter.ramBytesUsed();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the raw delegate above would now be responsible to account for vector data for both float and bytes and hence we switched to call the overhead part in this? But then will we not double count it for floats her with flatFieldVectorsWriter.ramBytesUsed and also rawVectorDelegate.ramBytesUsed(the newly added one)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, rawVectorDelegate is now the single source of truth for all flat vector data (both byte and float32).

No double counting happens, FieldWriter.flatFieldVectorsWriter is the same Java object that rawVectorDelegate holds internally as the per-field writer, it's what this.rawVectorDelegate.addField(fieldInfo) returns and then passes into new FieldWriter(fieldInfo, rawVectorDelegate). So rawVectorDelegate.ramBytesUsed() already accounts for those float vectors.

The writer level loop then calls field.quantizationOverheadBytesUsed(), which only counts the FieldWriter shell + magnitudes + dimensionSums, NOT flatFieldVectorsWriter. FieldWriter.ramBytesUsed() (which does include flatFieldVectorsWriter.ramBytesUsed()) is never called from the writer level accounting. It's there solely for the Accountable interface. So each byte of flat float data is counted exactly once through rawVectorDelegate.

Thanks @shubhamvishu!

size += magnitudes.ramBytesUsed();
return size;
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -282,9 +282,14 @@ public void finish() throws IOException {
@Override
public long ramBytesUsed() {
long total = SHALLOW_RAM_BYTES_USED;
// The rawVectorDelegate tracks all vector data for both byte and float32 fields.
// For byte vector fields (which bypass our FieldWriter), this is the only accounting.
total += rawVectorDelegate.ramBytesUsed();
for (FieldWriter field : fields) {
// the field tracks the delegate field usage
total += field.ramBytesUsed();
// quantizationOverheadBytesUsed() intentionally excludes flatFieldVectorsWriter
// because rawVectorDelegate.ramBytesUsed() already accounts for all flat vector
// data at the writer level. Calling field.ramBytesUsed() here would double-count.
total += field.quantizationOverheadBytesUsed();
}
return total;
}
Expand Down Expand Up @@ -727,9 +732,17 @@ ScalarQuantizer createQuantizer() throws IOException {
return quantizer;
}

/**
* Returns the RAM usage of quantization-specific state only. The underlying flat vector data is
* tracked separately by the rawVectorDelegate at the writer level.
*/
long quantizationOverheadBytesUsed() {
return SHALLOW_SIZE;
}

@Override
public long ramBytesUsed() {
long size = SHALLOW_SIZE;
long size = quantizationOverheadBytesUsed();
size += flatFieldVectorsWriter.ramBytesUsed();
return size;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,7 @@ public final long ramBytesUsed() {
* (long)
(RamUsageEstimator.NUM_BYTES_OBJECT_REF
+ RamUsageEstimator.NUM_BYTES_ARRAY_HEADER)
+ vectors.size() * (long) dim * Float.BYTES;
+ vectors.size() * (long) dim * fieldInfo.getVectorEncoding().byteSize;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoa, so this means, if Lucene user was index vectors coming in as byte[] (like they pre-quantize, outside of Lucene), we were incorrectly counting them as 4X larger RAM usage, and IW would flush way too early?

Copy link
Copy Markdown
Contributor Author

@iprithv iprithv May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. Before the fix, ramBytesUsed() always multiplied by Float.BYTES (4) regardless of encoding, so a byte[] vector field was reported as 4x its actual memory cost causing IndexWriter to flush up to 4x too early for byte encoded vector fields. After this goes in, it switches to fieldInfo.getVectorEncoding().byteSize, which is 1 for BYTE and 4 for FLOAT32, giving the correct cost in both cases. Thanks @mikemccand!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikemccand just wanted to touch base with you on this, in case it got buried. Thanks!

}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@
import org.apache.lucene.search.VectorScorer;
import org.apache.lucene.store.IndexOutput;
import org.apache.lucene.util.IOUtils;
import org.apache.lucene.util.RamUsageEstimator;
import org.apache.lucene.util.VectorUtil;
import org.apache.lucene.util.quantization.OptimizedScalarQuantizer;
import org.apache.lucene.util.quantization.QuantizedByteVectorValues;
Expand Down Expand Up @@ -508,9 +509,16 @@ static int calculateCentroid(MergeState mergeState, FieldInfo fieldInfo, float[]
@Override
public long ramBytesUsed() {
long total = SHALLOW_RAM_BYTES_USED;
// The rawVectorDelegate tracks all vector data for both byte and float32 fields.
// For byte vector fields (which bypass our FieldWriter), this is the only accounting.
// For float32 fields, this covers the flat vector data; our FieldWriter adds the
// quantization-specific overhead (magnitudes, dimensionSums) on top.
total += rawVectorDelegate.ramBytesUsed();
for (FieldWriter field : fields) {
// the field tracks the delegate field usage
total += field.ramBytesUsed();
// quantizationOverheadBytesUsed() intentionally excludes flatFieldVectorsWriter
// because rawVectorDelegate.ramBytesUsed() already accounts for all flat vector
// data at the writer level. Calling field.ramBytesUsed() here would double-count.
total += field.quantizationOverheadBytesUsed();
}
return total;
}
Expand Down Expand Up @@ -585,11 +593,22 @@ public float[] copyValue(float[] vectorValue) {
throw new UnsupportedOperationException();
}

/**
* Returns the RAM usage of quantization-specific state only (magnitudes, dimensionSums, shallow
* object overhead). The underlying flat vector data is tracked separately by the
* rawVectorDelegate at the writer level to avoid double-counting.
*/
long quantizationOverheadBytesUsed() {
long size = SHALLOW_SIZE;
size += magnitudes.ramBytesUsed();
size += RamUsageEstimator.sizeOf(dimensionSums);
return size;
}

@Override
public long ramBytesUsed() {
long size = SHALLOW_SIZE;
long size = quantizationOverheadBytesUsed();
size += flatFieldVectorsWriter.ramBytesUsed();
size += magnitudes.ramBytesUsed();
return size;
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -444,6 +444,75 @@ public void testWriterRamEstimate() throws Exception {
dir.close();
}

@SuppressWarnings("unchecked")
public void testWriterByteVectorRamEstimate() throws Exception {
final FieldInfos fieldInfos = new FieldInfos(new FieldInfo[0]);
final Directory dir = newDirectory();
Codec codec = Codec.getDefault();
final SegmentInfo si =
new SegmentInfo(
dir,
Version.LATEST,
Version.LATEST,
"0",
10000,
false,
false,
codec,
Collections.emptyMap(),
StringHelper.randomId(),
new HashMap<>(),
null);
final SegmentWriteState state =
new SegmentWriteState(
InfoStream.getDefault(), dir, si, fieldInfos, null, newIOContext(random()));
final KnnVectorsFormat format = codec.knnVectorsFormat();
try (KnnVectorsWriter writer = format.fieldsWriter(state)) {
int dim = random().nextInt(64) + 1;
if (dim % 2 == 1) {
++dim;
}
int numDocs = atLeast(100);
KnnFieldVectorsWriter<byte[]> fieldWriter =
(KnnFieldVectorsWriter<byte[]>)
writer.addField(
new FieldInfo(
"fieldA",
0,
false,
false,
false,
IndexOptions.NONE,
DocValuesType.NONE,
DocValuesSkipIndexType.NONE,
-1,
Map.of(),
0,
0,
0,
dim,
VectorEncoding.BYTE,
VectorSimilarityFunction.DOT_PRODUCT,
false,
false));
for (int i = 0; i < numDocs; i++) {
fieldWriter.addValue(i, randomVector8(dim));
}
// Validate the field-level RAM accounting uses correct byte sizes.
// The reported RAM must be at least the raw byte vector data.
final long fieldRamBytesUsed = fieldWriter.ramBytesUsed();
final long rawByteVectorData = (long) dim * numDocs * Byte.BYTES;
assertTrue(
"Expected field ramBytesUsed ("
+ fieldRamBytesUsed
+ ") >= raw byte vector data size ("
+ rawByteVectorData
+ ")",
fieldRamBytesUsed >= rawByteVectorData);
}
dir.close();
}

public void testIllegalSimilarityFunctionChangeTwoWriters() throws Exception {
try (Directory dir = newDirectory()) {
try (IndexWriter w = new IndexWriter(dir, newIndexWriterConfig())) {
Expand Down
Loading