Skip to content

[bugfix] Fixing instance partition assignments for multi-stream realtime tables#18433

Open
shauryachats wants to merge 3 commits intoapache:masterfrom
shauryachats:instanceassignfix
Open

[bugfix] Fixing instance partition assignments for multi-stream realtime tables#18433
shauryachats wants to merge 3 commits intoapache:masterfrom
shauryachats:instanceassignfix

Conversation

@shauryachats
Copy link
Copy Markdown
Collaborator

@shauryachats shauryachats commented May 6, 2026

Summary

Multi-stream realtime tables encode Pinot partition IDs as streamIndex * 10000 + streamPartitionId. Before this fix, RealtimeSegmentAssignment and ReplicaGroupSegmentAssignmentStrategy used the raw (encoded) partition ID directly when computing instance slots, causing incorrect slot mapping and breaking colocation of segments belonging to the same partition.

Changes:

  • In RealtimeSegmentAssignment.assignConsumingSegment, decode the Pinot partition ID to the stream-level partition ID via IngestionConfigUtils.getStreamPartitionIdFromPinotPartitionId before
    computing the instance index.
  • In ReplicaGroupSegmentAssignmentStrategy, extract a getPartitionIdFromSegmentName helper that applies the same decoding for REALTIME tables before % numPartitions, fixing both single-segment assignment and rebalance paths.

Testing

Deployed this on an internal cluster containing a multi-topic table and verified by setting the instanceAssignmentConfig as:

"instanceAssignmentConfigMap": {
  "CONSUMING": {
    "tagPoolConfig": {
      "tag": "cluster_REALTIME",
      "poolBased": false,
      "numPools": 0
    },
    "replicaGroupPartitionConfig": {
      "replicaGroupBased": true,
      "numInstances": 0,
      "numReplicaGroups": 2,
      "numInstancesPerReplicaGroup": 3,
      "numPartitions": 3,
      "numInstancesPerPartition": 1,
      "minimizeDataMovement": true,
      "partitionColumn": "trace_id"
    },
    "partitionSelector": "INSTANCE_REPLICA_GROUP_PARTITION_SELECTOR",
    "minimizeDataMovement": false
  },
  "COMPLETED": {
    "tagPoolConfig": {
      "tag": "cluster_REALTIME",
      "poolBased": false,
      "numPools": 0
    },
    "replicaGroupPartitionConfig": {
      "replicaGroupBased": true,
      "numInstances": 0,
      "numReplicaGroups": 2,
      "numInstancesPerReplicaGroup": 3,
      "numPartitions": 3,
      "numInstancesPerPartition": 1,
      "minimizeDataMovement": true,
      "partitionColumn": "trace_id"
    },
    "partitionSelector": "INSTANCE_REPLICA_GROUP_PARTITION_SELECTOR",
    "minimizeDataMovement": false
  }
},

For this setup, the raw partition ID 0 and 10000 would map to instance partition 0 % 3 = 0 and 10000 % 3 = 1 without the fix which meant different servers for the same stream partition ID which breaks colocation of data.

"traces_instance_partitions__0__8__20260505T0158Z": {
  "68456b84-25be-48f4-98ee-8ce8999dbc02": "ONLINE",
  "79225964-ca5b-4ffc-9e05-f41300a9d20a": "ONLINE"
},
"traces_instance_partitions__0__9__20260505T0233Z": {
  "68456b84-25be-48f4-98ee-8ce8999dbc02": "ONLINE",
  "79225964-ca5b-4ffc-9e05-f41300a9d20a": "ONLINE"
},
"traces_instance_partitions__10000__0__20260505T0119Z": {
  "89369471-20ad-413e-a439-39d6e9b1f4f2": "ONLINE",
  "b1550ff3-051d-425b-8884-3c7b3d75335b": "ONLINE"
},
"traces_instance_partitions__10000__10__20260505T0257Z": {
  "89369471-20ad-413e-a439-39d6e9b1f4f2": "ONLINE",
  "b1550ff3-051d-425b-8884-3c7b3d75335b": "ONLINE"
},

After the fix, both 0 and 10000 would map to the correct instance partition 0 % 3 = 0 and (10000 % 10000) % 3 = 0.

"traces_instance_partitions__0__8__20260505T0236Z": {
  "24c00225-df6b-493e-aa87-df9c17f1b4bd": "ONLINE",
  "2e1ccdfa-5b7e-4bd6-8118-c195a4eb6ead": "ONLINE"
},
"traces_instance_partitions__0__9__20260505T0300Z": {
  "24c00225-df6b-493e-aa87-df9c17f1b4bd": "ONLINE",
  "2e1ccdfa-5b7e-4bd6-8118-c195a4eb6ead": "ONLINE"
},
"traces_instance_partitions__10000__0__20260505T0146Z": {
  "24c00225-df6b-493e-aa87-df9c17f1b4bd": "ONLINE",
  "2e1ccdfa-5b7e-4bd6-8118-c195a4eb6ead": "ONLINE"
},
"traces_instance_partitions__10000__10__20260505T0338Z": {
  "24c00225-df6b-493e-aa87-df9c17f1b4bd": "ONLINE",
  "2e1ccdfa-5b7e-4bd6-8118-c195a4eb6ead": "ONLINE"
},

@shauryachats shauryachats added bug Something is not working as expected real-time Related to realtime table ingestion and serving labels May 6, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 6, 2026

Codecov Report

❌ Patch coverage is 88.23529% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.58%. Comparing base (5ccd383) to head (97c430a).

Files with missing lines Patch % Lines
...g/apache/pinot/spi/utils/IngestionConfigUtils.java 75.00% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18433      +/-   ##
============================================
+ Coverage     63.57%   63.58%   +0.01%     
  Complexity     1717     1717              
============================================
  Files          3252     3252              
  Lines        199132   199144      +12     
  Branches      30875    30877       +2     
============================================
+ Hits         126596   126627      +31     
+ Misses        62454    62437      -17     
+ Partials      10082    10080       -2     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 63.58% <88.23%> (+0.01%) ⬆️
temurin 63.58% <88.23%> (+0.01%) ⬆️
unittests 63.58% <88.23%> (+0.01%) ⬆️
unittests1 55.67% <0.00%> (+0.01%) ⬆️
unittests2 34.91% <88.23%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something is not working as expected real-time Related to realtime table ingestion and serving

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants