You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DPL Analysis: shortcircuit empty slices in Preslice / cached sliceBy
Extend the GroupSlicer empty-slice shortcircuit to the user-side slicing paths: PreslicePolicySorted::getSliceFor (doSliceBy, doFilteredSliceBy) and doSliceByCached / doFilteredSliceByCached via a new ArrowTableSlicingCache::getEmptySliceFor.
arrow::Table::Slice allocates per column even for 0 rows, which is pure waste for empty groups - the common case when slicing sparse candidate tables per collision. Both primitives now return a lazily built empty table, cached in a one-slot {table pointer, slice} pair and rebuilt when the input table changes with the next dataframe.
Measured on Hyperloop (taskLc on LHC23 2P3PDstar, tests 695219/695224): hf-task-lc device CPU -61%, train throughput +10%. Dense-slicing chains unaffected.
0 commit comments