Skip to content

Support partial metadata based aggregation when some projection aggregations are not metadata compatible#18334

Open
adasari wants to merge 1 commit into
apache:masterfrom
adasari:adasari/support-partial-metadata
Open

Support partial metadata based aggregation when some projection aggregations are not metadata compatible#18334
adasari wants to merge 1 commit into
apache:masterfrom
adasari:adasari/support-partial-metadata

Conversation

@adasari
Copy link
Copy Markdown

@adasari adasari commented Apr 25, 2026

Metadata-based aggregation is currently applied only when all projection aggregation functions are metadata-compatible.

If even one aggregation function in the projection is not metadata-based, the query engine falls back to scan-based execution for all aggregations, including those that could have been computed using metadata.

  • Metadata-compatible aggregations (e.g., COUNT, MIN, MAX)
  • Non-metadata-compatible aggregations (e.g., SUM, DISTINCT, or others)

This results in missed optimization opportunities and unnecessary full segment scans.
Eg:

SELECT COUNT(*), MAX(col1), SUM(col2)
FROM my_table;

This PR adds partial metadata based aggregations for aggregation plan node only.

Pending:

  • Unit tests

@adasari
Copy link
Copy Markdown
Author

adasari commented Apr 25, 2026

@Jackie-Jiang Please let me know your thoughts on the changes. I’ll add unit and integration tests once the approach is confirmed. thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant