[Groupby Query Metrics] Add merge buffer tracking #18731

GWphua · 2025-11-10T03:18:29Z

Huge thanks to @gianm for the implementation tip in the issue!

Description

Tracking merge buffer usage

Usage of a direct byte buffer is done under AbstractBufferHashGrouper and its implementations.

Each direct byte buffer uses a ByteBufferHashTable along with an offset tracker.
Usage is calculated by tracking the maximum capacity of the byte buffer in ByteBufferHashTable, and maximum offset size calculated throughout the query's lifecycle.

Incoprated a helpful suggestion by @aho135 : since the size of the hash tables are ever-changing, it makes sense to conduct calculations by taking the maximum values across queries -- so operators can have a better understanding of how the size of merge buffers can be configured.

Here's an example of the current SUM implementations, vs the MAX implementation The latter helps to tell us that we should probably configure merge buffer sizes to 2G for this case:

Release note

GroupByStatsMonitor now provides metrics "mergeBuffer/bytesUsed", and max metrics for merge buffer acquisition time, bytes used, spilled bytes, and merge dictionary size.

Key changed/added classes in this PR

GroupByStatsProvider
Groupers + underlying ByteBuffer table/lists.

This PR has:

been self-reviewed.
a release note entry in the PR description.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
been tested in a test Druid cluster.

Possible further enhancements

While building this PR, I have come across some problems which we can further enhance in the future:

Nested Group-bys

The current metric is great, but will not report accurately for nested group-by's. (Do correct me on this if I'm mistaken though!)

As far as I know, nested groupby limits the merge buffers usage count to 2, meaning the merge buffer will be re-used. IIUC, every ConcurrentGrouper (if concurrency is enabled) / SpillingGrouper (if concurrency disabled) is created and closed multiple times, and hence a per-query metric will likely over-report the merge buffer usage.

Simplify Memory Management

Right now we need to configure the following for each queryable service:

size of merge buffer
number of merge buffer
direct memory = (numProcessingThreads + numMergeBuffer + 1) * mergeBufferSizeBytes

It will be great if we can simplify the calculations down to simply configuring direct memory, and we can manage a memory pool instead. This allows for more flexibility (unused memory allocated for merge buffers may be used by processing threads instead).

aho135 · 2025-11-19T20:51:45Z

processing/src/main/java/org/apache/druid/query/groupby/GroupByStatsProvider.java

      if (perQueryStats.getMergeBufferAcquisitionTimeNs() > 0) {
        mergeBufferQueries++;
        mergeBufferAcquisitionTimeNs += perQueryStats.getMergeBufferAcquisitionTimeNs();
+        mergeBufferTotalUsage += perQueryStats.getMergeBufferTotalUsage();


@GWphua Instead of summing here, what do you think about taking the max? Then the metric emitted would be the max merge buffer usage of a single query in that emission period. This would be a good signal for operators on whether they need to tweak the mergeBuffer size.

I was thinking about this.

There are other metrics where taking MAX will also make sense --
spilledBytes --> How much storage would be good to configure?
dictionarySize --> How large can the merge dictionary size get?

I am considering adding another metric (maxSpilledBytes, maxDictionarySize, maxSpilledBytes). What do you think?

Yeah agreed. I do think it makes sense to have it for those 3 metrics

Even for mergeBuffer/acquisitionTimeNs I think there's value in having the max, as it gives operators a signal on whether to increase numMergeBuffers

aho135 · 2025-11-24T21:57:52Z

Thanks for adding those max metrics @GWphua!

What do you think about adding sqlQueryId as a dimension only for the MAX metrics? I think this would be useful for understanding how much the query execution time was affected by the mergeBufferAcquisition. Can also do this in a follow-up PR if you think it's useful.

GWphua · 2025-11-25T02:09:22Z

Hi @aho135

Thanks for the review! I also find that it will be very helpful to emit metrics for each query, so we know which query will take up alot of resources. In our version of Druid, we simply appended each of the PerQueryStat to the statsMap in QueryLifecycle#emitLogsAndMetrics, but I feel its quite a hacky way of doing it. sqlQueryId as a dimension in the GroupByStatsMonitor will definitely help.

Alternatively, we can look into migrating the groupBy query metrics in GroupByStatsMonitor to GroupByQueryMetrics, which should emit metrics for each GroupBy query. In that way, this can make the MAX and SUM metrics redundant as we can now emit metrics for each query.

We can do more of this in a seperate PR.

aho135 · 2025-11-25T23:14:27Z

Hi @aho135

Thanks for the review! I also find that it will be very helpful to emit metrics for each query, so we know which query will take up alot of resources. In our version of Druid, we simply appended each of the PerQueryStat to the statsMap in QueryLifecycle#emitLogsAndMetrics, but I feel its quite a hacky way of doing it. sqlQueryId as a dimension in the GroupByStatsMonitor will definitely help.

Alternatively, we can look into migrating the groupBy query metrics in GroupByStatsMonitor to GroupByQueryMetrics, which should emit metrics for each GroupBy query. In that way, this can make the MAX and SUM metrics redundant as we can now emit metrics for each query.

We can do more of this in a seperate PR.

Sounds good @GWphua I was thinking on very similar lines to emit these from GroupByQueryMetrics

I have a first draft on this: aho135@9f82091
Lmk if you have any thoughts on this. Thanks!

GWphua · 2025-11-26T10:11:57Z

Hi @aho135, since the scope of adding GroupByQueryMetrics is out of this PR, I have created #18781 to allow us to further discuss it there.

I have a first draft on this: aho135@9f82091
Lmk if you have any thoughts on this. Thanks!

I have a draft for GroupByQueryMetrics before creating this PR, and my draft is a direct extension of your implementation shared. I think I will try and create a PR with that draft soon. I was actually hoping to get this PR merged, before sharing the draft, because that draft is done as a follow-up to this PR.

GWphua · 2025-12-31T07:02:31Z

Hi @gianm, would appreciate it if I receive a review/feedback on this PR. Thanks!

GWphua added 3 commits November 7, 2025 18:22

Add byte buffer tracking for underlying hash tables

21004b4

Byte buffer tracking for underlying offset handlers

c935ea6

Fix tests

c781910

GWphua requested a review from gianm November 10, 2025 03:18

GWphua added 2 commits November 10, 2025 11:46

Fix quidem tests

7063d09

Documentation

19f6bc3

github-actions bot added the Area - Documentation label Nov 10, 2025

bytesUsed naming

0fcb6a0

aho135 reviewed Nov 19, 2025

View reviewed changes

GWphua added 4 commits November 24, 2025 17:37

Add max metrics

25f10d2

Add missing calculation in BufferHashGrouper

b6ad3c2

Checkstyle

28719eb

Checkstyle

59fe03c

GWphua mentioned this pull request Nov 26, 2025

Migrate GroupByStatsMonitor.PerQueryStats to GroupByQueryMetrics #18781

Open

GWphua added 2 commits December 31, 2025 10:22

Merge remote-tracking branch 'origin/master' into group-by-query

e6020a6

GroupByStatsProvider javadocs

507eecd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Groupby Query Metrics] Add merge buffer tracking #18731

[Groupby Query Metrics] Add merge buffer tracking #18731

GWphua commented Nov 10, 2025 •

edited

Loading

Uh oh!

aho135 Nov 19, 2025

Uh oh!

GWphua Nov 21, 2025

Uh oh!

aho135 Nov 21, 2025

Uh oh!

aho135 commented Nov 24, 2025

Uh oh!

GWphua commented Nov 25, 2025 •

edited

Loading

Uh oh!

aho135 commented Nov 25, 2025

Uh oh!

GWphua commented Nov 26, 2025

Uh oh!

GWphua commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Groupby Query Metrics] Add merge buffer tracking #18731

Are you sure you want to change the base?

[Groupby Query Metrics] Add merge buffer tracking #18731

Conversation

GWphua commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tracking merge buffer usage

Release note

Key changed/added classes in this PR

Possible further enhancements

Nested Group-bys

Simplify Memory Management

Uh oh!

aho135 Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

GWphua Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

aho135 Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

aho135 commented Nov 24, 2025

Uh oh!

GWphua commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aho135 commented Nov 25, 2025

Uh oh!

GWphua commented Nov 26, 2025

Uh oh!

GWphua commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

GWphua commented Nov 10, 2025 •

edited

Loading

GWphua commented Nov 25, 2025 •

edited

Loading