-
Notifications
You must be signed in to change notification settings - Fork 3.8k
[Groupby Query Metrics] Add merge buffer tracking #18731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| if (perQueryStats.getMergeBufferAcquisitionTimeNs() > 0) { | ||
| mergeBufferQueries++; | ||
| mergeBufferAcquisitionTimeNs += perQueryStats.getMergeBufferAcquisitionTimeNs(); | ||
| mergeBufferTotalUsage += perQueryStats.getMergeBufferTotalUsage(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GWphua Instead of summing here, what do you think about taking the max? Then the metric emitted would be the max merge buffer usage of a single query in that emission period. This would be a good signal for operators on whether they need to tweak the mergeBuffer size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this.
There are other metrics where taking MAX will also make sense --
spilledBytes --> How much storage would be good to configure?
dictionarySize --> How large can the merge dictionary size get?
I am considering adding another metric (maxSpilledBytes, maxDictionarySize, maxSpilledBytes). What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah agreed. I do think it makes sense to have it for those 3 metrics
Even for mergeBuffer/acquisitionTimeNs I think there's value in having the max, as it gives operators a signal on whether to increase numMergeBuffers
|
Thanks for adding those max metrics @GWphua! What do you think about adding |
|
Hi @aho135 Thanks for the review! I also find that it will be very helpful to emit metrics for each query, so we know which query will take up alot of resources. In our version of Druid, we simply appended each of the Alternatively, we can look into migrating the groupBy query metrics in We can do more of this in a seperate PR. |
Sounds good @GWphua I was thinking on very similar lines to emit these from I have a first draft on this: aho135@9f82091 |
|
Hi @aho135, since the scope of adding
I have a draft for |
|
Hi @gianm, would appreciate it if I receive a review/feedback on this PR. Thanks! |
Fixes #17902
Huge thanks to @gianm for the implementation tip in the issue!
Description
Tracking merge buffer usage
AbstractBufferHashGrouperand its implementations.ByteBufferHashTablealong with an offset tracker.ByteBufferHashTable, and maximum offset size calculated throughout the query's lifecycle.Incoprated a helpful suggestion by @aho135 : since the size of the hash tables are ever-changing, it makes sense to conduct calculations by taking the maximum values across queries -- so operators can have a better understanding of how the size of merge buffers can be configured.
Here's an example of the current SUM implementations, vs the MAX implementation The latter helps to tell us that we should probably configure merge buffer sizes to 2G for this case:

Release note
GroupByStatsMonitornow provides metrics "mergeBuffer/bytesUsed", and max metrics for merge buffer acquisition time, bytes used, spilled bytes, and merge dictionary size.Key changed/added classes in this PR
GroupByStatsProviderThis PR has:
Possible further enhancements
While building this PR, I have come across some problems which we can further enhance in the future:
Nested Group-bys
The current metric is great, but will not report accurately for nested group-by's. (Do correct me on this if I'm mistaken though!)
As far as I know, nested groupby limits the merge buffers usage count to 2, meaning the merge buffer will be re-used. IIUC, every ConcurrentGrouper (if concurrency is enabled) / SpillingGrouper (if concurrency disabled) is created and closed multiple times, and hence a per-query metric will likely over-report the merge buffer usage.
Simplify Memory Management
Right now we need to configure the following for each queryable service:
It will be great if we can simplify the calculations down to simply configuring direct memory, and we can manage a memory pool instead. This allows for more flexibility (unused memory allocated for merge buffers may be used by processing threads instead).