Skip to content

Conversation

@jainankitk
Copy link
Contributor

@jainankitk jainankitk commented Oct 8, 2025

Description

This change aims at optimizing the sub aggregation by leveraging multi range traversal for top level aggregation, and skip list for the sub aggregation.

I have copied over the BitSetDocIdStream class in OpenSearch from Lucene for now as it is not public, but should look at eventually getting rid of it.

Related Issues

Related to #17447, #19384

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@jainankitk jainankitk changed the title Combining filter rewrite and skip list approaches for further optimiz… Combining filter rewrite and skip list to optimize sub aggregation Oct 8, 2025
Signed-off-by: Ankit Jain <[email protected]>
@github-actions
Copy link
Contributor

github-actions bot commented Oct 8, 2025

❌ Gradle check result for 82bc95d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@jainankitk jainankitk marked this pull request as ready for review October 9, 2025 00:04
@github-actions
Copy link
Contributor

github-actions bot commented Oct 9, 2025

❌ Gradle check result for aff3dc6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for eaf7e52: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Ankit Jain <[email protected]>
@jainankitk
Copy link
Contributor Author

{"run-benchmark-test": "id_3"}

@jainankitk
Copy link
Contributor Author

{"run-benchmark-test": "id_12"}

Comment on lines +109 to +114
LeafBucketCollector sub = null;
if (collectableSubAggregators instanceof BFSCollector bfsCollector) {
sub = bfsCollector.getBFSLeafCollector(leafCtx);
} else {
sub = collectableSubAggregators.getLeafCollector(leafCtx);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo, this should be the other way around. The sub collector should be able to check if the parent is implementation of BFSCollector. Maybe somehow know that the invocation is from SubAggRangeCollector ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per current API there's the called collector doesn't know who is calling, i.e. there is a parent but that will be RangeCollector and not SubAggRangeCollector. I could add some logic on RangeCollector to ask if its precompute path.

Another option is inspecting the stack trace, but feel even more hacky, especially since its all OS code.

@github-actions
Copy link
Contributor

❌ Gradle check result for 7c05efe: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@jainankitk
Copy link
Contributor Author

{"run-benchmark-test": "id_12"}

@github-actions
Copy link
Contributor

The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/5064/ . Final results will be published once the job is completed.

@opensearch-ci-bot
Copy link
Collaborator

Benchmark Results

Benchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-pull-request/5064/

Metric Task Value Unit
Cumulative indexing time of primary shards 0 min
Min cumulative indexing time across primary shards 0 min
Median cumulative indexing time across primary shards 0 min
Max cumulative indexing time across primary shards 0 min
Cumulative indexing throttle time of primary shards 0 min
Min cumulative indexing throttle time across primary shards 0 min
Median cumulative indexing throttle time across primary shards 0 min
Max cumulative indexing throttle time across primary shards 0 min
Cumulative merge time of primary shards 0 min
Cumulative merge count of primary shards 0
Min cumulative merge time across primary shards 0 min
Median cumulative merge time across primary shards 0 min
Max cumulative merge time across primary shards 0 min
Cumulative merge throttle time of primary shards 0 min
Min cumulative merge throttle time across primary shards 0 min
Median cumulative merge throttle time across primary shards 0 min
Max cumulative merge throttle time across primary shards 0 min
Cumulative refresh time of primary shards 0 min
Cumulative refresh count of primary shards 2
Min cumulative refresh time across primary shards 0 min
Median cumulative refresh time across primary shards 0 min
Max cumulative refresh time across primary shards 0 min
Cumulative flush time of primary shards 0 min
Cumulative flush count of primary shards 1
Min cumulative flush time across primary shards 0 min
Median cumulative flush time across primary shards 0 min
Max cumulative flush time across primary shards 0 min
Total Young Gen GC time 0.176 s
Total Young Gen GC count 3
Total Old Gen GC time 0 s
Total Old Gen GC count 0
Store size 21.1662 GB
Translog size 5.12227e-08 GB
Heap used for segments 0 MB
Heap used for doc values 0 MB
Heap used for terms 0 MB
Heap used for norms 0 MB
Heap used for points 0 MB
Heap used for stored fields 0 MB
Segment count 22
100th percentile latency wait-for-snapshot-recovery 300002 ms
100th percentile service time wait-for-snapshot-recovery 300002 ms
error rate wait-for-snapshot-recovery 100 %
Min Throughput default 3.01 ops/s
Mean Throughput default 3.02 ops/s
Median Throughput default 3.02 ops/s
Max Throughput default 3.03 ops/s
50th percentile latency default 7.4297 ms
90th percentile latency default 8.12422 ms
99th percentile latency default 9.28908 ms
100th percentile latency default 9.86709 ms
50th percentile service time default 6.24968 ms
90th percentile service time default 6.72307 ms
99th percentile service time default 8.23099 ms
100th percentile service time default 8.97247 ms
error rate default 0 %
Min Throughput range 1.01 ops/s
Mean Throughput range 1.01 ops/s
Median Throughput range 1.01 ops/s
Max Throughput range 1.02 ops/s
50th percentile latency range 9.59986 ms
90th percentile latency range 11.2541 ms
99th percentile latency range 44.4291 ms
100th percentile latency range 74.6607 ms
50th percentile service time range 7.7774 ms
90th percentile service time range 9.58777 ms
99th percentile service time range 42.7624 ms
100th percentile service time range 73.359 ms
error rate range 0 %
Min Throughput distance_amount_agg 0.16 ops/s
Mean Throughput distance_amount_agg 0.16 ops/s
Median Throughput distance_amount_agg 0.16 ops/s
Max Throughput distance_amount_agg 0.16 ops/s
50th percentile latency distance_amount_agg 575605 ms
90th percentile latency distance_amount_agg 804555 ms
99th percentile latency distance_amount_agg 856506 ms
100th percentile latency distance_amount_agg 859397 ms
50th percentile service time distance_amount_agg 6203.38 ms
90th percentile service time distance_amount_agg 6343.79 ms
99th percentile service time distance_amount_agg 6474 ms
100th percentile service time distance_amount_agg 6474.69 ms
error rate distance_amount_agg 0 %
Min Throughput autohisto_agg 1.51 ops/s
Mean Throughput autohisto_agg 1.51 ops/s
Median Throughput autohisto_agg 1.51 ops/s
Max Throughput autohisto_agg 1.53 ops/s
50th percentile latency autohisto_agg 7.53593 ms
90th percentile latency autohisto_agg 8.71492 ms
99th percentile latency autohisto_agg 11.4246 ms
100th percentile latency autohisto_agg 12.3307 ms
50th percentile service time autohisto_agg 6.09332 ms
90th percentile service time autohisto_agg 7.01773 ms
99th percentile service time autohisto_agg 9.82282 ms
100th percentile service time autohisto_agg 10.6713 ms
error rate autohisto_agg 0 %
Min Throughput date_histogram_agg 1.51 ops/s
Mean Throughput date_histogram_agg 1.52 ops/s
Median Throughput date_histogram_agg 1.51 ops/s
Max Throughput date_histogram_agg 1.53 ops/s
50th percentile latency date_histogram_agg 7.13077 ms
90th percentile latency date_histogram_agg 7.7891 ms
99th percentile latency date_histogram_agg 9.49259 ms
100th percentile latency date_histogram_agg 10.8598 ms
50th percentile service time date_histogram_agg 5.47185 ms
90th percentile service time date_histogram_agg 6.12838 ms
99th percentile service time date_histogram_agg 7.73545 ms
100th percentile service time date_histogram_agg 9.00933 ms
error rate date_histogram_agg 0 %
Min Throughput desc_sort_tip_amount 1.01 ops/s
Mean Throughput desc_sort_tip_amount 1.01 ops/s
Median Throughput desc_sort_tip_amount 1.01 ops/s
Max Throughput desc_sort_tip_amount 1.02 ops/s
50th percentile latency desc_sort_tip_amount 9.41708 ms
90th percentile latency desc_sort_tip_amount 9.8871 ms
99th percentile latency desc_sort_tip_amount 11.3078 ms
100th percentile latency desc_sort_tip_amount 11.9181 ms
50th percentile service time desc_sort_tip_amount 7.5536 ms
90th percentile service time desc_sort_tip_amount 7.99025 ms
99th percentile service time desc_sort_tip_amount 9.74251 ms
100th percentile service time desc_sort_tip_amount 10.4493 ms
error rate desc_sort_tip_amount 0 %
Min Throughput asc_sort_tip_amount 1 ops/s
Mean Throughput asc_sort_tip_amount 1.01 ops/s
Median Throughput asc_sort_tip_amount 1.01 ops/s
Max Throughput asc_sort_tip_amount 1.01 ops/s
50th percentile latency asc_sort_tip_amount 10.5105 ms
90th percentile latency asc_sort_tip_amount 11.1068 ms
99th percentile latency asc_sort_tip_amount 14.272 ms
100th percentile latency asc_sort_tip_amount 15.6262 ms
50th percentile service time asc_sort_tip_amount 8.6644 ms
90th percentile service time asc_sort_tip_amount 9.01583 ms
99th percentile service time asc_sort_tip_amount 12.255 ms
100th percentile service time asc_sort_tip_amount 13.6141 ms
error rate asc_sort_tip_amount 0 %
Min Throughput desc_sort_passenger_count 2.01 ops/s
Mean Throughput desc_sort_passenger_count 2.02 ops/s
Median Throughput desc_sort_passenger_count 2.02 ops/s
Max Throughput desc_sort_passenger_count 2.03 ops/s
50th percentile latency desc_sort_passenger_count 8.14103 ms
90th percentile latency desc_sort_passenger_count 8.62047 ms
99th percentile latency desc_sort_passenger_count 9.15097 ms
100th percentile latency desc_sort_passenger_count 9.49433 ms
50th percentile service time desc_sort_passenger_count 6.77337 ms
90th percentile service time desc_sort_passenger_count 6.93812 ms
99th percentile service time desc_sort_passenger_count 7.48737 ms
100th percentile service time desc_sort_passenger_count 7.75652 ms
error rate desc_sort_passenger_count 0 %
Min Throughput asc_sort_passenger_count 2.01 ops/s
Mean Throughput asc_sort_passenger_count 2.02 ops/s
Median Throughput asc_sort_passenger_count 2.02 ops/s
Max Throughput asc_sort_passenger_count 2.04 ops/s
50th percentile latency asc_sort_passenger_count 8.72051 ms
90th percentile latency asc_sort_passenger_count 9.15956 ms
99th percentile latency asc_sort_passenger_count 9.92083 ms
100th percentile latency asc_sort_passenger_count 10.2327 ms
50th percentile service time asc_sort_passenger_count 7.39063 ms
90th percentile service time asc_sort_passenger_count 7.51338 ms
99th percentile service time asc_sort_passenger_count 8.78305 ms
100th percentile service time asc_sort_passenger_count 9.40489 ms
error rate asc_sort_passenger_count 0 %

@opensearch-ci-bot
Copy link
Collaborator

Benchmark Baseline Comparison Results

Benchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-compare/196/

Metric Task Baseline Contender Diff Unit
Cumulative indexing time of primary shards 0 0 0 min
Min cumulative indexing time across primary shard 0 0 0 min
Median cumulative indexing time across primary shard 0 0 0 min
Max cumulative indexing time across primary shard 0 0 0 min
Cumulative indexing throttle time of primary shards 0 0 0 min
Min cumulative indexing throttle time across primary shard 0 0 0 min
Median cumulative indexing throttle time across primary shard 0 0 0 min
Max cumulative indexing throttle time across primary shard 0 0 0 min
Cumulative merge time of primary shards 0 0 0 min
Cumulative merge count of primary shards 0 0 0
Min cumulative merge time across primary shard 0 0 0 min
Median cumulative merge time across primary shard 0 0 0 min
Max cumulative merge time across primary shard 0 0 0 min
Cumulative merge throttle time of primary shards 0 0 0 min
Min cumulative merge throttle time across primary shard 0 0 0 min
Median cumulative merge throttle time across primary shard 0 0 0 min
Max cumulative merge throttle time across primary shard 0 0 0 min
Cumulative refresh time of primary shards 0 0 0 min
Cumulative refresh count of primary shards 2 2 0
Min cumulative refresh time across primary shard 0 0 0 min
Median cumulative refresh time across primary shard 0 0 0 min
Max cumulative refresh time across primary shard 0 0 0 min
Cumulative flush time of primary shards 0 0 0 min
Cumulative flush count of primary shards 1 1 0
Min cumulative flush time across primary shard 0 0 0 min
Median cumulative flush time across primary shard 0 0 0 min
Max cumulative flush time across primary shard 0 0 0 min
Total Young Gen GC time 0.231 0.176 -0.055 s
Total Young Gen GC count 4 3 -1
Total Old Gen GC time 0 0 0 s
Total Old Gen GC count 0 0 0
Store size 21.1662 21.1662 0 GB
Translog size 5.12227e-08 5.12227e-08 0 GB
Heap used for segments 0 0 0 MB
Heap used for doc values 0 0 0 MB
Heap used for terms 0 0 0 MB
Heap used for norms 0 0 0 MB
Heap used for points 0 0 0 MB
Heap used for stored fields 0 0 0 MB
Segment count 22 22 0
100th percentile latency wait-for-snapshot-recovery 300001 300002 0.59375 ms
100th percentile service time wait-for-snapshot-recovery 300001 300002 0.59375 ms
error rate wait-for-snapshot-recovery 100 100 0 %
Min Throughput default 3.01285 3.01204 -0.00081 ops/s
Mean Throughput default 3.02093 3.01958 -0.00135 ops/s
Median Throughput default 3.01911 3.0178 -0.0013 ops/s
Max Throughput default 3.0369 3.03452 -0.00238 ops/s
50th percentile latency default 5.44528 7.4297 1.98443 ms
90th percentile latency default 6.01405 8.12422 2.11017 ms
99th percentile latency default 6.54351 9.28908 2.74557 ms
100th percentile latency default 6.56129 9.86709 3.3058 ms
50th percentile service time default 4.26272 6.24968 1.98696 ms
90th percentile service time default 4.86457 6.72307 1.8585 ms
99th percentile service time default 5.36643 8.23099 2.86456 ms
100th percentile service time default 5.44328 8.97247 3.52919 ms
error rate default 0 0 0 %
Min Throughput range 1.00608 1.00608 -0 ops/s
Mean Throughput range 1.01001 1.01001 -0 ops/s
Median Throughput range 1.0091 1.0091 -0 ops/s
Max Throughput range 1.0181 1.01809 -0 ops/s
50th percentile latency range 8.24616 9.59986 1.3537 ms
90th percentile latency range 9.32778 11.2541 1.92627 ms
99th percentile latency range 10.6048 44.4291 33.8243 ms
100th percentile latency range 11.0657 74.6607 63.595 ms
50th percentile service time range 6.39901 7.7774 1.37839 ms
90th percentile service time range 7.67522 9.58777 1.91255 ms
99th percentile service time range 8.82215 42.7624 33.9403 ms
100th percentile service time range 9.13118 73.359 64.2279 ms
error rate range 0 0 0 %
Min Throughput distance_amount_agg 0.149437 0.16034 0.0109 ops/s
Mean Throughput distance_amount_agg 0.149617 0.16055 0.01093 ops/s
Median Throughput distance_amount_agg 0.149604 0.160565 0.01096 ops/s
Max Throughput distance_amount_agg 0.149765 0.16077 0.01101 ops/s
50th percentile latency distance_amount_agg 622688 575605 -47082.8 ms
90th percentile latency distance_amount_agg 869383 804555 -64827.8 ms
99th percentile latency distance_amount_agg 924693 856506 -68187.2 ms
100th percentile latency distance_amount_agg 927742 859397 -68344.8 ms
50th percentile service time distance_amount_agg 6653.44 6203.38 -450.064 ms
90th percentile service time distance_amount_agg 6770.18 6343.79 -426.393 ms
99th percentile service time distance_amount_agg 7014.47 6474 -540.465 ms
100th percentile service time distance_amount_agg 7083.07 6474.69 -608.378 ms
error rate distance_amount_agg 0 0 0 %
Min Throughput autohisto_agg 1.50912 1.50907 -5e-05 ops/s
Mean Throughput autohisto_agg 1.51507 1.51497 -0.0001 ops/s
Median Throughput autohisto_agg 1.51373 1.51364 -9e-05 ops/s
Max Throughput autohisto_agg 1.52715 1.52699 -0.00017 ops/s
50th percentile latency autohisto_agg 6.19705 7.53593 1.33888 ms
90th percentile latency autohisto_agg 7.19986 8.71492 1.51506 ms
99th percentile latency autohisto_agg 9.00978 11.4246 2.4148 ms
100th percentile latency autohisto_agg 9.76453 12.3307 2.56619 ms
50th percentile service time autohisto_agg 4.76785 6.09332 1.32546 ms
90th percentile service time autohisto_agg 5.54018 7.01773 1.47755 ms
99th percentile service time autohisto_agg 7.57341 9.82282 2.24941 ms
100th percentile service time autohisto_agg 8.2461 10.6713 2.42522 ms
error rate autohisto_agg 0 0 0 %
Min Throughput date_histogram_agg 1.50981 1.50977 -4e-05 ops/s
Mean Throughput date_histogram_agg 1.51622 1.51614 -8e-05 ops/s
Median Throughput date_histogram_agg 1.51477 1.51468 -8e-05 ops/s
Max Throughput date_histogram_agg 1.52924 1.52907 -0.00016 ops/s
50th percentile latency date_histogram_agg 5.72908 7.13077 1.40169 ms
90th percentile latency date_histogram_agg 6.38854 7.7891 1.40056 ms
99th percentile latency date_histogram_agg 13.9375 9.49259 -4.44486 ms
100th percentile latency date_histogram_agg 20.6626 10.8598 -9.80285 ms
50th percentile service time date_histogram_agg 4.06014 5.47185 1.41171 ms
90th percentile service time date_histogram_agg 4.71322 6.12838 1.41516 ms
99th percentile service time date_histogram_agg 12.029 7.73545 -4.29352 ms
100th percentile service time date_histogram_agg 18.8259 9.00933 -9.81652 ms
error rate date_histogram_agg 0 0 0 %
Min Throughput desc_sort_tip_amount 1.00569 1.00565 -4e-05 ops/s
Mean Throughput desc_sort_tip_amount 1.00937 1.00929 -8e-05 ops/s
Median Throughput desc_sort_tip_amount 1.00852 1.00845 -7e-05 ops/s
Max Throughput desc_sort_tip_amount 1.01694 1.0168 -0.00015 ops/s
50th percentile latency desc_sort_tip_amount 7.76782 9.41708 1.64926 ms
90th percentile latency desc_sort_tip_amount 8.36377 9.8871 1.52334 ms
99th percentile latency desc_sort_tip_amount 10.0087 11.3078 1.29918 ms
100th percentile latency desc_sort_tip_amount 10.063 11.9181 1.85508 ms
50th percentile service time desc_sort_tip_amount 5.92072 7.5536 1.63288 ms
90th percentile service time desc_sort_tip_amount 6.70033 7.99025 1.28992 ms
99th percentile service time desc_sort_tip_amount 7.76178 9.74251 1.98074 ms
100th percentile service time desc_sort_tip_amount 7.79757 10.4493 2.65168 ms
error rate desc_sort_tip_amount 0 0 0 %
Min Throughput asc_sort_tip_amount 1.00487 1.0049 2e-05 ops/s
Mean Throughput asc_sort_tip_amount 1.00801 1.00806 4e-05 ops/s
Median Throughput asc_sort_tip_amount 1.00729 1.00733 4e-05 ops/s
Max Throughput asc_sort_tip_amount 1.01447 1.01455 8e-05 ops/s
50th percentile latency asc_sort_tip_amount 8.601 10.5105 1.90945 ms
90th percentile latency asc_sort_tip_amount 9.00432 11.1068 2.10247 ms
99th percentile latency asc_sort_tip_amount 9.43622 14.272 4.83576 ms
100th percentile latency asc_sort_tip_amount 9.52191 15.6262 6.10424 ms
50th percentile service time asc_sort_tip_amount 6.78479 8.6644 1.87961 ms
90th percentile service time asc_sort_tip_amount 7.00716 9.01583 2.00867 ms
99th percentile service time asc_sort_tip_amount 7.40353 12.255 4.8515 ms
100th percentile service time asc_sort_tip_amount 7.42976 13.6141 6.1843 ms
error rate asc_sort_tip_amount 0 0 0 %
Min Throughput desc_sort_passenger_count 2.01168 2.01159 -0.0001 ops/s
Mean Throughput desc_sort_passenger_count 2.01921 2.01907 -0.00014 ops/s
Median Throughput desc_sort_passenger_count 2.01746 2.01733 -0.00013 ops/s
Max Throughput desc_sort_passenger_count 2.03453 2.03427 -0.00027 ops/s
50th percentile latency desc_sort_passenger_count 6.73137 8.14103 1.40966 ms
90th percentile latency desc_sort_passenger_count 7.46135 8.62047 1.15912 ms
99th percentile latency desc_sort_passenger_count 9.79465 9.15097 -0.64369 ms
100th percentile latency desc_sort_passenger_count 11.5022 9.49433 -2.00785 ms
50th percentile service time desc_sort_passenger_count 5.24397 6.77337 1.52941 ms
90th percentile service time desc_sort_passenger_count 5.95289 6.93812 0.98523 ms
99th percentile service time desc_sort_passenger_count 8.11534 7.48737 -0.62797 ms
100th percentile service time desc_sort_passenger_count 9.82717 7.75652 -2.07065 ms
error rate desc_sort_passenger_count 0 0 0 %
Min Throughput asc_sort_passenger_count 2.01274 2.01275 1e-05 ops/s
Mean Throughput asc_sort_passenger_count 2.02094 2.021 5e-05 ops/s
Median Throughput asc_sort_passenger_count 2.01902 2.01906 4e-05 ops/s
Max Throughput asc_sort_passenger_count 2.03765 2.03772 7e-05 ops/s
50th percentile latency asc_sort_passenger_count 8.6785 8.72051 0.04201 ms
90th percentile latency asc_sort_passenger_count 9.05555 9.15956 0.10401 ms
99th percentile latency asc_sort_passenger_count 9.99141 9.92083 -0.07058 ms
100th percentile latency asc_sort_passenger_count 10.1185 10.2327 0.11418 ms
50th percentile service time asc_sort_passenger_count 7.3503 7.39063 0.04033 ms
90th percentile service time asc_sort_passenger_count 7.5635 7.51338 -0.05012 ms
99th percentile service time asc_sort_passenger_count 8.58771 8.78305 0.19534 ms
100th percentile service time asc_sort_passenger_count 8.68219 9.40489 0.7227 ms
error rate asc_sort_passenger_count 0 0 0 %

@github-actions
Copy link
Contributor

❌ Gradle check result for 1d97bee: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for d8448f5: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In-Review

Development

Successfully merging this pull request may close these issues.

4 participants