[BUG] Histogram aggregations can produce billions of empty buckets consuming lots of memory causing OOM issues

### Describe the bug

Today in OpenSearch, we can run expensive queries that create lots of objects in memory. To combat this, we use the `CircuitBreaker` which trips when memory pressure on the cluster piles up. `CircuitBreaker` is attached to many services, including `SearchService` which allows for expensive Objects and arrays to report their memory usage and trip the `CircuitBreaker` when they grow too large. Most aggregation queries make use of `BigArrays` to track ordinal counts and increment values in the BigArrays . When new buckets are created, `BigArrays` grows in size and reports its memory usage to the `CircuitBreaker` which may trip if required. We also have another mechanism of tracking lots of buckets with the `search.max_buckets` setting which defaults to a max of 65,536. This prevents more than those number of buckets from being created. Since each bucket has an assumed cost of `5Kb`, buckets can be expensive to create. 

A bug that was reported to us was while computing certain Histogram aggregations. During the reduce phase while computing aggregations, the coordinator does a `top level reduce` on the top level aggregation in the query which subsequently call their sub aggregation reduce recursively. Each aggregation type has an Internal representation which represents the agg type while being reduced. When `InternalDateHistogram's` reduce in called, it calls its `reduceBuckets` method which calls `reduceBucket` which calls a reduce on each bucket's aggregations. We call these `non-empty` buckets because they contain actual counts for values that the aggregation produces. `InternalDateHistogram's`  reduce then calls `addEmptyBuckets` when `minDocCount==0`, this is the default case when no `minDocCount` is specified. Here is where the bug is, when we try to add empty buckets, we use the `extendedBounds` that are specified as part of the aggregation. The extended bounds can be as large as possible, for example, 

```        
"extended_bounds": {
          "min": 0,
          "max": 1741558953724
        }
```

is an allowed extended bound that created billions of empty buckets. These buckets are added in a loop to an iterator. This can cause massive memory consumption on the cluster as there are billions and billions of empty buckets being created and memory is not being reported to `CircuitBreaker` or to `search.max_buckets`. Here is an example of a histogram query with a heap dump causing showing massive number of `InternalDateHistogram` bucket objects being created

```
Initiating dump at 2025-03-12 11:43:32.187402
6906:
 num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:     957930017    53644080952  org.opensearch.search.aggregations.bucket.histogram.InternalDateHistogram$Bucket
   2:        408193     9659489864  [Ljava.lang.Object; (java.base@11.0.21)
   3:       3789839     1383884968  [B (java.base@11.0.21)
   4:       1115012     1106714688  [I (java.base@11.0.21)
   5:       2352988      112943424  java.util.HashMap$Node (java.base@11.0.21)
   6:        875554       70044320  java.nio.DirectByteBufferR (java.base@11.0.21)
   7:       1935380       61932160  java.lang.String (java.base@11.0.21)
   8:         46908       52193552  [J (java.base@11.0.21)
```
```
"opensearch[node_id][search][T#1]" #287 daemon prio=5 os_prio=0 cpu=23337769.03ms elapsed=2814914.89s tid=0x0000ffee541559e0 nid=0x24cb runnable  [0x0000ffb9390bb000]
   java.lang.Thread.State: RUNNABLE
    at org.opensearch.common.Rounding$PreparedRounding.maybeUseArray(Rounding.java:425)
    at org.opensearch.common.Rounding$TimeUnitRounding.prepare(Rounding.java:518)
    at org.opensearch.common.Rounding.nextRoundingValue(Rounding.java:320)
    at org.opensearch.search.aggregations.bucket.histogram.InternalDateHistogram.nextKey(InternalDateHistogram.java:505)
    at org.opensearch.search.aggregations.bucket.histogram.InternalDateHistogram.addEmptyBuckets(InternalDateHistogram.java:412)
```

The loop never terminates and causes the node processing the request to go out of memory before taking the cluster down. 

## Proposed Solution
- After non-empty buckets are created, we trigger a call to `reduceContext.consumeBucketsAndMaybeBreak(reducedBuckets.size());` to check if we can even create empty buckets
- Before we can add empty buckets, we compute how many empty buckets we would need to add. We check the value at an interval by sampling a few times to get an approximate bucket count since the counts depend on the next key for each bucket we find
- we add the empty bucket count to the breaker to see if it trips, else we report it to `search.max_buckets` to see if that trips
- if we pass all these checks, we are allowed to create empty buckets and process the query normally 


### Related component

_No response_

### To Reproduce

Sample query to reproduce the issue

```
GET _search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "log_time": {
              "from": 0,
              "to": 1741558953724,
              "include_lower": true,
              "include_upper": true,
              "format": "epoch_millis",
              "boost": 1.0
            }
          }
        },
        {
          "query_string": {
            "query": "app_name:nextgen_rules_engine   AND verb:(\"execute\")   AND type:api  AND tid:*  AND NOT msg: rules_engine_dynamic_configuration",
            "fields": [],
            "type": "best_fields",
            "default_operator": "or",
            "max_determinized_states": 10000,
            "enable_position_increments": true,
            "fuzziness": "AUTO",
            "fuzzy_prefix_length": 0,
            "fuzzy_max_expansions": 50,
            "phrase_slop": 0,
            "analyze_wildcard": true,
            "escape": false,
            "auto_generate_synonyms_phrase_query": true,
            "fuzzy_transpositions": true,
            "boost": 1.0
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1.0
    }
  },
  "aggregations": {
    "2": {
      "date_histogram": {
        "field": "log_time",
        "format": "epoch_millis",
        "interval": "10s",
        "offset": 0,
        "order": {
          "_key": "asc"
        },
        "keyed": false,
        "extended_bounds": {
          "min": 0,
          "max": 1741558953724
        }
      },
      "aggregations": {
        "1": {
          "percentiles": {
            "field": "time_taken_ms",
            "percents": [
              95.0
            ],
            "keyed": true,
            "tdigest": {
              "compression": 100.0
            }
          }
        }
      }
    }
  }
}
```

### Expected behavior

Queries should terminate with exception instead of running for long hours taking down a node

### Additional Details

**Plugins**
Please list all plugins currently enabled.

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Host/Environment (please complete the following information):**
 - OS: [e.g. iOS]
 - Version [e.g. 22]

**Additional context**
Add any other context about the problem here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Histogram aggregations can produce billions of empty buckets consuming lots of memory causing OOM issues #17702

Describe the bug

Proposed Solution

Related component

To Reproduce

Expected behavior

Additional Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Histogram aggregations can produce billions of empty buckets consuming lots of memory causing OOM issues #17702

Description

Describe the bug

Proposed Solution

Related component

To Reproduce

Expected behavior

Additional Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions