[BUG] Sorting rules not applied to bucket aggregations

**What is the bug?**
While messing with aggregation group-by on a dataset with 50000 records, I noticed that it's not processing all the buckets for queries:

```
> source = nginx_logs
   | stats count() by method, bytes
   | stats count();

fetched rows / total rows = 1/1
+---------+
| count() |
|---------|
| 1000    |  <--- Should be at least 10804, according to below
+---------+

> source = nginx_logs
   | dedup method, bytes
   | stats count();

fetched rows / total rows = 1/1
+---------+
| count() |
|---------|
| 10804   |
+---------+
```

The bucket count limitation is understandable (they're memory-intensive and shipping many thousands of buckets would exhaust bandwidth), but it does cause issues when trying to work with sorted aggregations. In this dataset, the top buckets should all be `GET`, not `DELETE`.

```
> source = nginx_logs
   | stats count() by method, bytes
   | sort -`count()`
   | head;

fetched rows / total rows = 10/10
+---------+--------+-------+
| count() | method | bytes |
|---------+--------+-------|
| 13      | DELETE | 81    |
| 12      | DELETE | 74    |
| 10      | DELETE | 71    |
| 10      | DELETE | 95    |
| 10      | DELETE | 51    |
| 10      | DELETE | 88    |
| 9       | DELETE | 78    |
| 9       | DELETE | 53    |
| 9       | DELETE | 49    |
| 9       | DELETE | 55    |
+---------+--------+-------+
```

This equivalently leads to incorrect `top` output:

```
> source = nginx_logs
   | top bytes by method
   | head;
   
fetched rows / total rows = 10/10
+--------+-------+-------+
| method | bytes | count |
|--------+-------+-------|
| DELETE | 81    | 13    |
| DELETE | 74    | 12    |
| DELETE | 51    | 10    |
| DELETE | 71    | 10    |
| DELETE | 88    | 10    |
| DELETE | 95    | 10    |
| DELETE | 44    | 9     |
| DELETE | 49    | 9     |
| DELETE | 52    | 9     |
| DELETE | 53    | 9     |
+--------+-------+-------+
```

**How can one reproduce the bug?**
1. Create a dataset with at least 1000 different buckets according to some sort of grouping
2. Observe incorrect results when trying to collect/sort them

**What is the expected behavior?**
It seems like the buckets come out lexicographically sorted today, which implies that internally all the buckets are already being computed and sorted (i.e. returning incorrect results isn't a major cost optimization). I wonder if it'd be possible to support sorting in the bucket aggregation, so the 1000 records returned are actually the correct ones.

**What is your host/environment?**
 Mainline

**Do you have any screenshots?**
N/A

**Do you have any additional context?**
N/A


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Sorting rules not applied to bucket aggregations #4278

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Sorting rules not applied to bucket aggregations #4278

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions