Skip to content

Implement top-level (queryable) versions of custom support aggregate operators #28264

@roji

Description

@roji

#2981, #28104 and #13278 introduced various aggregate operators, but only IEnumerable versions. This means that they can be used with GroupBy:

_ = context.Products
    .GroupBy(od => od.ProductID)
    .Select(g => new
    {
        ProductID = g.Key,
        StandardDeviation = EF.Functions.StandardDeviationSample(g.Select(p => p.UnitPrice)),
    });

... but cannot be used at the query top-level - without GroupBy - where IQueryable versions would be required. The team discussed this and decided to punt the top-level operators out of 7.0, and revisit this based on user feedback.

Workaround

In the meantime, users can work around this gap by grouping over a constant:

var foo = ctx.Products
    .GroupBy(b => 1)
    .Select(g => EF.Functions.StandardDeviationSample(g.Select(p => p.UnitPrice))
    .FirstOrDefault();

Design discussion

See original discussion in #28104 (comment) on how to represent the top-level operators. Options include:

Extension over IQueryable

_ = context.Products.StandardDeviationSample(p => p.UnitPrice); // non-built-in
_ = context.Products.StringJoin(p => p.Name, ", "); // built-in
  • We generally feel that it's OK to introduce extensions over IQueryable; we avoid doing so on commonly-used types like IEnumerable, but it's very rare for people to use another LINQ provider in an EF Core app.
  • The syntax is inconsistent with the enumerableversion (which is via EF.Functions).
  • For built-in methods (e.g. string.Join), we'd have to introduce a special new thing (e.g. context.Products.StringJoin), which is even more inconsistent.

EF.Functions

_ = EF.Functions.StandardDeviationSample(context.Products.Select(p => p.UnitPrice)); // non-built-in
_ = EF.Functions.StringJoin(context.Products.Select(p => p.Name), ", "); // built-in
  • This is consistent with the enumerable version - except for built-in functions (e.g. string.Join).
  • It's quite unwieldy, since the entire query needs to be wrapped. This is already the case for the enumerable version, but arguments there tend to be a lot shorter (e.g. a single column or short expression, compared to an entire query for the queryable version).

Context.Query

We have an "unrelated" feature to generally allow expressing queries within a lambda (ISSUE NEEDED):

context.Query<T>(() => EF.Functions.StandardDeviationSample(context.Products.Select(p => p.UnitPrice))) // non-built-in
context.Query<T>(() => string.Join(", ", context.Products.Select(p => p.Name)) // built-in
  • Like the above EF.Functions option, this is consistent with the enumerable version.
  • But unlike that option, it's also consistent for built-in functions - the enumerable string.Join can be used inside the lambda.
  • No need to have an extra IQueryable versions of the aggregate operators - the same enumerable method can be used for top-level. This simplifies EF/provider maintenance.

Note: Infra for top-level aggregates was done in #28102
Note: See npgsql/efcore.pg#727 for the Npgsql epic covering PG aggregate operators - we'd have to do the same for those.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions