Julep: Syntax for reduction (overloadable splatting?)

I suggest a mechanism to customize splatting behavior such that code like

```julia
+((x .- y).^2...) / length(x)  # compute MSE
```

can be executed efficiently without any allocations.

The idea is inspired by discussion in #29114 by @c42f et al.

## Idea

Like dot-call syntax, I suggest to lower splatting to a series of function calls that are overloadable.  One possibility is

```julia
# f(a...) is expanded to:
apply(f, Arguments(VA(splattable(a))))

# f(a, b..., c, d...) is expanded to:
apply(f, Arguments(a, VA(splattable(b)), c, VA(splattable(d))))
```

(To avoid recursion, this lowering should happen only when the call includes splatting.)

When the dot-call syntax appears in the splatting operand, I suggest _not_ materialize the dot-call.  That is to say, for example, `op(f.(xs)...)` is lowered to

```julia
bc = broadcasted(f, xs)  # `bc` not materialized
apply(op, Arguments(VA(splattable(bc))))
```

This let us evaluate `op(f.(xs)...)` without any allocation once `reduce`/`foldl` supports `Broadcasted` object (#31020 started tackle this).

## Interface

The lowering above requires the following interface functions and types:

```julia
function apply end

splattable(x) = x

struct VA{T}
    args::T
end

struct Arguments{T <: Tuple}
    args::T
    Arguments(args...) = new{typeof(args)}(args)
end
```

* `apply` must be dispatched on the first argument type and may be dispatched on the second argument type.
* Per-vararg processing should be done via `splattable`.  This is analogous to `broadcastable` (@yurivish suggested this in https://github.com/JuliaLang/julia/issues/29114#issuecomment-515294850).
* The type `VA` (whose name can/should be improved) must be used only for defining `apply`; its constructor must not be overloaded.  This is for making it hard to break splatting semantics.
* The constructor for type `Arguments` must not be overloaded for the same reason.

Using current `Core._apply`, the default `apply` can be implemented as

```julia
apply(f, x) = Core._apply(Core._apply, (f,), _default_splattables(x.args))
_default_splattables(args) = map(x -> x isa VA ? materialize(x.args) : (x,), args)
```

## Example overloads

### Associative binary operators

Many useful operations can be expressed using splatting into associative operators

```julia
+(xs...)      == sum(xs)
*(xs...)      == prod(xs)
min(xs...)    == minimum(xs)
max(xs...)    == maximum(xs)
```

or "mapped-splatting"

```julia
+(xs.^2...)      == norm(xs)
+(xs .* ys...)   == dot(xs, ys)
```

(This is reminiscent of the "big operator" in Fortress.)

There are also various other associative binary operators in Base.   Invoking `reduce` with splatting could be useful:

```julia
const AssociativeOperator = Union{
    typoef(*),
    typoef(+),
    typoef(&),
    typoef(|),
    typoef(min),
    typoef(max),
    typoef(intersect),
    typoef(union),
    typoef(vcat),
    typoef(hcat),
    typoef(merge),
    # what else?
}

apply(op::AssociativeOperator, args::Arguments{Tuple{<:VA}}) =
    reduce(op, args.args[1].args)
```

For example, concatenating vectors would be efficiently done via `vcat(vectors...)` thanks to #27188.

It also is possible to support

```julia
op :: AssociativeOperator
op(a, bs..., cs...)
```

such that it is computed as

```julia
op(op(a, reduce(op, bs)), reduce(op, cs))
```

This may be implemented as

```julia
apply(op::AssociativeOperator, args::Arguments) = mapreduce(apply1, op, args.args)
_apply1(op::AssociativeOperator, args::VA) = reduce(op, args.args)
_apply1(op, x) = x
```

Note that, since `reduce` would degrade to `foldl` when the input is not an array (or not `Broadcasted` after #31020), we can also use it to fuse filtering with reduction

```julia
+((x for x in xs if x > 0)...)
```

### Non-associative binary functions

Splatting is useful for non-associative binary functions:

```julia
const BinaryFunction = Union{
    typoef(/),
    typoef(-),
    typoef(intersect!),
    typoef(union!),
    typoef(merge!),
    typoef(append!),
    typoef(push!),
    # what else?
}

function apply(op::BinaryFunction, args::Arguments{Tuple{Any, <:VA}})
    @assert !(args.args[1] isa VA)
    return foldl(op, args.args[2].args; init=args.args[1])
end
```

or more generally

```julia
function apply(op::BinaryFunction, args::Arguments)
    @assert !(args.args[1] isa VA)
    return foldl(op, flatten(x isa VA ? x.args : (x,) for x in args.args[2:end]); init=args.args[1])
end
```

### Matrix-vector multiplications

Not sure how many people need this, but `*(matrices..., vector)` can be (somewhat) efficiently evaluated by defining

```julia
apply(::typeof(*), args::Arguments{Tuple{<:VA, <:AbstractVector}}) =
    foldr(*, args.args[1].args; init=args.args[2])
```

(Of course, allocation could be much more minimized if we really want this.)

Similar optimization can be done for `∘(fs...)`; but I'm not sure about the exact usecase.

### Higher-order functions (`map(f, iters..)` etc.)

As this mechanism let any function optimize splatting, higher-order functions that may call splatting of given function can be optimized by defining their own `apply` specialization.  For example, `map(f, iters..)` can be specialized as

```julia
function apply(::typeof(map), args::Arguments{Tuple{Any, <:VA}})
    f = args.args[1]
    iters = args.args[2].args
    return map(splat(f), _zipsplat(iters))
end
```

where `_zipsplat(iters)` behaves like `zip(iters...)` but its element type does not have to be a `Tuple`.

Defining a similar overload for `broadcasted` may be possible provided that the object returned by `_zipsplat(iters)` is indexable.  This let us nest reduction inside mapping and avoid allocation in some cases:

```julia
vector .= +.(eachcol(matrix)...)
```

Other `map`-like functions including `map!` and `foreach` can also implement this overload.

### `print`-like functions

`print`, `println` and `write` can be invoked with varargs.  We can make, e.g., `println(xs...)` more compiler friendly and efficient when `xs` is a generic iterator.  Note that `apply(string, Arguments(xs))` can also be implemented in terms of `apply(print, Arguments(io, xs))`.

### `splattable`

`splattable` may be used to solve performance problem discussed in #29114:

```julia
splattable(x::Number) = (x,)
splattable(x::StaticArray) = Tuple(x)
```

In case of `Broadcasted`, it can be used for calling `instantiate`:

```julia
splattable(x::Broadcasted) = instantiate(x)
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Julep: Syntax for reduction (overloadable splatting?) #32860

Idea

Interface

Example overloads

Associative binary operators

Non-associative binary functions

Matrix-vector multiplications

Higher-order functions (`map(f, iters..)` etc.)

`print`-like functions

`splattable`

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Julep: Syntax for reduction (overloadable splatting?) #32860

Description

Idea

Interface

Example overloads

Associative binary operators

Non-associative binary functions

Matrix-vector multiplications

Higher-order functions (map(f, iters..) etc.)

print-like functions

splattable

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Higher-order functions (`map(f, iters..)` etc.)

`print`-like functions

`splattable`