-
-
Couldn't load subscription status.
- Fork 5.7k
Faster mapreduce for Broadcasted #31020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
f4c68ee
85f6405
cc7c05a
1a60367
ae55dc2
ece81cf
ee68a45
758e003
cec63e7
fb2ff9e
cf04cc8
cedeec6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -166,7 +166,7 @@ BroadcastStyle(a::AbstractArrayStyle{M}, ::DefaultArrayStyle{N}) where {M,N} = | |||||||
| # methods that instead specialize on `BroadcastStyle`, | ||||||||
| # copyto!(dest::AbstractArray, bc::Broadcasted{MyStyle}) | ||||||||
|
|
||||||||
| struct Broadcasted{Style<:Union{Nothing,BroadcastStyle}, Axes, F, Args<:Tuple} | ||||||||
| struct Broadcasted{Style<:Union{Nothing,BroadcastStyle}, Axes, F, Args<:Tuple} <: Base.AbstractBroadcasted | ||||||||
| f::F | ||||||||
| args::Args | ||||||||
| axes::Axes # the axes of the resulting object (may be bigger than implied by `args` if this is nested inside a larger `Broadcasted`) | ||||||||
|
|
@@ -219,6 +219,18 @@ argtype(bc::Broadcasted) = argtype(typeof(bc)) | |||||||
| _eachindex(t::Tuple{Any}) = t[1] | ||||||||
| _eachindex(t::Tuple) = CartesianIndices(t) | ||||||||
|
|
||||||||
| _combined_indexstyle(::Type{Tuple{A}}) where A = IndexStyle(A) | ||||||||
| _combined_indexstyle(::Type{TT}) where TT = | ||||||||
| IndexStyle(IndexStyle(Base.tuple_type_head(TT)), | ||||||||
| _combined_indexstyle(Base.tuple_type_tail(TT))) | ||||||||
|
|
||||||||
| Base.IndexStyle(bc::Broadcasted) = IndexStyle(typeof(bc)) | ||||||||
| Base.IndexStyle(::Type{<:Broadcasted{<:Any,Nothing,<:Any,Args}}) where {Args <: Tuple} = | ||||||||
| _combined_indexstyle(Args) | ||||||||
| Base.IndexStyle(::Type{<:Broadcasted{<:Any}}) = IndexCartesian() | ||||||||
|
|
||||||||
| Base.LinearIndices(bc::Broadcasted) = LinearIndices(eachindex(bc)) | ||||||||
|
||||||||
| _mapreduce(f, op, A::AbstractArray) = _mapreduce(f, op, IndexStyle(A), A) |
which is valid since IndexStyle is callable on AbstractArray:
Line 70 in 236c69b
| IndexStyle(A::AbstractArray) = IndexStyle(typeof(A)) |
However, to make IndexStyle a proper trait for Broadcasted, I defined methods for types and added the method for instance on top of them, just like how it's done for AbstractArray:
Line 227 in f4c68ee
| Base.IndexStyle(bc::Broadcasted) = IndexStyle(typeof(bc)) |
Note that direct definition of IndexStyle(::Broadcasted) would be type-stable if eachindex is so. If eachindex is not type-stable we don't have the performance gain anyway. Initially, I wasn't aiming for adding public interface for Broadcasted in this PR so I thought relying on the implementation detail that IndexStyle is only called on instance was OK. However, I also understand that contaminating "signature space" of IndexStyle is not particularly a clean solution.
If we say that IndexStyle for Broadcasted has to be a proper trait, we need to note that IndexStyle must be properly defined whenever axes is customized since axes is a public interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but as long as you define IndexStyle it should be a proper trait, independent from how it's used internally. Otherwise, you'd better define a custom internal function.
Anyway, what's the problem with the current implementation in this PR? Just that's it's complicated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. So, reading the documentation and the code more, it looks like Base.axes(::Broadcasted{MyStyle}) is not an overloadable method? From the documentation, Base.axes(x) is overloadable if x participate in the broadcasting. This method is then called via combine_axes in axes(::Broadcasted). I was initially worried that IndexStyle(::Type{<:Broadcasted}) and Base.axes(::Broadcasted) can become incompatible if downstream package authors are not careful but it looks like I don't need to worry about it.
tkf marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -12,6 +12,9 @@ else | |
| const SmallUnsigned = Union{UInt8,UInt16,UInt32} | ||
| end | ||
|
|
||
| abstract type AbstractBroadcasted end | ||
| const AbstractArrayOrBroadcasted = Union{AbstractArray, AbstractBroadcasted} | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I needed to define this abstract type AbstractIndexable end
const Indexable = Union{AbstractArray, AbstractIndexable}or even abstract type Indexable{N} end
abstract type AbstractArray{T,N} <: Indexable{N} endso that it's much easier for downstream projects to use indexing-based There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Having an abstract type or a trait like Though a trait would probably be better than an abstract type, since it would be more flexible. In particular, an object returned by There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we go with trait direction, one option is to use struct NonIndexable <: IndexStyle end
IndexStyle(::Type) = NonIndexable()There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a clever idea. I think the union here is sensible for now — that trait could possibly come later. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Differing the whole designing work for the trait makes sense. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @vtjnash mentioned there may be another way around this bootstrapping issue. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given that the 1.4 feature freeze is approaching, I support merging the PR with the FWIW, this PR is giving us incredibly good performance to be able to reuse the pairwise summation algorithm with weights at JuliaStats/StatsBase.jl#518. Great work! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks a lot for testing this PR in a real-world scenario! |
||
|
|
||
| """ | ||
| Base.add_sum(x, y) | ||
|
|
||
|
|
@@ -152,7 +155,8 @@ foldr(op, itr; kw...) = mapfoldr(identity, op, itr; kw...) | |
|
|
||
| # This is a generic implementation of `mapreduce_impl()`, | ||
| # certain `op` (e.g. `min` and `max`) may have their own specialized versions. | ||
| @noinline function mapreduce_impl(f, op, A::AbstractArray, ifirst::Integer, ilast::Integer, blksize::Int) | ||
| @noinline function mapreduce_impl(f, op, A::AbstractArrayOrBroadcasted, | ||
| ifirst::Integer, ilast::Integer, blksize::Int) | ||
| if ifirst == ilast | ||
| @inbounds a1 = A[ifirst] | ||
| return mapreduce_first(f, op, a1) | ||
|
|
@@ -175,7 +179,7 @@ foldr(op, itr; kw...) = mapfoldr(identity, op, itr; kw...) | |
| end | ||
| end | ||
|
|
||
| mapreduce_impl(f, op, A::AbstractArray, ifirst::Integer, ilast::Integer) = | ||
| mapreduce_impl(f, op, A::AbstractArrayOrBroadcasted, ifirst::Integer, ilast::Integer) = | ||
| mapreduce_impl(f, op, A, ifirst, ilast, pairwise_blocksize(f, op)) | ||
|
|
||
| """ | ||
|
|
@@ -296,13 +300,13 @@ The default is `reduce_first(op, f(x))`. | |
| """ | ||
| mapreduce_first(f, op, x) = reduce_first(op, f(x)) | ||
|
|
||
| _mapreduce(f, op, A::AbstractArray) = _mapreduce(f, op, IndexStyle(A), A) | ||
| _mapreduce(f, op, A::AbstractArrayOrBroadcasted) = _mapreduce(f, op, IndexStyle(A), A) | ||
|
|
||
| function _mapreduce(f, op, ::IndexLinear, A::AbstractArray{T}) where T | ||
| function _mapreduce(f, op, ::IndexLinear, A::AbstractArrayOrBroadcasted) | ||
| inds = LinearIndices(A) | ||
| n = length(inds) | ||
| if n == 0 | ||
| return mapreduce_empty(f, op, T) | ||
| return mapreduce_empty_iter(f, op, A, IteratorEltype(A)) | ||
| elseif n == 1 | ||
| @inbounds a1 = A[first(inds)] | ||
| return mapreduce_first(f, op, a1) | ||
|
|
@@ -323,7 +327,7 @@ end | |
|
|
||
| mapreduce(f, op, a::Number) = mapreduce_first(f, op, a) | ||
|
|
||
| _mapreduce(f, op, ::IndexCartesian, A::AbstractArray) = mapfoldl(f, op, A) | ||
| _mapreduce(f, op, ::IndexCartesian, A::AbstractArrayOrBroadcasted) = mapfoldl(f, op, A) | ||
|
|
||
| """ | ||
| reduce(op, itr; [init]) | ||
|
|
@@ -473,7 +477,7 @@ isgoodzero(::typeof(max), x) = isbadzero(min, x) | |
| isgoodzero(::typeof(min), x) = isbadzero(max, x) | ||
|
|
||
| function mapreduce_impl(f, op::Union{typeof(max), typeof(min)}, | ||
| A::AbstractArray, first::Int, last::Int) | ||
| A::AbstractArrayOrBroadcasted, first::Int, last::Int) | ||
| a1 = @inbounds A[first] | ||
| v1 = mapreduce_first(f, op, a1) | ||
| v2 = v3 = v4 = v1 | ||
|
|
@@ -746,7 +750,7 @@ function count(pred, itr) | |
| end | ||
| return n | ||
| end | ||
| function count(pred, a::AbstractArray) | ||
| function count(pred, a::AbstractArrayOrBroadcasted) | ||
| n = 0 | ||
| for i in eachindex(a) | ||
| @inbounds n += pred(a[i])::Bool | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turned out my previous approach (combine
IndexStyleof each argument) was wrong since it didn't work with, e.g.,broadcasted(+, zeros(5), zeros(1, 4)). Each argument can beIndexLineareven though the broadcasted indexing is not (that's the main job of broadcasted).So now the new approach is to return an equivalent of
IndexStyle(bc.axes[1])iflength(bc.axes) == 1and then returnIndexCartesian()otherwise. As I needAxestype parameter now, this implementation requires theBroadcastedto beinstantiated first. I guess this is OK since that's the usual broadcasting pipeline.Is this implementation fine? Is
IndexStyle(bc.axes[1]) == IndexLinear() && length(bc.axes) == 1the only possibleIndexLinear()case that is detectable in a type-stable manner? That is to say, I'm ignoring the case likebroadcasted(+, zeros(1, 4), zeros(1, 4))as I need to look at the value to check that it supports linear indexing.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't really require the
Broadcastedto beinstantiated to get anIndexStyle— it just means that if it's not instantiated that it falls back to a cartesian implementation. AllBroadcasteds should support cartesian indexing; only one-dimensional ones will allow straight integers.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's much more accurate way to put it. I wanted to say that "for the broadcasted to be dispatched to the optimized method", it has to be
instantiated first.