-
Notifications
You must be signed in to change notification settings - Fork 60
faster permutations, based on what rdeits suggested #122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
also add testsets
Codecov ReportBase: 96.85% // Head: 96.97% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #122 +/- ##
==========================================
+ Coverage 96.85% 96.97% +0.12%
==========================================
Files 7 7
Lines 700 728 +28
==========================================
+ Hits 678 706 +28
Misses 22 22
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
|
Before @benchmark foreach(identity, Combinatorics.permutations(1:2000, 2))
BenchmarkTools.Trial: 1 sample with 1 evaluation.
Single result which took 10.268 s (30.58% GC) to evaluate,
with a memory estimate of 60.47 GiB, over 11994001 allocations.and after @benchmark foreach(identity, Combinatorics.permutations(1:2000, 2))
BenchmarkTools.Trial: 39 samples with 1 evaluation.
Range (min … max): 128.003 ms … 132.626 ms ┊ GC (min … max): 8.80% … 9.58%
Time (median): 130.014 ms ┊ GC (median): 9.60%
Time (mean ± σ): 130.239 ms ± 1.225 ms ┊ GC (mean ± σ): 9.38% ± 0.41%
█ █▃█ ▃ ▃ ▃ █ █
█▁▁▁▁▁▇▁▁▁▁▁▁▁▁███▁▇▁█▁▇█▇▇█▁▁▇▁▁▁▇▁▁▇▁▁▁▇█▇▁▁█▇▇▁▇▁▇▁▁▇▁▁▁▁▇ ▁
128 ms Histogram: frequency by time 133 ms <
Memory estimate: 427.03 MiB, allocs estimate: 7996001. |
attempt at fix does not work yet
Thanks @bkamins for the help on this! Co-authored-by: Bogumił Kamiński <[email protected]>
Co-authored-by: Bogumił Kamiński <[email protected]>
I normally just download binaries from https://julialang.org/downloads/oldreleases/ and install it manually when needed. |
|
OK - looks good. Now someone not involved in development of the code should review it. (thank you in advance) |
|
Would it be worth posting on discourse to find someone willing to review? I can definitely do that |
|
Often it is also helpful if you go on record and state if you think yourself that this is ready to be merged and let loose on the world. |
|
Hi @mschauer, good idea. I have made a thorough review of it (now over a month since I last looked at it), as well as spent more time playing around with the new version. I believe the new version is reliable, much faster (the reason for this pull request), and ready for public use. |
|
@mschauer - can you please review it? I was involved in the implementation so a second set of eyes is required. |
| Base.eltype(::Type{Permutations{T}}) where {T} = Vector{eltype(T)} | ||
| function has_repeats(state::Vector{Int}) | ||
| for outer in eachindex(state) | ||
| for inner in (outer+1):lastindex(state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inner loop can be marked @inbounds safely, I think. Actually, so can the outer I guess
julia> v = [1:10000;];
julia> @btime has_repeats($v);
31.500 ms (0 allocations: 0 bytes)
julia> function has_repeats(state::Vector{Int})
@inbounds for outer in eachindex(state)
for inner in (outer+1):lastindex(state)
if state[outer] == state[inner]
return true
end
end
end
return false
end
has_repeats (generic function with 1 method)
julia> @btime has_repeats($v);
21.080 ms (0 allocations: 0 bytes)Admittedly, this is unnecessary if the compiler can infer this automatically, but apparently it doesn't do a good job on some platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inner can be marked @inbounds safely because of the type restriction in the signature. It might be good to note that in a comment in case that type restriction is ever loosened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the input! I've added the inbounds, and a comment explaining why it is safe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. Now someone from JuliaMath needs to do a final review, merge, and release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @mschauer, would you be willing to do the final review, merge, and release? I have reviewed it myself, and believe it is ready for the main branch.
from palday and jishnub
|
Why change the name of the type from |
Good point. I'll fix that |
ararslan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work here 🙂
src/permutations.jl
Outdated
|
|
||
| function Base.length(p::Permutations) | ||
| length(p.data) < p.length && return 0 | ||
| return Int(prod(big(length(p.data) - p.length + 1):big(length(p.data)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why convert to BigInt then back to Int? If the product exceeds typemax(Int), the conversion will error anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Would you suggest leaving it as a BigInt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. What is the return type of length(::Combinations)? I'd probably opt to make it consistent with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look like length(::Combinations) returns an Int64. Should the length function be worried about overflow in these cases? It's not too hard create a case that overflows and returns an error with Int64 for combinations and for permutations. But perhaps that's outside the scope of this PR.
I'm ok with having length(::Permutations) not converting to BigInt in the first place if we think that's most consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My recommendation would be to have it be consistent with Combinations for now and open an issue to discuss whether the length function should return BigInt. Then both can be changed at once later.
Co-authored-by: Alex Arslan <[email protected]>
Co-authored-by: Alex Arslan <[email protected]>
Co-authored-by: Alex Arslan <[email protected]>
Co-authored-by: Alex Arslan <[email protected]>
ararslan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work and thanks for the contribution!
|
This made Cf.: julia> @code_warntype permutations([1,2,3])
MethodInstance for permutations(::Vector{Int64})
from permutations(a) @ Combinatorics ~/.julia/dev/Combinatorics/src/permutations.jl:73
Arguments
#self#::Core.Const(Combinatorics.permutations)
a::Vector{Int64}
Body::Union{Combinatorics.Permutations{Vector{Int64}}, Vector{Vector{Int64}}}
1 ─ %1 = Combinatorics.permutations::Core.Const(Combinatorics.permutations)
│ %2 = Combinatorics.length::Core.Const(length)
│ %3 = (%2)(a)::Int64
│ %4 = (%1)(a, %3)::Union{Combinatorics.Permutations{Vector{Int64}}, Vector{Vector{Int64}}}
└── return %4 |
|
Hi @thchr, it's been a while since I worked on this, so my memory is a little fuzzy. If I remember correctly, it returns It might be worth running some tests to remind us if that was indeed the case. |
|
I think the main problem with that approach - although it may be faster for specific cases of So, if you test "the whole" thing (calling E.g., before this PR: julia> f(xs) = collect(permutations(xs))
julia> @btime f(1:3);
367.308 ns (20 allocations: 1.30 KiB)And, after: julia> @btime f(1:3);
432.668 ns (17 allocations: 704 bytes)This occurs even though EDIT: Actually, the slow down doesn't appear related to type-instability. We can remove the type-instability by just by-passing the julia> g(xs) = collect(Combinatorics.Permutations(xs, length(xs)))
julia> @btime g(1:3);
428.191 ns (17 allocations: 704 bytes)So I guess the argument for removing the special casing for |
I finally learned how to make a pull request, based on this issue
Here I copied in the code that rdeits originally wrote here
I also updated the tests to use testsets, which provides a slightly nicer user interface when testing.
I also ran the Julia formatter on the files I edited, which made some whitespace changes. Unfortunately github's differ isn't the best at figuring out what whitespace changed, so it might look a little confusing