Skip to content

Conversation

aminya
Copy link
Member

@aminya aminya commented Dec 31, 2019

This PR changes the benchmark system to AcuteBenchmark.

Fixes #25

IntelVectorMath/Base Performance Comparison
IntelVectorMath Performance Comparison

Performance over dimensions
IPerformance over dimensions

@aminya
Copy link
Member Author

aminya commented Dec 31, 2019

I haven't run the benchmark yet. I am trying to install MKL on my Windows machine.

I kept dimensions to 10 just for the sake of fast testing. To run the benchmark for different dimensions, we need to change dimensions to something like this:

# Dimensions
oneArgDims = [10 100 1000 10000]

twoArgDims = [10 100 1000 10000; # arg1
              10 100 1000 10000] # arg2

Could someone test the benchmark?

@Crown421
Copy link
Collaborator

I am going to check it out soon.

However, I would very strongly prefer to keep AcuteBenchmarks out of the dependencies (Project.toml). It is not necessary to run the package itself. I personally run things on a servers with a docker image, which I try to keep as small as possible.

I would prefer a small check at the beginning of the benchmark files prompting the user to install AcuteBenchmarks if they want to benchmark on their machine and don't have it yet, but I suspect that will only be a small part of the userbase.

@aminya
Copy link
Member Author

aminya commented Dec 31, 2019

I am going to check it out soon.

However, I would very strongly prefer to keep AcuteBenchmarks out of the dependencies (Project.toml). It is not necessary to run the package itself. I personally run things on a servers with a docker image, which I try to keep as small as possible.

I would prefer a small check at the beginning of the benchmark files prompting the user to install AcuteBenchmarks if they want to benchmark on their machine and don't have it yet, but I suspect that will only be a small part of the userbase.

I ran the benchmark for the array size of 100. Check the result in the OP.

I didn't add AcuteBenchmark to IntelVectorMath. It is in a separate folder with its own Project.toml.

@Crown421
Copy link
Collaborator

Ah, I apologize, I saw the email on my phone. The plots look great.

@aminya
Copy link
Member Author

aminya commented Jan 1, 2020

Other than Complex benchmark that needs updating Distributions (https://discourse.julialang.org/t/complex-uniform-distribution-between-1-0im-and-2-0im/32831/4), this is ready to get merged.

We can run the benchmark for more dimension sets too. I added a Github action to run the benchmark using it and create a PR. The current data is up to an array size of 100.

@Crown421
Copy link
Collaborator

Crown421 commented Jan 1, 2020

I looked at the GitHub action, and I am not fully clear on what it does.
What does it generate that you generate the pull request from?
Is the idea to run this action on every commit?

@aminya
Copy link
Member Author

aminya commented Jan 1, 2020

That is an experiment. I want to offload the benchmarking to a bot. It will make a separate branch. Then we can just decide to reference some of the results on the master.

However, setting up the environment doesn't work. I used something similar to Travis script in it. I guess the repo name should be changed.

@Crown421
Copy link
Collaborator

Crown421 commented Jan 1, 2020

Ok, sounds good.
I am looking into github actions right now, and will replace Travis testing with it today.

I like that it is more granular in its results, and that you can schedule jobs, which I will use to check against the nightly build once a month.

@Crown421
Copy link
Collaborator

Crown421 commented Jan 1, 2020

I also found this: https://bheisler.github.io/post/benchmarking-in-the-cloud/
So results from GitHub Actions might have to be taken with a grain of salt.

I think in addition we can we can add a dockerfile in the benchmarking folder that people can use to run the entire thing reproducibly.

@aminya
Copy link
Member Author

aminya commented Jan 1, 2020

So let me remove the benchmark yaml and merge this PR. I agree with the link you sent. The result may not be reliable.

@aminya
Copy link
Member Author

aminya commented Jan 1, 2020

I added Complex numbers support too.
IntelVectorMath Complex Performance Comparison

The result for the first few functions is surprising. I was using my laptop a little during that time. We might need to run the benchmark again for Complex numbers.

@Crown421
Copy link
Collaborator

Crown421 commented Jan 1, 2020

Once I finish updating the testing I will have a look. I can adapt my dockerfile and run the benchmark on a server based on your GitHub Actions file.

@Crown421
Copy link
Collaborator

Crown421 commented Jan 2, 2020

How would you like to proceed here?
I think I can run the benchmark tomorrow on a server, and add updated images to this PR. I assume I only need to run Benchmark.jl?

Afterwards we can merge. I think the next step would then be the uncomment all the functions languishing in the package, rerun benchmarks, and keep the ones that are meaningfully faster than base?

@aminya aminya force-pushed the AcuteBenchmark branch 2 times, most recently from d81221b to 03bc94e Compare January 2, 2020 22:48
aminya added 5 commits January 3, 2020 02:30
Disabling Complex functions

Performing benchmark

dimplot and updated result

Keyword syntax - AcuteBenchmark registered
@aminya aminya force-pushed the AcuteBenchmark branch 2 times, most recently from c683a89 to 1eaccb1 Compare January 2, 2020 23:09
@aminya
Copy link
Member Author

aminya commented Jan 2, 2020

I finally could push force by configuring SSH. I squashed the commits. 😄

This can be merged now. Later we can run the benchmark another time for the dimensions that I have added in the code.

I may add more plots to AcuteBenchmark (like GFLOPS) that I can add to this repo later.

@aminya
Copy link
Member Author

aminya commented Jan 2, 2020

I reran the complex benchmark. The same surprising result is acquired. I think there is some problem with using Complex functions of IVM. Probably too much overhead for Complex handling.

Anyways lets merge this

@Crown421
Copy link
Collaborator

Crown421 commented Jan 2, 2020

I think the last thing missing is a README in the benchmark folder, which we can link to in the main README.
This should include all the images, and maybe a few words on how to read them.

Also, when I tinkered with benchmarking I made these plots .
They are similar to your dimplots, but only show the performance ratio for functions over array size for all functions. Maybe a similar thing makes sense now/ later.

In terms of benchmarking, as long as all it takes is running the benchmark.jl I can run it overnight.

@aminya

This comment has been minimized.

@Crown421
Copy link
Collaborator

Crown421 commented Jan 2, 2020

I still have the images locally, I can add them back in tomorrow and write the readme
(essentially finally solving #1).

@aminya
Copy link
Member Author

aminya commented Jan 2, 2020

I think the last thing missing is a README in the benchmark folder, which we can link to in the main README.
This should include all the images, and maybe a few words on how to read them.

Also, when I tinkered with benchmarking I made these plots .
They are similar to your dimplots, but only show the performance ratio for functions over array size for all functions. Maybe a similar thing makes sense now/ later.

In terms of benchmarking, as long as all it takes is running the benchmark.jl I can run it overnight.

I plan to add more plots to AcuteBenchmark. I can add this type of plot too!

I still have the images locally, I can add them back in tomorrow and write the readme
(essentially finally solving #1).

I fixed it! Fortunately, I had a backup 🚀

I still have the images locally, I can add them back in tomorrow and write the readme
(essentially finally solving #1).

I can do that.

@aminya
Copy link
Member Author

aminya commented Jan 2, 2020

Naah. It was too early to celebrate. Real bar images are gone. I will run the benchmark now for a couple of hours.
Thank you!

@aminya
Copy link
Member Author

aminya commented Jan 3, 2020

I added the result for different dimensions up to 10000 array size.
As you see Complex calculation is definitely different from Real.

@Crown421
Copy link
Collaborator

Crown421 commented Jan 3, 2020

I had a look at your plots, sorry for making more comments.
Your updated complex plots look very different, and relative speed of 10^8 is almost unbelievable.
Also, I think the number of images is too high, the size of the Benchmark folder is your commit is over 16mb, which I believe is too high.

I vote for 2, maybe 3, bar plots, and also 2-3 dimplots. The latter could possibly contain multiple lines.

Maybe the solution to keep this repo lightweight could be for you to have a AcuteBenchmark Results repository, where you store the results for all the things you used the package for.

@aminya
Copy link
Member Author

aminya commented Jan 5, 2020

I had a look at your plots, sorry for making more comments.
Your updated complex plots look very different, and relative speed of 10^8 is almost unbelievable.

Why do you think this is the case? This means that base glitches for running Complex operations

Also, I think the number of images is too high, the size of the Benchmark folder is your commit is over 16mb, which I believe is too high.

I vote for 2, maybe 3, bar plots, and also 2-3 dimplots. The latter could possibly contain multiple lines.

Maybe the solution to keep this repo lightweight could be for you to have a AcuteBenchmark Results repository, where you store the results for all the things you used the package for.

I just uploaded the images for the record. For merging, I will remove them. As you said a couple of bar plots and dimplots are enough (maybe ~2-3MB).

@Crown421
Copy link
Collaborator

Crown421 commented Jan 5, 2020

I just ran the following quick test:

julia> using BenchmarkTools

julia> a = rand(100) + im*rand(100);

julia> @benchmark log.(a)
BenchmarkTools.Trial: 
  memory estimate:  1.83 KiB
  allocs estimate:  4
  --------------
  minimum time:     3.604 μs (0.00% GC)
  median time:      4.081 μs (0.00% GC)
  mean time:        4.654 μs (0.58% GC)
  maximum time:     98.969 μs (94.95% GC)
  --------------
  samples:          10000
  evals/sample:     8

julia> @benchmark VML.log(a)
BenchmarkTools.Trial: 
  memory estimate:  1.77 KiB
  allocs estimate:  1
  --------------
  minimum time:     1.407 μs (0.00% GC)
  median time:      1.564 μs (0.00% GC)
  mean time:        1.866 μs (1.15% GC)
  maximum time:     56.081 μs (96.53% GC)
  --------------
  samples:          10000
  evals/sample:     10

which suggest a performance ratio of about 2.6, whereas you bar plot in bench-dims-set4-relative.png suggest 1.5x10^8, which is an extremely large number.

Also, what do you think about having a Benchmark Results Repo? That way we don't need any images in this repo, but can link as many as we want.

@aminya
Copy link
Member Author

aminya commented Jan 6, 2020

Running benchmark only for size 500 gives this which is different from what we got in the previous results.
I think the problem is that memory overflow thing that you had experienced before
bench-dims-set1-relative

I will run the benchmark only for Complex numbers to see what happens.

@aminya
Copy link
Member Author

aminya commented Jan 11, 2020

I can back up a branch from this PR on my fork, then remove the images, and only put the links in the readme. Does it sound good?

@Crown421
Copy link
Collaborator

Crown421 commented Jan 11, 2020

That does sound good.
To make it more 'formal' I would continue to propose a "AcuteBenchmark Results" repo, with folder for this package, and "soon" to be joined by others. In fact, I have two other packages in mind that I will look into as soon as I have time, that would also benefit from AcuteBenchmark.
Each package should contain the .jl files needed to recreate the results, but not of the results themselves (possibly explicitly excluded with a .gitignore file).

In the Readme of this repo, we can link all the picture from the "Results" repo.

@aminya
Copy link
Member Author

aminya commented Jan 11, 2020

OK. I will create a repo like that. We can also use git submodules to actually insert the content optionally (whoever inits the submodules).

I will probably submodule the result repo to the main AcuteBenchmark repo

@Crown421
Copy link
Collaborator

Yes, I was thinking about submodules. If you know how to set them up properly, I am all for it.

Updating readme to link to submodule

Moving result to AcuteBenchmark-Results
@aminya aminya force-pushed the AcuteBenchmark branch 7 times, most recently from 0e54c29 to 945b576 Compare January 12, 2020 12:46
@aminya
Copy link
Member Author

aminya commented Jan 12, 2020

I set up the repo and submodules. I will merge this now.

@aminya aminya merged commit 5a678ae into master Jan 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

More comprehensive benchmark

2 participants