Compiler optimizations #494
                
     Open
            
            
          
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
This pull request includes two simple compiler optimizations:
-O3 -ffast-mathinCMakeLists.txtsqr(x) {return x * x}function fromlib/utils.cppTesting
I've also added a build and benchmark script, so we can run w/:
Results
So using both optimizations reduces runtime of this example by ~87%.
Notes
This is a PR to
master, but the numbers and experiments above are based on comparison to commitc07eeae9394ab30ca8d984b2ec2e40ab4c2d2e08, per @manujinda 's recommendation. I have not tested onmaster, but it should be easy for you to copy/paste these changes to whichever branch is currently under development.Possible Future Work
After these optimizations, profiling via
callgrindreveals that the majority of runtime is spent computing calls toexpinKDE::pdf. To get additional speedups, we'd need to makeexprun faster. This may be possible, depending on how much precision is required, by eg. using approximateexpfunctions like the ones described here.