Skip to content

Conversation

@bjarthur
Copy link

using Float32 for Lanczos is ~2x faster and uses ~1/2x as much memory as the current Float64.

this PR currently uses whatever precision was input as the precision the internal calculations are performed with. i could also imagine specifying the type used for internal computations in the type (e.g. struct Lanczos4OpenCV{T} <: AbstractLanczos end) to separate it from the input.

i'm also curious where there is a more clever way to cast l4_2d_cs at compile time so as not to incur runtime penalities.

let me know what you think and i'll add some tests and docs.

julia> using Interpolations, BenchmarkTools

julia> x=rand(1_000_000);

julia> @benchmark Interpolations._lanczos4_opencv.(x)
BenchmarkTools.Trial: 231 samples with 1 evaluation per sample.
 Range (min … max):  19.061 ms … 83.561 ms  ┊ GC (min … max): 5.34% … 74.95%
 Time  (median):     21.303 ms              ┊ GC (median):    2.90%
 Time  (mean ± σ):   21.654 ms ±  4.135 ms  ┊ GC (mean ± σ):  4.35% ±  5.10%

                       ▅█▄▃                                    
  ▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▃▃▃▃▅▇█████▄▄▄▃▃▃▃▂▂▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▂ ▃
  19.1 ms         Histogram: frequency by time        24.8 ms <

 Memory estimate: 61.05 MiB, allocs estimate: 4.

julia> x=rand(Float32, 1_000_000);

julia> @benchmark Interpolations._lanczos4_opencv.(x)
BenchmarkTools.Trial: 387 samples with 1 evaluation per sample.
 Range (min … max):  12.078 ms … 76.608 ms  ┊ GC (min … max): 0.00% … 83.87%
 Time  (median):     12.695 ms              ┊ GC (median):    3.05%
 Time  (mean ± σ):   12.928 ms ±  3.290 ms  ┊ GC (mean ± σ):  4.56% ±  4.80%

     ▂       ▄▅▇██▇▅▂                                          
  ▆▇▇██▄▆▆▄▇██████████▄█▄▇▄▆▁▄▁▇▄▄▁▄▄▄▇▄▆▁▁▄▁▁▁▁▆▁▁▁▁▁▁▄▄▁▁▁▄ ▇
  12.1 ms      Histogram: log(frequency) by time      14.5 ms <

 Memory estimate: 30.53 MiB, allocs estimate: 4.

@codecov
Copy link

codecov bot commented Oct 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.10%. Comparing base (7a2d581) to head (5f69ead).

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #635   +/-   ##
=======================================
  Coverage   88.10%   88.10%           
=======================================
  Files          29       29           
  Lines        1908     1908           
=======================================
  Hits         1681     1681           
  Misses        227      227           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bjarthur
Copy link
Author

second commit makes it slightly faster:

julia> x=rand(Float32, 1_000_000);

julia> @benchmark Interpolations.value_weights.(Ref(Lanczos4OpenCV()), x)
BenchmarkTools.Trial: 429 samples with 1 evaluation per sample.
 Range (min … max):  10.796 ms … 75.222 ms  ┊ GC (min … max): 0.00% … 85.30%
 Time  (median):     11.479 ms              ┊ GC (median):    3.84%
 Time  (mean ± σ):   11.664 ms ±  3.107 ms  ┊ GC (mean ± σ):  5.43% ±  4.66%

               ▂█▆▃▄▄▅▇▃▁                                      
  ▃▃▃▃▃▃▁▃▃▃▄▅▆██████████▆▃▃▃▄▃▁▁▂▃▁▁▃▃▂▁▁▃▁▂▁▃▁▂▂▁▁▁▁▁▁▁▁▂▂▂ ▃
  10.8 ms         Histogram: frequency by time          13 ms <

 Memory estimate: 30.53 MiB, allocs estimate: 5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant