Skip to content

Conversation

@br3aker
Copy link
Contributor

@br3aker br3aker commented Nov 23, 2021

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

This PR brings 2 new things:

  1. New IDCT implementation base on Arai, Agui, and Nakajima works, it's a faster than master implementation with same level of overall quality, spectral results are even closer to libjpeg spectral dump
  2. Fused matrix tranpose into zig-zag step during jpeg blocks decoding

Benchmarks

Master

Method Mean Error StdDev
'Baseline 4:4:4 Interleaved' 11.781 ms 0.0737 ms 0.0654 ms
'Baseline 4:2:0 Interleaved' 8.688 ms 0.0345 ms 0.0306 ms
'Baseline 4:0:0 (grayscale)' 1.643 ms 0.0092 ms 0.0086 ms
'Progressive 4:2:0 Non-Interleaved' 13.770 ms 0.0928 ms 0.0823 ms

PR

Method Mean Error StdDev
'Baseline 4:4:4 Interleaved' 11.127 ms 0.0659 ms 0.0550 ms
'Baseline 4:2:0 Interleaved' 8.458 ms 0.0289 ms 0.0256 ms
'Baseline 4:0:0 (grayscale)' 1.550 ms 0.0050 ms 0.0044 ms
'Progressive 4:2:0 Non-Interleaved' 13.220 ms 0.0449 ms 0.0398 ms

@codecov
Copy link

codecov bot commented Nov 23, 2021

Codecov Report

Merging #1847 (586df2e) into master (05c3f2b) will increase coverage by 0%.
The diff coverage is 86%.

❗ Current head 586df2e differs from pull request most recent head da82761. Consider uploading reports for the commit da82761 to get more accurate results
Impacted file tree graph

@@          Coverage Diff           @@
##           master   #1847   +/-   ##
======================================
  Coverage      87%     87%           
======================================
  Files         937     937           
  Lines       48438   48456   +18     
  Branches     6057    6058    +1     
======================================
+ Hits        42215   42270   +55     
+ Misses       5212    5177   -35     
+ Partials     1011    1009    -2     
Flag Coverage Δ
unittests 87% <86%> (+<1%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...rp/Formats/Jpeg/Components/FastFloatingPointDCT.cs 63% <73%> (-5%) ⬇️
src/ImageSharp/Formats/Jpeg/Components/Block8x8.cs 77% <100%> (+8%) ⬆️
...mats/Jpeg/Components/Decoder/HuffmanScanDecoder.cs 93% <100%> (ø)
.../Jpeg/Components/Decoder/JpegBlockPostProcessor.cs 100% <100%> (ø)
.../Jpeg/Components/FastFloatingPointDCT.Intrinsic.cs 100% <100%> (ø)
src/ImageSharp/Formats/Jpeg/Components/ZigZag.cs 100% <100%> (ø)
src/ImageSharp/Formats/Jpeg/JpegDecoderCore.cs 89% <100%> (+<1%) ⬆️
...rc/ImageSharp/Formats/Webp/Lossless/Vp8LEncoder.cs 97% <0%> (-1%) ⬇️
...mageSharp/Common/Helpers/SimdUtils.HwIntrinsics.cs 99% <0%> (+<1%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 05c3f2b...da82761. Read the comment docs.

/// <param name="quantTable">Quantization table to adjust.</param>
public static void AdjustToIDCT(ref Block8x8F quantTable)
{
for (int i = 0; i < Block8x8F.Size; i++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AdjustmentCoefficients is 64 elements yeah? If so, since Block8x8F indexer doesn't do bounds checks in release we could avoid the check here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Did it without indexer with pure Unsafe.Add. We need nint indexer for Block8x8F & Block8x8 IMO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason we can’t have that now?

Copy link
Contributor Author

@br3aker br3aker Nov 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of places where indexer is used and which can use nint indexer. I'd rather do it as a separate PR. Code cleanup is the last part of my closed decoder optimization PR so I can do it there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Makes sense

Copy link
Member

@JimBobSquarePants JimBobSquarePants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. Excellent as ever! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Jpeg encoder can be further optimized by fusing transpose and zig-zag steps

2 participants