std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs #13566

jedisct1 · 2022-11-16T23:11:20Z

Carryless multiplication was slow on older Intel CPUs, justifying the need for using Karatsuba multiplication.

This is not the case any more; using 4 multiplications to multiply two 128-bit numbers is actually faster than 3 multiplications + shifts and additions.

This is also true on aarch64.

Keep using Karatsuba only when targeting x86 (granted, this is a bit of a brutal shortcut, we should really list all the CPU models that had a slow clmul instruction).

Also remove useless agg_2 treshold and restore the ability to precompute only H and H^2 in ReleaseSmall.

Finally, avoid using u256. Using 128-bit registers is actually faster.

Carryless multiplication was slow on older Intel CPUs, justifying the need for using Karatsuba multiplication. This is not the case any more; using 4 multiplications to multiply two 128-bit numbers is actually faster than 3 multiplications + shifts and additions. This is also true on aarch64. Keep using Karatsuba only when targeting x86 (granted, this is a bit of a brutal shortcut, we should really list all the CPU models that had a slow clmul instruction). Also remove useless agg_2 treshold and restore the ability to precompute only H and H^2 in ReleaseSmall. Finally, avoid using u256. Using 128-bit registers is actually faster.

lib/std/crypto/ghash.zig

daurnimator · 2022-11-17T00:20:50Z

lib/std/crypto/ghash.zig

+    const I256 = struct {
+        hi: u128,
+        lo: u128,
+        mid: u128,


I don't think I understand what mid is

Schoolbook multiplication:

ab * cd = bd + ad + bc + ac

ac is hi, bd is lo and in the middle we have ad+bc. Eventually, a shifted addition will get rid of it:

mid = ad + bc lo = bd + (mid << n) hi = ac + (mid >> n)

We are doing several multiplications in a row. So, instead of adding the middle term after every multiplication, we can accumulate the lo, hi and mid values, and only add mid at the end. This is what we do here.

…3566)" This reverts commit 7cfeae1 which is causing std lib tests to fail on wasm32-wasi.

andrewrk · 2022-11-17T22:39:33Z

Reverted in 72d3f4b, failing std lib tests on wasm32-wasi:

index 0 incorrect. expected 38, found 238
index 0 incorrect. expected 7, found 229
index 0 incorrect. expected 100, found 35
index 0 incorrect. expected 136, found 29
index 0 incorrect. expected 185, found 241
2082 passed; 139 skipped; 5 failed.
test failureError: failed to run main module `/workspace/zig-cache-local-debug/o/3a79b8e31d7ec517c3f810b5d8da5053/test.wasm`

Caused by:
    0: failed to invoke command default
    1: wasm trap: unreachable
       wasm backtrace:
           0: 0x8fd7 - <unknown>!os.abort
           1: 0x7c76 - <unknown>!builtin.default_panic
           2: 0x72d3 - <unknown>!test_runner.main
           3: 0x7257 - <unknown>!_start
       note: using the `WASMTIME_BACKTRACE_DETAILS=1` environment variable to may show more debugging information
       
error: the following test command failed with exit code 134:
wasmtime --dir=. --allow-unknown-exports /workspace/zig-cache-local-debug/o/3a79b8e31d7ec517c3f810b5d8da5053/test.wasm
error: test...
error: The following command exited with error code 1:
/workspace/build-debug/stage3/bin/zig test /workspace/lib/std/std.zig --test-name-prefix std-wasm32-wasi-Debug-bare-single-default  --cache-dir /workspace/zig-cache-local-debug --global-cache-dir /workspace/zig-cache-global-debug --name test -fsingle-threaded -target wasm32-wasi -mcpu generic --test-cmd wasmtime --test-cmd --dir=. --test-cmd --allow-unknown-exports --test-cmd-bin -I /workspace/test -L /deps/local/lib -I /deps/local/include --zig-lib-dir /workspace/lib --enable-cache 
error: the following build command failed with exit code 1:
/workspace/zig-cache-local-debug/o/a0e824bb0d15362e74e8cd9e9655e6d6/build /workspace/build-debug/stage3/bin/zig /workspace /workspace/zig-cache-local-debug /workspace/zig-cache-global-debug test -fqemu -fwasmtime -Dstatic-llvm -Dtarget=native-native-musl --search-prefix /deps/local --zig-lib-dir /workspace/build-debug/../lib

…iglang#13566)" This reapplies commit 72d3f4b.

jedisct1 · 2022-11-17T22:57:50Z

Oops!

Fixed by 4dd061a in #13579

jedisct1 force-pushed the faster-ghash5 branch from 30a5485 to 45d8998 Compare November 17, 2022 00:04

jedisct1 added the standard library This issue involves writing Zig code for the standard library. label Nov 17, 2022

daurnimator reviewed Nov 17, 2022

View reviewed changes

Use a switch, add some comments

3c7b01a

jedisct1 merged commit 7cfeae1 into ziglang:master Nov 17, 2022

jedisct1 deleted the faster-ghash5 branch November 17, 2022 12:07

andrewrk added a commit that referenced this pull request Nov 17, 2022

Revert "std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs (#1…

72d3f4b

…3566)" This reverts commit 7cfeae1 which is causing std lib tests to fail on wasm32-wasi.

jedisct1 added a commit to jedisct1/zig that referenced this pull request Nov 17, 2022

Reapply "std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs (z…

3051e27

…iglang#13566)" This reapplies commit 72d3f4b.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs #13566

std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs #13566

Uh oh!

jedisct1 commented Nov 16, 2022

Uh oh!

Uh oh!

Uh oh!

daurnimator Nov 17, 2022

Uh oh!

jedisct1 Nov 17, 2022

Uh oh!

andrewrk commented Nov 17, 2022

Uh oh!

jedisct1 commented Nov 17, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs #13566

std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs #13566

Uh oh!

Conversation

jedisct1 commented Nov 16, 2022

Uh oh!

Uh oh!

Uh oh!

daurnimator Nov 17, 2022

Choose a reason for hiding this comment

Uh oh!

jedisct1 Nov 17, 2022

Choose a reason for hiding this comment

Uh oh!

andrewrk commented Nov 17, 2022

Uh oh!

jedisct1 commented Nov 17, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants