Skip to content

Conversation

@jedisct1
Copy link
Contributor

Carryless multiplication was slow on older Intel CPUs, justifying the need for using Karatsuba multiplication.

This is not the case any more; using 4 multiplications to multiply two 128-bit numbers is actually faster than 3 multiplications + shifts and additions.

This is also true on aarch64.

Keep using Karatsuba only when targeting x86 (granted, this is a bit of a brutal shortcut, we should really list all the CPU models that had a slow clmul instruction).

Also remove useless agg_2 treshold and restore the ability to precompute only H and H^2 in ReleaseSmall.

Finally, avoid using u256. Using 128-bit registers is actually faster.

Carryless multiplication was slow on older Intel CPUs, justifying
the need for using Karatsuba multiplication.

This is not the case any more; using 4 multiplications to multiply
two 128-bit numbers is actually faster than 3 multiplications +
shifts and additions.

This is also true on aarch64.

Keep using Karatsuba only when targeting x86 (granted, this is a bit
of a brutal shortcut, we should really list all the CPU models that
had a slow clmul instruction).

Also remove useless agg_2 treshold and restore the ability to
precompute only H and H^2 in ReleaseSmall.

Finally, avoid using u256. Using 128-bit registers is actually faster.
@jedisct1 jedisct1 added the standard library This issue involves writing Zig code for the standard library. label Nov 17, 2022
const I256 = struct {
hi: u128,
lo: u128,
mid: u128,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I understand what mid is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Schoolbook multiplication:

ab * cd =

    bd
+  ad
+  bc
+ ac

ac is hi, bd is lo and in the middle we have ad+bc. Eventually, a shifted addition will get rid of it:

mid = ad + bc
 lo = bd + (mid << n)
 hi = ac + (mid >> n)

We are doing several multiplications in a row. So, instead of adding the middle term after every multiplication, we can accumulate the lo, hi and mid values, and only add mid at the end. This is what we do here.

@jedisct1 jedisct1 merged commit 7cfeae1 into ziglang:master Nov 17, 2022
@jedisct1 jedisct1 deleted the faster-ghash5 branch November 17, 2022 12:07
andrewrk added a commit that referenced this pull request Nov 17, 2022
…3566)"

This reverts commit 7cfeae1 which
is causing std lib tests to fail on wasm32-wasi.
@andrewrk
Copy link
Member

Reverted in 72d3f4b, failing std lib tests on wasm32-wasi:

index 0 incorrect. expected 38, found 238
index 0 incorrect. expected 7, found 229
index 0 incorrect. expected 100, found 35
index 0 incorrect. expected 136, found 29
index 0 incorrect. expected 185, found 241
2082 passed; 139 skipped; 5 failed.
test failureError: failed to run main module `/workspace/zig-cache-local-debug/o/3a79b8e31d7ec517c3f810b5d8da5053/test.wasm`

Caused by:
    0: failed to invoke command default
    1: wasm trap: unreachable
       wasm backtrace:
           0: 0x8fd7 - <unknown>!os.abort
           1: 0x7c76 - <unknown>!builtin.default_panic
           2: 0x72d3 - <unknown>!test_runner.main
           3: 0x7257 - <unknown>!_start
       note: using the `WASMTIME_BACKTRACE_DETAILS=1` environment variable to may show more debugging information
       
error: the following test command failed with exit code 134:
wasmtime --dir=. --allow-unknown-exports /workspace/zig-cache-local-debug/o/3a79b8e31d7ec517c3f810b5d8da5053/test.wasm
error: test...
error: The following command exited with error code 1:
/workspace/build-debug/stage3/bin/zig test /workspace/lib/std/std.zig --test-name-prefix std-wasm32-wasi-Debug-bare-single-default  --cache-dir /workspace/zig-cache-local-debug --global-cache-dir /workspace/zig-cache-global-debug --name test -fsingle-threaded -target wasm32-wasi -mcpu generic --test-cmd wasmtime --test-cmd --dir=. --test-cmd --allow-unknown-exports --test-cmd-bin -I /workspace/test -L /deps/local/lib -I /deps/local/include --zig-lib-dir /workspace/lib --enable-cache 
error: the following build command failed with exit code 1:
/workspace/zig-cache-local-debug/o/a0e824bb0d15362e74e8cd9e9655e6d6/build /workspace/build-debug/stage3/bin/zig /workspace /workspace/zig-cache-local-debug /workspace/zig-cache-global-debug test -fqemu -fwasmtime -Dstatic-llvm -Dtarget=native-native-musl --search-prefix /deps/local --zig-lib-dir /workspace/build-debug/../lib

jedisct1 added a commit to jedisct1/zig that referenced this pull request Nov 17, 2022
@jedisct1
Copy link
Contributor Author

Oops!

Fixed by 4dd061a in #13579

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

standard library This issue involves writing Zig code for the standard library.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants