Skip to content
This repository was archived by the owner on Jun 26, 2020. It is now read-only.

Conversation

@abrown
Copy link
Member

@abrown abrown commented Jul 24, 2019

This PR also adds a simple constant pool implementation based on the discussion in abrown#2. Some form of constant pool is necessary because vector constants of 128-bits (the current cranelift vector size) must be emitted separately--in rodata memory--for the x86 instructions to move them in to the XMM registers.

This PR depends on commits waiting to be merged in #855.

Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, overall this looks really good!

@abrown abrown force-pushed the add-simd-const-x86-refactored branch 2 times, most recently from c7dc251 to 380ccfa Compare July 26, 2019 22:45
@abrown abrown force-pushed the add-simd-const-x86-refactored branch from 380ccfa to 1e0ab76 Compare July 31, 2019 21:50
@abrown
Copy link
Member Author

abrown commented Jul 31, 2019

After talking to @yurydelendik (thanks for the help!) and debugging the execution of these instructions in wasmtime, I realized that the relocation offsets I initially implemented for constants were not going to work. I mirrored the jump table implementation in cranelift as best I could without fully understanding how wasmtime does relocation. After talking to @yurydelendik I realized that wasmtime is currently not relocating the jump tables from where they are emitted in MemorySink (i.e. right after the function body). Thus, the offsets computed against %rip in cranelift's codegen are not modified by wasmtime and must be correct when they are emitted by cranelift. Understanding this I then modified the constant pool code to have correct offsets (not offsets starting from the beginning of the constant pool but offsets starting from the beginning of the function) in relaxation.rs, just like the jump tables are doing. Now I can run SIMD examples in wasmtime such as:

(module
  (func $test_splat (result i32)
    i32.const 42
    i32x4.splat
    i32x4.extract_lane 0
  )

  (func $test_insert_lane (result i32)
      v128.const i64x2 0 0
      i32.const 99
      i32x4.replace_lane 1
      i32x4.extract_lane 1
  )

  (func $test_const (result i32)
    v128.const i32x4 1 2 3 4
    i32x4.extract_lane 3
  )

  ;; exporting the function allows us to call it with `wasmtime --invoke=...`
  (export "test_splat" (func $test_splat))
  (export "test_insert_lane" (func $test_insert_lane))
  (export "test_const" (func $test_const))
)

by running the following:

cargo +nightly build && \
  wat2wasm --enable-simd --debug-names ../simd.wat -o ../simd.wasm && \
  target/debug/wasmtime ../simd.wasm -g --enable-simd --invoke=test_insert_lane

The breaking change from this PR that wasmtime (and other cranelift users) will have to absorb is the new RelocSink method I added--reloc_constant. For the reasons I describe above, however, I currently implement this as an empty method in wasmtime. That being the case, perhaps RelocSink::reloc_constant is currently unnecessary? Or at the very least, perhaps the parameters should change from code_offset: CodeOffset, reloc: Reloc, constant_offset: ConstantOffset (where constant_offset is the offset from the beginning of the function to the constant value) to code_offset: CodeOffset, reloc: Reloc, constant: Constant (where constant is an entity that allows the user to look up the constant inside the ConstantPool implementation)?

@abrown
Copy link
Member Author

abrown commented Aug 5, 2019

@arunetm, here is a const implementation for x86.

@abrown
Copy link
Member Author

abrown commented Aug 8, 2019

@sunfishcode, @bnjbvr: as we discussed, I added code in cranelift-wasm to keep track of the values from v128.const that are defaulting to I8x16; immediately before the value is used (in the few implemented SIMD instructions) I raw_bitcast the value to the appropriate type if necessary.

@abrown abrown force-pushed the add-simd-const-x86-refactored branch from 2d5865a to 13cca99 Compare August 19, 2019 23:53
@abrown
Copy link
Member Author

abrown commented Aug 19, 2019

@sunfishcode rebased this as well and is ready for review; as a reminder, we will have to update RelocSink in wasmtime when we publish this version of cranelift and update in wasmtime (see bytecodealliance/wasmtime#238).

Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, just a few comments below:

@abrown abrown force-pushed the add-simd-const-x86-refactored branch 5 times, most recently from d5a679c to cb4bfec Compare August 21, 2019 17:40
Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, overall this looks good, just a few comments below:

let constants: Vec<Constant> = func.dfg.constants.iter().map(|(h, _)| h.clone()).collect();
for constant_handle in constants {
func.dfg.constants.set_offset(constant_handle, offset);
offset += func.dfg.constants.get(constant_handle).len() as u32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use try_from() to convert to u32 here. It may not be possible to create a 4 GiB constant today, but if anyone ever could, silent wraparound here would be a subtle bug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

/// its constant data (i.e. [ConstantData](ir::constant::ConstantData))
#[derive(Clone)]
pub struct ConstantPool {
// this mapping maintains the insertion order as long as Constants are created with sequentially increasing integers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a comment forms a substantial sentence, please capitalize it and end it with a period.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went through and added periods to all the comments in this file.

abrown added 4 commits August 26, 2019 11:38
Examining wasm-objdump revealed that it stores SIMD constants in little-endian order, e.g.:

000071 func[2] <test_const>:
 000072: fd 02 01 00 00 00 02 00 00 | v128.const 0x00000001 0x00000002 0x00000003 0x00000004
 00007b: 00 03 00 00 00 04 00 00 00 |
 000084: fd 0d 03                   | i32x4.extract_lane 3
 000087: 0b                         | end

This change avoids confusion by making the CLIF representation use little-endian order as well.
By default, constants added by SIMD's v128.const will be typed as I8x16 in CLIF. This type must be changed to the appropriate vector type before use to satisfy cranelift's type checking. To do this, we track what SSA values are created by v128.const and convert them with a raw_bitcast immediately before use in the currently implemented SIMD instructions.
To do so we must use a new version of wabt-rs that allows us to enable features (e.g. SIMD) when translating the wasmtests
@abrown abrown force-pushed the add-simd-const-x86-refactored branch from 86aa687 to 5128e19 Compare August 26, 2019 18:41
Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@sunfishcode sunfishcode merged commit d5058da into bytecodealliance:master Aug 26, 2019
@abrown abrown deleted the add-simd-const-x86-refactored branch August 26, 2019 23:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants