- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
Description
Basically, the MergeFunctions LLVM pass can rewrite functions to generate calls that are not valid under the calling convention of the target, e.g. extern "ptx-kernel" functions should not call other extern "ptx-kernel" functions in NVPTX.
This is an LLVM bug, described here (thanks @nikic): https://bugs.llvm.org/show_bug.cgi?id=40232. A PR also adds a target option and a -Z flag to control MergeFunctions: #57268.
Example: in the following Rust source, the functions foo and bar get merged by MergeFunctions:
#![crate_type = "lib"]
#![feature(abi_ptx)]
#![feature(lang_items)]
#![feature(link_llvm_intrinsics)]
#![feature(naked_functions)]
#![feature(no_core)]
#![no_core]
#[lang = "sized"]
trait Sized {}
#[lang = "copy"]
trait Copy {}
#[allow(improper_ctypes)]
extern "C" {
  #[link_name = "llvm.nvvm.barrier0"]
  fn syncthreads() -> ();
}
#[inline]
pub unsafe fn _syncthreads() -> () {
  syncthreads()
}
#[no_mangle]
pub unsafe extern "ptx-kernel" fn foo() {
  _syncthreads();
  _syncthreads();
}
#[no_mangle]
pub unsafe extern "ptx-kernel" fn bar() {
  _syncthreads();
  _syncthreads();
}to yield the incorrect PTX assembly, as the call.uni bar instruction is not valid since a kernel is calling another kernel (note this requires rustc -Z merge-functions=trampolines from the above PR):
//
// Generated by LLVM NVPTX Back-End
//
.version 3.2
.target sm_35
.address_size 64
	// .globl	bar             // -- Begin function bar
                                        // @bar
.visible .entry bar()
{
// %bb.0:                               // %start
	bar.sync 	0;
	bar.sync 	0;
	ret;
                                        // -- End function
}
	// .globl	foo             // -- Begin function foo
.visible .entry foo()                   // @foo
{
// %bb.0:
	{ // callseq 0, 0
	.reg .b32 temp_param_reg;
	// XXX: `call.uni bar` is not a valid call!
	call.uni 
	bar, 
	(
	);
	} // callseq 0
	ret;
                                        // -- End function
}
Disabling MergeFunctions (e.g. using rustc -Z merge-functions=disabled) yields correct PTX assembly:
//
// Generated by LLVM NVPTX Back-End
//
.version 3.2
.target sm_35
.address_size 64
	// .globl	foo             // -- Begin function foo
                                        // @foo
.visible .entry foo()
{
// %bb.0:                               // %start
	bar.sync 	0;
	bar.sync 	0;
	ret;
                                        // -- End function
}
	// .globl	bar             // -- Begin function bar
.visible .entry bar()                   // @bar
{
// %bb.0:                               // %start
	bar.sync 	0;
	bar.sync 	0;
	ret;
                                        // -- End function
}
P.S. Currently the default operation of MergeFunctions is to emit function aliases which are not supported by NVPTX, so controlling MergeFunctions via the merge-functions flag is necessary to generate any of the PTX assembly above.
Meta
I'm on a patched rustc so this may not be so helpful, but here it is anyway:
rustc --version --verbose:
rustc 1.33.0-nightly (fb86d604b 2018-12-27)
binary: rustc
commit-hash: fb86d604bf65c3becd16180b56267a329cf268d5
commit-date: 2018-12-27
host: x86_64-unknown-linux-gnu
release: 1.33.0-nightly
LLVM version: 8.0