Skip to content

Tracking Issue for target_feature_inline_always #145574

@davidtwco

Description

@davidtwco

This is a tracking issue for the target_feature_inline_always experiment, which permits inline(always) to be used with target_feature.

The feature gate for the issue is #![feature(target_feature_inline_always)].

See discussion in the 2025-08-06 t-lang triage meeting.

About tracking issues

Tracking issues are used to record the overall progress of implementation. They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions. A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.

Instead, open a dedicated issue for the specific matter and add the relevant feature gate label. Discussion comments will get marked as off-topic or deleted. Repeated discussions on the tracking issue may lead to the tracking issue getting locked.

target_feature_inline_always

Rust prohibits the #[inline(always)] attribute from being used on functions annotated with #[target_feature(enable = "...")]. As such, platform vector intrinsics are not always inlined. Inlining of platform vector intrinsics is an important part of their semantics:

You might call this "just a perf issue", but inlining of platform vector intrinsics is an important part of their semantics. They are useless if this does not happen reliably.

Combined with Rust preferring an indirect ABI for vector types (#47743), this lack of guaranteed inlining can lead to lost performance: When not inlined, use of platform vector intrinsics result in vectors being moved in and out of memory for each call due to the Rust ABI, significantly impacting performance. In debug builds, platform vector intrinsics will never be inlined, which can needlessly hamper performance of debug builds of SIMD-heavy projects without being any more debuggable (in fact, it can often be harder to debug, due to all the additional function calls).

In order to guarantee the best performance of SIMD code, in pure Rust code, it is instead necessary to use only extern "C" functions which pass vectors as immediates. If platform vector intrinsics could use inline(always), then these issues would be avoided.

This limitation has been present since #49425 in March 2018, following rust-lang/stdarch#404 reporting that use of inline(always) with target_feature could result in instruction selection errors from LLVM or unsoundness, using the following example:

//! This example is from rust-lang/stdarch#404 in March 2018
#![feature(stdsimd)]
#![feature(target_feature)]

use std::arch::x86_64::*;

// `alpha` is inlined into `main`. As `main` does not have the "sse4.2" target
// feature enabled, LLVM is not able to select the appropriate instructions for
// the intrinsics and this results in an ICE.

#[inline(always)]
#[target_feature(enable = "sse4.2")]
unsafe fn alpha(needle: &[u8], haystack: &[u8]) -> i32 {
    let a = _mm_loadu_si128(needle.as_ptr() as *const _);
    let b = _mm_loadu_si128(haystack.as_ptr() as *const _);

    _mm_cmpestri(a, 3, b, 15, _SIDD_CMP_EQUAL_ORDERED)
}

fn main() {
    let haystack = b"Split \r\n\t line  ";
    let needle = b"\r\n\t ignore this ";

    let idx = unsafe { alpha(needle, haystack) };
    assert_eq!(idx, 6);
}

rust-lang/rfcs#2045 intended that inline(always) and target_feature should be compatible, and this was discussed in the RFC thread, concluding that it could be disallowed temporarily and revisited later during the stabilisation decision:

We can ban #[inline(always)] at first and resolve before stabilization whether we want it or not.

This is due to the behaviour of the alwaysinline hint in LLVM: it skips all checks for compatible target features between the callee and caller, and so can result in unsoundness due to ABI mismatches or internal compiler errors due to failures of instruction selection. This is a consequence of target features being a property of the body containing a call in LLVM, so when a body is inlined, the target features which apply to its calls also change (llvm/llvm-project#70563).

Restrictions preventing use of inline(always) and target_feature) together were not revisited prior to stabilisation of the target_feature attribute as had been intended.

A combination of this restriction and Rust preferring an indirect ABI for vector types meant that there were no further unsoundness resulting from inline(always) and target_feature until #116573:

//! This example is from rust-lang/rust#116573 in October 2023
use std::mem::transmute;
#[cfg(target_arch = "x86")]
use std::arch::x86::*;
#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;

extern "C" fn no_target_feature(_dummy: f32, x: __m256) {
    let val = unsafe { transmute::<_, [u32; 8]>(x) };
    dbg!(val);
}

#[inline(always)] 
fn no_target_feature_intermediate(dummy: f32, x: __m256) {
    no_target_feature(dummy, x);
}

#[target_feature(enable = "avx")]
unsafe fn with_target_feature(x: __m256) {
  // Critical call: caller and callee have different target features.
  // However, we use the Rust ABI, so this is fine.
  no_target_feature_intermediate(0.0, x);
}

fn main() {
    assert!(is_x86_feature_detected!("avx"));
    // SAFETY: we checked that the `avx` feature is present.
    unsafe {
        with_target_feature(transmute([1; 8]));
    }
}

After inlining no_target_feature_intermediate, this ends up triggering the same issue as in #116558. #116558 led to the introduction of the abi_unsupported_vector_types lint, which prevented calls to non-Rust ABI functions that have SIMD vectors in their signature unless the caller was annotated with the target feature. This lint has since been changed to a hard error and since this intervention, there is no known way of triggering #116573.

#116573 led to investigations around whether this could be fixed in LLVM. Previous attempts to perform safety checks for target features and alwaysinline resulted in many regressions:

Is there a way to reproduce this without #[inline(always)]? Forcing inlining disables target-feature safety checks in LLVM.

(Incidentally, there was an attempt to not do that in LLVM 17, but this was reverted due to the large amount of regressions it caused. People rely on that a lot, including in Rust.)

This issue was reported to LLVM in llvm/llvm-project#70563. As starting to perform the safety checks in the inliner is fraught, instead, the issue asked whether it were possible for a call terminator to be less context-dependent and retain the original ABI of the call after inlining. There was little feedback on this issue. It is understood to be quite challenging and unlikely to be implemented.

Since #111836, the target_feature attribute of closures with inline(always) are ignored. It is assumed that these will be inlined into a function with the target feature.

To enable inline(always) to be used with target_feature, rustc can instead put the alwaysinline hint on the callsites of target_feature+inline(always) functions, rather than on the definitions of those functions. rustc will only do this once it has checked that the target features are compatible and the call would be safe to inline.

If alwaysinline is not appropriate for a given callsite, then a regular inline hint could be added, or no hint at all. rustc could emit a lint warning the user that the function they are calling wants to be inlined but the caller is missing the appropriate target features. As inlining is always just a hint, even with inline(always), not inlining in some circumstances would not be surprising to the user (this already happens for inline(always) recursive calls, for example).

Function pointers to inline(always)+target_feature functions would not have any inlining hints applied to their invocations. Any lints added would also be emitted when function pointers are created in a context where inlining would not be able to occur were it a call instead.

If this were implemented, then existing platform vendor intrinsics could be made inline(always) rather than inline. It is expected that a vast majority of callers of these functions are already annotated with the appropriate target feature, so this would result in no visible change in a majority of cases (other than potentially improved performance and correct semantics of the platform vector intrinsics). Some users would start to receive the lint indicating that their call isn't being inlined, and would likely want to annotate their function appropriately.

This has been previously discussed on Zulip, but that proposal suggested an error for callers of inline(always) functions when inlining was not possible, which is a worse solution than than putting alwaysinline hint on callsites and poses tricky backwards incompatibly issues. At the time, this was seen as an unsoundness that should be fixed in LLVM and anything that Rust could do would be a workaround for that.

It is unlikely that llvm/llvm-project#70563 is fixed in LLVM anytime soon, as there have been RFCs to do this that have made no progress and other LLVM frontends have workarounds - Clang has checks in its frontend for this, applying alwaysinline appropriately to avoid unsoundness. Rust could continue to reject use of inline(always) and target_feature together, insisting that this is an LLVM limitation, but this would just prevent Rust from respecting the semantics of
platform vector intrinsics.

This is forward-compatible with changing back to putting alwaysinline hints on function definitions were llvm/llvm-project#70563 to be fixed.

After the inline(always) and target_feature attributes are permitted to co-exist, then the MIR inliner should be able to always inline when the target features are compatible too (it could be already able to do so once the limitation on the attributes is lifted).

Steps

Unresolved Questions

None yet.

Implementation history

None yet.

Metadata

Metadata

Assignees

No one assigned

    Labels

    B-experimentalBlocker: In-tree experiment; RFC pending, not yet approved or unneeded (requires FCP to stabilize).C-tracking-issueCategory: An issue tracking the progress of sth. like the implementation of an RFCS-tracking-impl-incompleteStatus: The implementation is incomplete.T-langRelevant to the language team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions