- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
Closed
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchE-easyCall for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue.Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue.E-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.Call for participation: An issue has been fixed and does not reproduce, but no test has been added.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Description
I tried this code, which contains 3 functions which check if all the bits in a u64 are all ones:
#[no_mangle]
fn ne_bytes(input: u64) -> bool {
    let bytes = input.to_ne_bytes();
    bytes.iter().all(|x| *x == !0)
}
#[no_mangle]
fn black_box_ne_bytes(input: u64) -> bool {
    let bytes = input.to_ne_bytes();
    let bytes = std::hint::black_box(bytes);
    bytes.iter().all(|x| *x == !0)
}
#[no_mangle]
fn direct(input: u64) -> bool {
    input == !0
}I expected to see this happen: ne_bytes() should be optimized to the same thing as direct(), while black_box_ne_bytes() should be optimized slightly worse
Instead, this happened: I got the following assembly, where ne_bytes() is somehow optimized worse than black_box_ne_bytes()
ne_bytes:
        mov     rax, rdi
        not     rax
        shl     rax, 8
        sete    cl
        shr     rdi, 56
        cmp     edi, 255
        setae   al
        and     al, cl
        ret
black_box_ne_bytes:
        mov     qword ptr [rsp - 8], rdi
        lea     rax, [rsp - 8]
        cmp     qword ptr [rsp - 8], -1
        sete    al
        ret
direct:
        cmp     rdi, -1
        sete    al
        retMeta
Reproducible on godbolt with stable rustc 1.82.0 (f6e511eec 2024-10-15) and nightly rustc 1.85.0-nightly (7db7489f9 2024-11-25)
Metadata
Metadata
Assignees
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchE-easyCall for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue.Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue.E-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.Call for participation: An issue has been fixed and does not reproduce, but no test has been added.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.