-
Notifications
You must be signed in to change notification settings - Fork 179
[CIR][ABI][AArch64][Lowering] Fix calls for struct types > 128 bits #1335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| cir_cconv_unreachable("NYI"); | ||
|
|
||
| IRCallArgs[FirstIRArg] = alloca; | ||
| // TODO(cir): add check for cases where we don't need the memcpy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit afraid on how we're gonna remember to tackle this later, any major issue that prevents it to be treated right away?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, there isn't enough information in CIR currently to determine when we don't need the copy. I plan to add these incrementally. Also, if it makes it any better, the cases where we don't need the copy and much more rarer compared to the cases where we do -:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incremental is fine, I'm mostly curious about the C source that leads to the case where we don't need a copy (I'm assuming that if you made that comment you are coming from somewhere?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, the reason for the comment is here
…1335) In [PR#1074](#1074) we introduced calls for struct types > 128 bits, but there's is an issue here. [This](https://github.com/llvm/clangir/blob/3e17e7b9404e1a28bf33bdd5943f4a208134d479/clang/lib/CIR/Dialect/Transforms/TargetLowering/LowerFunction.cpp#L1169) is meant to be a `memcpy` of the alloca instead of directly passing the alloca, just like in the [OG](https://github.com/llvm/clangir/blob/3e17e7b9404e1a28bf33bdd5943f4a208134d479/clang/lib/CodeGen/CGCall.cpp#L5323). The PR was meant to use a `memcpy` and later handle cases where we don't need the `memcpy`. For example, running the following code snippet `tmp.c` using `bin/clang tmp.c -o tmp -Xclang -fclangir -Xclang -fclangir-call-conv-lowering --target=aarch64-none-linux-gnu`: ``` #include <stdio.h> typedef struct { int a, b, c, d, e; } S; void change(S s) { s.a = 10; } void foo(void) { S s; s.a = 9; change(s); printf("%d\n", s.a); } int main(void) { foo(); return 0; } ``` gives 10 instead of 9, because we pass the pointer instead of a copy. Relevant part of the OG LLVM output: ``` @foo() %s = alloca %struct.S, align 4 %byval-temp = alloca %struct.S, align 4 %a = getelementptr inbounds nuw %struct.S, ptr %s, i32 0, i32 0 store i32 9, ptr %a, align 4 call void @llvm.memcpy.p0.p0.i64(ptr align 4 %byval-temp, ptr align 4 %s, i64 20, i1 false) call void @change(ptr noundef %byval-temp) ``` Current LLVM output through CIR: ``` @foo() %1 = alloca %struct.S, i64 1, align 4 %2 = getelementptr %struct.S, ptr %1, i32 0, i32 0 store i32 9, ptr %2, align 4 %3 = load %struct.S, ptr %1, align 4 call void @change(ptr %1) ``` So, there should be a memcpy. This PR fixes this, and adds a comment/note for the future cases where we need to check if the copy is not needed. I have also updated the old test with structs having size > 128.
…lvm#1335) In [PR#1074](llvm#1074) we introduced calls for struct types > 128 bits, but there's is an issue here. [This](https://github.com/llvm/clangir/blob/3e17e7b9404e1a28bf33bdd5943f4a208134d479/clang/lib/CIR/Dialect/Transforms/TargetLowering/LowerFunction.cpp#L1169) is meant to be a `memcpy` of the alloca instead of directly passing the alloca, just like in the [OG](https://github.com/llvm/clangir/blob/3e17e7b9404e1a28bf33bdd5943f4a208134d479/clang/lib/CodeGen/CGCall.cpp#L5323). The PR was meant to use a `memcpy` and later handle cases where we don't need the `memcpy`. For example, running the following code snippet `tmp.c` using `bin/clang tmp.c -o tmp -Xclang -fclangir -Xclang -fclangir-call-conv-lowering --target=aarch64-none-linux-gnu`: ``` typedef struct { int a, b, c, d, e; } S; void change(S s) { s.a = 10; } void foo(void) { S s; s.a = 9; change(s); printf("%d\n", s.a); } int main(void) { foo(); return 0; } ``` gives 10 instead of 9, because we pass the pointer instead of a copy. Relevant part of the OG LLVM output: ``` @foo() %s = alloca %struct.S, align 4 %byval-temp = alloca %struct.S, align 4 %a = getelementptr inbounds nuw %struct.S, ptr %s, i32 0, i32 0 store i32 9, ptr %a, align 4 call void @llvm.memcpy.p0.p0.i64(ptr align 4 %byval-temp, ptr align 4 %s, i64 20, i1 false) call void @change(ptr noundef %byval-temp) ``` Current LLVM output through CIR: ``` @foo() %1 = alloca %struct.S, i64 1, align 4 %2 = getelementptr %struct.S, ptr %1, i32 0, i32 0 store i32 9, ptr %2, align 4 %3 = load %struct.S, ptr %1, align 4 call void @change(ptr %1) ``` So, there should be a memcpy. This PR fixes this, and adds a comment/note for the future cases where we need to check if the copy is not needed. I have also updated the old test with structs having size > 128.
In PR#1074 we introduced calls for struct types > 128 bits, but there's is an issue here.
This is meant to be a
memcpyof the alloca instead of directly passing the alloca, just like in the OG. The PR was meant to use amemcpyand later handle cases where we don't need thememcpy.For example, running the following code snippet
tmp.cusingbin/clang tmp.c -o tmp -Xclang -fclangir -Xclang -fclangir-call-conv-lowering --target=aarch64-none-linux-gnu:gives 10 instead of 9, because we pass the pointer instead of a copy.
Relevant part of the OG LLVM output:
Current LLVM output through CIR:
So, there should be a memcpy.
This PR fixes this, and adds a comment/note for the future cases where we need to check if the copy is not needed. I have also updated the old test with structs having size > 128.