-
Notifications
You must be signed in to change notification settings - Fork 304
Enhance LLVM EasyBlock to better handle offload builds #3675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I'm testing this with a custom EasyConfig for LLVM 19.1.7 based on GCCcore 14.2.0, including offloading support for |
|
Test report by @Thyre Overview of tested easyconfigs (in order)
Build succeeded for 0 out of 1 (1 easyconfigs in total) |
|
Building of one test case consistently fails with a segmentation fault: Certainly not related to this PR, but causes an issue nonetheless, as this aborts building LLVM even if only a test case fails to build. |
|
TODO: Fix EasyBuild abort due to raising an EasyBuildError in test_step, even when ignore_test_failure is set. See: https://easybuild.slack.com/archives/C34UA1HT7/p1743263737188979 Fix this by switching to report_test_failure. |
|
@boegelbot please test @ jsc-zen3 |
|
@Thyre: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... - notification for comment with ID 2764630785 processed Message to humans: this is just bookkeeping information for me, |
|
Test report by @boegelbot Overview of tested easyconfigs (in order)
Build succeeded for 1 out of 1 (1 easyconfigs in total) |
|
Test report by @Thyre Overview of tested easyconfigs (in order)
Build succeeded (with --ignore-test-failure) for 0 out of 1 (1 easyconfigs in total) |
|
Test report by @Thyre Overview of tested easyconfigs (in order)
Build succeeded (with --ignore-test-failure) for 0 out of 1 (1 easyconfigs in total) |
|
Test report by @Thyre Overview of tested easyconfigs (in order)
Build succeeded (with --ignore-test-failure) for 0 out of 1 (1 easyconfigs in total) |
|
|
|
The argument The AMDGPU targets did not include |
|
Test report by @Thyre Overview of tested easyconfigs (in order)
Build succeeded (with --ignore-test-failure) for 0 out of 1 (1 easyconfigs in total) |
|
Umh it needs Crivella@b5b988d the first part is just to do the build in parallel first and than serial if it fails, the second is just the fixed sanity check EDIT: NVM it was renamed and i see you already added the fix |
|
The build worked fine, but I ran into llvm/llvm-project@7102592 I'll take a look at my failed tests. Everything else looks fine now. |
|
Hm, a very large portion of my tests fail with |
|
@boegelbot please test @ jsc-zen3 |
|
@Thyre: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... - notification for comment with ID 2766532084 processed Message to humans: this is just bookkeeping information for me, |
|
Test report by @boegelbot Overview of tested easyconfigs (in order)
Build succeeded for 1 out of 1 (1 easyconfigs in total) |
|
Test report by @Thyre Overview of tested easyconfigs (in order)
Build succeeded (with --ignore-test-failure) for 1 out of 1 (1 easyconfigs in total) |
|
I still had the same 142 test failures, but everything else finally worked! |
While not officially documented somewhere, this is used in multiple places within LLVM to pass runtime arguments. This seems to work for LLVM 19, but has to be verified with other versions as well. Signed-off-by: Jan André Reuter <[email protected]>
|
The latest commit allowed both arguments to finally be passed correctly, though I've only checked this on "Linux" (one of my PCs) with a single stage build. Triggered another build here: easybuilders/easybuild-easyconfigs#22517 (comment) Will do additional testing on other systems. To test This is how it looked like during the build: /data/EasyBuild-develop/build/LLVM/19.1.7/GCCcore-14.2.0/llvm.obj.1 > grep -rni -e "LIBOMPTARGET_PLUGINS_TO_BUILD" -e "LIBOMPTARGET_DEVICE_ARCHITECTURES" **/CMakeCache.txt
---------------------------
CMakeCache.txt:2097:RUNTIMES_CMAKE_ARGS:UNINITIALIZED=-DCMAKE_C_FLAGS=--gcc-install-dir=/data/EasyBuild-develop/software/GCCcore/14.2.0/lib/gcc/x86_64-pc-linux-gnu/14.2.0;-DCMAKE_CXX_FLAGS=--gcc-install-dir=/data/EasyBuild-develop/software/GCCcore/14.2.0/lib/gcc/x86_64-pc-linux-gnu/14.2.0;-DPYTHON_EXECUTABLE=/data/EasyBuild-develop/software/Python/3.13.1-GCCcore-14.2.0/bin/python;-DPython_EXECUTABLE=/data/EasyBuild-develop/software/Python/3.13.1-GCCcore-14.2.0/bin/python;-DPython3_EXECUTABLE=/data/EasyBuild-develop/software/Python/3.13.1-GCCcore-14.2.0/bin/python;-DLIBOMPTARGET_PLUGINS_TO_BUILD=host|amdgpu;-DLIBOMPTARGET_DEVICE_ARCHITECTURES=gfx1101|gfx90a
runtimes/runtimes-bins/CMakeCache.txt:938:LIBOMPTARGET_DEVICE_ARCHITECTURES:STRING=gfx1101;gfx90a
runtimes/runtimes-bins/CMakeCache.txt:964:LIBOMPTARGET_PLUGINS_TO_BUILD:STRING=host;amdgpu
runtimes/runtimes-bins/CMakeCache.txt:1252:RUNTIMES_CMAKE_ARGS:UNINITIALIZED=-DCMAKE_C_FLAGS=--gcc-install-dir=/data/EasyBuild-develop/software/GCCcore/14.2.0/lib/gcc/x86_64-pc-linux-gnu/14.2.0;-DCMAKE_CXX_FLAGS=--gcc-install-dir=/data/EasyBuild-develop/software/GCCcore/14.2.0/lib/gcc/x86_64-pc-linux-gnu/14.2.0;-DPYTHON_EXECUTABLE=/data/EasyBuild-develop/software/Python/3.13.1-GCCcore-14.2.0/bin/python;-DPython_EXECUTABLE=/data/EasyBuild-develop/software/Python/3.13.1-GCCcore-14.2.0/bin/python;-DPython3_EXECUTABLE=/data/EasyBuild-develop/software/Python/3.13.1-GCCcore-14.2.0/bin/python;-DLIBOMPTARGET_PLUGINS_TO_BUILD=host;amdgpu;-DLIBOMPTARGET_DEVICE_ARCHITECTURES=gfx1101;gfx90a |
… build Updated runtime arguments for offloading were potentially ignored, as they were never passed to CMake. Did not occur with EB develop, but with EB 5.0.0. Signed-off-by: Jan André Reuter <[email protected]>
ce46ced to
a173a53
Compare
|
While working on an easyconfig + easyblock (which uses the LLVM easyblock and does some customizations on top of it) for ROCm LLVM I was running into the same issue with |
|
Sry was out of the loop for a week while on holidays. Where the |
Yes, they were not passed to the runtime build, causing all offload architectures (and device |
|
New test reports:
|
I managed to reproduce the same crash on a somewhat minimal Fedora 42 installation while trying out easybuilders/easybuild-framework#4860. Looks like this specific LLVM version may be problematic, depending on the configuration. Will move to LLVM 18.1.7, 19.1.1 & 20.1.1 for further testing. |
|
Test report by @Thyre Overview of tested easyconfigs (in order)
Build succeeded for 1 out of 1 (1 easyconfigs in total) Well, that's weird: There is a This system doesn't have |
|
Test report by @Thyre Overview of tested easyconfigs (in order)
Build succeeded for 1 out of 1 (1 easyconfigs in total) This one looks perfectly good. So I guess that LLVM 20 just places a dummy |
|
Triggered another LLVM 18 build here: easybuilders/easybuild-easyconfigs#21832 (comment) Generally, I think that this should be ready now. It won't fix #3702. We should handle this in a separate PR. The remainder looks good now for LLVM 18-20. @Crivella, @bedroge, @boegel, can you take a final look? Thanks a lot. |
bedroge
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've used this myself for several ROCm-LLVM 6.3.3 (LLVM 18) and 6.4.0 (LLVM 19) builds, and that worked fine. I haven't tested it with the regular LLVM builds, but I see several successful test reports.
So I'm happy with the changes. @Crivella: I suspect that you were also going to review this (?), so I won't merge it yet.
|
@bedroge I am running one final build on my WS with this PR, once it is finished i can approve and hit merge |
Crivella
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Going in, thanks @Thyre! |
Fixes the following issues in the new LLVM EasyBlock:
LIBOMPTARGET_PLUGINS_TO_BUILDto runtime CMake variables, as the value was disregarded before, building all offload plugins.pxtasdue to tests failing otherwisebuild_targets = allAMDGPU_GFX_SUPPORT, as outdated and hard to maintain