Skip to content

Conversation

@ocaisa
Copy link
Member

@ocaisa ocaisa commented Feb 22, 2022

(created using eb --new-pr)

Replaces #14607
Requires easybuilders/easybuild-framework#3971

….2-GCC-10.3.0-CUDA-11.3.1.eb and patches: MPIwrapper-2.3.2-default-installation-dirs.patch
@ocaisa ocaisa added the new label Feb 22, 2022
@ocaisa
Copy link
Member Author

ocaisa commented Feb 22, 2022

@eschnett in case you are interested, this is a first working version of MPItrampoline (including an MPI override) for use with EasyBuild.

@ocaisa
Copy link
Member Author

ocaisa commented Feb 22, 2022

Just a warning, this only works out of the box because the OpenMPI in MPItrampoline and MPIwrapper share the same dependency tree (with the one in MPIwrapper having a few additional CUDA-related deps...but most notably not including a GPU-enabled build of libfabric).

To get MPIwrapper to work for an arbitrary MPI it will most likely need a pretty clever easyblock. We will probably need everything to only be a build dep (to avoid one-name rule problems), to enforce the use of rpath and to use some of the features of MPIwrapper to get around other problems (see #14607 (comment)).

@eschnett
Copy link

@ocaisa I'm glad this works!

With "works", do you meant that MPItrampoline works with an automatically built OpenMPI, or can one also set the respective environment variables to point MPItrampoline to an external MPIwrapper, as would be necessary on an HPC system?

@ocaisa
Copy link
Member Author

ocaisa commented Feb 22, 2022

The way I've done it here, we rely primarily on the default Open MPI that is built alongside MPItrampoline, this leaves you free to define the environment variables to override that (either within EasyBuild or as an end user).

I've also included a build of MPIwrapper which includes a CUDA-enabled Open MPI that it wraps. That covers all the use cases we currently have in EasyBuild, and means that the order modules are loaded doesn't matter since we only define the environment variables in the MPIwrapper module.

We'll have to figure out a mechanism to easily allow sites to use their own MPIwrapper (since that potentially requires a second definition of the environment variables, which would mean that the order in which modules are loaded matters). Lmod may rescue us there.

@ocaisa
Copy link
Member Author

ocaisa commented Feb 23, 2022

Toolchain PR now open for this at easybuilders/easybuild-framework#3971

@ocaisa
Copy link
Member Author

ocaisa commented Feb 23, 2022

Test report by @ocaisa
SUCCESS
Build succeeded for 5 out of 5 (5 easyconfigs in total)
node1.int.eessi-gpu.learnhpc.eu - Linux Rocky Linux 8.5 (Green Obsidian), x86_64, AMD EPYC 7742 64-Core Processor (zen2), Python 3.9.9
See https://gist.github.com/842c7a01c666c042bd24f6f4462e4caf for a full test report.

@easybuilders easybuilders deleted a comment from boegelbot Feb 23, 2022
@ocaisa
Copy link
Member Author

ocaisa commented Feb 25, 2022

There's quite a bit of tweaking going on in MPItrampoline right now as we stress test this a bit. I'm going to mark this as WIP until that has settled down.

@ocaisa ocaisa changed the title {devel,mpi}[GCC/10.3.0] MPItrampoline v3.3.1, MPIwrapper v2.3.2 WIP: {devel,mpi}[GCC/10.3.0] MPItrampoline v3.3.1, MPIwrapper v2.3.2 Feb 25, 2022
@boegel boegel added this to the 4.x milestone Mar 1, 2022
@boegel
Copy link
Member

boegel commented Mar 1, 2022

Test report by @boegel
SUCCESS
Build succeeded for 5 out of 5 (5 easyconfigs in total)
node2607.swalot.os - Linux CentOS Linux 7.9.2009, x86_64, Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz (haswell), Python 3.6.8
See https://gist.github.com/76f510233f6bdfb5ec8ebebb1bb9a55e for a full test report.

@easybuilders easybuilders deleted a comment from boegelbot Mar 1, 2022
Copy link
Member

@jfgrimm jfgrimm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As decided in issue #16330, we have deprecated the use of True to signify a system-toolchain dependency (#16384), in favour of the more intuitive SYSTEM template constant. Due to the change in the test suite, please run eb --sync-pr-with-develop 15018 and update the PR to use SYSTEM instead.

@akesandgren
Copy link
Contributor

Closing this since GCC(core)/10.3.0 foss/2021a is deprecated, see https://docs.easybuild.io/policies/toolchains

Sorry for not getting back to this @ocaisa

If this is still relevant, please consider opening a new pull request using a more recent toolchain

@akesandgren akesandgren closed this May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants