Skip to content

Conversation

@akesandgren
Copy link
Contributor

@akesandgren akesandgren commented May 7, 2024

…2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch, PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch
@akesandgren
Copy link
Contributor Author

Tests that are failing for me are:

inductor/test_torchinductor 1/1 failed!   test_multilayer_var_lowp
inductor/test_torchinductor_dynamic_shapes 1/1 failed!   test_multilayer_var_lowp
test_cpp_extensions_open_device_registration 1/1 failed!   test_open_device_registration (Not implemented yet ?)
inductor/test_cpu_repro 1/1 failed!    test_scatter_using_atomic_add
test_decomp 1/1 failed!   test_sdpa (_nn_functional_scaled_dot_product_attention_cpu_bfloat16)
inductor/test_torchinductor_opinfo 1/1 failed!
 inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCPU::test_comprehensive_fft_ihfft2_cpu_int32 FAILED
    C++ error

@akesandgren
Copy link
Contributor Author

@Flamefire
The first two are the same, precision problem on AMD zen3 at least
the cpp_extensions_open_device_registration.... haven't a clue yet
the scatter_using_atomic_add looks like it's not compiling to the code it expects, not sure why
test_sdpa is also precision related
I didn't attack the C++ error

@akesandgren
Copy link
Contributor Author

@boegelbot Please test @ jsc-zen3

@boegelbot
Copy link
Collaborator

@akesandgren: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=20489 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_20489 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 4085

Test results coming soon (I hope)...

- notification for comment with ID 2098172875 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@akesandgren
Copy link
Contributor Author

Test report by @akesandgren
FAILED
Build succeeded for 0 out of 1 (3 easyconfigs in total)
b-an02.hpc2n.umu.se - Linux Ubuntu 20.04, x86_64, Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz, Python 3.8.10
See https://gist.github.com/akesandgren/ef17ea2435926ca06bbe5cbbe6058158 for a full test report.

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 2 out of 3 (3 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.3, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.18
See https://gist.github.com/boegelbot/1ce9b70e5410ebd3a1d8dbbce992b8c7 for a full test report.

@akesandgren
Copy link
Contributor Author

Test report by @akesandgren
FAILED
Build succeeded for 2 out of 3 (3 easyconfigs in total)
b-cn1607.hpc2n.umu.se - Linux Ubuntu 22.04, x86_64, Intel(R) Xeon(R) Gold 6336Y CPU @ 2.40GHz, 3 x NVIDIA NVIDIA A100 80GB PCIe, 545.29.06, Python 3.10.12
See https://gist.github.com/akesandgren/9a9b6ec51769af98d6d4689b4e1ba93a for a full test report.

@akesandgren
Copy link
Contributor Author

Interesting...
If I run the tests standalone there are fewer failing tests than when run during a build...

@Flamefire
Copy link
Contributor

Interesting... If I run the tests standalone there are fewer failing tests than when run during a build...

Not unusual for PyTorch ;-)
I just got bitten again by $XDG_CACHE_HOME: PyTorch uses that to store JIT compiled files so rerunning the same test again with the same value for that will result in a different behavior as it will load the file from that directory instead of JIT compiling it.

@akesandgren
Copy link
Contributor Author

akesandgren commented May 14, 2024

These fail because SANDCASTLE=1 when run as part of build

export/test_lift_unlift
export/test_serialize
export/test_torchbind
export/test_unflatten
higher_order_ops/test_with_effects
test_weak

And those are the diff between my standalone test run (which was without SANDCASTLE) and the test-while-building

@akesandgren
Copy link
Contributor Author

@Flamefire Do you know why we set SANDCASTLE=1 in the easyblock?
As far as I can see it is a specific machine that they run tests on...

@Flamefire
Copy link
Contributor

@Flamefire Do you know why we set SANDCASTLE=1 in the easyblock? As far as I can see it is a specific machine that they run tests on...

Yes, there are a lot of things like @unittest.skipIf(IS_SANDCASTLE, "NYI: fuser CPU support for Sandcastle") in the tests and the idea was: If they don't even run/work on their machine we shouldn't even try to do for us.

So we might need to patch those failing ones. For TestWithEffects it loads a different library, similar in test_weak.py and likely for the export tests although I couldn't find the exact ones you mentioned

@akesandgren
Copy link
Contributor Author

I'm doing a test without SANDCASTLE set and test_hub disabled, that's one of only two I found that is doing external downloads, the other being one test in test_nn.
By the looks of some of the comments around SANDCASTLE it doesn't feel like a normal x86_64 based machine...

@akesandgren
Copy link
Contributor Author

And I have manually run the full test suite without SANDCASTLE set on a previous build and saw only 3 failed tests.
So I don't think we need SANDCASTLE set.

@Flamefire
Copy link
Contributor

By the looks of some of the comments around SANDCASTLE it doesn't feel like a normal x86_64 based machine...

Might be. I used it because it disable a LOT of tests, especially those downloading stuff IIRC. See https://github.com/search?q=repo%3Apytorch%2Fpytorch%20IS_SANDCASTLE&type=code

Two such instances seems to skip whole classes of tests at once: https://github.com/pytorch/pytorch/blob/20aa7cc6788ff10dee2d927057b10a81af638a32/test/jit/test_backends.py#L69-L73 and https://github.com/pytorch/pytorch/blob/2e4d0111953e6db7e4ce5cf041e6a78770092495/test/jit/test_torchbind.py#L37-L38

And I have manually run the full test suite without SANDCASTLE set on a previous build and saw only 3 failed tests.

If it is indeed the case that now NOT setting it causes fewer failures then we should. Best to condition it on 2.3+ to not introduce regressions.

I'll try to push a change upstream to use something like @skip_if_sandcastle which would give us an easy way to skip all those tests by patching that function without changing any other behavior controlled by that env variable

@Flamefire
Copy link
Contributor

We have another issue: pytest-rerun-failures interferes with our test parsing. We want some output like

    # ===================== 2 failed, 128 passed, 2 skipped, 2 warnings in 3.43s =====================
    # test_quantization failed!

But now we get:

Running test_cpp_extensions_open_device_registration 1/1 ... [2024-05-13 16:48:56.717884]
Executing ['.../python', '-bb', 'test_cpp_extensions_open_device_registration.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2'] ... [2024-05-13 16:48:56.718522]
===================== test session starts =====================
[...]
('RERUN', {'yellow': True}) [1.1713s]                                                    [100%]
test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration ('RERUN', {'yellow': True}) [0.0036s]                                                    [100%]
test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration FAILED [0.0033s]                                                                         [100%]

===================== RERUNS =====================
_____________________ TestCppExtensionOpenRgistration.test_open_device_registration _____________________
[...]
_____________________ TestCppExtensionOpenRgistration.test_open_device_registration _____________________
[...]
===================== FAILURES =====================
_____________________ TestCppExtensionOpenRgistration.test_open_device_registration _____________________
[...]
===================== short test summary info =====================
FAILED [0.0033s] test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration - AssertionError: RuntimeError not raised
!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!
===================== 1 failed, 2 rerun in 39.35s =====================
Got exit code 1
Retrying...
===================== test session starts =====================
[...]
('RERUN', {'yellow': True}) [1.9584s]                                                    [100%]
test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration [W Module.cpp:160] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1...

('RERUN', {'yellow': True}) [0.0036s]                                                    [100%]
test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration [W Module.cpp:160] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1...

FAILED [0.0023s]                                                                         [100%]

===================== RERUNS =====================
_____________________ TestCppExtensionOpenRgistration.test_open_device_registration _____________________
[...]
===================== FAILURES =====================
_____________________ TestCppExtensionOpenRgistration.test_open_device_registration _____________________
[...]
===================== short test summary info =====================
FAILED [0.0023s] test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration - AssertionError: RuntimeError not raised
!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!
===================== 1 failed, 2 rerun in 40.27s =====================
Got exit code 1
Retrying...
===================== test session starts =====================
[...]
('RERUN', {'yellow': True}) [1.8911s]                                                    [100%]
test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration [W Module.cpp:160] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1...

('RERUN', {'yellow': True}) [0.0032s]                                                    [100%]
test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration [W Module.cpp:160] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1...

FAILED [0.0021s]                                                                         [100%]

===================== RERUNS =====================
_____________________ TestCppExtensionOpenRgistration.test_open_device_registration _____________________
[...]
===================== FAILURES =====================
_____________________ TestCppExtensionOpenRgistration.test_open_device_registration _____________________
[...]
===================== short test summary info =====================
FAILED [0.0021s] test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration - AssertionError: RuntimeError not raised
!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!
===================== 1 failed, 2 rerun in 40.00s =====================
Got exit code 1
Retrying...
===================== test session starts =====================
[...]
===================== 1 deselected in 0.02s =====================
The following tests failed consistently: ['test/test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration']
test_cpp_extensions_open_device_registration 1/1 failed!
Running test_cuda 1/1 ... [2024-05-13 16:51:10.730579]
  1. I don't see how we could reasonably parse this
  2. It exits after the first failed test. This means even "1 failed, 2 rerun in 40.00s" just says: "1 test out of an unknown number of tests failed"

@akesandgren
Copy link
Contributor Author

@boegelbot Please test @ jsc-zen3
EB_ARGS="--include-easyblocks-from-pr 3330"

@boegelbot
Copy link
Collaborator

@akesandgren: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=20489 EB_ARGS="--include-easyblocks-from-pr 3330" EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_20489 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 4128

Test results coming soon (I hope)...

- notification for comment with ID 2112638268 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optree requires typing-extensions/4.10.0-GCCcore-13.2.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that? I installed it just fine:

... python -m pip check completed successfully

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are you getting typing-extensions? It is not part of Python-3.11.5-GCCcore-13.2.0.eb. optree build fails without typing-extensions.

== installing...
== ... (took 29 secs)
== taking care of extensions...
== restore after iterating...
== postprocessing...
== sanity checking...
== ... (took 3 secs)
== FAILED: Installation ended unsuccessfully (build directory: /build/optree/0.11.0/GCCcore-13.2.0): build failed (first 300 chars): `/app/software/Python/3.11.5-GCCcore-13.2.0/bin/python -m pip check` failed:
optree 0.11.0 requires typing-extensions, which is not installed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you need to reinstall Python. The current develop version and release 4.9.1 contains it:

('typing_extensions', '4.8.0', {
'checksums': ['df8e4339e9cb77357558cbdbceca33c303714cf861d1eef15e1070055ae8b7ef'],
}),

However it was a change between 4.8.2 and 4.9.x by #19777

From the looks of that PR this was made because too many other ECs depended on that. And IMO it makes sense to include it in Python by default

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, --rebuild --skip added four packages. This will fix many things for me.

== installing extension tomli 2.0.1 (1/4)...
==      configuring...
==      building...
==      testing...
==      installing...
==      ... (took 11 secs)
== installing extension packaging 23.2 (2/4)...
==      configuring...
==      building...
==      testing...
==      installing...
==      ... (took 2 secs)
== installing extension typing_extensions 4.8.0 (3/4)...
==      configuring...
==      building...
==      testing...
==      installing...
==      ... (took 2 secs)
== installing extension setuptools-scm 8.0.4 (4/4)...

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#3330
FAILED
Build succeeded for 2 out of 3 (3 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.4, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.18
See https://gist.github.com/boegelbot/50db8a2d24c3a8139108dc99a9001182 for a full test report.

@boegel boegel added this to the 4.x milestone May 22, 2024
@akesandgren
Copy link
Contributor Author

@Flamefire Any ideas on how to deal with the error output parsing problem?

@Flamefire
Copy link
Contributor

@Flamefire Any ideas on how to deal with the error output parsing problem?

Not many. I still have an open issue for that: pytorch/pytorch#126523

No luck so far to get a machine readable output from PyTorch directly. I.e. I wanted them to get the --save-xml option work correctly but nothing yet after pytorch/pytorch#126690 failed.

We could try to get that option working by patching the test files to make sure --junit-xml-reruns and --save-xml is always set/passed. Then we can check if the XML files are any good for us.

Another option would be to revert their changes to the rerun feature using a custom implementation that broke our detection: pytorch/pytorch@3b7d60b

That might get difficult to keep going forward but I don't see any current alternatives.

@jpecar
Copy link
Contributor

jpecar commented Nov 12, 2024

optree in this pr is missing git as builddependency.

@github-actions
Copy link

github-actions bot commented Mar 4, 2025

Updated software PyTorch-2.3.0-foss-2023b.eb

Diff against PyTorch-2.1.2-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
index bce1b68aa7..ca28c8dcea 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
@@ -1,5 +1,5 @@
 name = 'PyTorch'
-version = '2.1.2'
+version = '2.3.0'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
@@ -11,7 +11,6 @@ source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
 patches = [
     'PyTorch-1.7.0_disable-dev-shm-test.patch',
-    'PyTorch-1.11.1_skip-test_init_from_local_shards.patch',
     'PyTorch-1.12.1_add-hypothesis-suppression.patch',
     'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch',
     'PyTorch-1.12.1_fix-TestTorch.test_to.patch',
@@ -23,39 +22,34 @@ patches = [
     'PyTorch-1.13.1_skip-tests-without-fbgemm.patch',
     'PyTorch-2.0.1_avoid-test_quantization-failures.patch',
     'PyTorch-2.0.1_fix-skip-decorators.patch',
-    'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch',
     'PyTorch-2.0.1_fix-vsx-loadu.patch',
-    'PyTorch-2.0.1_no-cuda-stubs-rpath.patch',
     'PyTorch-2.0.1_skip-failing-gradtest.patch',
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
-    'PyTorch-2.1.0_disable-gcc12-warning.patch',
-    'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
-    'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
-    'PyTorch-2.1.0_fix-validationError-output-test.patch',
     'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch',
     'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch',
-    'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch',
     'PyTorch-2.1.0_remove-test-requiring-online-access.patch',
     'PyTorch-2.1.0_skip-diff-test-on-ppc.patch',
     'PyTorch-2.1.0_skip-dynamo-test_predispatch.patch',
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
-    'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
-    'PyTorch-2.1.0_skip-test_wrap_bad.patch',
-    'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
-    'PyTorch-2.1.2_fix-test_memory_profiler.patch',
-    'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
-    'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
-    'PyTorch-2.1.2_fix-vsx-vector-div.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
+    'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch',
+    'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch',
+    'PyTorch-2.3.0_skip-test_init_from_local_shards.patch',
+    'PyTorch-2.3.0_no-cuda-stubs-rpath.patch',
+    'PyTorch-2.3.0_disable-gcc12-warning.patch',
+    'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch',
+    'PyTorch-2.3.0_disable_tests_which_need_network_download.patch',
+    'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch',
+    'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch',
+    'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch',
+    'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch',
+    'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch',
 ]
 checksums = [
-    {'pytorch-v2.1.2.tar.gz': '85effbcce037bffa290aea775c9a4bad5f769cb229583450c40055501ee1acd7'},
+    {'pytorch-v2.3.0.tar.gz': '69579513b26261bbab32e13b7efc99ad287fcf3103087f2d4fdf1adacd25316f'},
     {'PyTorch-1.7.0_disable-dev-shm-test.patch': '622cb1eaeadc06e13128a862d9946bcc1f1edd3d02b259c56a9aecc4d5406b8a'},
-    {'PyTorch-1.11.1_skip-test_init_from_local_shards.patch':
-     '4aeb1b0bc863d4801b0095cbce69f8794066748f0df27c6aaaf729c5ecba04b7'},
     {'PyTorch-1.12.1_add-hypothesis-suppression.patch':
      'e71ffb94ebe69f580fa70e0de84017058325fdff944866d6bd03463626edc32c'},
     {'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch':
@@ -75,28 +69,16 @@ checksums = [
     {'PyTorch-2.0.1_avoid-test_quantization-failures.patch':
      '02e3f47e4ed1d7d6077e26f1ae50073dc2b20426269930b505f4aefe5d2f33cd'},
     {'PyTorch-2.0.1_fix-skip-decorators.patch': '2039012cef45446065e1a2097839fe20bb29fe3c1dcc926c3695ebf29832e920'},
-    {'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch':
-     '1b37194f55ae678f3657b8728dfb896c18ffe8babe90987ce468c4fa9274f357'},
     {'PyTorch-2.0.1_fix-vsx-loadu.patch': 'a0ffa61da2d47c6acd09aaf6d4791e527d8919a6f4f1aa7ed38454cdcadb1f72'},
-    {'PyTorch-2.0.1_no-cuda-stubs-rpath.patch': '8902e58a762240f24cdbf0182e99ccdfc2a93492869352fcb4ca0ec7e407f83a'},
     {'PyTorch-2.0.1_skip-failing-gradtest.patch': '8030bdec6ba49b057ab232d19a7f1a5e542e47e2ec340653a246ec9ed59f8bc1'},
     {'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch':
      '7047862abc1abaff62954da59700f36d4f39fcf83167a638183b1b7f8fec78ae'},
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
-    {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
-    {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
-     'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
-    {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
-     '84bb51a719abc677031a7a3dfe4382ff098b0cbd8b39b8bed2a7fa03f80ac1e9'},
-    {'PyTorch-2.1.0_fix-validationError-output-test.patch':
-     '7eba0942afb121ed92fac30d1529447d892a89eb3d53c565f8e9d480e95f692b'},
     {'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch':
      '3793b4b878be1abe7791efcbd534774b87862cfe7dc4774ca8729b6cabb39e7e'},
     {'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch':
      'aef38adf1210d0c5455e91d7c7a9d9e5caad3ae568301e0ba9fc204309438e7b'},
-    {'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch':
-     '0ac36411e76506b3354c85a8a1260987f66af947ee52ffc64230aee1fa02ea8b'},
     {'PyTorch-2.1.0_remove-test-requiring-online-access.patch':
      '35184b8c5a1b10f79e511cc25db3b8a5585a5d58b5d1aa25dd3d250200b14fd7'},
     {'PyTorch-2.1.0_skip-diff-test-on-ppc.patch': '394157dbe565ffcbc1821cd63d05930957412156cc01e949ef3d3524176a1dda'},
@@ -104,22 +86,34 @@ checksums = [
      '6298daf9ddaa8542850eee9ea005f28594ab65b1f87af43d8aeca1579a8c4354'},
     {'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch':
      '5229ca88a71db7667a90ddc0b809b2c817698bd6e9c5aaabd73d3173cf9b99fe'},
-    {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
-     '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
-    {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
-    {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
-     'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
-    {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
-     '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
-    {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
-     'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
-    {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
-    {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
+    {'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch':
+     '23416f2d9d5226695ec3fbea0671e3650c655c19deefd3f0f8ddab5afa50f485'},
+    {'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch':
+     '0dcbdfde6752c3ff54c5376f521b4a742167669feb7f0f1d4e1d4d55f72b664f'},
+    {'PyTorch-2.3.0_skip-test_init_from_local_shards.patch':
+     '90ed9c2870f57ee6dc032d00873a37e2217a2b92a13035ded1c25ad5306455f2'},
+    {'PyTorch-2.3.0_no-cuda-stubs-rpath.patch':
+     '7ba26824b5def7379cff02ae821a080698e6affea0da45bc846e9ecb89939cb1'},
+    {'PyTorch-2.3.0_disable-gcc12-warning.patch':
+     'a8a624e1a2a5f4c82610173e50bd0f853e49bd5621b432f5aac689f9f6eb1514'},
+    {'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch':
+     '36aa2d5ba175be17f4e996f4fb2d544fe477d4a0bd0644cd59a85063779afc8e'},
+    {'PyTorch-2.3.0_disable_tests_which_need_network_download.patch':
+     'b7fd1a5135dfd4098cdc054182f7bf84a23ac98462a00477712182b5442da855'},
+    {'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch':
+     '041adcd91d994b8c2ab57d227f081cd57e572c157117b37171e1eb8eb576f8fc'},
+    {'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch':
+     'aa6ff764f3f7bf84372a8a257fe1b4ae6dc4b9744ad35f0f9015f2696c62a41e'},
+    {'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch':
+     '9703fd0f1fca8916f6d79d83e9a7efe8e3f717362a5fdaa8f5d9da90d0c75018'},
+    {'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch':
+     '7955f2655db3da18606574fdcbc5990be24098f49ad1db5e86ea756ea1cc506f'},
+    {'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch':
+     'ee07d21c3ac7aeb0bd0e39507b18a417b9125284a529102929c4b5c6727c2976'},
 ]
 
 osdependencies = [OS_PKG_IBVERBS_DEV]
@@ -131,6 +125,9 @@ builddependencies = [
     ('pytest-flakefinder', '1.1.0'),
     ('pytest-rerunfailures', '14.0'),
     ('pytest-shard', '0.1.2'),
+    ('tlparse', '0.3.5'),
+    ('optree', '0.13.0'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
@@ -170,15 +167,24 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # This test is expected to fail when run in their CI, but won't in our case.
+        # It just checks for a "CI" env variable
+        'test_ci_sanity_check_fail',
+        # This fails consistently and is disabled upstream
+        # See https://github.com/pytorch/pytorch/issues/100152 and
+        # https://github.com/pytorch/pytorch/pull/124712
+        'test_cpp_extensions_open_device_registration',
+
     ]
 }
 
-runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-through-error  --verbose %(excluded_tests)s'
+local_test_opts = '--continue-through-error --pipe-logs --verbose %(excluded_tests)s'
+runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py ' + local_test_opts
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 6
 
 tests = ['PyTorch-check-cpp-extension.py']
 
Diff against PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
index 65dfced170..ca28c8dcea 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
@@ -1,18 +1,16 @@
 name = 'PyTorch'
-version = '2.1.2'
-versionsuffix = '-CUDA-%(cudaver)s'
+version = '2.3.0'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2023b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
 patches = [
     'PyTorch-1.7.0_disable-dev-shm-test.patch',
-    'PyTorch-1.11.1_skip-test_init_from_local_shards.patch',
     'PyTorch-1.12.1_add-hypothesis-suppression.patch',
     'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch',
     'PyTorch-1.12.1_fix-TestTorch.test_to.patch',
@@ -24,50 +22,34 @@ patches = [
     'PyTorch-1.13.1_skip-tests-without-fbgemm.patch',
     'PyTorch-2.0.1_avoid-test_quantization-failures.patch',
     'PyTorch-2.0.1_fix-skip-decorators.patch',
-    'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch',
     'PyTorch-2.0.1_fix-vsx-loadu.patch',
-    'PyTorch-2.0.1_no-cuda-stubs-rpath.patch',
     'PyTorch-2.0.1_skip-failing-gradtest.patch',
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
-    'PyTorch-2.1.0_disable-gcc12-warning.patch',
-    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
-    'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
-    'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
-    'PyTorch-2.1.0_fix-validationError-output-test.patch',
     'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch',
     'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch',
-    'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch',
     'PyTorch-2.1.0_remove-test-requiring-online-access.patch',
     'PyTorch-2.1.0_skip-diff-test-on-ppc.patch',
     'PyTorch-2.1.0_skip-dynamo-test_predispatch.patch',
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
-    'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
-    'PyTorch-2.1.0_skip-test_wrap_bad.patch',
-    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
-    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
-    'PyTorch-2.1.2_fix-device-mesh-check.patch',
-    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
-    'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
-    'PyTorch-2.1.2_fix-test_memory_profiler.patch',
-    'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
-    'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
-    'PyTorch-2.1.2_fix-vsx-vector-div.patch',
-    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
-    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
-    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
-    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
-    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
+    'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch',
+    'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch',
+    'PyTorch-2.3.0_skip-test_init_from_local_shards.patch',
+    'PyTorch-2.3.0_no-cuda-stubs-rpath.patch',
+    'PyTorch-2.3.0_disable-gcc12-warning.patch',
+    'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch',
+    'PyTorch-2.3.0_disable_tests_which_need_network_download.patch',
+    'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch',
+    'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch',
+    'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch',
+    'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch',
+    'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch',
 ]
 checksums = [
-    {'pytorch-v2.1.2.tar.gz': '85effbcce037bffa290aea775c9a4bad5f769cb229583450c40055501ee1acd7'},
+    {'pytorch-v2.3.0.tar.gz': '69579513b26261bbab32e13b7efc99ad287fcf3103087f2d4fdf1adacd25316f'},
     {'PyTorch-1.7.0_disable-dev-shm-test.patch': '622cb1eaeadc06e13128a862d9946bcc1f1edd3d02b259c56a9aecc4d5406b8a'},
-    {'PyTorch-1.11.1_skip-test_init_from_local_shards.patch':
-     '4aeb1b0bc863d4801b0095cbce69f8794066748f0df27c6aaaf729c5ecba04b7'},
     {'PyTorch-1.12.1_add-hypothesis-suppression.patch':
      'e71ffb94ebe69f580fa70e0de84017058325fdff944866d6bd03463626edc32c'},
     {'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch':
@@ -87,30 +69,16 @@ checksums = [
     {'PyTorch-2.0.1_avoid-test_quantization-failures.patch':
      '02e3f47e4ed1d7d6077e26f1ae50073dc2b20426269930b505f4aefe5d2f33cd'},
     {'PyTorch-2.0.1_fix-skip-decorators.patch': '2039012cef45446065e1a2097839fe20bb29fe3c1dcc926c3695ebf29832e920'},
-    {'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch':
-     '1b37194f55ae678f3657b8728dfb896c18ffe8babe90987ce468c4fa9274f357'},
     {'PyTorch-2.0.1_fix-vsx-loadu.patch': 'a0ffa61da2d47c6acd09aaf6d4791e527d8919a6f4f1aa7ed38454cdcadb1f72'},
-    {'PyTorch-2.0.1_no-cuda-stubs-rpath.patch': '8902e58a762240f24cdbf0182e99ccdfc2a93492869352fcb4ca0ec7e407f83a'},
     {'PyTorch-2.0.1_skip-failing-gradtest.patch': '8030bdec6ba49b057ab232d19a7f1a5e542e47e2ec340653a246ec9ed59f8bc1'},
     {'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch':
      '7047862abc1abaff62954da59700f36d4f39fcf83167a638183b1b7f8fec78ae'},
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
-    {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
-    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
-     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
-    {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
-     'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
-    {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
-     '84bb51a719abc677031a7a3dfe4382ff098b0cbd8b39b8bed2a7fa03f80ac1e9'},
-    {'PyTorch-2.1.0_fix-validationError-output-test.patch':
-     '7eba0942afb121ed92fac30d1529447d892a89eb3d53c565f8e9d480e95f692b'},
     {'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch':
      '3793b4b878be1abe7791efcbd534774b87862cfe7dc4774ca8729b6cabb39e7e'},
     {'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch':
      'aef38adf1210d0c5455e91d7c7a9d9e5caad3ae568301e0ba9fc204309438e7b'},
-    {'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch':
-     '0ac36411e76506b3354c85a8a1260987f66af947ee52ffc64230aee1fa02ea8b'},
     {'PyTorch-2.1.0_remove-test-requiring-online-access.patch':
      '35184b8c5a1b10f79e511cc25db3b8a5585a5d58b5d1aa25dd3d250200b14fd7'},
     {'PyTorch-2.1.0_skip-diff-test-on-ppc.patch': '394157dbe565ffcbc1821cd63d05930957412156cc01e949ef3d3524176a1dda'},
@@ -118,74 +86,68 @@ checksums = [
      '6298daf9ddaa8542850eee9ea005f28594ab65b1f87af43d8aeca1579a8c4354'},
     {'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch':
      '5229ca88a71db7667a90ddc0b809b2c817698bd6e9c5aaabd73d3173cf9b99fe'},
-    {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
-     '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
-    {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
-    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
-    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
-     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
-    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
-    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
-     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
-    {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
-     'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
-    {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
-     '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
-    {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
-     'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
-    {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
-    {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
-    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
-     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
-    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
-     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
-    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
-    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
-     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
-     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
-    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
-     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
+    {'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch':
+     '23416f2d9d5226695ec3fbea0671e3650c655c19deefd3f0f8ddab5afa50f485'},
+    {'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch':
+     '0dcbdfde6752c3ff54c5376f521b4a742167669feb7f0f1d4e1d4d55f72b664f'},
+    {'PyTorch-2.3.0_skip-test_init_from_local_shards.patch':
+     '90ed9c2870f57ee6dc032d00873a37e2217a2b92a13035ded1c25ad5306455f2'},
+    {'PyTorch-2.3.0_no-cuda-stubs-rpath.patch':
+     '7ba26824b5def7379cff02ae821a080698e6affea0da45bc846e9ecb89939cb1'},
+    {'PyTorch-2.3.0_disable-gcc12-warning.patch':
+     'a8a624e1a2a5f4c82610173e50bd0f853e49bd5621b432f5aac689f9f6eb1514'},
+    {'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch':
+     '36aa2d5ba175be17f4e996f4fb2d544fe477d4a0bd0644cd59a85063779afc8e'},
+    {'PyTorch-2.3.0_disable_tests_which_need_network_download.patch':
+     'b7fd1a5135dfd4098cdc054182f7bf84a23ac98462a00477712182b5442da855'},
+    {'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch':
+     '041adcd91d994b8c2ab57d227f081cd57e572c157117b37171e1eb8eb576f8fc'},
+    {'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch':
+     'aa6ff764f3f7bf84372a8a257fe1b4ae6dc4b9744ad35f0f9015f2696c62a41e'},
+    {'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch':
+     '9703fd0f1fca8916f6d79d83e9a7efe8e3f717362a5fdaa8f5d9da90d0c75018'},
+    {'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch':
+     '7955f2655db3da18606574fdcbc5990be24098f49ad1db5e86ea756ea1cc506f'},
+    {'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch':
+     'ee07d21c3ac7aeb0bd0e39507b18a417b9125284a529102929c4b5c6727c2976'},
 ]
 
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.27.6'),
+    ('hypothesis', '6.90.0'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '12.0'),
+    ('pytest-rerunfailures', '14.0'),
     ('pytest-shard', '0.1.2'),
+    ('tlparse', '0.3.5'),
+    ('optree', '0.13.0'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
-    ('CUDA', '12.1.1', '', SYSTEM),
-    ('cuDNN', '8.9.2.26', '-CUDA-%(cudaver)s', SYSTEM),
-    ('magma', '2.7.2', '-CUDA-%(cudaver)s'),
-    ('NCCL', '2.18.3', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
+    ('Python', '3.11.5'),
+    ('Python-bundle-PyPI', '2023.10'),
+    ('protobuf', '25.3'),
+    ('protobuf-python', '4.25.3'),
     ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
-    ('PyYAML', '6.0'),
-    ('MPFR', '4.2.0'),
-    ('GMP', '6.2.1'),
+    ('SciPy-bundle', '2023.11'),
+    ('PyYAML', '6.0.1'),
+    ('MPFR', '4.2.1'),
+    ('GMP', '6.3.0'),
     ('numactl', '2.0.16'),
     ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('Pillow', '10.2.0'),
+    ('expecttest', '0.2.1'),
+    ('networkx', '3.2.1'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2'),
+    ('Z3', '4.13.0',),
 ]
 
 use_pip = True
@@ -205,33 +167,24 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
-        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
-        'distributed/tensor/parallel/test_tp_random_state',
-        # failures on OmniPath systems, which don't support some optional InfiniBand features
-        # See https://github.com/pytorch/tensorpipe/issues/413
-        'distributed/pipeline/sync/skip/test_gpipe',
-        'distributed/pipeline/sync/skip/test_leak',
-        'distributed/pipeline/sync/test_bugs',
-        'distributed/pipeline/sync/test_inplace',
-        'distributed/pipeline/sync/test_pipe',
-        'distributed/pipeline/sync/test_transparency',
+        # This test is expected to fail when run in their CI, but won't in our case.
+        # It just checks for a "CI" env variable
+        'test_ci_sanity_check_fail',
+        # This fails consistently and is disabled upstream
+        # See https://github.com/pytorch/pytorch/issues/100152 and
+        # https://github.com/pytorch/pytorch/pull/124712
+        'test_cpp_extensions_open_device_registration',
+
     ]
 }
 
-runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-through-error  --verbose %(excluded_tests)s'
+local_test_opts = '--continue-through-error --pipe-logs --verbose %(excluded_tests)s'
+runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py ' + local_test_opts
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
-# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
-
-# The readelf sanity check command can be taken out once the TestRPATH test from
-# https://github.com/pytorch/pytorch/pull/109493 is accepted, since it is then checked as part of the PyTorch test suite
-local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
-sanity_check_commands = [
-    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
-]
+max_failed_tests = 6
 
 tests = ['PyTorch-check-cpp-extension.py']
 
Diff against PyTorch-2.1.2-foss-2023a.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
index a79f709480..ca28c8dcea 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
@@ -1,17 +1,16 @@
 name = 'PyTorch'
-version = '2.1.2'
+version = '2.3.0'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2023b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
 patches = [
     'PyTorch-1.7.0_disable-dev-shm-test.patch',
-    'PyTorch-1.11.1_skip-test_init_from_local_shards.patch',
     'PyTorch-1.12.1_add-hypothesis-suppression.patch',
     'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch',
     'PyTorch-1.12.1_fix-TestTorch.test_to.patch',
@@ -23,39 +22,34 @@ patches = [
     'PyTorch-1.13.1_skip-tests-without-fbgemm.patch',
     'PyTorch-2.0.1_avoid-test_quantization-failures.patch',
     'PyTorch-2.0.1_fix-skip-decorators.patch',
-    'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch',
     'PyTorch-2.0.1_fix-vsx-loadu.patch',
-    'PyTorch-2.0.1_no-cuda-stubs-rpath.patch',
     'PyTorch-2.0.1_skip-failing-gradtest.patch',
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
-    'PyTorch-2.1.0_disable-gcc12-warning.patch',
-    'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
-    'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
-    'PyTorch-2.1.0_fix-validationError-output-test.patch',
     'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch',
     'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch',
-    'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch',
     'PyTorch-2.1.0_remove-test-requiring-online-access.patch',
     'PyTorch-2.1.0_skip-diff-test-on-ppc.patch',
     'PyTorch-2.1.0_skip-dynamo-test_predispatch.patch',
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
-    'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
-    'PyTorch-2.1.0_skip-test_wrap_bad.patch',
-    'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
-    'PyTorch-2.1.2_fix-test_memory_profiler.patch',
-    'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
-    'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
-    'PyTorch-2.1.2_fix-vsx-vector-div.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
+    'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch',
+    'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch',
+    'PyTorch-2.3.0_skip-test_init_from_local_shards.patch',
+    'PyTorch-2.3.0_no-cuda-stubs-rpath.patch',
+    'PyTorch-2.3.0_disable-gcc12-warning.patch',
+    'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch',
+    'PyTorch-2.3.0_disable_tests_which_need_network_download.patch',
+    'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch',
+    'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch',
+    'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch',
+    'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch',
+    'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch',
 ]
 checksums = [
-    {'pytorch-v2.1.2.tar.gz': '85effbcce037bffa290aea775c9a4bad5f769cb229583450c40055501ee1acd7'},
+    {'pytorch-v2.3.0.tar.gz': '69579513b26261bbab32e13b7efc99ad287fcf3103087f2d4fdf1adacd25316f'},
     {'PyTorch-1.7.0_disable-dev-shm-test.patch': '622cb1eaeadc06e13128a862d9946bcc1f1edd3d02b259c56a9aecc4d5406b8a'},
-    {'PyTorch-1.11.1_skip-test_init_from_local_shards.patch':
-     '4aeb1b0bc863d4801b0095cbce69f8794066748f0df27c6aaaf729c5ecba04b7'},
     {'PyTorch-1.12.1_add-hypothesis-suppression.patch':
      'e71ffb94ebe69f580fa70e0de84017058325fdff944866d6bd03463626edc32c'},
     {'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch':
@@ -75,28 +69,16 @@ checksums = [
     {'PyTorch-2.0.1_avoid-test_quantization-failures.patch':
      '02e3f47e4ed1d7d6077e26f1ae50073dc2b20426269930b505f4aefe5d2f33cd'},
     {'PyTorch-2.0.1_fix-skip-decorators.patch': '2039012cef45446065e1a2097839fe20bb29fe3c1dcc926c3695ebf29832e920'},
-    {'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch':
-     '1b37194f55ae678f3657b8728dfb896c18ffe8babe90987ce468c4fa9274f357'},
     {'PyTorch-2.0.1_fix-vsx-loadu.patch': 'a0ffa61da2d47c6acd09aaf6d4791e527d8919a6f4f1aa7ed38454cdcadb1f72'},
-    {'PyTorch-2.0.1_no-cuda-stubs-rpath.patch': '8902e58a762240f24cdbf0182e99ccdfc2a93492869352fcb4ca0ec7e407f83a'},
     {'PyTorch-2.0.1_skip-failing-gradtest.patch': '8030bdec6ba49b057ab232d19a7f1a5e542e47e2ec340653a246ec9ed59f8bc1'},
     {'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch':
      '7047862abc1abaff62954da59700f36d4f39fcf83167a638183b1b7f8fec78ae'},
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
-    {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
-    {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
-     'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
-    {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
-     '84bb51a719abc677031a7a3dfe4382ff098b0cbd8b39b8bed2a7fa03f80ac1e9'},
-    {'PyTorch-2.1.0_fix-validationError-output-test.patch':
-     '7eba0942afb121ed92fac30d1529447d892a89eb3d53c565f8e9d480e95f692b'},
     {'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch':
      '3793b4b878be1abe7791efcbd534774b87862cfe7dc4774ca8729b6cabb39e7e'},
     {'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch':
      'aef38adf1210d0c5455e91d7c7a9d9e5caad3ae568301e0ba9fc204309438e7b'},
-    {'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch':
-     '0ac36411e76506b3354c85a8a1260987f66af947ee52ffc64230aee1fa02ea8b'},
     {'PyTorch-2.1.0_remove-test-requiring-online-access.patch':
      '35184b8c5a1b10f79e511cc25db3b8a5585a5d58b5d1aa25dd3d250200b14fd7'},
     {'PyTorch-2.1.0_skip-diff-test-on-ppc.patch': '394157dbe565ffcbc1821cd63d05930957412156cc01e949ef3d3524176a1dda'},
@@ -104,56 +86,72 @@ checksums = [
      '6298daf9ddaa8542850eee9ea005f28594ab65b1f87af43d8aeca1579a8c4354'},
     {'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch':
      '5229ca88a71db7667a90ddc0b809b2c817698bd6e9c5aaabd73d3173cf9b99fe'},
-    {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
-     '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
-    {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
-    {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
-     'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
-    {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
-     '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
-    {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
-     'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
-    {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
-    {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
+    {'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch':
+     '23416f2d9d5226695ec3fbea0671e3650c655c19deefd3f0f8ddab5afa50f485'},
+    {'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch':
+     '0dcbdfde6752c3ff54c5376f521b4a742167669feb7f0f1d4e1d4d55f72b664f'},
+    {'PyTorch-2.3.0_skip-test_init_from_local_shards.patch':
+     '90ed9c2870f57ee6dc032d00873a37e2217a2b92a13035ded1c25ad5306455f2'},
+    {'PyTorch-2.3.0_no-cuda-stubs-rpath.patch':
+     '7ba26824b5def7379cff02ae821a080698e6affea0da45bc846e9ecb89939cb1'},
+    {'PyTorch-2.3.0_disable-gcc12-warning.patch':
+     'a8a624e1a2a5f4c82610173e50bd0f853e49bd5621b432f5aac689f9f6eb1514'},
+    {'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch':
+     '36aa2d5ba175be17f4e996f4fb2d544fe477d4a0bd0644cd59a85063779afc8e'},
+    {'PyTorch-2.3.0_disable_tests_which_need_network_download.patch':
+     'b7fd1a5135dfd4098cdc054182f7bf84a23ac98462a00477712182b5442da855'},
+    {'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch':
+     '041adcd91d994b8c2ab57d227f081cd57e572c157117b37171e1eb8eb576f8fc'},
+    {'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch':
+     'aa6ff764f3f7bf84372a8a257fe1b4ae6dc4b9744ad35f0f9015f2696c62a41e'},
+    {'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch':
+     '9703fd0f1fca8916f6d79d83e9a7efe8e3f717362a5fdaa8f5d9da90d0c75018'},
+    {'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch':
+     '7955f2655db3da18606574fdcbc5990be24098f49ad1db5e86ea756ea1cc506f'},
+    {'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch':
+     'ee07d21c3ac7aeb0bd0e39507b18a417b9125284a529102929c4b5c6727c2976'},
 ]
 
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.27.6'),
+    ('hypothesis', '6.90.0'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '12.0'),
+    ('pytest-rerunfailures', '14.0'),
     ('pytest-shard', '0.1.2'),
+    ('tlparse', '0.3.5'),
+    ('optree', '0.13.0'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
+    ('Python', '3.11.5'),
+    ('Python-bundle-PyPI', '2023.10'),
+    ('protobuf', '25.3'),
+    ('protobuf-python', '4.25.3'),
     ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
-    ('PyYAML', '6.0'),
-    ('MPFR', '4.2.0'),
-    ('GMP', '6.2.1'),
+    ('SciPy-bundle', '2023.11'),
+    ('PyYAML', '6.0.1'),
+    ('MPFR', '4.2.1'),
+    ('GMP', '6.3.0'),
     ('numactl', '2.0.16'),
     ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('Pillow', '10.2.0'),
+    ('expecttest', '0.2.1'),
+    ('networkx', '3.2.1'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2',),
+    ('Z3', '4.13.0',),
 ]
 
 use_pip = True
+buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
     '': [
@@ -169,15 +167,24 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # This test is expected to fail when run in their CI, but won't in our case.
+        # It just checks for a "CI" env variable
+        'test_ci_sanity_check_fail',
+        # This fails consistently and is disabled upstream
+        # See https://github.com/pytorch/pytorch/issues/100152 and
+        # https://github.com/pytorch/pytorch/pull/124712
+        'test_cpp_extensions_open_device_registration',
+
     ]
 }
 
-runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-through-error  --verbose %(excluded_tests)s'
+local_test_opts = '--continue-through-error --pipe-logs --verbose %(excluded_tests)s'
+runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py ' + local_test_opts
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 6
 
 tests = ['PyTorch-check-cpp-extension.py']
 

@akesandgren
Copy link
Contributor Author

@boegelbot Please test @ jsc-zen3

@boegelbot
Copy link
Collaborator

@akesandgren: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=20489 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_20489 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 5910

Test results coming soon (I hope)...

- notification for comment with ID 2712807286 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 2 out of 3 (2 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.5, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.21
See https://gist.github.com/boegelbot/9ed2424b7a6d9a69906d47fee8dadb2b for a full test report.

@akesandgren
Copy link
Contributor Author

@boegelbot Please test @ jsc-zen3

@boegelbot
Copy link
Collaborator

@akesandgren: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=20489 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_20489 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 5912

Test results coming soon (I hope)...

- notification for comment with ID 2714831517 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 1 out of 2 (2 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.5, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.21
See https://gist.github.com/boegelbot/3084d4ba47631f85e60969de4c8f10d6 for a full test report.

@Flamefire
Copy link
Contributor

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
n1585 - Linux RHEL 8.9 (Ootpa), x86_64, Intel(R) Xeon(R) Platinum 8470 (icelake), Python 3.8.17
See https://gist.github.com/Flamefire/dfc494f356037b43afe18ae26f0de83c for a full test report.

@akesandgren
Copy link
Contributor Author

Test report by @akesandgren
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
b-an02.hpc2n.umu.se - Linux Ubuntu 20.04, x86_64, Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz, Python 3.8.10
See https://gist.github.com/akesandgren/f9b7483d858482a7db2b48ab594ad865 for a full test report.

@akesandgren
Copy link
Contributor Author

Test report by @akesandgren
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
b-cn1611.hpc2n.umu.se - Linux Ubuntu 22.04, x86_64, AMD EPYC 7313 16-Core Processor, 1 x NVIDIA NVIDIA A100 80GB PCIe, 555.58.02, Python 3.10.12
See https://gist.github.com/akesandgren/4e590fcfa6500ba79bf85348e2268cbb for a full test report.

@akesandgren
Copy link
Contributor Author

I'd say this one is ready for merging now.

@verdurin
Copy link
Member

Test report by @verdurin
SUCCESS
Build succeeded for 5 out of 5 (2 easyconfigs in total)
easybuild-el8.cloud.in.bmrc.ox.ac.uk - Linux Rocky Linux 8.10, x86_64, Intel Xeon Processor (Skylake, IBRS), Python 3.6.8
See https://gist.github.com/verdurin/db4fc4853d17e004e278136c926fc412 for a full test report.

@verdurin verdurin modified the milestones: 4.x, release after 4.9.4 Mar 18, 2025
Copy link
Member

@verdurin verdurin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine.

@verdurin verdurin merged commit 44919f5 into easybuilders:develop Mar 18, 2025
10 checks passed
@akesandgren akesandgren deleted the 20240507124933_new_pr_PyTorch230 branch March 18, 2025 12:50
@boegel boegel modified the milestones: release after 4.9.4, 5.0.0, release after 5.0.0 Mar 18, 2025
@Flamefire
Copy link
Contributor

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
i7101 - Linux Rocky Linux 8.9 (Green Obsidian), x86_64, AMD EPYC 7702 64-Core Processor (zen2), Python 3.8.17
See https://gist.github.com/Flamefire/6e3669c9cf90da1e4303c5f8db5ddd68 for a full test report.

@Flamefire
Copy link
Contributor

2.3.0 for 2024 at #22616

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants