{ai}[foss/2023b] PyTorch v2.3.0 #20489

akesandgren · 2024-05-07T10:49:42Z

(created using eb --new-pr)

Depends on:

Use unittest XML files to parse PyTorch test results easybuild-easyblocks#3633

…2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch, PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch

akesandgren · 2024-05-07T10:51:05Z

Tests that are failing for me are:

inductor/test_torchinductor 1/1 failed!   test_multilayer_var_lowp
inductor/test_torchinductor_dynamic_shapes 1/1 failed!   test_multilayer_var_lowp
test_cpp_extensions_open_device_registration 1/1 failed!   test_open_device_registration (Not implemented yet ?)
inductor/test_cpu_repro 1/1 failed!    test_scatter_using_atomic_add
test_decomp 1/1 failed!   test_sdpa (_nn_functional_scaled_dot_product_attention_cpu_bfloat16)
inductor/test_torchinductor_opinfo 1/1 failed!
 inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCPU::test_comprehensive_fft_ihfft2_cpu_int32 FAILED
    C++ error

akesandgren · 2024-05-07T10:53:44Z

@Flamefire
The first two are the same, precision problem on AMD zen3 at least
the cpp_extensions_open_device_registration.... haven't a clue yet
the scatter_using_atomic_add looks like it's not compiling to the code it expects, not sure why
test_sdpa is also precision related
I didn't attack the C++ error

akesandgren · 2024-05-07T11:21:25Z

@boegelbot Please test @ jsc-zen3

boegelbot · 2024-05-07T11:30:11Z

@akesandgren: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=20489 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_20489 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

exit code: 0
output:

Submitted batch job 4085

Test results coming soon (I hope)...

- notification for comment with ID 2098172875 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

akesandgren · 2024-05-07T20:25:43Z

Test report by @akesandgren
FAILED
Build succeeded for 0 out of 1 (3 easyconfigs in total)
b-an02.hpc2n.umu.se - Linux Ubuntu 20.04, x86_64, Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz, Python 3.8.10
See https://gist.github.com/akesandgren/ef17ea2435926ca06bbe5cbbe6058158 for a full test report.

boegelbot · 2024-05-07T20:36:01Z

Test report by @boegelbot
FAILED
Build succeeded for 2 out of 3 (3 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.3, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.18
See https://gist.github.com/boegelbot/1ce9b70e5410ebd3a1d8dbbce992b8c7 for a full test report.

akesandgren · 2024-05-09T13:43:25Z

Test report by @akesandgren
FAILED
Build succeeded for 2 out of 3 (3 easyconfigs in total)
b-cn1607.hpc2n.umu.se - Linux Ubuntu 22.04, x86_64, Intel(R) Xeon(R) Gold 6336Y CPU @ 2.40GHz, 3 x NVIDIA NVIDIA A100 80GB PCIe, 545.29.06, Python 3.10.12
See https://gist.github.com/akesandgren/9a9b6ec51769af98d6d4689b4e1ba93a for a full test report.

akesandgren · 2024-05-13T05:36:53Z

Interesting...
If I run the tests standalone there are fewer failing tests than when run during a build...

Flamefire · 2024-05-13T09:13:05Z

Interesting... If I run the tests standalone there are fewer failing tests than when run during a build...

Not unusual for PyTorch ;-)
I just got bitten again by $XDG_CACHE_HOME: PyTorch uses that to store JIT compiled files so rerunning the same test again with the same value for that will result in a different behavior as it will load the file from that directory instead of JIT compiling it.

akesandgren · 2024-05-14T11:11:51Z

These fail because SANDCASTLE=1 when run as part of build

export/test_lift_unlift
export/test_serialize
export/test_torchbind
export/test_unflatten
higher_order_ops/test_with_effects
test_weak

And those are the diff between my standalone test run (which was without SANDCASTLE) and the test-while-building

akesandgren · 2024-05-14T11:30:49Z

@Flamefire Do you know why we set SANDCASTLE=1 in the easyblock?
As far as I can see it is a specific machine that they run tests on...

Flamefire · 2024-05-14T13:57:26Z

@Flamefire Do you know why we set SANDCASTLE=1 in the easyblock? As far as I can see it is a specific machine that they run tests on...

Yes, there are a lot of things like @unittest.skipIf(IS_SANDCASTLE, "NYI: fuser CPU support for Sandcastle") in the tests and the idea was: If they don't even run/work on their machine we shouldn't even try to do for us.

So we might need to patch those failing ones. For TestWithEffects it loads a different library, similar in test_weak.py and likely for the export tests although I couldn't find the exact ones you mentioned

akesandgren · 2024-05-14T14:37:52Z

I'm doing a test without SANDCASTLE set and test_hub disabled, that's one of only two I found that is doing external downloads, the other being one test in test_nn.
By the looks of some of the comments around SANDCASTLE it doesn't feel like a normal x86_64 based machine...

akesandgren · 2024-05-14T14:48:13Z

And I have manually run the full test suite without SANDCASTLE set on a previous build and saw only 3 failed tests.
So I don't think we need SANDCASTLE set.

Flamefire · 2024-05-14T17:04:42Z

By the looks of some of the comments around SANDCASTLE it doesn't feel like a normal x86_64 based machine...

Might be. I used it because it disable a LOT of tests, especially those downloading stuff IIRC. See https://github.com/search?q=repo%3Apytorch%2Fpytorch%20IS_SANDCASTLE&type=code

Two such instances seems to skip whole classes of tests at once: https://github.com/pytorch/pytorch/blob/20aa7cc6788ff10dee2d927057b10a81af638a32/test/jit/test_backends.py#L69-L73 and https://github.com/pytorch/pytorch/blob/2e4d0111953e6db7e4ce5cf041e6a78770092495/test/jit/test_torchbind.py#L37-L38

And I have manually run the full test suite without SANDCASTLE set on a previous build and saw only 3 failed tests.

If it is indeed the case that now NOT setting it causes fewer failures then we should. Best to condition it on 2.3+ to not introduce regressions.

I'll try to push a change upstream to use something like @skip_if_sandcastle which would give us an easy way to skip all those tests by patching that function without changing any other behavior controlled by that env variable

easybuild/easyconfigs/t/tlparse/tlparse-0.3.5-GCCcore-13.2.0.eb

Flamefire · 2024-05-15T08:35:18Z

We have another issue: pytest-rerun-failures interferes with our test parsing. We want some output like

    # ===================== 2 failed, 128 passed, 2 skipped, 2 warnings in 3.43s =====================
    # test_quantization failed!

But now we get:

Running test_cpp_extensions_open_device_registration 1/1 ... [2024-05-13 16:48:56.717884]
Executing ['.../python', '-bb', 'test_cpp_extensions_open_device_registration.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2'] ... [2024-05-13 16:48:56.718522]
===================== test session starts =====================
[...]
('RERUN', {'yellow': True}) [1.1713s]                                                    [100%]
test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration ('RERUN', {'yellow': True}) [0.0036s]                                                    [100%]
test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration FAILED [0.0033s]                                                                         [100%]

===================== RERUNS =====================
_____________________ TestCppExtensionOpenRgistration.test_open_device_registration _____________________
[...]
_____________________ TestCppExtensionOpenRgistration.test_open_device_registration _____________________
[...]
===================== FAILURES =====================
_____________________ TestCppExtensionOpenRgistration.test_open_device_registration _____________________
[...]
===================== short test summary info =====================
FAILED [0.0033s] test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration - AssertionError: RuntimeError not raised
!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!
===================== 1 failed, 2 rerun in 39.35s =====================
Got exit code 1
Retrying...
===================== test session starts =====================
[...]
('RERUN', {'yellow': True}) [1.9584s]                                                    [100%]
test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration [W Module.cpp:160] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1...

('RERUN', {'yellow': True}) [0.0036s]                                                    [100%]
test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration [W Module.cpp:160] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1...

FAILED [0.0023s]                                                                         [100%]

===================== RERUNS =====================
_____________________ TestCppExtensionOpenRgistration.test_open_device_registration _____________________
[...]
===================== FAILURES =====================
_____________________ TestCppExtensionOpenRgistration.test_open_device_registration _____________________
[...]
===================== short test summary info =====================
FAILED [0.0023s] test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration - AssertionError: RuntimeError not raised
!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!
===================== 1 failed, 2 rerun in 40.27s =====================
Got exit code 1
Retrying...
===================== test session starts =====================
[...]
('RERUN', {'yellow': True}) [1.8911s]                                                    [100%]
test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration [W Module.cpp:160] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1...

('RERUN', {'yellow': True}) [0.0032s]                                                    [100%]
test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration [W Module.cpp:160] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1...

FAILED [0.0021s]                                                                         [100%]

===================== RERUNS =====================
_____________________ TestCppExtensionOpenRgistration.test_open_device_registration _____________________
[...]
===================== FAILURES =====================
_____________________ TestCppExtensionOpenRgistration.test_open_device_registration _____________________
[...]
===================== short test summary info =====================
FAILED [0.0021s] test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration - AssertionError: RuntimeError not raised
!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!
===================== 1 failed, 2 rerun in 40.00s =====================
Got exit code 1
Retrying...
===================== test session starts =====================
[...]
===================== 1 deselected in 0.02s =====================
The following tests failed consistently: ['test/test_cpp_extensions_open_device_registration.py::TestCppExtensionOpenRgistration::test_open_device_registration']
test_cpp_extensions_open_device_registration 1/1 failed!
Running test_cuda 1/1 ... [2024-05-13 16:51:10.730579]

I don't see how we could reasonably parse this
It exits after the first failed test. This means even "1 failed, 2 rerun in 40.00s" just says: "1 test out of an unknown number of tests failed"

akesandgren · 2024-05-15T14:02:52Z

@boegelbot Please test @ jsc-zen3
EB_ARGS="--include-easyblocks-from-pr 3330"

… more tests.

boegelbot · 2024-05-15T14:10:08Z

@akesandgren: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=20489 EB_ARGS="--include-easyblocks-from-pr 3330" EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_20489 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

exit code: 0
output:

Submitted batch job 4128

Test results coming soon (I hope)...

- notification for comment with ID 2112638268 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

fizwit · 2024-05-15T21:35:17Z

easybuild/easyconfigs/o/optree/optree-0.11.0-GCCcore-13.2.0.eb

optree requires typing-extensions/4.10.0-GCCcore-13.2.0

Why is that? I installed it just fine:

... python -m pip check completed successfully

Where are you getting typing-extensions? It is not part of Python-3.11.5-GCCcore-13.2.0.eb. optree build fails without typing-extensions.

== installing... == ... (took 29 secs) == taking care of extensions... == restore after iterating... == postprocessing... == sanity checking... == ... (took 3 secs) == FAILED: Installation ended unsuccessfully (build directory: /build/optree/0.11.0/GCCcore-13.2.0): build failed (first 300 chars): `/app/software/Python/3.11.5-GCCcore-13.2.0/bin/python -m pip check` failed: optree 0.11.0 requires typing-extensions, which is not installed.

Seems like you need to reinstall Python. The current develop version and release 4.9.1 contains it:

easybuild-easyconfigs/easybuild/easyconfigs/p/Python/Python-3.11.5-GCCcore-13.2.0.eb

Lines 56 to 58 in 43ff814

('typing_extensions', '4.8.0', {

'checksums': ['df8e4339e9cb77357558cbdbceca33c303714cf861d1eef15e1070055ae8b7ef'],

}),

However it was a change between 4.8.2 and 4.9.x by #19777

From the looks of that PR this was made because too many other ECs depended on that. And IMO it makes sense to include it in Python by default

thanks, --rebuild --skip added four packages. This will fix many things for me.

== installing extension tomli 2.0.1 (1/4)... == configuring... == building... == testing... == installing... == ... (took 11 secs) == installing extension packaging 23.2 (2/4)... == configuring... == building... == testing... == installing... == ... (took 2 secs) == installing extension typing_extensions 4.8.0 (3/4)... == configuring... == building... == testing... == installing... == ... (took 2 secs) == installing extension setuptools-scm 8.0.4 (4/4)...

boegelbot · 2024-05-15T22:28:57Z

Test report by @boegelbot
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#3330
FAILED
Build succeeded for 2 out of 3 (3 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.4, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.18
See https://gist.github.com/boegelbot/50db8a2d24c3a8139108dc99a9001182 for a full test report.

akesandgren · 2024-08-27T13:39:08Z

@Flamefire Any ideas on how to deal with the error output parsing problem?

Flamefire · 2024-08-28T12:13:24Z

@Flamefire Any ideas on how to deal with the error output parsing problem?

Not many. I still have an open issue for that: pytorch/pytorch#126523

No luck so far to get a machine readable output from PyTorch directly. I.e. I wanted them to get the --save-xml option work correctly but nothing yet after pytorch/pytorch#126690 failed.

We could try to get that option working by patching the test files to make sure --junit-xml-reruns and --save-xml is always set/passed. Then we can check if the XML files are any good for us.

Another option would be to revert their changes to the rerun feature using a custom implementation that broke our detection: pytorch/pytorch@3b7d60b

That might get difficult to keep going forward but I don't see any current alternatives.

jpecar · 2024-11-12T10:19:03Z

optree in this pr is missing git as builddependency.

github-actions · 2025-03-04T10:21:24Z

Updated software `PyTorch-2.3.0-foss-2023b.eb`

Diff against PyTorch-2.1.2-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
index bce1b68aa7..ca28c8dcea 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
@@ -1,5 +1,5 @@
 name = 'PyTorch'
-version = '2.1.2'
+version = '2.3.0'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
@@ -11,7 +11,6 @@ source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
 patches = [
     'PyTorch-1.7.0_disable-dev-shm-test.patch',
-    'PyTorch-1.11.1_skip-test_init_from_local_shards.patch',
     'PyTorch-1.12.1_add-hypothesis-suppression.patch',
     'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch',
     'PyTorch-1.12.1_fix-TestTorch.test_to.patch',
@@ -23,39 +22,34 @@ patches = [
     'PyTorch-1.13.1_skip-tests-without-fbgemm.patch',
     'PyTorch-2.0.1_avoid-test_quantization-failures.patch',
     'PyTorch-2.0.1_fix-skip-decorators.patch',
-    'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch',
     'PyTorch-2.0.1_fix-vsx-loadu.patch',
-    'PyTorch-2.0.1_no-cuda-stubs-rpath.patch',
     'PyTorch-2.0.1_skip-failing-gradtest.patch',
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
-    'PyTorch-2.1.0_disable-gcc12-warning.patch',
-    'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
-    'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
-    'PyTorch-2.1.0_fix-validationError-output-test.patch',
     'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch',
     'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch',
-    'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch',
     'PyTorch-2.1.0_remove-test-requiring-online-access.patch',
     'PyTorch-2.1.0_skip-diff-test-on-ppc.patch',
     'PyTorch-2.1.0_skip-dynamo-test_predispatch.patch',
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
-    'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
-    'PyTorch-2.1.0_skip-test_wrap_bad.patch',
-    'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
-    'PyTorch-2.1.2_fix-test_memory_profiler.patch',
-    'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
-    'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
-    'PyTorch-2.1.2_fix-vsx-vector-div.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
+    'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch',
+    'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch',
+    'PyTorch-2.3.0_skip-test_init_from_local_shards.patch',
+    'PyTorch-2.3.0_no-cuda-stubs-rpath.patch',
+    'PyTorch-2.3.0_disable-gcc12-warning.patch',
+    'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch',
+    'PyTorch-2.3.0_disable_tests_which_need_network_download.patch',
+    'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch',
+    'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch',
+    'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch',
+    'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch',
+    'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch',
 ]
 checksums = [
-    {'pytorch-v2.1.2.tar.gz': '85effbcce037bffa290aea775c9a4bad5f769cb229583450c40055501ee1acd7'},
+    {'pytorch-v2.3.0.tar.gz': '69579513b26261bbab32e13b7efc99ad287fcf3103087f2d4fdf1adacd25316f'},
     {'PyTorch-1.7.0_disable-dev-shm-test.patch': '622cb1eaeadc06e13128a862d9946bcc1f1edd3d02b259c56a9aecc4d5406b8a'},
-    {'PyTorch-1.11.1_skip-test_init_from_local_shards.patch':
-     '4aeb1b0bc863d4801b0095cbce69f8794066748f0df27c6aaaf729c5ecba04b7'},
     {'PyTorch-1.12.1_add-hypothesis-suppression.patch':
      'e71ffb94ebe69f580fa70e0de84017058325fdff944866d6bd03463626edc32c'},
     {'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch':
@@ -75,28 +69,16 @@ checksums = [
     {'PyTorch-2.0.1_avoid-test_quantization-failures.patch':
      '02e3f47e4ed1d7d6077e26f1ae50073dc2b20426269930b505f4aefe5d2f33cd'},
     {'PyTorch-2.0.1_fix-skip-decorators.patch': '2039012cef45446065e1a2097839fe20bb29fe3c1dcc926c3695ebf29832e920'},
-    {'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch':
-     '1b37194f55ae678f3657b8728dfb896c18ffe8babe90987ce468c4fa9274f357'},
     {'PyTorch-2.0.1_fix-vsx-loadu.patch': 'a0ffa61da2d47c6acd09aaf6d4791e527d8919a6f4f1aa7ed38454cdcadb1f72'},
-    {'PyTorch-2.0.1_no-cuda-stubs-rpath.patch': '8902e58a762240f24cdbf0182e99ccdfc2a93492869352fcb4ca0ec7e407f83a'},
     {'PyTorch-2.0.1_skip-failing-gradtest.patch': '8030bdec6ba49b057ab232d19a7f1a5e542e47e2ec340653a246ec9ed59f8bc1'},
     {'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch':
      '7047862abc1abaff62954da59700f36d4f39fcf83167a638183b1b7f8fec78ae'},
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
-    {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
-    {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
-     'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
-    {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
-     '84bb51a719abc677031a7a3dfe4382ff098b0cbd8b39b8bed2a7fa03f80ac1e9'},
-    {'PyTorch-2.1.0_fix-validationError-output-test.patch':
-     '7eba0942afb121ed92fac30d1529447d892a89eb3d53c565f8e9d480e95f692b'},
     {'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch':
      '3793b4b878be1abe7791efcbd534774b87862cfe7dc4774ca8729b6cabb39e7e'},
     {'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch':
      'aef38adf1210d0c5455e91d7c7a9d9e5caad3ae568301e0ba9fc204309438e7b'},
-    {'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch':
-     '0ac36411e76506b3354c85a8a1260987f66af947ee52ffc64230aee1fa02ea8b'},
     {'PyTorch-2.1.0_remove-test-requiring-online-access.patch':
      '35184b8c5a1b10f79e511cc25db3b8a5585a5d58b5d1aa25dd3d250200b14fd7'},
     {'PyTorch-2.1.0_skip-diff-test-on-ppc.patch': '394157dbe565ffcbc1821cd63d05930957412156cc01e949ef3d3524176a1dda'},
@@ -104,22 +86,34 @@ checksums = [
      '6298daf9ddaa8542850eee9ea005f28594ab65b1f87af43d8aeca1579a8c4354'},
     {'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch':
      '5229ca88a71db7667a90ddc0b809b2c817698bd6e9c5aaabd73d3173cf9b99fe'},
-    {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
-     '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
-    {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
-    {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
-     'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
-    {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
-     '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
-    {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
-     'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
-    {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
-    {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
+    {'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch':
+     '23416f2d9d5226695ec3fbea0671e3650c655c19deefd3f0f8ddab5afa50f485'},
+    {'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch':
+     '0dcbdfde6752c3ff54c5376f521b4a742167669feb7f0f1d4e1d4d55f72b664f'},
+    {'PyTorch-2.3.0_skip-test_init_from_local_shards.patch':
+     '90ed9c2870f57ee6dc032d00873a37e2217a2b92a13035ded1c25ad5306455f2'},
+    {'PyTorch-2.3.0_no-cuda-stubs-rpath.patch':
+     '7ba26824b5def7379cff02ae821a080698e6affea0da45bc846e9ecb89939cb1'},
+    {'PyTorch-2.3.0_disable-gcc12-warning.patch':
+     'a8a624e1a2a5f4c82610173e50bd0f853e49bd5621b432f5aac689f9f6eb1514'},
+    {'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch':
+     '36aa2d5ba175be17f4e996f4fb2d544fe477d4a0bd0644cd59a85063779afc8e'},
+    {'PyTorch-2.3.0_disable_tests_which_need_network_download.patch':
+     'b7fd1a5135dfd4098cdc054182f7bf84a23ac98462a00477712182b5442da855'},
+    {'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch':
+     '041adcd91d994b8c2ab57d227f081cd57e572c157117b37171e1eb8eb576f8fc'},
+    {'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch':
+     'aa6ff764f3f7bf84372a8a257fe1b4ae6dc4b9744ad35f0f9015f2696c62a41e'},
+    {'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch':
+     '9703fd0f1fca8916f6d79d83e9a7efe8e3f717362a5fdaa8f5d9da90d0c75018'},
+    {'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch':
+     '7955f2655db3da18606574fdcbc5990be24098f49ad1db5e86ea756ea1cc506f'},
+    {'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch':
+     'ee07d21c3ac7aeb0bd0e39507b18a417b9125284a529102929c4b5c6727c2976'},
 ]
 
 osdependencies = [OS_PKG_IBVERBS_DEV]
@@ -131,6 +125,9 @@ builddependencies = [
     ('pytest-flakefinder', '1.1.0'),
     ('pytest-rerunfailures', '14.0'),
     ('pytest-shard', '0.1.2'),
+    ('tlparse', '0.3.5'),
+    ('optree', '0.13.0'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
@@ -170,15 +167,24 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # This test is expected to fail when run in their CI, but won't in our case.
+        # It just checks for a "CI" env variable
+        'test_ci_sanity_check_fail',
+        # This fails consistently and is disabled upstream
+        # See https://github.com/pytorch/pytorch/issues/100152 and
+        # https://github.com/pytorch/pytorch/pull/124712
+        'test_cpp_extensions_open_device_registration',
+
     ]
 }
 
-runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-through-error  --verbose %(excluded_tests)s'
+local_test_opts = '--continue-through-error --pipe-logs --verbose %(excluded_tests)s'
+runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py ' + local_test_opts
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 6
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
index 65dfced170..ca28c8dcea 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
@@ -1,18 +1,16 @@
 name = 'PyTorch'
-version = '2.1.2'
-versionsuffix = '-CUDA-%(cudaver)s'
+version = '2.3.0'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2023b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
 patches = [
     'PyTorch-1.7.0_disable-dev-shm-test.patch',
-    'PyTorch-1.11.1_skip-test_init_from_local_shards.patch',
     'PyTorch-1.12.1_add-hypothesis-suppression.patch',
     'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch',
     'PyTorch-1.12.1_fix-TestTorch.test_to.patch',
@@ -24,50 +22,34 @@ patches = [
     'PyTorch-1.13.1_skip-tests-without-fbgemm.patch',
     'PyTorch-2.0.1_avoid-test_quantization-failures.patch',
     'PyTorch-2.0.1_fix-skip-decorators.patch',
-    'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch',
     'PyTorch-2.0.1_fix-vsx-loadu.patch',
-    'PyTorch-2.0.1_no-cuda-stubs-rpath.patch',
     'PyTorch-2.0.1_skip-failing-gradtest.patch',
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
-    'PyTorch-2.1.0_disable-gcc12-warning.patch',
-    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
-    'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
-    'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
-    'PyTorch-2.1.0_fix-validationError-output-test.patch',
     'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch',
     'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch',
-    'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch',
     'PyTorch-2.1.0_remove-test-requiring-online-access.patch',
     'PyTorch-2.1.0_skip-diff-test-on-ppc.patch',
     'PyTorch-2.1.0_skip-dynamo-test_predispatch.patch',
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
-    'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
-    'PyTorch-2.1.0_skip-test_wrap_bad.patch',
-    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
-    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
-    'PyTorch-2.1.2_fix-device-mesh-check.patch',
-    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
-    'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
-    'PyTorch-2.1.2_fix-test_memory_profiler.patch',
-    'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
-    'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
-    'PyTorch-2.1.2_fix-vsx-vector-div.patch',
-    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
-    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
-    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
-    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
-    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
+    'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch',
+    'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch',
+    'PyTorch-2.3.0_skip-test_init_from_local_shards.patch',
+    'PyTorch-2.3.0_no-cuda-stubs-rpath.patch',
+    'PyTorch-2.3.0_disable-gcc12-warning.patch',
+    'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch',
+    'PyTorch-2.3.0_disable_tests_which_need_network_download.patch',
+    'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch',
+    'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch',
+    'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch',
+    'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch',
+    'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch',
 ]
 checksums = [
-    {'pytorch-v2.1.2.tar.gz': '85effbcce037bffa290aea775c9a4bad5f769cb229583450c40055501ee1acd7'},
+    {'pytorch-v2.3.0.tar.gz': '69579513b26261bbab32e13b7efc99ad287fcf3103087f2d4fdf1adacd25316f'},
     {'PyTorch-1.7.0_disable-dev-shm-test.patch': '622cb1eaeadc06e13128a862d9946bcc1f1edd3d02b259c56a9aecc4d5406b8a'},
-    {'PyTorch-1.11.1_skip-test_init_from_local_shards.patch':
-     '4aeb1b0bc863d4801b0095cbce69f8794066748f0df27c6aaaf729c5ecba04b7'},
     {'PyTorch-1.12.1_add-hypothesis-suppression.patch':
      'e71ffb94ebe69f580fa70e0de84017058325fdff944866d6bd03463626edc32c'},
     {'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch':
@@ -87,30 +69,16 @@ checksums = [
     {'PyTorch-2.0.1_avoid-test_quantization-failures.patch':
      '02e3f47e4ed1d7d6077e26f1ae50073dc2b20426269930b505f4aefe5d2f33cd'},
     {'PyTorch-2.0.1_fix-skip-decorators.patch': '2039012cef45446065e1a2097839fe20bb29fe3c1dcc926c3695ebf29832e920'},
-    {'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch':
-     '1b37194f55ae678f3657b8728dfb896c18ffe8babe90987ce468c4fa9274f357'},
     {'PyTorch-2.0.1_fix-vsx-loadu.patch': 'a0ffa61da2d47c6acd09aaf6d4791e527d8919a6f4f1aa7ed38454cdcadb1f72'},
-    {'PyTorch-2.0.1_no-cuda-stubs-rpath.patch': '8902e58a762240f24cdbf0182e99ccdfc2a93492869352fcb4ca0ec7e407f83a'},
     {'PyTorch-2.0.1_skip-failing-gradtest.patch': '8030bdec6ba49b057ab232d19a7f1a5e542e47e2ec340653a246ec9ed59f8bc1'},
     {'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch':
      '7047862abc1abaff62954da59700f36d4f39fcf83167a638183b1b7f8fec78ae'},
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
-    {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
-    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
-     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
-    {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
-     'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
-    {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
-     '84bb51a719abc677031a7a3dfe4382ff098b0cbd8b39b8bed2a7fa03f80ac1e9'},
-    {'PyTorch-2.1.0_fix-validationError-output-test.patch':
-     '7eba0942afb121ed92fac30d1529447d892a89eb3d53c565f8e9d480e95f692b'},
     {'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch':
      '3793b4b878be1abe7791efcbd534774b87862cfe7dc4774ca8729b6cabb39e7e'},
     {'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch':
      'aef38adf1210d0c5455e91d7c7a9d9e5caad3ae568301e0ba9fc204309438e7b'},
-    {'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch':
-     '0ac36411e76506b3354c85a8a1260987f66af947ee52ffc64230aee1fa02ea8b'},
     {'PyTorch-2.1.0_remove-test-requiring-online-access.patch':
      '35184b8c5a1b10f79e511cc25db3b8a5585a5d58b5d1aa25dd3d250200b14fd7'},
     {'PyTorch-2.1.0_skip-diff-test-on-ppc.patch': '394157dbe565ffcbc1821cd63d05930957412156cc01e949ef3d3524176a1dda'},
@@ -118,74 +86,68 @@ checksums = [
      '6298daf9ddaa8542850eee9ea005f28594ab65b1f87af43d8aeca1579a8c4354'},
     {'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch':
      '5229ca88a71db7667a90ddc0b809b2c817698bd6e9c5aaabd73d3173cf9b99fe'},
-    {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
-     '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
-    {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
-    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
-    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
-     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
-    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
-    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
-     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
-    {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
-     'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
-    {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
-     '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
-    {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
-     'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
-    {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
-    {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
-    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
-     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
-    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
-     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
-    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
-    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
-     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
-     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
-    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
-     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
+    {'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch':
+     '23416f2d9d5226695ec3fbea0671e3650c655c19deefd3f0f8ddab5afa50f485'},
+    {'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch':
+     '0dcbdfde6752c3ff54c5376f521b4a742167669feb7f0f1d4e1d4d55f72b664f'},
+    {'PyTorch-2.3.0_skip-test_init_from_local_shards.patch':
+     '90ed9c2870f57ee6dc032d00873a37e2217a2b92a13035ded1c25ad5306455f2'},
+    {'PyTorch-2.3.0_no-cuda-stubs-rpath.patch':
+     '7ba26824b5def7379cff02ae821a080698e6affea0da45bc846e9ecb89939cb1'},
+    {'PyTorch-2.3.0_disable-gcc12-warning.patch':
+     'a8a624e1a2a5f4c82610173e50bd0f853e49bd5621b432f5aac689f9f6eb1514'},
+    {'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch':
+     '36aa2d5ba175be17f4e996f4fb2d544fe477d4a0bd0644cd59a85063779afc8e'},
+    {'PyTorch-2.3.0_disable_tests_which_need_network_download.patch':
+     'b7fd1a5135dfd4098cdc054182f7bf84a23ac98462a00477712182b5442da855'},
+    {'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch':
+     '041adcd91d994b8c2ab57d227f081cd57e572c157117b37171e1eb8eb576f8fc'},
+    {'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch':
+     'aa6ff764f3f7bf84372a8a257fe1b4ae6dc4b9744ad35f0f9015f2696c62a41e'},
+    {'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch':
+     '9703fd0f1fca8916f6d79d83e9a7efe8e3f717362a5fdaa8f5d9da90d0c75018'},
+    {'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch':
+     '7955f2655db3da18606574fdcbc5990be24098f49ad1db5e86ea756ea1cc506f'},
+    {'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch':
+     'ee07d21c3ac7aeb0bd0e39507b18a417b9125284a529102929c4b5c6727c2976'},
 ]
 
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.27.6'),
+    ('hypothesis', '6.90.0'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '12.0'),
+    ('pytest-rerunfailures', '14.0'),
     ('pytest-shard', '0.1.2'),
+    ('tlparse', '0.3.5'),
+    ('optree', '0.13.0'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
-    ('CUDA', '12.1.1', '', SYSTEM),
-    ('cuDNN', '8.9.2.26', '-CUDA-%(cudaver)s', SYSTEM),
-    ('magma', '2.7.2', '-CUDA-%(cudaver)s'),
-    ('NCCL', '2.18.3', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
+    ('Python', '3.11.5'),
+    ('Python-bundle-PyPI', '2023.10'),
+    ('protobuf', '25.3'),
+    ('protobuf-python', '4.25.3'),
     ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
-    ('PyYAML', '6.0'),
-    ('MPFR', '4.2.0'),
-    ('GMP', '6.2.1'),
+    ('SciPy-bundle', '2023.11'),
+    ('PyYAML', '6.0.1'),
+    ('MPFR', '4.2.1'),
+    ('GMP', '6.3.0'),
     ('numactl', '2.0.16'),
     ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('Pillow', '10.2.0'),
+    ('expecttest', '0.2.1'),
+    ('networkx', '3.2.1'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2'),
+    ('Z3', '4.13.0',),
 ]
 
 use_pip = True
@@ -205,33 +167,24 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
-        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
-        'distributed/tensor/parallel/test_tp_random_state',
-        # failures on OmniPath systems, which don't support some optional InfiniBand features
-        # See https://github.com/pytorch/tensorpipe/issues/413
-        'distributed/pipeline/sync/skip/test_gpipe',
-        'distributed/pipeline/sync/skip/test_leak',
-        'distributed/pipeline/sync/test_bugs',
-        'distributed/pipeline/sync/test_inplace',
-        'distributed/pipeline/sync/test_pipe',
-        'distributed/pipeline/sync/test_transparency',
+        # This test is expected to fail when run in their CI, but won't in our case.
+        # It just checks for a "CI" env variable
+        'test_ci_sanity_check_fail',
+        # This fails consistently and is disabled upstream
+        # See https://github.com/pytorch/pytorch/issues/100152 and
+        # https://github.com/pytorch/pytorch/pull/124712
+        'test_cpp_extensions_open_device_registration',
+
     ]
 }
 
-runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-through-error  --verbose %(excluded_tests)s'
+local_test_opts = '--continue-through-error --pipe-logs --verbose %(excluded_tests)s'
+runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py ' + local_test_opts
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
-# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
-
-# The readelf sanity check command can be taken out once the TestRPATH test from
-# https://github.com/pytorch/pytorch/pull/109493 is accepted, since it is then checked as part of the PyTorch test suite
-local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
-sanity_check_commands = [
-    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
-]
+max_failed_tests = 6
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023a.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
index a79f709480..ca28c8dcea 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
@@ -1,17 +1,16 @@
 name = 'PyTorch'
-version = '2.1.2'
+version = '2.3.0'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2023b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
 patches = [
     'PyTorch-1.7.0_disable-dev-shm-test.patch',
-    'PyTorch-1.11.1_skip-test_init_from_local_shards.patch',
     'PyTorch-1.12.1_add-hypothesis-suppression.patch',
     'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch',
     'PyTorch-1.12.1_fix-TestTorch.test_to.patch',
@@ -23,39 +22,34 @@ patches = [
     'PyTorch-1.13.1_skip-tests-without-fbgemm.patch',
     'PyTorch-2.0.1_avoid-test_quantization-failures.patch',
     'PyTorch-2.0.1_fix-skip-decorators.patch',
-    'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch',
     'PyTorch-2.0.1_fix-vsx-loadu.patch',
-    'PyTorch-2.0.1_no-cuda-stubs-rpath.patch',
     'PyTorch-2.0.1_skip-failing-gradtest.patch',
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
-    'PyTorch-2.1.0_disable-gcc12-warning.patch',
-    'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
-    'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
-    'PyTorch-2.1.0_fix-validationError-output-test.patch',
     'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch',
     'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch',
-    'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch',
     'PyTorch-2.1.0_remove-test-requiring-online-access.patch',
     'PyTorch-2.1.0_skip-diff-test-on-ppc.patch',
     'PyTorch-2.1.0_skip-dynamo-test_predispatch.patch',
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
-    'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
-    'PyTorch-2.1.0_skip-test_wrap_bad.patch',
-    'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
-    'PyTorch-2.1.2_fix-test_memory_profiler.patch',
-    'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
-    'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
-    'PyTorch-2.1.2_fix-vsx-vector-div.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
+    'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch',
+    'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch',
+    'PyTorch-2.3.0_skip-test_init_from_local_shards.patch',
+    'PyTorch-2.3.0_no-cuda-stubs-rpath.patch',
+    'PyTorch-2.3.0_disable-gcc12-warning.patch',
+    'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch',
+    'PyTorch-2.3.0_disable_tests_which_need_network_download.patch',
+    'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch',
+    'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch',
+    'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch',
+    'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch',
+    'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch',
 ]
 checksums = [
-    {'pytorch-v2.1.2.tar.gz': '85effbcce037bffa290aea775c9a4bad5f769cb229583450c40055501ee1acd7'},
+    {'pytorch-v2.3.0.tar.gz': '69579513b26261bbab32e13b7efc99ad287fcf3103087f2d4fdf1adacd25316f'},
     {'PyTorch-1.7.0_disable-dev-shm-test.patch': '622cb1eaeadc06e13128a862d9946bcc1f1edd3d02b259c56a9aecc4d5406b8a'},
-    {'PyTorch-1.11.1_skip-test_init_from_local_shards.patch':
-     '4aeb1b0bc863d4801b0095cbce69f8794066748f0df27c6aaaf729c5ecba04b7'},
     {'PyTorch-1.12.1_add-hypothesis-suppression.patch':
      'e71ffb94ebe69f580fa70e0de84017058325fdff944866d6bd03463626edc32c'},
     {'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch':
@@ -75,28 +69,16 @@ checksums = [
     {'PyTorch-2.0.1_avoid-test_quantization-failures.patch':
      '02e3f47e4ed1d7d6077e26f1ae50073dc2b20426269930b505f4aefe5d2f33cd'},
     {'PyTorch-2.0.1_fix-skip-decorators.patch': '2039012cef45446065e1a2097839fe20bb29fe3c1dcc926c3695ebf29832e920'},
-    {'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch':
-     '1b37194f55ae678f3657b8728dfb896c18ffe8babe90987ce468c4fa9274f357'},
     {'PyTorch-2.0.1_fix-vsx-loadu.patch': 'a0ffa61da2d47c6acd09aaf6d4791e527d8919a6f4f1aa7ed38454cdcadb1f72'},
-    {'PyTorch-2.0.1_no-cuda-stubs-rpath.patch': '8902e58a762240f24cdbf0182e99ccdfc2a93492869352fcb4ca0ec7e407f83a'},
     {'PyTorch-2.0.1_skip-failing-gradtest.patch': '8030bdec6ba49b057ab232d19a7f1a5e542e47e2ec340653a246ec9ed59f8bc1'},
     {'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch':
      '7047862abc1abaff62954da59700f36d4f39fcf83167a638183b1b7f8fec78ae'},
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
-    {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
-    {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
-     'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
-    {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
-     '84bb51a719abc677031a7a3dfe4382ff098b0cbd8b39b8bed2a7fa03f80ac1e9'},
-    {'PyTorch-2.1.0_fix-validationError-output-test.patch':
-     '7eba0942afb121ed92fac30d1529447d892a89eb3d53c565f8e9d480e95f692b'},
     {'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch':
      '3793b4b878be1abe7791efcbd534774b87862cfe7dc4774ca8729b6cabb39e7e'},
     {'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch':
      'aef38adf1210d0c5455e91d7c7a9d9e5caad3ae568301e0ba9fc204309438e7b'},
-    {'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch':
-     '0ac36411e76506b3354c85a8a1260987f66af947ee52ffc64230aee1fa02ea8b'},
     {'PyTorch-2.1.0_remove-test-requiring-online-access.patch':
      '35184b8c5a1b10f79e511cc25db3b8a5585a5d58b5d1aa25dd3d250200b14fd7'},
     {'PyTorch-2.1.0_skip-diff-test-on-ppc.patch': '394157dbe565ffcbc1821cd63d05930957412156cc01e949ef3d3524176a1dda'},
@@ -104,56 +86,72 @@ checksums = [
      '6298daf9ddaa8542850eee9ea005f28594ab65b1f87af43d8aeca1579a8c4354'},
     {'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch':
      '5229ca88a71db7667a90ddc0b809b2c817698bd6e9c5aaabd73d3173cf9b99fe'},
-    {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
-     '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
-    {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
-    {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
-     'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
-    {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
-     '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
-    {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
-     'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
-    {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
-    {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
+    {'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch':
+     '23416f2d9d5226695ec3fbea0671e3650c655c19deefd3f0f8ddab5afa50f485'},
+    {'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch':
+     '0dcbdfde6752c3ff54c5376f521b4a742167669feb7f0f1d4e1d4d55f72b664f'},
+    {'PyTorch-2.3.0_skip-test_init_from_local_shards.patch':
+     '90ed9c2870f57ee6dc032d00873a37e2217a2b92a13035ded1c25ad5306455f2'},
+    {'PyTorch-2.3.0_no-cuda-stubs-rpath.patch':
+     '7ba26824b5def7379cff02ae821a080698e6affea0da45bc846e9ecb89939cb1'},
+    {'PyTorch-2.3.0_disable-gcc12-warning.patch':
+     'a8a624e1a2a5f4c82610173e50bd0f853e49bd5621b432f5aac689f9f6eb1514'},
+    {'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch':
+     '36aa2d5ba175be17f4e996f4fb2d544fe477d4a0bd0644cd59a85063779afc8e'},
+    {'PyTorch-2.3.0_disable_tests_which_need_network_download.patch':
+     'b7fd1a5135dfd4098cdc054182f7bf84a23ac98462a00477712182b5442da855'},
+    {'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch':
+     '041adcd91d994b8c2ab57d227f081cd57e572c157117b37171e1eb8eb576f8fc'},
+    {'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch':
+     'aa6ff764f3f7bf84372a8a257fe1b4ae6dc4b9744ad35f0f9015f2696c62a41e'},
+    {'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch':
+     '9703fd0f1fca8916f6d79d83e9a7efe8e3f717362a5fdaa8f5d9da90d0c75018'},
+    {'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch':
+     '7955f2655db3da18606574fdcbc5990be24098f49ad1db5e86ea756ea1cc506f'},
+    {'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch':
+     'ee07d21c3ac7aeb0bd0e39507b18a417b9125284a529102929c4b5c6727c2976'},
 ]
 
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.27.6'),
+    ('hypothesis', '6.90.0'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '12.0'),
+    ('pytest-rerunfailures', '14.0'),
     ('pytest-shard', '0.1.2'),
+    ('tlparse', '0.3.5'),
+    ('optree', '0.13.0'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
+    ('Python', '3.11.5'),
+    ('Python-bundle-PyPI', '2023.10'),
+    ('protobuf', '25.3'),
+    ('protobuf-python', '4.25.3'),
     ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
-    ('PyYAML', '6.0'),
-    ('MPFR', '4.2.0'),
-    ('GMP', '6.2.1'),
+    ('SciPy-bundle', '2023.11'),
+    ('PyYAML', '6.0.1'),
+    ('MPFR', '4.2.1'),
+    ('GMP', '6.3.0'),
     ('numactl', '2.0.16'),
     ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('Pillow', '10.2.0'),
+    ('expecttest', '0.2.1'),
+    ('networkx', '3.2.1'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2',),
+    ('Z3', '4.13.0',),
 ]
 
 use_pip = True
+buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
     '': [
@@ -169,15 +167,24 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # This test is expected to fail when run in their CI, but won't in our case.
+        # It just checks for a "CI" env variable
+        'test_ci_sanity_check_fail',
+        # This fails consistently and is disabled upstream
+        # See https://github.com/pytorch/pytorch/issues/100152 and
+        # https://github.com/pytorch/pytorch/pull/124712
+        'test_cpp_extensions_open_device_registration',
+
     ]
 }
 
-runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-through-error  --verbose %(excluded_tests)s'
+local_test_opts = '--continue-through-error --pipe-logs --verbose %(excluded_tests)s'
+runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py ' + local_test_opts
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 6
 
 tests = ['PyTorch-check-cpp-extension.py']

…list of cargos specified.

easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb

…prp-logs to reduce output to stdout.

akesandgren · 2025-03-11T06:21:51Z

@boegelbot Please test @ jsc-zen3

boegelbot · 2025-03-11T06:30:08Z

@akesandgren: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=20489 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_20489 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

exit code: 0
output:

Submitted batch job 5910

Test results coming soon (I hope)...

- notification for comment with ID 2712807286 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

boegelbot · 2025-03-11T15:27:29Z

Test report by @boegelbot
FAILED
Build succeeded for 2 out of 3 (2 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.5, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.21
See https://gist.github.com/boegelbot/9ed2424b7a6d9a69906d47fee8dadb2b for a full test report.

akesandgren · 2025-03-11T15:45:20Z

@boegelbot Please test @ jsc-zen3

boegelbot · 2025-03-11T15:50:08Z

@akesandgren: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=20489 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_20489 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

exit code: 0
output:

Submitted batch job 5912

Test results coming soon (I hope)...

- notification for comment with ID 2714831517 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

boegelbot · 2025-03-12T00:43:29Z

Test report by @boegelbot
FAILED
Build succeeded for 1 out of 2 (2 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.5, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.21
See https://gist.github.com/boegelbot/3084d4ba47631f85e60969de4c8f10d6 for a full test report.

Flamefire · 2025-03-12T09:53:02Z

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
n1585 - Linux RHEL 8.9 (Ootpa), x86_64, Intel(R) Xeon(R) Platinum 8470 (icelake), Python 3.8.17
See https://gist.github.com/Flamefire/dfc494f356037b43afe18ae26f0de83c for a full test report.

…conv_1/2/3d_lower_precision_cpu_bfloat16

akesandgren · 2025-03-13T23:42:29Z

Test report by @akesandgren
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
b-an02.hpc2n.umu.se - Linux Ubuntu 20.04, x86_64, Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz, Python 3.8.10
See https://gist.github.com/akesandgren/f9b7483d858482a7db2b48ab594ad865 for a full test report.

akesandgren · 2025-03-14T14:43:53Z

Test report by @akesandgren
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
b-cn1611.hpc2n.umu.se - Linux Ubuntu 22.04, x86_64, AMD EPYC 7313 16-Core Processor, 1 x NVIDIA NVIDIA A100 80GB PCIe, 555.58.02, Python 3.10.12
See https://gist.github.com/akesandgren/4e590fcfa6500ba79bf85348e2268cbb for a full test report.

akesandgren · 2025-03-17T06:20:17Z

I'd say this one is ready for merging now.

verdurin · 2025-03-17T21:35:41Z

Test report by @verdurin
SUCCESS
Build succeeded for 5 out of 5 (2 easyconfigs in total)
easybuild-el8.cloud.in.bmrc.ox.ac.uk - Linux Rocky Linux 8.10, x86_64, Intel Xeon Processor (Skylake, IBRS), Python 3.6.8
See https://gist.github.com/verdurin/db4fc4853d17e004e278136c926fc412 for a full test report.

verdurin

Looks fine.

Flamefire · 2025-03-19T13:10:42Z

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
i7101 - Linux Rocky Linux 8.9 (Green Obsidian), x86_64, AMD EPYC 7702 64-Core Processor (zen2), Python 3.8.17
See https://gist.github.com/Flamefire/6e3669c9cf90da1e4303c5f8db5ddd68 for a full test report.

Flamefire · 2025-03-19T16:35:26Z

2.3.0 for 2024 at #22616

adding easyconfigs: PyTorch-2.3.0-foss-2023b.eb and patches: PyTorch-…

f55e28a

…2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch, PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch

akesandgren added the update label May 7, 2024

Flamefire reviewed May 14, 2024

View reviewed changes

easybuild/easyconfigs/t/tlparse/tlparse-0.3.5-GCCcore-13.2.0.eb Outdated Show resolved Hide resolved

tlparse: re-enable sanity_pip_check

f71aee2

pytorch: re-add some ported patches from previous version and disable…

22c71dd

… more tests.

fizwit reviewed May 15, 2024

View reviewed changes

boegel added this to the 4.x milestone May 22, 2024

migueldiascosta mentioned this pull request Sep 4, 2024

{math}[GCCcore/13.2.0] ArmComputeLibrary v23.08 #21309

Open

akesandgren added 2 commits March 4, 2025 13:44

tlparse is cargo based so has to use CargoPythonPackage and have the …

03fdba4

…list of cargos specified.

tlparse: add missing checksums for cargos

aeb0fab

Flamefire reviewed Mar 7, 2025

View reviewed changes

easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb Outdated Show resolved Hide resolved

Add fix for "masked load for int type", skip one more test and use --…

af45e07

…prp-logs to reduce output to stdout.

Exclude test_cpp_extensions_open_device_registration

e22f70f

akesandgren added 2 commits March 13, 2025 07:33

PyTorch: Skip test_sdpa_nn_functional_scaled_dot_product_attention_cpu.

73abe29

PyTorch: Add patch for mkldnn-avx512-f32-bias problem in test_conv_de…

4623d8f

…conv_1/2/3d_lower_precision_cpu_bfloat16

verdurin modified the milestones: 4.x, release after 4.9.4 Mar 18, 2025

verdurin approved these changes Mar 18, 2025

View reviewed changes

verdurin merged commit 44919f5 into easybuilders:develop Mar 18, 2025
10 checks passed

akesandgren deleted the 20240507124933_new_pr_PyTorch230 branch March 18, 2025 12:50

boegel modified the milestones: release after 4.9.4, 5.0.0, release after 5.0.0 Mar 18, 2025

	('typing_extensions', '4.8.0', {
	'checksums': ['df8e4339e9cb77357558cbdbceca33c303714cf861d1eef15e1070055ae8b7ef'],
	}),

{ai}[foss/2023b] PyTorch v2.3.0 #20489

{ai}[foss/2023b] PyTorch v2.3.0 #20489

Uh oh!

Conversation

akesandgren commented May 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

akesandgren commented May 7, 2024

Uh oh!

akesandgren commented May 7, 2024

Uh oh!

akesandgren commented May 7, 2024

Uh oh!

boegelbot commented May 7, 2024

Uh oh!

akesandgren commented May 7, 2024

Uh oh!

boegelbot commented May 7, 2024

Uh oh!

akesandgren commented May 9, 2024

Uh oh!

akesandgren commented May 13, 2024

Uh oh!

Flamefire commented May 13, 2024

Uh oh!

akesandgren commented May 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

akesandgren commented May 14, 2024

Uh oh!

Flamefire commented May 14, 2024

Uh oh!

akesandgren commented May 14, 2024

Uh oh!

akesandgren commented May 14, 2024

Uh oh!

Flamefire commented May 14, 2024

Uh oh!

Uh oh!

Flamefire commented May 15, 2024

Uh oh!

akesandgren commented May 15, 2024

Uh oh!

boegelbot commented May 15, 2024

Uh oh!

fizwit May 15, 2024

Choose a reason for hiding this comment

Uh oh!

Flamefire May 16, 2024

Choose a reason for hiding this comment

Uh oh!

fizwit May 18, 2024

Choose a reason for hiding this comment

Uh oh!

Flamefire May 19, 2024

Choose a reason for hiding this comment

Uh oh!

fizwit May 21, 2024

Choose a reason for hiding this comment

Uh oh!

boegelbot commented May 15, 2024

Uh oh!

akesandgren commented Aug 27, 2024

Uh oh!

Flamefire commented Aug 28, 2024

Uh oh!

jpecar commented Nov 12, 2024

Uh oh!

github-actions bot commented Mar 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Updated software PyTorch-2.3.0-foss-2023b.eb

Uh oh!

Uh oh!

akesandgren commented Mar 11, 2025

Uh oh!

boegelbot commented Mar 11, 2025

Uh oh!

boegelbot commented Mar 11, 2025

Uh oh!

akesandgren commented Mar 11, 2025

akesandgren commented May 7, 2024 •

edited

Loading

akesandgren commented May 14, 2024 •

edited

Loading

github-actions bot commented Mar 4, 2025 •

edited

Loading

Updated software `PyTorch-2.3.0-foss-2023b.eb`