Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
287 commits
Select commit Hold shift + click to select a range
6751303
[ihack_ihel] revert tput logs to ease comparisons
valassi Sep 29, 2022
05f3c8f
[hack_ihel] rerun 5 tputs after fixing ihel splitting #538: functiona…
valassi Sep 29, 2022
a66dc90
[hack_ihel] prepare to merge upstream/master - checkout all generated…
valassi Feb 25, 2024
e848dc8
[hack_ihel] in gg_tt.sa CPPProcess.h, split sigmakin and sigmakin_get…
valassi Feb 26, 2024
3308626
[hack_ihel] in gg_tt.sa CPPProcess.cc, split sigmakin and sigmakin_ge…
valassi Feb 26, 2024
542edac
Merge remote-tracking branch 'upstream/master' into hack_ihel
valassi Feb 26, 2024
cabcb0d
[hack_ihel] in gg_tt.sa CPPProcess.cc, progress in fixing cuda builds…
valassi Feb 26, 2024
982f0e9
[hack_ihel] in gg_tt.sa (MemoryAccessMatrixElements.h, MemoryBuffers.…
valassi Feb 26, 2024
c65f089
[hack_ihel] in gg_tt.sa MemoryAccessMatrixElements.h, rename "icomb" …
valassi Oct 30, 2024
fb4aac1
[hack_ihel] in gg_tt.sa (MemoryAccessMatrixElements.h and various .cc…
valassi Oct 30, 2024
98e3f58
[hack_ihel] more WIP in gg_tt.sa CPPProcess.cc - C++ runTest still fa…
valassi Oct 31, 2024
97ea53c
[hack_ihel] more WIP in gg_tt.sa CPPProcess.cc - C++ runTest still fails
valassi Oct 31, 2024
f291fe3
[hack_ihel] revert the last 5 commits in gg_tt.sa: go back to ME buff…
valassi Nov 2, 2024
fc3843a
[hack_ihel] in gg_tt.sa GpuAbstraction.h, add gpuMemcpyDeviceToDevice
valassi Nov 2, 2024
65b65be
[hack_ihel] in gg_tt.sa, fix GPU builds (add select_hel kernel and m_…
valassi Nov 2, 2024
182df8c
[hack_ihel] in gg_tt.sa, fix gcheck.exe, however runTest.exe still fa…
valassi Nov 2, 2024
45489a9
[hack_ihel] in gg_tt.sa and CODEGEN runTest.cc, make MEK a data membe…
valassi Nov 2, 2024
5018c21
[hack_ihel] in gg_tt.sa, replace <<<..kernel..>>> by gpuKernelLaunch …
valassi Nov 2, 2024
3f149bb
[hack_ihel] in gg_tt.sa, remove normalise_output from CPPProcess.h an…
valassi Nov 2, 2024
7d9d8c9
Merge remote-tracking branch 'upstream/master' into glav2_hack_ihel
valassi Nov 3, 2024
4de84d7
[hack_ihel] in gg_tt.sa CPPProcess.cc, remove ievt declaration to fix…
valassi Nov 3, 2024
ad5dc76
[hack_ihel] in gg_tt.sa, fix clang formatting as required in the upco…
valassi Nov 3, 2024
d4f6e9c
[hack_ihel] backport to CODEGEN the changes in gg_tt.sa (except for C…
valassi Nov 3, 2024
226c4ea
[hack_ihel] backport to CODEGEN the changes in gg_tt.sa CPPProcess.cc…
valassi Nov 3, 2024
ec92461
[hack_ihel] regenerate gg_tt.sa, all as expected
valassi Nov 3, 2024
1f7e38e
[hack_ihel] regenerate gg_tt.mad with the latest CODEGEN using one ke…
valassi Nov 3, 2024
488da4a
[hack_ihel] in gg_tt.mad CPPProcess.cc, progress towards fixing color…
valassi Nov 3, 2024
664df84
[hack_ihel] in gg_tt.mad CPPProcess.cc, clearly distinguish jamp2_sv …
valassi Nov 3, 2024
9566d22
[hack_ihel] in gg_tt.mad CPPProcess.cc, progress towards fixing color…
valassi Nov 3, 2024
f4bd31f
[hack_ihel] in gg_tt.mad CPPProcess.cc, call gpuDeviceSynchronize() a…
valassi Nov 3, 2024
e59a48d
[hack_ihel] in gg_tt.mad CPPProcess.cc, initialize allJamp2s to 0 - n…
valassi Nov 3, 2024
b7725ca
[hack_ihel] in gg_tt.mad and CODEGEN GpuAbstraction.h, add gpuMemset …
valassi Nov 3, 2024
e68e89c
[hack_ihel] in gg_tt.mad CPPProcess.cc, replace cudaMem functions by …
valassi Nov 3, 2024
52a0d15
[hack_ihel] backport to CODEGEN the changes in gg_tt.mad CPPProcess.c…
valassi Nov 3, 2024
229286f
[hack_ihel] regenerate gg_tt.mad, all as expected - can now regenerat…
valassi Nov 3, 2024
a0f04fa
[hack_ihel] regenerate all processes
valassi Nov 3, 2024
4ad0f3b
[hack_ihel] rerun 102 tput builds and tests on itscrd90 with one heli…
valassi Nov 4, 2024
c13d578
[hack_ihel] rerun 30 tmad tests on itscrd90 with one helicity per ker…
valassi Nov 4, 2024
8b8a650
[hack_ihel] in gg_tt.sa, outside multichannel mode remove variables t…
valassi Nov 4, 2024
1bffacd
[hack_ihel] backport to CODEGEN the gg_tt.sa fixes for build warnings…
valassi Nov 4, 2024
7f7918b
[hack_ihel] regenerate gg_tt.sa (all ok with formatting fixes) and .m…
valassi Nov 4, 2024
43adfea
[hack_ihel] regenerate ee_mumu.sa to debug the issues in runTest
valassi Nov 4, 2024
587f83d
[hack_ihel] in ee_mumu.sa CPPProcess.cc, change helcolDenominators[0]…
valassi Nov 4, 2024
980b865
[hack_ihel] bug fix in CODEGEN: use %(den_factors)s for helcolDenomin…
valassi Nov 4, 2024
80d021d
[hack_ihel] regenerate ee_mumu.sa (all ok) and ee_mumu.mad
valassi Nov 4, 2024
74b1e71
[hack_ihel] in ee_mumu.mad CPPProcess.cc, add class DeviceAccessJamp2…
valassi Nov 4, 2024
82e2f93
[hack_ihel] in CODEGEN, backport ee_mumu.mad CPPProcess.cc, adding cl…
valassi Nov 4, 2024
405ccfe
[hack_ihel] regenerate all processes (checked that ee_mumu.mad was ok)
valassi Nov 4, 2024
87990d3
[hack_ihel] in ee_mumu.mad and CODEGEN CPPProcess.cc, fix silly bug i…
valassi Nov 5, 2024
bf5c6b1
[hack_ihel] regenerate all processes (checked that ee_mumu.mad was ok)
valassi Nov 5, 2024
4dae3d9
[hack_ihel] rerun 102 tput builds and tests on itscrd90 with one heli…
valassi Nov 5, 2024
d7d156a
[hack_ihel] (COMPLETE PART1a WITHOUT STREAMS) rerun 30 tmad tests on …
valassi Nov 5, 2024
038f244
[hack_ihel] in gg_tt.mad and CODEGEN GpuAbstraction.h, add support fo…
valassi Nov 5, 2024
dd93156
[hack_ihel] in gg_tt.mad and CODEGEN, finally implement one helicity …
valassi Nov 5, 2024
9141d20
[hack_ihel] regenerate all processes with one helicity per thread usi…
valassi Nov 5, 2024
d6bab91
[hack_ihel] in gg_ttgg.mad and CODEGEN, fix cuda stream performance (…
valassi Nov 5, 2024
75c9734
[hack_ihel] in tput/throughputX.sh, replace sigmaKin by calculate_wav…
valassi Nov 5, 2024
5545b23
[hack_ihel] in gg_ttgg.mad and CODEGEN, finally fix both streams perf…
valassi Nov 5, 2024
a988f74
[hack_ihel] regenerate all processes after fixing both performance an…
valassi Nov 6, 2024
d880493
[hack_ihel] rerun 102 tput builds and tests on itscrd90 with one heli…
valassi Nov 6, 2024
55ecb8a
[hack_ihel] (COMPLETE PART1b WITH STREAMS) rerun 30 tmad tests on its…
valassi Nov 6, 2024
143e1c3
[hack_ihel2] in gg_tt.mad and CODEGEN, add allJamps argument in all A…
valassi Nov 6, 2024
0bdcac0
[hack_ihel2] regenerate gg_ttgg.mad
valassi Nov 6, 2024
1a8da67
[hack_ihel2] in gg_tt.mad, gg_ttgg.mad and CODEGEN, remove MemoryAcce…
valassi Nov 6, 2024
1a73547
[hack_ihel2] in gg_ttgg.mad and CODEGEN, split calculate_wavefunction…
valassi Nov 6, 2024
5d469e4
[hack_ihel2] in tput/throughputX.sh, profile both the calculate_jamps…
valassi Nov 6, 2024
e4665dd
[hack_ihel2] regenerate all processes after splitting the two kernels…
valassi Nov 7, 2024
a1bedd3
[hack_ihel2] bug fix in CODEGEN MemoryBuffers.h (remove hardcoded sm …
valassi Nov 7, 2024
c8e3576
[hack_ihel2] in CODEGEN/allGenerateAndCompare.sh add -bsmonly and -no…
valassi Nov 7, 2024
590f9b2
[hack_ihel2] regenerate all processes after fixing BSM codegen
valassi Nov 7, 2024
cf1eb92
[hack_ihel2] rerun 102 tput tests on itscrd90 after splitting jamps a…
valassi Nov 7, 2024
adaa7fa
[hack_ihel2] (COMPLETE PART2 SPLIT JAMPS/COLOR WITHOUT CUBLAS) rerun …
valassi Nov 7, 2024
25c7358
[hack_ihel3/cublas] in gg_ttgg.mad CPPProcess.cc, move the color matr…
valassi Nov 7, 2024
512d2a0
[hack_ihel3/cublas] in gg_ttgg.mad CPPProcess.cc, use a NormalizedCol…
valassi Nov 8, 2024
57396cd
[hack_ihel3/cublas] in gg_ttgg.mad and CODEGEN GpuAbstraction.h and G…
valassi Nov 7, 2024
c20ec19
[hack_ihel3/cublas] in gg_ttgg.mad and CODEGEN, add a first almost co…
valassi Nov 11, 2024
33874a2
[hack_ihel3/cublas] regenerate all processes
valassi Nov 10, 2024
bb85437
[hack_ihel3/cublas] in tput and tmad driver scripts, export CUDACPP_R…
valassi Nov 10, 2024
b6437ef
[hack_ihel3/cublas] in tput/throughputX.sh, profile color_sum_kernel …
valassi Nov 11, 2024
c98f554
[hack_ihel3/cublas] rerun 94 (not 102: skip ggttggg) tput tests on it…
valassi Nov 11, 2024
7bfc4c3
[hack_ihel3/cublas] rerun 27 (not 30: skip ggttggg) tmad tests on its…
valassi Nov 11, 2024
256bff6
[hack_ihel3/cublas] in gg_ttgg.mad and CODEGEN CPPProcess.cc, move th…
valassi Nov 13, 2024
d7e73b0
[hack_ihel3/cublas] regenerate all processes
valassi Nov 13, 2024
fe24855
[hack_ihel3/cublas] in gg_ttgg.mad and CODEGEN GpuAbstraction.h, use …
valassi Nov 13, 2024
09c9483
[hack_ihel3/cublas] in gg_ttgg.mad, progress in adding FPTYPE=m suppo…
valassi Nov 13, 2024
6e6adb0
[hack_ihel3/cublas] in gg_ttgg.mad, improve CUDACPP_RUNTIME_BLASCOLOR…
valassi Sep 4, 2025
acb4135
[hack_ihel3/cublas] in gg_ttgg.mad, improve comments in the code
valassi Sep 4, 2025
941fdaa
[hack_ihel3/cublas] complete backport of gg_ttgg.mad (NB runTest stil…
valassi Sep 9, 2025
a3bff69
[hack_ihel3/cublas] in gg_ttgg.mad and CODEGEN, ensure createNormaliz…
valassi Sep 9, 2025
19df9b1
[hack_ihel3/cublas] check that codegen for gg_ttgg.mad is ok
valassi Sep 9, 2025
1cf2ddd
[hack_ihel3/cublas] regenerate gg_tt.mad: note that runTest stil fail…
valassi Sep 9, 2025
4c4d394
[hack_ihel3/cublas] in gg_tt.mad, move color sum methods to separate …
valassi Sep 9, 2025
19568fe
[hack_ihel3/cublas] in CODEGEN, backport from gg_tt.mad the move of c…
valassi Sep 9, 2025
3f2a8ff
[hack_ihel3/cublas] check that codegen for gg_tt.mad is ok
valassi Sep 10, 2025
cca6cb2
[hack_ihel3/cublas] in gg_tt.mad, tmp debug: inspect fptype2 jamps be…
valassi Sep 10, 2025
97f3e4f
[hack_ihel3/cublas] in gg_tt.mad, tmp debug: inspect fptype2 ztemp af…
valassi Sep 10, 2025
762f50d
[hack_ihel3/cublas] in gg_tt.mad, add a comment about allJamps reuse …
valassi Sep 10, 2025
27b0ab6
[hack_ihel3/cublas] in gg_tt.mad, tmp debug: inspect again fptype2 ja…
valassi Sep 10, 2025
8b01d27
[hack_ihel3/cublas] in gg_tt.mad, tmp debug: inspect fptype2 MEs afte…
valassi Sep 10, 2025
7e38062
[hack_ihel3/cublas] in gg_tt.mad, tmp debug: show that color_sum_gpu …
valassi Sep 10, 2025
76276a1
[hack_ihel3/cublas] in gg_tt.mad, BUG FIX for runTest in blas/m mode:…
valassi Sep 10, 2025
6fee43f
[hack_ihel3/cublas] in gg_tt.mad, revert the last five tmp debug commits
valassi Sep 10, 2025
0529907
[hack_ihel3/cublas] in CODEGEN, backport gg_tt.mad: BUG FIX for runTe…
valassi Sep 10, 2025
176871c
[hack_ihel3/cublas] check that codegen for gg_tt.mad is ok
valassi Sep 10, 2025
da0aab2
[hack_ihel3/cublas] regenerate gg_ttgg/ggg.mad and test gg_ttggg.mad …
valassi Sep 10, 2025
ab9d61d
[hack_ihel3/cublas] in gg_tt.mad, move from new2 to new1 striding for…
valassi Sep 10, 2025
71a015f
[hack_ihel3/cublas] in CODEGEN, backport gg_tt.mad, move from new2 to…
valassi Sep 10, 2025
0d5cda9
[hack_ihel3/cublas] check that codegen for gg_tt.mad is ok
valassi Sep 10, 2025
a019caa
[hack_ihel3/cublas] regenerate gg_ttgg/ggg.mad and test gg_ttggg.mad …
valassi Sep 10, 2025
965b64b
[hack_ihel3/cublas] in gg_tt.mad, simplify the code: use "new1" strid…
valassi Sep 10, 2025
0b616ed
[hack_ihel3/cublas] in gg_tt.mad, clean up and remove two unnecessary…
valassi Sep 10, 2025
31b05ec
[hack_ihel3/cublas] in gg_tt.mad, improve the handling of CUDACPP_RUN…
valassi Sep 10, 2025
bf03fa5
[hack_ihel3/cublas] in gg_tt.mad, replace CUBLAS_OP by GPUBLAS_OP to …
valassi Sep 10, 2025
5b34fb1
[hack_ihel3/cublas] in CODEGEN, backport latest changes from gg_tt.ma…
valassi Sep 10, 2025
90924ce
[hack_ihel3/cublas] check that codegen for gg_tt.mad is ok
valassi Sep 10, 2025
5634f45
[hack_ihel3/cublas] in gg_tt.mad and CODEGEN, use CUBLAS_TF32_TENSOR_…
valassi Sep 10, 2025
b07080b
[hack_ihel3/cublas] in tput and tmad driver scripts, unset CUDACPP_RU…
valassi Sep 10, 2025
890a517
[hack_ihel3/cublas] in tput and tmad driver scripts, export HASBLAS=h…
valassi Sep 10, 2025
7eaebb6
[hack_ihel3/cublas] in tput/throughputX.sh, run cuda tests both with …
valassi Sep 10, 2025
f21659c
[hack_ihel3/cublas] regenerate all processes
valassi Sep 10, 2025
078c5a1
[hack_ihel3/cublas] in tput/throughputX.sh, add some code snipets for…
valassi Sep 10, 2025
ad4b95b
[hack_ihel3/cublas] in tmad/madX.sh, go back to CUDACPP_RUNTIME_BLASC…
valassi Sep 10, 2025
1340a2b
[hack_ihel3/cublas] revert the previous tmad tests using blas color s…
valassi Sep 11, 2025
97b96d2
[hack_ihel3/cublas] rerun 30 tmad tests on itscrd90 with cublas build…
valassi Sep 11, 2025
a79aadd
[hack_ihel3/cublas] revert the previous tput tests using blas color s…
valassi Sep 11, 2025
e7eea47
[hack_ihel3/cublas] in tput/throughputX.sh, minor bug fix in variable…
valassi Sep 11, 2025
5ad2bad
[hack_ihel3/cublas] in tput scripts, run a single blas config (off, o…
valassi Sep 11, 2025
e93833d
[hack_ihel3/cublas] in tput/allTees.sh, add noBlas and blasOn tests (…
valassi Sep 11, 2025
dfc44a0
[hack_ihel3/cublas] in tput/throughputX.sh, bug fix for -blasOn and -…
valassi Sep 11, 2025
0bc3475
[hack_ihel3/cublas] in tput/allTees.sh, add -bsmblasonly option
valassi Sep 11, 2025
a5058b7
[hack_ihel3/cublas] in tput/throughputX.sh, BUG FIX: all tests have b…
valassi Sep 11, 2025
44c49be
[hack_ihel2/3/4] rerun hack_ihel2 (commit adaa7fadc) 102 tput tests i…
valassi Sep 11, 2025
20d8d33
[hack_ihel3/cublas] rerun 114 (102 plus 12 new blas tests) tput tests…
valassi Sep 11, 2025
4284ab7
[hack_ihel3/cublas] (COMPLETE PART3 ADD THE OPTION TO COMPUTE COLOR S…
valassi Sep 11, 2025
7e0c9fc
[sep25] add missing updates to copyrights and authors for the changes…
valassi Sep 13, 2025
bffc435
[sep25] in CODEGEN/generateAndCompare.sh, remove card.png and matrix1…
valassi Sep 13, 2025
f39c928
[sep25] regenerate all processes
valassi Sep 13, 2025
b915f06
[hack_ihel4] in tput/allTees.sh, check for 'aborted' in logs
valassi Sep 13, 2025
5e21ce3
[hack_ihel4] in tmad/allTees.sh, add a -checkonly option as in tput/a…
valassi Sep 13, 2025
5bfd37b
[hack_ihel4] in tmad/allTees.sh, check for asserts in logs
valassi Sep 13, 2025
576675a
[sep25] rerun 30 tmad tests on itscrd90 - all ok
valassi Sep 13, 2025
36d1b72
[sep25] in tput scripts, add scaling tests
valassi Sep 16, 2025
eb5a43e
[sep25] rerun 102 tput tests on itscrd90 (now together with 18 scalin…
valassi Sep 13, 2025
b8d7dba
[sep25] first execution of 18 tput scaling tests on itscrd90
valassi Sep 16, 2025
5fafe05
[sep25] in tput/allTees.sh, add -scalingonly option
valassi Sep 16, 2025
fc13479
[sep25] in tput/tmad scripts, print out CUDA/HIP architecture
valassi Sep 17, 2025
e407361
[sep25] in tmad/allTees.sh, check for segmentation faults in logs
valassi Sep 17, 2025
b6d902b
[sep25] rerun 114 (96 + 18 scaling) tput tests on LUMI - all ok
valassi Sep 17, 2025
8b7ed65
[sep25] rerun 30 tmad tests on LUMI - all as expected
valassi Sep 17, 2025
06aa9ca
[sep25] add tmad/strip10x.sh script to remove x10 sections from tmad …
valassi Sep 17, 2025
a7adfc2
[sep25] in tmad/allTees.sh, remove x10 tests from default (execute th…
valassi Sep 17, 2025
a15d81d
[sep25] remove x10 sections from tmad LUMI logs
valassi Sep 17, 2025
01ff46d
[sep25] revert tput/tmad logs from lumi to itscrd90 logs
valassi Sep 17, 2025
30305e3
[sep25] ** COMPLETE SEP25 ** remove x10 sections from tmad itscrd90 logs
valassi Sep 17, 2025
9f82f10
[hack_ihel_sep25] prepare to merge hack_ihel into sep25 - tput and tm…
valassi Sep 17, 2025
9d6c5ec
[hack_ihel_sep25] prepare to merge hack_ihel into sep25 - codegen log…
valassi Sep 17, 2025
10fc46c
Merge remote-tracking branch 'origin/hack_ihel' into hack_ihel_sep25
valassi Sep 17, 2025
e0d9b4a
[hack_ihel_sep25] in CODEGEN, bug fix for HIP: replace cudaStream by …
valassi Sep 17, 2025
625ebbe
[hack_ihel_sep25] in CODEGEN, fix HIP build warning: wrap checkGpu ar…
valassi Sep 17, 2025
6a8276a
[hack_ihel_sep25] regenerate all processes with HIP fixes
valassi Sep 17, 2025
e26f061
[hack_ihel_sep25] go back to 102 tput logs for sep25 on itscrd90
valassi Sep 18, 2025
5a44d13
[hack_ihel_sep25] go back to 30 tmad logs (no x10) for sep25 on itscrd90
valassi Sep 18, 2025
6044c55
[hack_ihel_sep25] rerun 114 (96 + 18 scaling) tput tests on LUMI - al…
valassi Sep 18, 2025
bc2d7b5
[hack_ihel_sep25] rerun 30 tmad tests on LUMI (no x10) - all ok (but …
valassi Sep 18, 2025
fa5a034
[hack_ihel_sep25] revert 144 tput/tmad logs from lumi/hack_ihel_sep25…
valassi Sep 18, 2025
c4ee7d6
[hack_ihel_sep25] rerun 120 (102 + 18 scaling) tput tests on itscrd90…
valassi Sep 18, 2025
9f802a9
[hack_ihel_sep25] ** COMPLETE HACK_IHEL_SEP25 ** rerun 30 tmad tests …
valassi Sep 18, 2025
7dc7620
[hack_ihel2_sep25] prepare to merge hack_ihel2 into hack_ihel_sep25 -…
valassi Sep 18, 2025
0a5a400
[hack_ihel2_sep25] prepare to merge hack_ihel2 into hack_ihel_sep25 -…
valassi Sep 18, 2025
5b3f601
Merge remote-tracking branch 'origin/hack_ihel2' into hack_ihel2_sep25
valassi Sep 18, 2025
5f12aab
[hack_ihel2_sep25] go back to 132 tput and tmad logs for hack_ihel_se…
valassi Sep 18, 2025
29bec52
[hack_ihel2_sep25] regenerate all processes - no changes except in co…
valassi Sep 18, 2025
cf93abd
[hack_ihel2_sep25] rerun 114 tput (96 + 18 scaling) logs on LUMI - fa…
valassi Sep 19, 2025
22d35e7
[hack_ihel2_sep25] rerun 30 tmad tests on LUMI - aborts in ggttgg, wi…
valassi Sep 19, 2025
3694bab
[hack_ihel2_sep25] in tput scripts, shorten HIP abort messages
valassi Sep 20, 2025
9f6a2a3
[hack_ihel2_sep25] in tput/throughputX.sh, skip ggttgg gpu test in 20…
valassi Sep 20, 2025
a91ca59
[hack_ihel2_sep25] rerun ggttgg tput (and scaling) tests on LUMI afte…
valassi Sep 20, 2025
3308104
[hack_ihel2_sep25] in tmad/allTees.sh, check for errors and aborts
valassi Sep 20, 2025
1d73925
[hack_ihel2_sep25] in tmad/madX.sh, define ggttgg max grid as 512 32 …
valassi Sep 20, 2025
9720000
[hack_ihel2_sep25] rerun ggttgg tmad tests on LUMI after tuning the s…
valassi Sep 20, 2025
d5f0f6c
[hack_ihel2_sep25] go back from hack_ihel2_sep25/LUMI to hack_ihel_se…
valassi Sep 20, 2025
cad3033
[hack_ihel2_sep25] rerun 120 (102 + 18 scaling) tput tests on itscrd9…
valassi Sep 19, 2025
1a68c99
[hack_ihel2_sep25] rerun 30 tmad (no x10) tests on itscrd90 (before t…
valassi Sep 19, 2025
2fd95f3
[hack_ihel2_sep25] rerun ggttgg tput (and scaling) tests on rd90 afte…
valassi Sep 20, 2025
7b12a5c
[hack_ihel2_sep25] ** COMPLETE HACK_IHEL2_SEP25 ** rerun ggttgg tmad …
valassi Sep 20, 2025
d6f7049
[hack_ihel3_sep25] prepare to merge hack_ihel3 into hack_ihel2_sep25 …
valassi Sep 20, 2025
d482bd5
[hack_ihel3_sep25] prepare to merge hack_ihel3 into hack_ihel2_sep25 …
valassi Sep 20, 2025
986d4cb
[hack_ihel3_sep25] prepare to merge hack_ihel3 into hack_ihel2_sep25 …
valassi Sep 20, 2025
a414fb0
Merge branch 'hack_ihel3' into hack_ihel3_sep25
valassi Sep 20, 2025
44c73de
[hack_ihel3_sep25] regenerate all processes
valassi Sep 20, 2025
975022a
[hack_ihel3_sep25] in CODEGEN (and .mad/.sa) GpuAbstraction.h, protec…
valassi Sep 20, 2025
878e522
[hack_ihel3_sep25] in CODEGEN (and .mad/.sa) GpuRuntime.h, bug fix in…
valassi Sep 20, 2025
d652d45
[hack_ihel3_sep25] in CODEGEN (and .mad/.sa) GpuAbstraction.h, define…
valassi Sep 20, 2025
f974a77
[hack_ihel3_sep25] in CODEGEN, add another hack with pBlasHandle = nu…
valassi Sep 20, 2025
95be3da
[hack_ihel3_sep25] in CODEGEN color_sum.cc, add __host__ to Normalize…
valassi Sep 20, 2025
c964d6a
[hack_ihel3_sep25] in CODEGEN cudacpp.mk and GpuAbstraction.h, look f…
valassi Sep 20, 2025
335fcd9
[hack_ihel3_sep25] regenerate all processes again with all fixes for …
valassi Sep 20, 2025
b56251e
[hack_ihel3_sep25] rerun 132 (96 + 12 blas + 18 scaling + 6 new blas/…
valassi Sep 21, 2025
ac04c54
[hack_ihel3_sep25] rerun 30 tmad tests on LUMI - all ok
valassi Sep 21, 2025
1218338
[hack_ihel2_sep25] go back from hack_ihel3_sep25/LUMI to hack_ihel2_s…
valassi Sep 21, 2025
46a4a8c
[hack_ihel3_sep25] rerun 138 (102 + 12 blas + 18 scaling + 6 new blas…
valassi Sep 21, 2025
10c3e3b
[hack_ihel3_sep25] rerun 30 tmad tests on itscrd90 - all ok
valassi Sep 21, 2025
03c0f5e
[hack_ihel3_sep25] minor fixes in tput/allTees.sh
valassi Sep 21, 2025
a91bf29
[hack_ihel3_sep25] minor fixes in tmad/madX.sh
valassi Sep 21, 2025
4ade11e
[hack_ihel3_sep25] in CODEGEN, fix a build warning for incompatible _…
valassi Sep 21, 2025
f98c217
[hack_ihel3_sep25] ** COMPLETE HACK_IHEL3_SEP25 ** regenerate all pro…
valassi Sep 21, 2025
037b612
Merge branch 'master' into hack_ihel3_sep25_pr
oliviermattelaer Oct 7, 2025
cfe9046
[hack_ihel3_oct25] in tput/allTees.sh, add 6 more scaling logs (blasO…
valassi Oct 1, 2025
bf9a225
[hack_ihel3_oct25] complete 144 tput tests on itscrd90: add 6 more bl…
valassi Oct 2, 2025
0e47e81
[hack_ihel3_oct25] in tput/throughputX.sh, use 64 thr/blk for HIP ins…
valassi Oct 2, 2025
415b0d0
[hack_ihel3p1] in gg_tt.mad, move the helicity loop into color_sum_gp…
valassi Oct 10, 2025
2ac1529
[hack_ihel3p1] in gg_tt.mad, fix hasNoBlas after moving the helicity …
valassi Oct 10, 2025
289e34e
[hack_ihel3p1] in gg_tt.mad, further cleanup of color_sum_gpu
valassi Oct 10, 2025
ae4e222
[hack_ihel3p1] backport to CODEGEN from gg_tt.mad the move of the hel…
valassi Oct 10, 2025
ee6bde7
[hack_ihel3p1] regenerate gg_tt.mad (all ok) and gg_ttggg.mad: no cha…
valassi Oct 10, 2025
112b9cf
[hack_ihel3p1] in gg_tt.mad, complete the "ihel3p1/blas" implementati…
valassi Oct 10, 2025
90c46f1
[hack_ihel3p1] backport to CODEGEN from gg_tt.mad: complete "ihel3p1/…
valassi Oct 10, 2025
02d36b2
[hack_ihel3p1] regenerate all processes after completing "ihel3p1/bla…
valassi Oct 10, 2025
a361ec0
[hack_ihel3p1] in CODEGEN/generateAndCompare.sh, remove unnecessary/b…
valassi Oct 10, 2025
1b7d976
[hack_ihel3p1] rerun 144 tput tests on itscrd90 - very good blas perf…
valassi Oct 11, 2025
f0788b8
[hack_ihel3p1] rerun 30 tmad tests on itscrd90 - some failures for gq…
valassi Oct 11, 2025
a12f34d
[hack_ihel3p1] in gq_ttq.mad, BUG FIX in the "ihel3p1/blas" implement…
valassi Oct 11, 2025
a889f3b
[hack_ihel3p1] backport to CODEGEN gq_ttq.mad BUG FIX in "ihel3p1/bla…
valassi Oct 11, 2025
43125e7
[hack_ihel3p1] go back to the 'hack_ihel3_oct25' logs (144 tput, 30 t…
valassi Oct 11, 2025
e938d1b
[hack_ihel3p1] in gg_tt.mad, use a triangular matrix instead of a squ…
valassi Oct 11, 2025
72864de
[hack_ihel3p1] backport to CODEGEN from gg_tt.mad, use a triangular m…
valassi Oct 11, 2025
2d68927
[hack_ihel3p1] regenerate all processes after fixing "ihel3p1/blas" f…
valassi Oct 11, 2025
4178974
[hack_ihel3p1] rerun 144 tput tests on itscrd90 - very good blas perf…
valassi Oct 11, 2025
5fce1aa
[hack_ihel3p1] rerun 30 tmad tests on itscrd90 - all ok (gqttq tests …
valassi Oct 11, 2025
cffd74c
[hack_ihel3p1] in CODEGEN, fix for HIP (skip TF32 cuBLAS math mode)
valassi Oct 11, 2025
5ecd504
[hack_ihel3p1] regenerate all processes after fixing "ihel3p1/blas" f…
valassi Oct 11, 2025
6baae79
[hack_ihel3p1] rerun 138 tput tests on LUMI - all ok
valassi Oct 12, 2025
c593242
[hack_ihel3p1] rerun 30 tmad tests on LUMI - all ok
valassi Oct 12, 2025
79a60c7
Merge branch 'sep25' into oct25 (current master)
valassi Oct 13, 2025
7e0f9d1
[oct25] regenerate all processes after the trex MR - add fbridge.h to…
valassi Oct 13, 2025
4952ab5
[oct25] ** COMPLETE OCT25 ** go back to sep25 CODEGEN logs for easier…
valassi Oct 13, 2025
5a068e8
[hack_ihel3p1] ** COMPLETE HACK_IHEL3P1 ** go back from hack_ihel3p1/…
valassi Oct 12, 2025
6c181f9
Update epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolib…
valassi Oct 13, 2025
0dd6df6
Update epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolib…
valassi Oct 13, 2025
44482d2
Merge remote-tracking branch 'origin/oct25' (current master + regener…
valassi Oct 13, 2025
1cbf8d3
Merge remote-tracking branch 'origin/hack_ihel3p1' (improve blas colo…
valassi Oct 13, 2025
a0328aa
[hack_ihel3_sep25_pr] regenerate all processes: some changes in nproc…
valassi Oct 13, 2025
536bd06
[hack_ihel3_sep25_pr] in gg_tt.sa, bug fix for cuda builds in no-mult…
valassi Oct 13, 2025
ffc7129
[hack_ihel3_sep25_pr] backport to CODEGEN from gg_tt.sa the bug fix f…
valassi Oct 13, 2025
e0d061b
[hack_ihel3_sep25_pr] ** COMPLETE HACK_IHEL3_SEP25_PR ** regenerate a…
valassi Oct 13, 2025
b9fa12b
merge with nopatch version
oliviermattelaer Oct 21, 2025
7c6e9ba
update source code
oliviermattelaer Oct 21, 2025
626fd07
Regenerate .mad code
Qubitol Oct 21, 2025
c58d396
Add mg5 input for .sa folders
Qubitol Oct 21, 2025
6ecfe01
Regenerate code with allGenerateAndCompare.sh
Qubitol Oct 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion .github/workflows/archiver.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Copyright (C) 2020-2025 CERN and UCLouvain.
# Licensed under the GNU Lesser General Public License (version 3 or later).
# Created by: A. Valassi (Sep 2024) for the MG5aMC CUDACPP plugin.
# Further modified by: D. Massaro, A. Valassi (2024) for the MG5aMC CUDACPP plugin.
# Further modified by: D. Massaro, A. Valassi (2024-2025) for the MG5aMC CUDACPP plugin.

#----------------------------------------------------------------------------------------------------------------------------------

Expand Down
5 changes: 5 additions & 0 deletions .github/workflows/c-cpp.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# Copyright (C) 2020-2025 CERN and UCLouvain.
# Licensed under the GNU Lesser General Public License (version 3 or later).
# Created by: S. Hageboeck (Nov 2020) for the MG5aMC CUDACPP plugin.
# Further modified by: S. Hageboeck, D. Massaro, S. Roiser, A. Valassi, Z. Wettersten (2024-2025) for the MG5aMC CUDACPP plugin.

name: C/C++ CI

on:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,17 +1,23 @@
// Copyright (C) 2020-2024 CERN and UCLouvain.
// Copyright (C) 2020-2025 CERN and UCLouvain.
// Licensed under the GNU Lesser General Public License (version 3 or later).
// Created by: J. Teig (Jul 2023) for the MG5aMC CUDACPP plugin.
// Further modified by: J. Teig, A. Valassi (2020-2024) for the MG5aMC CUDACPP plugin.
// Further modified by: J. Teig, A. Valassi (2020-2025) for the MG5aMC CUDACPP plugin.

#ifndef MG5AMC_GPUABSTRACTION_H
#define MG5AMC_GPUABSTRACTION_H 1

#include "mgOnGpuConfig.h"

#include <cassert>

//--------------------------------------------------------------------------

#ifdef __CUDACC__ // this must be __CUDACC__ (not MGONGPUCPP_GPUIMPL)

#ifndef MGONGPU_HAS_NO_BLAS
#include "cublas_v2.h"
#endif

#define gpuError_t cudaError_t
#define gpuPeekAtLastError cudaPeekAtLastError
#define gpuGetErrorString cudaGetErrorString
Expand All @@ -21,24 +27,61 @@
#define gpuMalloc( ptr, size ) checkGpu( cudaMalloc( ptr, size ) )

#define gpuMemcpy( dstData, srcData, srcBytes, func ) checkGpu( cudaMemcpy( dstData, srcData, srcBytes, func ) )
#define gpuMemset( data, value, bytes ) checkGpu( cudaMemset( data, value, bytes ) )
#define gpuMemcpyHostToDevice cudaMemcpyHostToDevice
#define gpuMemcpyDeviceToHost cudaMemcpyDeviceToHost
#define gpuMemcpyDeviceToDevice cudaMemcpyDeviceToDevice
#define gpuMemcpyToSymbol( type1, type2, size ) checkGpu( cudaMemcpyToSymbol( type1, type2, size ) )

#define gpuFree( ptr ) checkGpu( cudaFree( ptr ) )
#define gpuFreeHost( ptr ) checkGpu( cudaFreeHost( ptr ) )

#define gpuGetSymbolAddress( devPtr, symbol ) checkGpu( cudaGetSymbolAddress( devPtr, symbol ) )

#define gpuSetDevice cudaSetDevice
#define gpuDeviceSynchronize cudaDeviceSynchronize
#define gpuDeviceReset cudaDeviceReset

#define gpuLaunchKernel( kernel, blocks, threads, ... ) kernel<<<blocks, threads>>>( __VA_ARGS__ )
#define gpuLaunchKernelSharedMem( kernel, blocks, threads, sharedMem, ... ) kernel<<<blocks, threads, sharedMem>>>( __VA_ARGS__ )
//#define gpuLaunchKernelSharedMem( kernel, blocks, threads, sharedMem, ... ) kernel<<<blocks, threads, sharedMem>>>( __VA_>
#define gpuLaunchKernelStream( kernel, blocks, threads, stream, ... ) kernel<<<blocks, threads, 0, stream>>>( __VA_ARGS__ )

#define gpuStream_t cudaStream_t
#define gpuStreamCreate( pStream ) checkGpu( cudaStreamCreate( pStream ) )
#define gpuStreamDestroy( stream ) checkGpu( cudaStreamDestroy( stream ) )

#define gpuBlasStatus_t cublasStatus_t
#define GPUBLAS_STATUS_SUCCESS CUBLAS_STATUS_SUCCESS
#ifndef MGONGPU_HAS_NO_BLAS
#define gpuBlasHandle_t cublasHandle_t
#else
#define gpuBlasHandle_t void // hack to keep the same API also in noBLAS builds
#endif
#define gpuBlasCreate cublasCreate
#define gpuBlasDestroy cublasDestroy
#define gpuBlasSetStream cublasSetStream

#define gpuBlasSaxpy cublasSaxpy
#define gpuBlasSdot cublasSdot
#define gpuBlasSgemv cublasSgemv
#define gpuBlasSgemm cublasSgemm
#define gpuBlasSgemmStridedBatched cublasSgemmStridedBatched
#define gpuBlasDaxpy cublasDaxpy
#define gpuBlasDdot cublasDdot
#define gpuBlasDgemv cublasDgemv
#define gpuBlasDgemm cublasDgemm
#define gpuBlasDgemmStridedBatched cublasDgemmStridedBatched
#define GPUBLAS_OP_N CUBLAS_OP_N
#define GPUBLAS_OP_T CUBLAS_OP_T

//--------------------------------------------------------------------------

#elif defined __HIPCC__

#ifndef MGONGPU_HAS_NO_BLAS
#include "hipblas/hipblas.h"
#endif

#define gpuError_t hipError_t
#define gpuPeekAtLastError hipPeekAtLastError
#define gpuGetErrorString hipGetErrorString
Expand All @@ -48,22 +91,69 @@
#define gpuMalloc( ptr, size ) checkGpu( hipMalloc( ptr, size ) )

#define gpuMemcpy( dstData, srcData, srcBytes, func ) checkGpu( hipMemcpy( dstData, srcData, srcBytes, func ) )
#define gpuMemset( data, value, bytes ) checkGpu( hipMemset( data, value, bytes ) )
#define gpuMemcpyHostToDevice hipMemcpyHostToDevice
#define gpuMemcpyDeviceToHost hipMemcpyDeviceToHost
#define gpuMemcpyDeviceToDevice hipMemcpyDeviceToDevice
#define gpuMemcpyToSymbol( type1, type2, size ) checkGpu( hipMemcpyToSymbol( type1, type2, size ) )

#define gpuFree( ptr ) checkGpu( hipFree( ptr ) )
#define gpuFreeHost( ptr ) checkGpu( hipHostFree( ptr ) )

#define gpuGetSymbolAddress( devPtr, symbol ) checkGpu( hipGetSymbolAddress( devPtr, symbol ) )

#define gpuSetDevice hipSetDevice
#define gpuDeviceSynchronize hipDeviceSynchronize
#define gpuDeviceReset hipDeviceReset

#define gpuLaunchKernel( kernel, blocks, threads, ... ) kernel<<<blocks, threads>>>( __VA_ARGS__ )
#define gpuLaunchKernelSharedMem( kernel, blocks, threads, sharedMem, ... ) kernel<<<blocks, threads, sharedMem>>>( __VA_ARGS__ )
//#define gpuLaunchKernelSharedMem( kernel, blocks, threads, sharedMem, ... ) kernel<<<blocks, threads, sharedMem>>>( __VA_>
#define gpuLaunchKernelStream( kernel, blocks, threads, stream, ... ) kernel<<<blocks, threads, 0, stream>>>( __VA_ARGS__ )

#define gpuStream_t hipStream_t
#define gpuStreamCreate( pStream ) checkGpu( hipStreamCreate( pStream ) )
#define gpuStreamDestroy( stream ) checkGpu( hipStreamDestroy( stream ) )

#define gpuBlasStatus_t hipblasStatus_t
#define GPUBLAS_STATUS_SUCCESS HIPBLAS_STATUS_SUCCESS
#ifndef MGONGPU_HAS_NO_BLAS
#define gpuBlasHandle_t hipblasHandle_t
#else
#define gpuBlasHandle_t void // hack to keep the same API also in noBLAS builds
#endif
#define gpuBlasCreate hipblasCreate
#define gpuBlasDestroy hipblasDestroy
#define gpuBlasSetStream hipblasSetStream

#define gpuBlasSaxpy hipblasSaxpy
#define gpuBlasSdot hipblasSdot
#define gpuBlasSgemv hipblasSgemv
#define gpuBlasSgemm hipblasSgemm
#define gpuBlasSgemmStridedBatched hipblasSgemmStridedBatched
#define gpuBlasDaxpy hipblasDaxpy
#define gpuBlasDdot hipblasDdot
#define gpuBlasDgemv hipblasDgemv
#define gpuBlasDgemm hipblasDgemm
#define gpuBlasDgemmStridedBatched hipblasDgemmStridedBatched
#define GPUBLAS_OP_N HIPBLAS_OP_N
#define GPUBLAS_OP_T HIPBLAS_OP_T

#endif

//--------------------------------------------------------------------------

#ifdef MGONGPU_FPTYPE2_FLOAT
#define gpuBlasTaxpy gpuBlasSaxpy
#define gpuBlasTdot gpuBlasSdot
#define gpuBlasTgemv gpuBlasSgemv
#define gpuBlasTgemm gpuBlasSgemm
#define gpuBlasTgemmStridedBatched gpuBlasSgemmStridedBatched
#else
#define gpuBlasTaxpy gpuBlasDaxpy
#define gpuBlasTdot gpuBlasDdot
#define gpuBlasTgemv gpuBlasDgemv
#define gpuBlasTgemm gpuBlasDgemm
#define gpuBlasTgemmStridedBatched gpuBlasDgemmStridedBatched
#endif

#endif // MG5AMC_GPUABSTRACTION_H
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// Copyright (C) 2020-2024 CERN and UCLouvain.
// Copyright (C) 2020-2025 CERN and UCLouvain.
// Licensed under the GNU Lesser General Public License (version 3 or later).
// Created by: J. Teig (Jun 2023, based on earlier work by S. Roiser) for the MG5aMC CUDACPP plugin.
// Further modified by: O. Mattelaer, S. Roiser, J. Teig, A. Valassi, Z. Wettersten (2020-2025) for the MG5aMC CUDACPP plugin.
Expand Down Expand Up @@ -30,6 +30,22 @@ inline void assertGpu( gpuError_t code, const char* file, int line, bool abort =

//--------------------------------------------------------------------------

#ifdef MGONGPUCPP_GPUIMPL /* clang-format off */
#ifndef MGONGPU_HAS_NO_BLAS
#define checkGpuBlas( code ){ assertGpuBlas( code, __FILE__, __LINE__ ); }
inline void assertGpuBlas( gpuBlasStatus_t code, const char *file, int line, bool abort = true )
{
if ( code != GPUBLAS_STATUS_SUCCESS )
{
printf( "ERROR! assertGpuBlas: '%d' in %s:%d\n", code, file, line );
if( abort ) assert( code == GPUBLAS_STATUS_SUCCESS );
}
}
#endif
#endif /* clang-format on */

//--------------------------------------------------------------------------

#ifdef MGONGPUCPP_GPUIMPL
namespace mg5amcGpu
{
Expand Down
Loading