-
Notifications
You must be signed in to change notification settings - Fork 37
Several fixes for icx2023.2 (including fixes for sqrt FPEs in ixx/oxx/vxx) #737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…me undetected errors! (NB this is itscrd90, so different from the baseline itscrd80) The summary says status=0... STARTED AT Mon Jul 24 07:58:00 PM CEST 2023 ./tput/teeThroughputX.sh -mix -hrd -makej -eemumu -ggtt -ggttg -ggttgg -gqttq -ggttggg -makeclean ENDED(1) AT Mon Jul 24 11:44:19 PM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -flt -hrd -makej -eemumu -ggtt -ggttgg -inlonly -makeclean ENDED(2) AT Tue Jul 25 12:08:13 AM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -makej -eemumu -ggtt -ggttg -gqttq -ggttgg -ggttggg -flt -bridge -makeclean ENDED(3) AT Tue Jul 25 12:18:54 AM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -rmbhst ENDED(4) AT Tue Jul 25 12:22:12 AM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -curhst ENDED(5) AT Tue Jul 25 12:25:28 AM CEST 2023 [Status=0] But actually some tests have failed... tput/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt:Floating Point Exception (CPU neppV=1): 'unknown' ievt=-1 tput/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt:Floating Point Exception (CPU neppV=4): 'unknown' ievt=-1 tput/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt:Floating Point Exception (CPU neppV=8): 'unknown' ievt=-1 tput/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt:Floating Point Exception (GPU): 'unknown' ievt=-1 tput/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt:Floating Point Exception (GPU): 'unknown' ievt=-1
STARTED AT Tue Jul 25 12:28:50 AM CEST 2023 ENDED AT Tue Jul 25 04:39:58 AM CEST 2023 Status=0 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_m_inl0_hrd0.txt 1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_d_inl0_hrd0.txt 1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_f_inl0_hrd0.txt 1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_m_inl0_hrd0.txt 0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt
…E-2 to 4E-2 in single precision for momenta (madgraph5#735) This fixes the test failure (also in gcc), but there is an FPE madgraph5#736 in icx
…bug FPE madgraph5#736 The FPE seems to be in testxxx.cc? Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc [==========] Running 3 tests from 3 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx Program received signal SIGFPE, Arithmetic exception. 0x0000000000413193 in SIGMA_SM_GG_TTX_CPU_XXX_testxxx_Test::TestBody (this=<optimized out>) at testxxx.cc:132 132 mass0[ievt] = sqrt( p0 * p0 - p1 * p1 - p2 * p2 - p3 * p3 ); Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34-60.el9.x86_64 libgcc-11.3.1-4.3.el9.alma.x86_64 libstdc++-11.3.1-4.3.el9.alma.x86_64 (gdb) where
…, this seems to avoid the FPE madgraph5#736
Revert "[icx] in gg_tt.mad cudacpp.mk, switch on -g (while keeping -O3) to debug FPE madgraph5#736" This reverts commit 901ddab.
…tile" workaround for madgraph5#736
…adgraph5#735 and madgraph5#736 - but a few (undetected) FPEs still take place STARTED AT Tue Jul 25 03:05:48 PM CEST 2023 ./tput/teeThroughputX.sh -mix -hrd -makej -eemumu -ggtt -ggttg -ggttgg -gqttq -ggttggg -makeclean ENDED(1) AT Tue Jul 25 04:01:00 PM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -flt -hrd -makej -eemumu -ggtt -ggttgg -inlonly -makeclean ENDED(2) AT Tue Jul 25 04:17:52 PM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -makej -eemumu -ggtt -ggttg -gqttq -ggttgg -ggttggg -flt -bridge -makeclean ENDED(3) AT Tue Jul 25 04:28:37 PM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -rmbhst ENDED(4) AT Tue Jul 25 04:31:55 PM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -curhst ENDED(5) AT Tue Jul 25 04:35:11 PM CEST 2023 [Status=0] Example: runExe /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/ee_mumu.mad/SubProcesses/P1_epem_mupmum/build.avx2_f_inl0_hrd0/runTest.exe -Floating Point Exception (CPU neppV=8): 'unknown' ievt=-1 +Floating Point Exception (CPU neppV=8): 'ixxxxx' ievt=0 May be reproduced with ./tput/teeThroughputX.sh -eemumu -fltonly
… (exit 1 instead of exit 0), see madgraph5#736
…o not go undetected (exit 1 instead of exit 0), see madgraph5#736
for f in $(gitls */SubProcesses/testxxx.cc); do \cp ee_mumu.mad/SubProcesses/testxxx.cc $f; done
…bug FPE madgraph5#736 [avalassi@itscrd90 icx2023/cvmfs] /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx> make cleanall; make -j -f cudacpp.mk FPTYPE=f AVX=avx2 ... Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc [New Thread 0x7fffedea5000 (LWP 375028)] [New Thread 0x7fffed6a4000 (LWP 375029)] [==========] Running 6 tests from 6 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx Thread 1 "runTest.exe" received signal SIGFPE, Arithmetic exception. 0x000000000041373f in mg5amcCpu::fpternary(int __vector(8) const&, float __vector(8) const&, float const&) (mask=..., a=..., b=<optimized out>) at ../../src/mgOnGpuVectors.h:490 490 for( int i = 0; i < neppV; i++ ) out[i] = ( mask[i] ? a[i] : b ); Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34-60.el9.x86_64 libgcc-11.3.1-4.3.el9.alma.x86_64 libstdc++-11.3.1-4.3.el9.alma.x86_64 nvidia-driver-cuda-libs-530.30.02-1.el9.x86_64 (gdb) where 0 0x000000000041373f in mg5amcCpu::fpternary(int __vector(8) const&, float __vector(8) const&, float const&) (mask=..., a=..., b=<optimized out>) at ../../src/mgOnGpuVectors.h:490 1 mg5amcCpu::fpmax(float __vector(8) const&, float const&) (a=..., b=<optimized out>) at ../../src/mgOnGpuVectors.h:650 2 mg5amcCpu::ixxxxx<mg5amcCpu::KernelAccessMomenta<false>, mg5amcCpu::KernelAccessWavefunctions<false> > ( momenta=momenta@entry=0x103cac0, fmass=<optimized out>, nhel=nhel@entry=1, nsf=nsf@entry=-1, wavefunctions=wavefunctions@entry=0x7fffffff9a80, ipar=ipar@entry=0) at ../../src/HelAmps_sm.h:279
…icx with another volatile for square roots (add also a volatile fpsqrt) Now ixxxxx succeeds and the FPE moves to vxxxxx make cleanall; make -j -f cudacpp.mk FPTYPE=f AVX=avx2 Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc [==========] Running 6 tests from 6 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx Floating Point Exception (CPU neppV=8): 'vxxxxx' ievt=0
…icx with another volatile for square roots Now vxxxxx succeeds and the FPE moves to oxxxxx make cleanall; make -j -f cudacpp.mk FPTYPE=f AVX=avx2 Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc [==========] Running 6 tests from 6 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx Floating Point Exception (CPU neppV=8): 'oxxxxx' ievt=0
…icx with another volatile for square roots (as done in ixxxxx)
Now oxxxxx succeeds and the FPE moves to another part of ixxxxx
make cleanall; make -j -f cudacpp.mk FPTYPE=f AVX=avx2
Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc
[==========] Running 6 tests from 6 test suites.
[----------] Global test environment set-up.
[----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX
[ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx
Floating Point Exception (CPU neppV=8): 'ixxxxx' ievt=16
(gdb) where
0x00000000004135a4 in mg5amcCpu::ixxxxx<mg5amcCpu::KernelAccessMomenta<false>, mg5amcCpu::KernelAccessWavefunctions<false> > (
momenta=momenta@entry=0x103cf00, fmass=500, nhel=nhel@entry=1, nsf=nsf@entry=-1, wavefunctions=wavefunctions@entry=0x7fffffff9a80,
ipar=ipar@entry=0) at ../../src/HelAmps_sm.h:208
208 const fptype_sv pp = fpmin( pvec0, fpsqrt( pvec1 * pvec1 + pvec2 * pvec2 + pvec3 * pvec3 ) );
…xxx for icx with another volatile for square roots
Now the FPE moves to another part of ixxxxx
make cleanall; make -j -f cudacpp.mk FPTYPE=f AVX=avx2
(gdb) where
0 0x00000000004137ff in mg5amcCpu::fpsqrt(float __vector(8) const&) (v=...) at ../../src/mgOnGpuVectors.h:253
1 mg5amcCpu::ixxxxx<mg5amcCpu::KernelAccessMomenta<false>, mg5amcCpu::KernelAccessWavefunctions<false> > (
momenta=momenta@entry=0x103cf00, fmass=<optimized out>, nhel=nhel@entry=1, nsf=nsf@entry=-1,
wavefunctions=wavefunctions@entry=0x7fffffff9a80, ipar=ipar@entry=0) at ../../src/HelAmps_sm.h:264
…icx with another volatile for square roots (add also a volatile fpsqrt)
Now ixxxxx succeeds and the FPE moves to another part of vxxxxx
make cleanall; make -j -f cudacpp.mk FPTYPE=f AVX=avx2
Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc
[==========] Running 6 tests from 6 test suites.
[----------] Global test environment set-up.
[----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX
[ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx
Floating Point Exception (CPU neppV=8): 'vxxxxx' ievt=16
(gdb) where
0 0x000000000041114c in mg5amcCpu::vxxxxx<mg5amcCpu::KernelAccessMomenta<false>, mg5amcCpu::KernelAccessWavefunctions<false> > (
momenta=momenta@entry=0x103cf00, vmass=500,
vmass@entry=<error reading variable: That operation is not available on integers of more than 8 bytes.>, nhel=nhel@entry=1,
nsv=nsv@entry=-1, wavefunctions=wavefunctions@entry=0x7fffffff96c0, ipar=ipar@entry=0) at ../../src/HelAmps_sm.h:464
1 0x00000000004431e4 in SIGMA_SM_GG_TTX_CPU_XXX_testxxx_Test::TestBody (this=<optimized out>) at testxxx.cc:372
…icx with another volatile for square roots
Now vxxxxx succeeds and the FPE moves to another part of oxxxxx
make cleanall; make -j -f cudacpp.mk FPTYPE=f AVX=avx2
Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc
[==========] Running 6 tests from 6 test suites.
[----------] Global test environment set-up.
[----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX
[ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx
Floating Point Exception (CPU neppV=8): 'oxxxxx' ievt=16
(gdb) where
0 0x00000000004126d5 in mg5amcCpu::oxxxxx<mg5amcCpu::KernelAccessMomenta<false>, mg5amcCpu::KernelAccessWavefunctions<false> > (
momenta=momenta@entry=0x103cf00, fmass=500,
fmass@entry=<error reading variable: That operation is not available on integers of more than 8 bytes.>, nhel=nhel@entry=1,
nsf=nsf@entry=-1, wavefunctions=wavefunctions@entry=0x7fffffff9c80, ipar=ipar@entry=0) at ../../src/HelAmps_sm.h:622
1 0x00000000004434c6 in SIGMA_SM_GG_TTX_CPU_XXX_testxxx_Test::TestBody (this=<optimized out>) at testxxx.cc:390
…xxx for icx with another volatile for square roots
Now the FPE moves to another part of oxxxxx
make cleanall; make -j -f cudacpp.mk FPTYPE=f AVX=avx2
Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc
[==========] Running 6 tests from 6 test suites.
[----------] Global test environment set-up.
[----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX
[ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx
Floating Point Exception (CPU neppV=8): 'oxxxxx' ievt=16
(gdb) where
0 0x000000000041291f in mg5amcCpu::fpsqrt(float __vector(8) const&) (v=...) at ../../src/mgOnGpuVectors.h:253
1 mg5amcCpu::oxxxxx<mg5amcCpu::KernelAccessMomenta<false>, mg5amcCpu::KernelAccessWavefunctions<false> > (
momenta=momenta@entry=0x103cf00, fmass=<error reading variable: That operation is not available on integers of more than 8 bytes.>,
nhel=nhel@entry=1, nsf=nsf@entry=-1, wavefunctions=wavefunctions@entry=0x7fffffff9c80, ipar=ipar@entry=0)
at ../../src/HelAmps_sm.h:679
Revert "[icx] in gg_tt.mad cudacpp.mk, switch on -g (while keeping -O3) to debug FPE madgraph5#736" This reverts commit 9a5a5bc.
…test.mk has also changed now
./tput/teeThroughputX.sh -ggtt -flt -makej -makeclean
…bug FPE madgraph5#736 make cleanall; make -j -f cudacpp.mk FPTYPE=f AVX=sse4 Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc [==========] Running 6 tests from 6 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx Floating Point Exception (CPU neppV=4): 'unknown' ievt=-1 (gdb) where 0 0x00000000004173f8 in SIGMA_SM_GG_TTX_CPU_XXX_testxxx_Test::TestBody (this=<optimized out>) at testxxx.cc:133 1 0x00000000004c6ffc in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ()
… do nt understand why it gives an FPE, honestly)
Now another FPE in sse4 moves again in ixxx...
make cleanall; make -j -f cudacpp.mk FPTYPE=f AVX=sse4
Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc
[==========] Running 6 tests from 6 test suites.
[----------] Global test environment set-up.
[----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX
[ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx
Floating Point Exception (CPU neppV=4): 'ixxxxx' ievt=0
(gdb) where
0 0x0000000000411a60 in mg5amcCpu::fpsqrt(float __vector(4) const volatile&) (v=...) at ../../src/mgOnGpuVectors.h:244
1 mg5amcCpu::ixxxxx<mg5amcCpu::KernelAccessMomenta<false>, mg5amcCpu::KernelAccessWavefunctions<false> > (
momenta=momenta@entry=0x101b8c0, fmass=<optimized out>, nhel=nhel@entry=1, nsf=nsf@entry=-1,
wavefunctions=wavefunctions@entry=0x7fffffff9de0, ipar=ipar@entry=0) at ../../src/HelAmps_sm.h:288
2 0x000000000043f30f in SIGMA_SM_GG_TTX_CPU_XXX_testxxx_Test::TestBody (this=<optimized out>) at testxxx.cc:340
…qrt (I do not understand why it gives an FPE, honestly) This now fixes the FPTYPE=f AVX=sse4 runTest.exe on icx...
./tput/teeThroughputX.sh -ggtt -flt -makej -makeclean
Revert "[icx] in gg_tt.mad cudacpp.mk, switch on -g (while keeping -O3) to debug FPE madgraph5#736" This reverts commit e3af119.
… etc - and include formatting fixes
|
This has become another very complex FPE fixing campaign... and not only in ixx/oxx/vxx. The icx optimizer, based now on clang17, is doing even more strange stuff. In some cases I have data which is supposed to be exactly 0 and I am unable to take a sqrt of that, so I do it ONLY if it is GREATER than 0. Rerunning all tests tonight. Hopefully it looks better. |
…nt Exception" errors have disappeared STARTED AT Wed Jul 26 01:32:01 AM CEST 2023 ./tput/teeThroughputX.sh -mix -hrd -makej -eemumu -ggtt -ggttg -ggttgg -gqttq -ggttggg -makeclean ENDED(1) AT Wed Jul 26 05:31:18 AM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -flt -hrd -makej -eemumu -ggtt -ggttgg -inlonly -makeclean ENDED(2) AT Wed Jul 26 05:58:22 AM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -makej -eemumu -ggtt -ggttg -gqttq -ggttgg -ggttggg -flt -bridge -makeclean ENDED(3) AT Wed Jul 26 06:12:34 AM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -rmbhst ENDED(4) AT Wed Jul 26 06:16:54 AM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -curhst ENDED(5) AT Wed Jul 26 06:21:13 AM CEST 2023 [Status=0] Example diff: -Floating Point Exception (CPU neppV=4): 'unknown' ievt=-1 +[ PASSED ] 6 tests. There is some degradation of performance, but only for simple 2->2 processes. For more complex processes, performance is essentially the same. Somewhat surprisingly, double (double FP) results do not seem to be affected? Only float (single FP) results seem to show some difference in performance and disassembly symbols?
STARTED AT Wed Jul 26 06:25:39 AM CEST 2023 ENDED AT Wed Jul 26 10:43:48 AM CEST 2023 Status=0 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_m_inl0_hrd0.txt 1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_d_inl0_hrd0.txt 1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_f_inl0_hrd0.txt 1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_m_inl0_hrd0.txt 0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt There is maybe a tiny degradation of performance, but only for simpler physics processes
…est.mk to keep backward-compatibility to epoch1/epoch2 of gtest directory names madgraph5#125 and madgraph5#738
…h1/epoch2 fixes to the other 13 processes for f in $(gitls */SubProcesses/cudacpp.mk); do \cp gg_tt.mad/SubProcesses/cudacpp.mk $f; done for f in $(gitls */test/cudacpp_test.mk); do \cp gg_tt.mad/test/cudacpp_test.mk $f; done
|
I have rerun all tests overnight - now all FPEs are fixed in icpx. I have also made minor changes in the gtest handling of compiler-specific build directories (#125 and #738) for allwoing backward compatibility to epoch1/epoch2, which were previously failing the CI This is now ready, I will self-merge. I will document the full contents a posteriori |
|
I have merged this MR #737 with another set of comprehensive FPE fixes. Here's some documentation
It should also be noted that another whole set of FPEs is still pending in #733. These are invalid (sqrt?), underflow and overflow FPEs. Initial investigations in WIP MR #706, in any case, suggest that this is due to some bug in coupling propagation in non-SM processes, rather than to the ixx/oxx/vxx functions. To be followed up... that is high priority as it is a blocker for ATLAS pp->ttW. Volia that's all for the documentation of this MR. cc @roiser @oliviermattelaer @hageboeck @Jooorgen @zeniheisser |
…ng icx madgraph5#737 into upstream/master)
This MR includes a few minor fixes for new platforms (#734), in particular icx2023.2 and clang16