forked from madgraph5/madgraph4gpu
-
Notifications
You must be signed in to change notification settings - Fork 1
CUDA CI test #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
CUDA CI test #9
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…_ZERO (see firemodels/fds/issues/5638 on gh) with -ffpe flags However, the build gives this warning ccache /cvmfs/sft.cern.ch/lcg/releases/gcc/11.2.0-ad950/x86_64-centos8/bin/g++ -O3 -std=c++17 -I. -I../../src -I../../../../../test/googletest/install/include -I../../../../../test/googletest/install/include -Wall -Wshadow -Wextra -ffast-math -fopenmp -march=skylake-avx512 -mprefer-vector-width=256 -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -ffpe-trap=invalid,zero,overflow -ffpe-summary=none -fPIC -c testxxx.cc -o testxxx.o cc1plus: warning: command-line option ‘-ffpe-trap=invalid,zero,overflow’ is valid for Fortran but not for C++ cc1plus: warning: command-line option ‘-ffpe-summary=none’ is valid for Fortran but not for C++ I will revert
Revert "[fpe] in ggttsa cudacpp.mk, try to debug madgraph5#701 IEEE_DIVIDE_BY_ZERO (see firemodels/fds/issues/5638 on gh) with -ffpe flags" This reverts commit d75e426.
…als to debug madgraph5#701 (see https://stackoverflow.com/a/17473528) This works as expected: [avalassi@itscrd80 gcc11.2/cvmfs] /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.sa/SubProcesses/P1_Sigma_sm_gg_ttx> ./runTest.exe --gtest_filter=*xxx Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc Note: Google Test filter = *xxx [==========] Running 2 tests from 2 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx Floating point exception (core dumped)
…signal handler for madgraph5#701 [avalassi@itscrd80 gcc11.2/cvmfs] /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.sa/SubProcesses/P1_Sigma_sm_gg_ttx> make -j AVX=512y ... [avalassi@itscrd80 gcc11.2/cvmfs] /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.sa/SubProcesses/P1_Sigma_sm_gg_ttx> ./runTest.exe --gtest_filter=*xxx Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc Note: Google Test filter = *xxx [==========] Running 2 tests from 2 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx Floating Point Exception (CPU neppV=4): 'ipzxxx'
…CPP_RUNTIME_DISABLEFPE is set Note: as observed last week, a debug build triggers an FPE exception already in ixxxxx [avalassi@itscrd80 gcc11.2/cvmfs] /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.sa/SubProcesses/P1_Sigma_sm_gg_ttx> ./runTest.exe Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc [==========] Running 3 tests from 3 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx Floating Point Exception (CPU neppV=4): 'ixxxxx' Conversely, in the same debug build, disabling FPEs with the env variable gives a successful test [avalassi@itscrd80 gcc11.2/cvmfs] /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.sa/SubProcesses/P1_Sigma_sm_gg_ttx> CUDACPP_RUNTIME_DISABLEFPE=1 ./runTest.exe Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc [==========] Running 3 tests from 3 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx [ OK ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx (0 ms) [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX (0 ms total) [----------] 1 test from SIGMA_SM_GG_TTX_CPU_MISC [ RUN ] SIGMA_SM_GG_TTX_CPU_MISC.testmisc [ OK ] SIGMA_SM_GG_TTX_CPU_MISC.testmisc (0 ms) [----------] 1 test from SIGMA_SM_GG_TTX_CPU_MISC (0 ms total) [----------] 1 test from SIGMA_SM_GG_TTX_CPU/MadgraphTest [ RUN ] SIGMA_SM_GG_TTX_CPU/MadgraphTest.CompareMomentaAndME/0 INFO: Opening reference file ../../test/ref/dump_CPUTest.Sigma_sm_gg_ttx.txt INFO: The application is built for skylake-avx512 (AVX512VL) and the host supports it INFO: The application is built for skylake-avx512 (AVX512VL) and the host supports it [ OK ] SIGMA_SM_GG_TTX_CPU/MadgraphTest.CompareMomentaAndME/0 (34 ms) [----------] 1 test from SIGMA_SM_GG_TTX_CPU/MadgraphTest (34 ms total) [----------] Global test environment tear-down [==========] 3 tests from 3 test suites ran. (35 ms total) [ PASSED ] 3 tests.
No change in runTest behaviour, FPEs by default, succeeds if FPEs disabled
…et cast) No change in runTest behaviour, FPEs by default, succeeds if FPEs disabled
…handler). This also includes a resetHstMomentaToPar0, which is commented out for the moment. The idea was to modify the momenta befaore each xxx call, to ensure that they are all consistent. But I will instead implement a more solid fix. No change in runTest behaviour, FPEs by default, succeeds if FPEs disabled
…dgraph5#701 in function ixxxxx This builds ok
In debug mode this fails like this
[==========] Running 3 tests from 3 test suites.
[----------] Global test environment set-up.
[----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX
[ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx
nsp=-1 ievt=0: 500, 0, 0, 500,
IXXXXX: sqp0p3={ -0, -0, -0, -0 }
Floating Point Exception (CPU neppV=4): 'ixxxxx' ievt=0
Note: last week the sqp0p3 were not all 0. I am not sure what I was doing (I was using hstReset?).
Anyway: I will revert this commit an dthe previous one. We need a much more solid fix in all xxx functions.
…l start from scratch Revert "[fpe] in ggtt.sa HelAmps_sm.h, add some debugging printouts for ixxxxx" This reverts commit fdacc5e Revert "[fpe] in ggtt.sa HelAmps_sm.h, first (OLD!) attempt of BUG FIX FOR madgraph5#701 in function ixxxxx" This reverts commit 7674824.
The build fails because maskand is also defined in testmisc.cc
… mgOnGpuVectors.h now
Thiw now shows (in debug builds) that the first tests executed is ixxxxx and it immediately fails with FPE [==========] Running 3 tests from 3 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx nsp=-1 ievt=0: 500, 0, 0, 500, Prepare test ixxxxx ievt=0 Floating Point Exception (CPU neppV=4): 'ixxxxx' ievt=0
…ptype& r )" to create cx vectors from fp scalars
…ion ixxxxx This builds and runs ok. The FPE (always in debug mode) is now moved from ixxxxx to the next ipzxxx [==========] Running 3 tests from 3 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx nsp=-1 ievt=0: 500, 0, 0, 500, Prepare test ixxxxx ievt=0 Prepare test ipzxxx ievt=0 Floating Point Exception (CPU neppV=4): 'ipzxxx' ievt=0
…ginning of each test (prepare to modify momenta for ipzxxx) No change in runTest behaviour, FPEs by default in ipzxxx, succeeds if FPEs disabled
…respecting the relevant assumptions Assumption example for ipzxxx: (FMASS == 0) and (PX == PY == 0 and E == +PZ > 0) This is done by testing one ievt and copying all momenta to that ievt NB: after adding the woraround for ipzxxx, now the test fails in vxxxxx, which is the real issue in madgraph5#701 [==========] Running 3 tests from 3 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx nsp=-1 ievt=0: 500, 0, 0, 500, Prepare test ixxxxx ievt=0 Prepare test ipzxxx ievt=0 Prepare test vxxxxx ievt=0 Floating Point Exception (CPU neppV=4): 'vxxxxx' ievt=0
…ion vxxxxx This builds and runs ok. The FPE (always in debug mode) is now moved from vxxxxx to the next oxxxxx Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc [==========] Running 3 tests from 3 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx nsp=-1 ievt=0: 500, 0, 0, 500, Prepare test ixxxxx ievt=0 Prepare test ipzxxx ievt=0 Prepare test vxxxxx ievt=0 Prepare test sxxxxx ievt=0 Prepare test oxxxxx ievt=0 Floating Point Exception (CPU neppV=4): 'oxxxxx' ievt=0
NB1: This also adds LIBFLAGS to link command for shared libraries This is needed to avoid "hidden symbol `__gcov_init' in ...libgcov.a(_gcov.o) is referenced by DSO" errors NB2: I will not add a gcov target to .mad makefiles (they have no debug target either yet)
…make clean' Revert "[fpe] in ggt.sa .gitignore, add gcov suffixes to gitignore" This reverts commit eb5594d.
…ion oxxxxx This builds ok. The FPE (always in debug mode) is now moved from oxxxxx to the next opzxxx [==========] Running 3 tests from 3 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx nsp=-1 ievt=0: 500, 0, 0, 500, Prepare test ixxxxx ievt=0 Prepare test ipzxxx ievt=0 Prepare test vxxxxx ievt=0 Prepare test sxxxxx ievt=0 Prepare test oxxxxx ievt=0 Prepare test opzxxx ievt=0 Floating Point Exception (CPU neppV=4): 'opzxxx' ievt=0 HOWEVER, I introduced a functional bug in oxxxxx - the test fails if I disable FPEs
It builds, but the tests still fail NB: there are two different sets of ip and im whether pp=0 or pp>0 in oxxxxx! (And I should also check ixxxxx)
Revert "[icx] in gg_tt.mad cudacpp.mk, switch on -g (while keeping -O3) to debug FPE madgraph5#736" This reverts commit 9a5a5bc.
…test.mk has also changed now
./tput/teeThroughputX.sh -ggtt -flt -makej -makeclean
…bug FPE madgraph5#736 make cleanall; make -j -f cudacpp.mk FPTYPE=f AVX=sse4 Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc [==========] Running 6 tests from 6 test suites. [----------] Global test environment set-up. [----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX [ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx Floating Point Exception (CPU neppV=4): 'unknown' ievt=-1 (gdb) where 0 0x00000000004173f8 in SIGMA_SM_GG_TTX_CPU_XXX_testxxx_Test::TestBody (this=<optimized out>) at testxxx.cc:133 1 0x00000000004c6ffc in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ()
… do nt understand why it gives an FPE, honestly)
Now another FPE in sse4 moves again in ixxx...
make cleanall; make -j -f cudacpp.mk FPTYPE=f AVX=sse4
Running main() from /data/avalassi/GPU2023/madgraph4gpuX/test/googletest/googletest/src/gtest_main.cc
[==========] Running 6 tests from 6 test suites.
[----------] Global test environment set-up.
[----------] 1 test from SIGMA_SM_GG_TTX_CPU_XXX
[ RUN ] SIGMA_SM_GG_TTX_CPU_XXX.testxxx
Floating Point Exception (CPU neppV=4): 'ixxxxx' ievt=0
(gdb) where
0 0x0000000000411a60 in mg5amcCpu::fpsqrt(float __vector(4) const volatile&) (v=...) at ../../src/mgOnGpuVectors.h:244
1 mg5amcCpu::ixxxxx<mg5amcCpu::KernelAccessMomenta<false>, mg5amcCpu::KernelAccessWavefunctions<false> > (
momenta=momenta@entry=0x101b8c0, fmass=<optimized out>, nhel=nhel@entry=1, nsf=nsf@entry=-1,
wavefunctions=wavefunctions@entry=0x7fffffff9de0, ipar=ipar@entry=0) at ../../src/HelAmps_sm.h:288
2 0x000000000043f30f in SIGMA_SM_GG_TTX_CPU_XXX_testxxx_Test::TestBody (this=<optimized out>) at testxxx.cc:340
…qrt (I do not understand why it gives an FPE, honestly) This now fixes the FPTYPE=f AVX=sse4 runTest.exe on icx...
./tput/teeThroughputX.sh -ggtt -flt -makej -makeclean
Revert "[icx] in gg_tt.mad cudacpp.mk, switch on -g (while keeping -O3) to debug FPE madgraph5#736" This reverts commit e3af119.
… etc - and include formatting fixes
…nt Exception" errors have disappeared STARTED AT Wed Jul 26 01:32:01 AM CEST 2023 ./tput/teeThroughputX.sh -mix -hrd -makej -eemumu -ggtt -ggttg -ggttgg -gqttq -ggttggg -makeclean ENDED(1) AT Wed Jul 26 05:31:18 AM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -flt -hrd -makej -eemumu -ggtt -ggttgg -inlonly -makeclean ENDED(2) AT Wed Jul 26 05:58:22 AM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -makej -eemumu -ggtt -ggttg -gqttq -ggttgg -ggttggg -flt -bridge -makeclean ENDED(3) AT Wed Jul 26 06:12:34 AM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -rmbhst ENDED(4) AT Wed Jul 26 06:16:54 AM CEST 2023 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -curhst ENDED(5) AT Wed Jul 26 06:21:13 AM CEST 2023 [Status=0] Example diff: -Floating Point Exception (CPU neppV=4): 'unknown' ievt=-1 +[ PASSED ] 6 tests. There is some degradation of performance, but only for simple 2->2 processes. For more complex processes, performance is essentially the same. Somewhat surprisingly, double (double FP) results do not seem to be affected? Only float (single FP) results seem to show some difference in performance and disassembly symbols?
STARTED AT Wed Jul 26 06:25:39 AM CEST 2023 ENDED AT Wed Jul 26 10:43:48 AM CEST 2023 Status=0 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_m_inl0_hrd0.txt 1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_d_inl0_hrd0.txt 1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_f_inl0_hrd0.txt 1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_m_inl0_hrd0.txt 0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt There is maybe a tiny degradation of performance, but only for simpler physics processes
…est.mk to keep backward-compatibility to epoch1/epoch2 of gtest directory names madgraph5#125 and madgraph5#738
…h1/epoch2 fixes to the other 13 processes for f in $(gitls */SubProcesses/cudacpp.mk); do \cp gg_tt.mad/SubProcesses/cudacpp.mk $f; done for f in $(gitls */test/cudacpp_test.mk); do \cp gg_tt.mad/test/cudacpp_test.mk $f; done
Several fixes for icx2023.2 (including fixes for sqrt FPEs in ixx/oxx/vxx)
…ng of upstream/master
…(will revert the log)
Revert "[jthip] regenerate ggttgg.mad after merging upstream/master - all ok (will revert the log)" This reverts commit 9d5b6d9.
…to gpu_abstraction
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.