Running the same MAR experiment can result in vastly different erros
These tests were made on spiritX machine, with mar-tools: c08015609ff43804fbf78ba530995e4f3c1af887 and marv3-future: 64c75700aecc3d790e57a49f8a4a07f1686211aa
(mariso_cross_compilation
branch).
Example 1:
Call of PHYrad_CEP_mp OUT : 1/ 2/2015 23: 0: 0
OUTice x-hourly outputs 8 0.00000000
Writing of OUTice in ICE.20150201.ANx.nc: 2/ 2/2015 0: 0: 0
MAR time : 2/ 2/2015 0: 0: 0
Real time : 01/24/2023 11:37:20
Step time : dt=120.0, dtHyd= 60.0, dtDiff= 60.0, dtRadi= 3600. s, nt_Mix= 2
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x14f2f4182d21 in ???
#1 0x14f2f4181ef5 in ???
#2 0x14f2f3e1b08f in ???
at /build/glibc-SzIz7B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
#3 0x55e74041bf50 in ???
#4 0x14f2f443c8e5 in ???
#5 0x55e74041cbee in ???
#6 0x55e74046c3a7 in ???
#7 0x55e7403b23ee in ???
#8 0x14f2f3dfc082 in __libc_start_main
at ../csu/libc-start.c:308
#9 0x55e7403b241d in ???
#10 0xffffffffffffffff in ???
srun: error: spiritx64-6: task 0: Segmentation fault (core dumped)
Example 2:
ERROR filatmo 2015 2 1 0 6 for ( 4,123,19) -36.=> -9.
ERROR filatmo 2015 2 1 0 6 for ( 4,124,19) -46.=> -28.
STOP in filatmo.f: NaN on pixel(i,j,k) 4 126 19
-1299.78223 6.64605951 8.56959915
Example 3:
Current / 1-Feb-2015 12: 2: 0 t = 2721720
2nd VBC / 1-Feb-2015 18 /(6) t = 2743200
CRASH1 in sisvat_qso.f on pixel (i,j,n) 25 96 1
decrease your time step or increase ntphys and ntdiff in time_steps.f
Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG IEEE_DIVIDE_BY_ZERO IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
On Spirit:
Current / 1-Feb-2015 12: 2: 0 t = 2721720
2nd VBC / 1-Feb-2015 18 /(6) t = 2743200
CRASH1 in sisvat_qso.f on pixel (i,j,n) 116 13 1
decrease your time step or increase ntphys and ntdiff in time_steps.f
Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG IEEE_DIVIDE_BY_ZERO IEEE_OVERFLOW_FLAG IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
Example 4:
On spirit with intel (spirit_intel
configuration)
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
mar 0000000000A83933 Unknown Unknown Unknown
libpthread-2.31.s 00001482CE15F420 Unknown Unknown Unknown
libiomp5.so 00001482CE2AA98D Unknown Unknown Unknown
libiomp5.so 00001482CE2A860E Unknown Unknown Unknown
libiomp5.so 00001482CE2A8224 Unknown Unknown Unknown
libiomp5.so 00001482CE29FE03 Unknown Unknown Unknown
libiomp5.so 00001482CE2A07BB Unknown Unknown Unknown
mar 0000000000AAA960 Unknown Unknown Unknown
mar 0000000000A6F735 Unknown Unknown Unknown
mar 00000000005E218D outice_ 190 outice.f90
mar 000000000041C10A MAIN__ 2460 mar.f90
mar 000000000040DB22 Unknown Unknown Unknown
libc-2.31.so 00001482CDE2C083 __libc_start_main Unknown Unknown
mar 000000000040DA2E Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
mar 0000000000A839D9 Unknown Unknown Unknown
libpthread-2.31.s 00001482CE15F420 Unknown Unknown Unknown
libiomp5.so 00001482CE2A5207 Unknown Unknown Unknown
libiomp5.so 00001482CE29DB0F Unknown Unknown Unknown
libiomp5.so 00001482CE29D89E Unknown Unknown Unknown
libiomp5.so 00001482CE22BDE1 Unknown Unknown Unknown
ld-2.31.so 00001482CEEA6F8D Unknown Unknown Unknown
libc-2.31.so 00001482CDE4E8A7 Unknown Unknown Unknown
libc-2.31.so 00001482CDE4EA60 on_exit Unknown Unknown
mar 0000000000A7F0E1 Unknown Unknown Unknown
mar 0000000000A83933 Unknown Unknown Unknown
libpthread-2.31.s 00001482CE15F420 Unknown Unknown Unknown
libiomp5.so 00001482CE2AA98D Unknown Unknown Unknown
libiomp5.so 00001482CE2A860E Unknown Unknown Unknown
libiomp5.so 00001482CE2A8224 Unknown Unknown Unknown
libiomp5.so 00001482CE29FE03 Unknown Unknown Unknown
libiomp5.so 00001482CE2A07BB Unknown Unknown Unknown
mar 0000000000AAA960 Unknown Unknown Unknown
mar 0000000000A6F735 Unknown Unknown Unknown
mar 00000000005E218D outice_ 190 outice.f90
mar 000000000041C10A MAIN__ 2460 mar.f90
mar 000000000040DB22 Unknown Unknown Unknown
libc-2.31.so 00001482CDE2C083 __libc_start_main Unknown Unknown
mar 000000000040DA2E Unknown Unknown Unknown
srun: error: spirit64-01: task 0: Exited with exit code 174