Numerical differences between CPU and GPU versions
I run some simple tests to see if GPU and CPU give similar results. For all runs I used the test/jeanzay/run.def
configuration.
Test info:
- Configuration:
test/jeanzay/run.def
- Branch:
master
- Commit: 2dffea37
-
make_icosa
used with flags:- CPU:
-trunk -arch X64_JEANZAY -parallel mpi -with_xios -job 8
- GPU:
-trunk -arch JEANZAY_NVIDIA_ACC -parallel mpi -with_xios -job 8
(JEANZAY_NVIDIA_ACC arch files come from simple_phyiscs)
- CPU:
Takeaways:
- GPU on 1 and 4 nodes/cards gives exactly the same result
- GPU differs from CPU slightly (I think in the reasonable limits?):
(venv) (nccmp) [ump84rg@jean-zay-pp1: DYNAMICO]$ nccmp dynamico_2dffea37_test_jz_rundef_cpu.nc dynamico_2dffea37_test_jz_rundef_gpu_n4.nc -dfqS
Variable Group Count Sum AbsSum Min Max Range Mean StdDev
ps / 5384 -14.9062 42.0625 -0.0078125 0.0078125 0.015625 -0.00276862 0.00730615
PHI / 240995 -289.124 995.945 -0.03125 0.03125 0.0625 -0.00119971 0.00794288
U / 3884378 13.2657 20.5828 -7.05719e-05 0.000211783 0.000282355 3.41514e-06 1.54676e-05
V / 5342716 -0.00125629 8.85545 -2.45898e-05 3.04542e-05 5.50441e-05 -2.35141e-10 2.81349e-06
P / 69521 -146.721 433.955 -0.0078125 0.0078125 0.015625 -0.00211045 0.00628369
T / 176129 -0.034256 3.55864 -3.05176e-05 3.05176e-05 6.10352e-05 -1.94494e-07 2.1429e-05
U850 / 137690 0.221722 0.407975 -2.00272e-05 5.51082e-05 7.51354e-05 1.6103e-06 6.6034e-06
V850 / 178103 1.28482e-05 0.221708 -1.5812e-05 1.57934e-05 3.16054e-05 7.21391e-11 2.20717e-06
T850 / 9287 0.0144196 0.212906 -3.05176e-05 3.05176e-05 6.10352e-05 1.55266e-06 2.41127e-05
OMEGA850 / 177948 -0.000341993 0.00401326 -7.33038e-07 1.27917e-06 2.01221e-06 -1.92187e-09 4.71969e-08
U500 / 124234 0.324376 0.616242 -2.86102e-05 0.000100374 0.000128984 2.61101e-06 1.21784e-05
V500 / 178091 1.06196e-05 0.367027 -2.04127e-05 2.01836e-05 4.05963e-05 5.96299e-11 3.6028e-06
T500 / 3666 -0.0134125 0.0699005 -3.05176e-05 3.05176e-05 6.10352e-05 -3.65861e-06 1.98467e-05
OMEGA500 / 177975 -0.000179265 0.00318635 -4.98723e-07 6.37956e-07 1.13668e-06 -1.00725e-09 3.11756e-08
q / 2078690 -1.93187 7.37466 -0.000244141 0.00012207 0.000366211 -9.2937e-07 1.62199e-05
I am attaching also the visualization of spatial differences for some variables (ps, U500, V500, W500, T500, OMEGA500, U850, V850, W850, T850, OMEGA850) at first, middle, and last timestamps spatial_differences_cpu_vs_gpu_n4_fml.pdf.
Generated with the command:
python3 <DYNAMICO_PATH>/gitlab-ci/data-validation/plot_spatial_differences.py dynamico_2dffea37_test_jz_rundef_cpu.nc dynamico_2dffea37_test_jz_rundef_gpu_n4.nc --variables ps U500 V500 W500 T500 OMEGA500 U850 V850 W850 T850 OMEGA850 --ts_strategy fml --output spatial_differences_cpu_vs_gpu_n4_fml.pdf
Edited by Patryk Kiepas