Pierre-Yves Barriat a882c0589c Mise à jour de 'eORCA025/eORCA025.L121-LUCIA00/README.md' 1 year ago
..
arch_nemo 3a384b1474 First Lucia eORCA025.L121 1 year ago
arch_xios 3a384b1474 First Lucia eORCA025.L121 1 year ago
cfgs 5307324563 New cfg with older XIOS 1 year ago
README.md a882c0589c Mise à jour de 'eORCA025/eORCA025.L121-LUCIA00/README.md' 1 year ago

README.md

Running eORCA025.L121-LUCIA00

Source code

Download NEMO code 4.2.0

git clone --branch 4.2.0 https://forge.nemo-ocean.eu/nemo/nemo.git nemo_4.2.0

Extract and install XIOS (trunk)

svn co http://forge.ipsl.jussieu.fr/ioserver/svn/XIOS/trunk xios_trunk

revision 2482

Setup

module load craype-x86-milan
module load PrgEnv-gnu/8.3.3
module load netCDF-Fortran/4.6.0-gompi-2022a
module load Perl/.5.34.1-GCCcore-11.3.0

Compile XIOS with the arch files from this folder

./make_xios --arch lucia_gnu -j 4

Compile NEMO

  • with this arch file > check the path to your XIOS
  • with this cfg folder > open and add a line to nemo_4.2.0/cfgs/ref_cfgs.txt :

ORCA025_ICE OCE ICE

./makenemo -m 'lucia_gnu' -r ORCA025_ICE -n 'MY_ORCA025_ICE' -j '4'

Data

I prepared all the input files in /gpfs/scratch/acad/ecearth/pbarriat/data/nemo directory. I set up the namelist and the launch script according to the name and path of the input files.

Very first try

CC Release RES XIOS NEMO #NODES Optimization WTIME per MONTH
NE4_00.sh foss 2022a e025 8 592 6 -O3 45min
NE4_01.sh foss 2022a e025 9 1191 12 -O3 21min
  • 1 year (1979), restart every month
  • forcing JRA55 no_leap 3h
  • nem_time_step_sec=1350 and lim_time_step_sec=1350

Initial data:

Goutorbe_ghflux.nc
eORCA025_ghflux_v2.0_c3.0_weights_bilin_nohls.nc
eORCA025_iwm_b0.2_v1.0_nohls.nc
eORCA025.L121_domain_cfg_b0.5_c3.0_d1.0_nohls_clean.nc
eORCA025_runoff_b0.2_v0.0_nohls.nc
eORCA025_calving_b0.2_v2.3_nohls.nc
eORCA025_ttv_b0.2_v0.0_nohls.nc
eORCA025_bfr2d_v0.2_nohls.nc
eORCA025_shlat2d_v0.2_nohls.nc
eORCA025_distcoast_b0.2_v0.0_nohls.nc
eORCA025.L121-empc_nohls.nc
eORCA025.L121_WOA2018_c3.0_d1.0_v19812010.5.2_nohls.nc
chlorophyl_v0.0.nc
eORCA025_chlorophyl_v0.0_c3.0_weights_bilin_nohls.nc
eORCA025_sss_WOA2018_c3.0_v19812010.5.1_nohls.nc
eORCA025_seaice_c3.0_v19802004.0_nohls.nc

NE4_01 ran 12 legs (JRA limited to one year: needs others years if you want to continue)

Change the forcing

CC Release RES XIOS NEMO #NODES Optimization WTIME per MONTH
NE4_02.sh foss 2022a e025 8 592 6 -O3 45min
NE4_03.sh foss 2022a e025 10 2390 24 -O3 13min
NE4_04.sh foss 2022a e025 9 1191 12 -O3 21min
  • starting from year 1960, restart every month
  • forcing ERA5 leap 3h
  • nem_time_step_sec=1350 and lim_time_step_sec=1350

NE4_03 runs 15 legs then:

[cns182:1940326:0:1940326] Caught signal 11 (Segmentation fault: Sent by the kernel at address (nil))
==== backtrace (tid:1940326) ====
 0 0x000000000004eb20 killpg()  ???:0
 1 0x0000000000b0c32f __icblbc_MOD_icb_lbc_mpp()  ???:0
 2 0x000000000081127c __icbstp_MOD_icb_stp()  ???:0
 3 0x0000000000475e86 __sbcmod_MOD_sbc()  ???:0
 4 0x00000000004a60c0 __step_MOD_stp()  ???:0
 5 0x000000000046401f __nemogcm_MOD_nemo_gcm()  ???:0
 6 0x00000000004594ed main()  ???:0
 7 0x000000000003acf3 __libc_start_main()  ???:0
 8 0x000000000046123e _start()  ???:0
=================================

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x1489006d0b1f in ???
#1  0xb0c32f in ???
#2  0x81127b in ???
#3  0x475e85 in ???
#4  0x4a60bf in ???
#5  0x46401e in ???
#6  0x4594ec in ???
#7  0x1489006bccf2 in ???
#8  0x46123d in ???
#9  0xffffffffffffffff in ???
srun: error: cns182: task 2382: Segmentation fault (core dumped)
srun: launch/slurm: _step_signal: Terminating StepId=458518.0
slurmstepd: error: *** STEP 458518.0 ON cns159 CANCELLED AT 2023-04-24T11:38:53 ***

re-launch 3 times with differents nodes: same issue

NE4_04 runs 14 legs then same error as above

error related to icebergs

Disable the icebergs

CC Release RES XIOS NEMO #NODES Optimization WTIME per MONTH
NE4_05.sh foss 2022a e025 9 1191 12 -O3 21min
  • starting from year 1960, restart every month
  • forcing ERA5 leap 3h
  • nem_time_step_sec=1350 and lim_time_step_sec=1350
  • no icebergs

NE4_05 runs 77 legs then:


  ===>>> : E R R O R

          ===========

   stp_ctl: |ssh| > 20 m  or  |U| > 10 m/s  or  S <= 0  or  S >= 100  or  NaN encounter in the tests
 
 kt 150234 |ssh| max   3.871     at i j   1024  307     MPI rank  197
 kt 150234 |U|   max   2.190     at i j k 1334  691   1 MPI rank  726
 kt 150234 Sal   min   5.309     at i j k 1198 1037   1 MPI rank 1054
 kt 150234 Sal   max   100.0     at i j k  827  241  15 MPI rank   98
 
        ===> output of last computed fields in output.abort* files