the scratch on Lemaitre3 is a BeeGFS file system which "doesn't like" small files. At the beginning of a run, OASIS creates many small files (in a very short period) and sometimes BeeGFS can't handle them.
So it's better to write these files on the RAM (= /dev/shm/) instead of your running directory (scratch)
Dear all,
On Lemaitre3, with the standard configuration (see [README](https://gogs.elic.ucl.ac.be/pbarriat/ecearth_patch/src/master/README.md)), if sometimes you get this crash:
```
Sep 16 15:27:19 lm3-w045 ifsmaster-ecconf: (hfi/PSM)[94909]: PSM2 can't open hfi unit: -1 (err=23)
```
or this one:
```
forrtl: Remote I/O error
```
you should change a little bit the code of the OASIS coupler.
From your ec-earth repository, open `sources/oasis3-mct/lib/psmile/src/mod_oasis_method.F90` and replace:
```
423 WRITE(filename,'(a,i2.2)') 'debug.root.',compid
429 WRITE(filename2,'(a,i2.2)') 'debug.notroot.',compid
436 WRITE(filename,'(a,i2.2,a,i6.6)') 'debug.',compid,'.',mpi_rank_local
```
with:
```
423 WRITE(filename,'(a,i2.2)') '/dev/shm/debug.root.',compid
429 WRITE(filename2,'(a,i2.2)') '/dev/shm/debug.notroot.',compid
436 WRITE(filename,'(a,i2.2,a,i6.6)') '/dev/shm/debug.',compid,'.',mpi_rank_local
```
Once done, re-compile oasis, ifs and nemo...
Reason:
the scratch on Lemaitre3 is a BeeGFS file system which "doesn't like" small files. At the beginning of a run, OASIS creates many small files (in a very short period) and sometimes BeeGFS can't handle them.
So it's better to write these files on the RAM (= /dev/shm/) instead of your running directory (scratch)
The same bug also affected the PARAMOUR NEMO-CCLM coupled setup. The fix described above solved the issue. Thanks PY.
@klein: Is it possible to include a CPP "BeeFGS" key in OASIS, and adapt the code to use the fix described above when that key is triggered?
The same bug also affected the PARAMOUR NEMO-CCLM coupled setup. The fix described above solved the issue. Thanks PY.
@klein: Is it possible to include a CPP "BeeFGS" key in OASIS, and adapt the code to use the fix described above when that key is triggered?
Dear all,
On Lemaitre3, with the standard configuration (see README), if sometimes you get this crash:
or this one:
you should change a little bit the code of the OASIS coupler.
From your ec-earth repository, open
sources/oasis3-mct/lib/psmile/src/mod_oasis_method.F90
and replace:with:
Once done, re-compile oasis, ifs and nemo...
Reason:
the scratch on Lemaitre3 is a BeeGFS file system which "doesn't like" small files. At the beginning of a run, OASIS creates many small files (in a very short period) and sometimes BeeGFS can't handle them.
So it's better to write these files on the RAM (= /dev/shm/) instead of your running directory (scratch)
The same bug also affected the PARAMOUR NEMO-CCLM coupled setup. The fix described above solved the issue. Thanks PY.
@klein: Is it possible to include a CPP "BeeFGS" key in OASIS, and adapt the code to use the fix described above when that key is triggered?