Charles Pelletier 3 年之前
父节点
当前提交
65b7d3829b

+ 250 - 0
nemo_bdy/generate_new_bdy_dta.py

@@ -0,0 +1,250 @@
+from netCDF4 import Dataset
+import sys
+import time
+import numpy as np
+from collections import OrderedDict
+
+# Script for generating NEMO BDY data files from pre-interpolated data, and a NEMO BDY geometry file.
+# Author: C. Pelletier
+
+# Inputs:
+# 1) NEMO BDY geometry description file (also needed by NEMO itself for a BDY-including run),
+# can be generated from generate_new_bdy_geom.py
+
+# 2) A set of external data that has already been pre-interpolated to the full 2D NEMO grid you want to run at.
+# It's a huge overkill in terms of data (you'll need <1% of the actual full data), but it really is simpler to implement.
+# It's ok to have data that is gibberish outside of the vicinity of the NEMO BDY (but do take a margin, e.g. on the 2*nbr slices near BDY, have real data), 
+# but the data should be on the full NEMO domain, and should have been "drowned" (i.e., all potentially masked cells should have been filled up, e.g. with nearest-neighbors)
+
+# Input data nomenclature
+# filename: <indata_folder>/<pref_in><ftype>_<year>.nc
+# WHERE:
+# a) <indata_folder>: input data folder, defined below
+# b) <pref_in>: input prefix, defined below
+# c) <ftype>: "filetype". I typically use grid_T, grid_U and grid_V. The NEMO grid uses slightly staggered grids for thermodynamics ("grid_T"), x-direction dynamics ("grid_U"), and y-direction dynamics ("grid_V"). Your input pre-interpolated fields should have been interpolated to distinct grids, depending on their nature:
+#   c.i) T-grid for temperature, salinity, sea surface height, sea-ice concentration, sea-ice thickness, snow over sea-ice;
+#   c.ii) U-grid for x-direction ocean velocities (barotropic and baroclinic velocities, and potentially sea-ice U velocities)
+#   c.iii) V-grid for y-direction ocean velocities (barotropic and baroclinic velocities, and potentially sea-ice V velocities)
+#   The names of 'ftype' does not matter, it can be called whatever, but they should be consistent with the ftypes dictionary defined (and described) below.
+
+# Input data netCDF content:
+# DIMENSIONS: must be time_counter, depth<grid>, y, x
+# where <grid> is t/u/v, depending on the gridtype of the file.
+# The depth<grid> dimension is not needed for files only containing 2D fields (e.g. ssh only, sea ice only)
+# VARIABLES(DIMENSIONS):
+#  - time_counter(time_counter): time variable;
+#  - depth<grid>(depth<grid>): depth axis
+#  - <var_3d>(time_counter,depth<grid>,y,x): 3D-variable for input data fields (e.g. temp/sal/baroclinic velocities)
+#  - <var_2d>(time_counter,y,x): 2D-variable for input data fields (e.g. sea ice, barotropic velocities, ssh)
+# <var_3d> and <var_2d> should match the variable names given in the ftypes dictionary below.
+# Input data fields should have been pre-drowned (i.e. no mask) with one exception: it's ok to have the last few vertical levels that are FULLY masked (e.g., at depths which are below you domain bathymetry). In that case, the script will just copy the deepest available level to the levels below it (and it probably won't be used in NEMO anyway but still).
+
+
+indata_folder="/mfast/pelletie/oras5/preinterp"
+outdata_folder="/mfast/pelletie/oras5/formatted/rim9_masked"
+
+# Input geometry file, coming from generate_new_bdy_geom.py
+geom_file = "/mfast/pelletie/oras5/grids/eORCA025-SO_BDY-COORDS_masked_rim9.nc"
+
+pref_in = "ORAS5_preinterp_opa1_"
+pref_out = "ORAS5_opa1_BDYDTA_eORCA025-SO_rim9_masked_1m_"
+
+# Inclusive year range, i.e., data from stop_year file will also be generated.
+start_year = 1979
+stop_year = 1981
+
+
+ftypes = OrderedDict()
+# This is the dictinary that described which variables are present in each 'ftype' type of files.
+# The ftype key (thing in the [] brackets after "ftypes") should be the <ftype> from the input filenames described above.
+# 'grid' should be the type of gridpoints, that must be shared for all variables within one file (t, u, or v)
+# In this example, I'm putting several variables within one given file, but I could have used distinct files for all variables. However, I could not have used one file for all variables, since all variables are not defined on the same t/u/v grids.
+# 'variables' is another dictionary (nested within ftypes), listed the variable names contained within the input file.
+# for each variable, you can specify desired minimum and maximum values for the said variable.
+# Values outside of the min/max range will be brought back to min/max 
+
+ftypes['grid_T'] = { 'grid' : 't', \
+                     'variables' : {
+                         'conservative_temperature' : { 'min' : -2. },
+                         'absolute_salinity' : { 'min' : 25. },
+                         'sossheig'          : {} \
+                                   } \
+                   }
+
+ftypes['grid_U'] = { 'grid' : 'u', \
+                    'variables' : {
+                        'uobarotropic' : { },
+                        'uobaroclinic' : { }}}
+
+ftypes['grid_V'] = { 'grid' : 'v', \
+                     'variables' : {
+                     'vobarotropic' : { },
+                     'vobaroclinic' : { }}}
+
+
+### END OF INPUT SPECIFICATION.
+
+
+grids = []
+
+for key in ftypes.keys():
+    curr_grid = ftypes[ftype]['grid']
+    if(curr_grid not in grids):
+        grids.append(curr_grid)
+
+
+geom_nc = Dataset(geom_file, mode='r')
+
+nbis = {}
+nbjs = {}
+nbrs = {}
+lons = {}
+lats = {}
+nbdys = {}
+
+for grid in grids:
+    nbis[grid] = geom_nc.variables['nbi'+grid][0,:] - 1
+    nbjs[grid] = geom_nc.variables['nbj'+grid][0,:] - 1
+    nbrs[grid] = geom_nc.variables['nbr'+grid][0,:]
+    nbdys[grid] = len(geom_nc.dimensions['xb'+grid.upper()])
+    lons[grid] = geom_nc.variables['glam'+grid][0,:]
+    lats[grid] = geom_nc.variables['gphi'+grid][0,:]
+
+geom_nc.close()
+
+
+for year in range(start_year,stop_year+1):
+    for ftype in ftypes.keys():
+
+        cdict = ftypes[ftype]
+
+        input_file = indata_folder+"/"+pref_in+ftype+"_%d"%year+".nc"
+        output_file = outdata_folder+"/"+pref_out+ftype+"_y%d"%year+".nc"
+
+        print('Generating ', output_file)
+        start = time.time()
+        
+        in_nc = Dataset(input_file, mode='r')
+        out_nc = Dataset(output_file, mode='w')
+
+        grid = cdict['grid']
+        nt = len(in_nc.dimensions['time_counter'])
+
+        ny = len(in_nc.dimensions['y'])
+        nx = len(in_nc.dimensions['x'])
+        
+        has_3d = False
+        nxb = nbdys[grid]
+        
+        for name, dim in in_nc.dimensions.items():
+            if(name == 'depth'+grid):
+                has_3d = True
+                nz = len(dim)
+                break
+
+        out_nc.createDimension('yb', 1)
+        out_nc.createDimension('xb'+grid.upper(), nxb)
+        out_nc.createDimension('time_counter', 0)
+        if(has_3d):
+            out_nc.createDimension('depth'+grid, nz)
+
+        idx_flat_2d = np.repeat( np.arange(nt) * ny*nx, nxb ) \
+                      + np.tile(nbjs[grid] * nx, nt) \
+                      + np.tile(nbis[grid], nt)
+
+        time_in = in_nc.variables['time_counter']
+        time_out = out_nc.createVariable(varname = 'time_counter', datatype = time_in.datatype, dimensions = time_in.dimensions)
+        time_out.setncatts(time_in.__dict__)
+        time_out[:] = time_in[:]
+
+        nbi_out = out_nc.createVariable(varname='nbi'+grid, datatype='i4', dimensions=['yb', 'xb'+grid.upper()])
+        nbi_out[0,:] = nbis[grid][:] + 1
+
+        nbj_out = out_nc.createVariable(varname='nbj'+grid, datatype='i4', dimensions=['yb', 'xb'+grid.upper()])
+        nbj_out[0,:] = nbjs[grid][:] + 1
+
+        nbr_out = out_nc.createVariable(varname='nbr'+grid, datatype='i4', dimensions=['yb', 'xb'+grid.upper()])
+        nbr_out[0,:] = nbrs[grid][:]
+        
+        if(has_3d):
+            depth_in = in_nc.variables['depth'+grid]
+            depth_out = out_nc.createVariable(varname = 'depth'+grid, datatype = depth_in.datatype, dimensions = depth_in.dimensions)
+            depth_out.setncatts(depth_in.__dict__)
+            depth_out[:] = depth_in[:]
+
+            idx_flat_3d = np.repeat( np.arange(nt*nz) * ny*nx, nxb ) \
+                          + np.tile(nbjs[grid] * nx, nt*nz) \
+                          + np.tile(nbis[grid], nt*nz)
+
+
+
+        lon_out = out_nc.createVariable(varname='nav_lon', datatype='f8', dimensions=['yb', 'xb'+grid.upper()])
+        lon_out.standard_name = "longitude"
+        lon_out.units = "degrees East"
+        lon_out[0,:] = lons[grid][:]
+            
+        lat_out = out_nc.createVariable(varname='nav_lat', datatype='f8', dimensions=['yb', 'xb'+grid.upper()])
+        lat_out.standard_name = "latitude"
+        lat_out.units = "degrees North"
+        lat_out[0,:] = lats[grid][:]
+        
+        for var in cdict['variables']:
+            in_var = in_nc.variables[var]
+
+            
+            if( 'depth'+grid in in_var.dimensions ):
+
+                out_var = out_nc.createVariable(var, datatype='f4', \
+                                                dimensions=['time_counter', 'depth'+grid, 'yb', 'xb'+grid.upper()], fill_value = 1.e20)
+
+
+                out_var[:,:,0,:] = np.reshape( (np.ndarray.flatten(in_var[:]))[idx_flat_3d], \
+                                               (nt, nz, nxb) )
+
+                if(np.ma.is_masked(out_var[:])):
+                    n_msk = np.count_nonzero(out_var[:].mask)
+                    all_masked_lev = np.all(out_var[:].mask, axis=(0,2,3))
+                    if(np.any(all_masked_lev)):
+                        idx_start_bot = np.amin((np.nonzero(all_masked_lev))[0])
+                        nbot = nz - idx_start_bot
+                        if( nbot * nt * nxb == n_msk):
+                            out_var[:,idx_start_bot:,0,:] = (out_var[:,idx_start_bot-1,0,:])[:,None,:] * (np.ones(shape=[nbot], dtype=float))[None,:,None]
+                        else:
+                            print('ERROR: it does not look like you have properly predrowned ', var, 'in ', input_file)
+                            sys.exit()
+                    else:
+                        print('ERROR: it does not look like you have properly predrowned ', var, 'in ', input_file)
+                        sys.exit()
+                                            
+                    
+
+            else:
+                out_var = out_nc.createVariable(var, datatype='f4', \
+                                                dimensions=['time_counter', 'yb', 'xb'+grid.upper()], \
+                                                fill_value = 1.e20)
+                out_var[:,0,:] =  np.reshape( (np.ndarray.flatten(in_var[:]))[idx_flat_2d], \
+                                               (nt, nxb) )
+                
+
+                
+
+                
+            if( 'min' in cdict['variables'][var].keys() ):
+                out_var[:] = np.clip(out_var[:],\
+                                     a_min = cdict['variables'][var]['min'], \
+                                     a_max = None)
+
+
+            if( 'max' in cdict['variables'][var].keys() ):
+                out_var[:] = np.clip(out_var[:],\
+                                     a_max = cdict['variables'][var]['max'], \
+                                     a_min = None)
+
+        
+        in_nc.close()
+        out_nc.close()
+
+        stop = time.time()
+
+        print('Done in %.2f'%(stop-start)+'s')
+        print('')

+ 158 - 0
nemo_bdy/generate_new_bdy_dta_onefile.py

@@ -0,0 +1,158 @@
+from netCDF4 import Dataset
+import sys
+import time
+import numpy as np
+from collections import OrderedDict
+from nn_interp import nn_interp_1d
+
+# Script for generating NEMO BDY data files from pre-interpolated data, and a NEMO BDY geometry file.
+# Author: C. Pelletier
+
+# Call:
+# python generate_new_bdy_dta_onefile.py <input_file> <grid> <output_file>
+# where:
+# <input_file>: is an input dataset which has been preinterpolated to the NEMO grid, and contains all the fields you want to generate data to.
+#              This script will "detect" any data fields in <input_file>.
+# <grid> is the type of NEMO grid points (T, U or V) that your input data is defined on.
+#             All fields within one <input_data> file MUST share the same type of points (e.g., don't mix velocities and tracers).
+#             If the input file has 3D fields, then it should contain a depth variable AND a depth dimension, both called "depth<grid>" (e.g. "depthv" for V nodes)
+# <output_file> is the path to an output file.
+
+# BDY geometry file, coming from generate_new_bdy_geom.py
+geom_file = "/storepelican/pelletie/nemo_bdy/grids/eORCA025_SO_BDY-COORDS_rim1.nc"
+
+if(len(sys.argv) == 4):
+    input_file = sys.argv[1]
+    grid = sys.argv[2]
+    output_file = sys.argv[3]
+else:
+    print('Please provide three input arguments: python generate_new_bdy_dta_onefile.py <input_file> <grid> <output_file>')
+    sys.exit()
+
+if(grid not in ['t', 'u', 'v']):
+    print('grid should be t, u or v')
+    sys.exit()
+
+### END OF INPUT SPECIFICATION.
+
+
+geom_nc = Dataset(geom_file, mode='r')
+
+
+nbis = geom_nc.variables['nbi'+grid][0,:] - 1
+nbjs = geom_nc.variables['nbj'+grid][0,:] - 1
+nbrs = geom_nc.variables['nbr'+grid][0,:]
+nbdys = len(geom_nc.dimensions['xb'+grid.upper()])
+lons = geom_nc.variables['glam'+grid][0,:]
+lats = geom_nc.variables['gphi'+grid][0,:]
+
+geom_nc.close()
+
+        
+in_nc = Dataset(input_file, mode='r')
+out_nc = Dataset(output_file, mode='w')
+
+nt = len(in_nc.dimensions['time_counter'])
+
+ny = len(in_nc.dimensions['y'])
+nx = len(in_nc.dimensions['x'])
+
+has_3d = False
+nxb = nbdys
+
+
+remap_vars = []
+
+for name, var in in_nc.variables.items():
+    if(( 'time_counter' in var.dimensions) \
+       and ('x' in var.dimensions)):
+        remap_vars.append(name)
+        
+        if('depth'+grid in var.dimensions):
+            has_3d = True
+            nz = len( in_nc.dimensions['depth'+grid] )
+
+out_nc.createDimension('yb', 1)
+out_nc.createDimension('xb'+grid.upper(), nxb)
+out_nc.createDimension('time_counter', 0)
+if(has_3d):
+    out_nc.createDimension('depth'+grid, nz)
+
+idx_flat_2d = np.repeat( np.arange(nt) * ny*nx, nxb ) \
+              + np.tile(nbjs * nx, nt) \
+              + np.tile(nbis, nt)
+
+time_in = in_nc.variables['time_counter']
+time_out = out_nc.createVariable(varname = 'time_counter', datatype = time_in.datatype, dimensions = time_in.dimensions)
+time_out.setncatts(time_in.__dict__)
+time_out[:] = time_in[:]
+
+nbi_out = out_nc.createVariable(varname='nbi'+grid, datatype='i4', dimensions=['yb', 'xb'+grid.upper()])
+nbi_out[0,:] = nbis[:] + 1
+
+nbj_out = out_nc.createVariable(varname='nbj'+grid, datatype='i4', dimensions=['yb', 'xb'+grid.upper()])
+nbj_out[0,:] = nbjs[:] + 1
+
+nbr_out = out_nc.createVariable(varname='nbr'+grid, datatype='i4', dimensions=['yb', 'xb'+grid.upper()])
+nbr_out[0,:] = nbrs[:]
+
+if(has_3d):
+    depth_in = in_nc.variables['depth'+grid]
+    depth_out = out_nc.createVariable(varname = 'depth'+grid, datatype = depth_in.datatype, dimensions = depth_in.dimensions)
+    depth_out.setncatts(depth_in.__dict__)
+    depth_out[:] = depth_in[:]
+
+    idx_flat_3d = np.repeat( np.arange(nt*nz) * ny*nx, nxb ) \
+                  + np.tile(nbjs * nx, nt*nz) \
+                  + np.tile(nbis, nt*nz)
+
+
+lon_out = out_nc.createVariable(varname='nav_lon', datatype='f8', dimensions=['yb', 'xb'+grid.upper()])
+lon_out.standard_name = "longitude"
+lon_out.units = "degrees East"
+lon_out[0,:] = lons[:]
+
+lat_out = out_nc.createVariable(varname='nav_lat', datatype='f8', dimensions=['yb', 'xb'+grid.upper()])
+lat_out.standard_name = "latitude"
+lat_out.units = "degrees North"
+lat_out[0,:] = lats[:]
+
+
+for var in remap_vars:
+    in_var = in_nc.variables[var]
+
+    if( 'depth'+grid in in_var.dimensions ):
+
+        out_var = out_nc.createVariable(var, datatype='f4', \
+                                        dimensions=['time_counter', 'depth'+grid, 'yb', 'xb'+grid.upper()], fill_value = 1.e20)
+
+
+        out_var[:,:,0,:] = np.reshape( (np.ndarray.flatten(in_var[:]))[idx_flat_3d], \
+                                       (nt, nz, nxb) )
+
+
+        if(np.ma.is_masked(out_var[:])):
+            for t in range(0,nt):
+                for k in range(0,nz):
+                    if(np.all( out_var[:].mask[t,k,0,:] ) ):
+                        # All masked nodes: take from upper level.
+                        out_var[t,k,0,:] = out_var[t,k-1,0,:]
+                    elif(np.any(out_var[:].mask[t,k,0,:])):
+                        # Some masked nodes: drown at same level.
+                        out_var[t,k,0,:] = nn_interp_1d(out_var[t,k,0,:])
+
+    else:
+        out_var = out_nc.createVariable(var, datatype='f4', \
+                                        dimensions=['time_counter', 'yb', 'xb'+grid.upper()], \
+                                        fill_value = 1.e20)
+        out_var[:,0,:] =  np.reshape( (np.ndarray.flatten(in_var[:]))[idx_flat_2d], \
+                                       (nt, nxb) )
+
+        if(np.ma.is_masked(out_var[:])):
+            for t in range(0,nt):
+                if(np.any(out_var[:].mask[t,0,:])):
+                    out_var[t,0,:] = nn_interp_1d(out_var[t,0,:])
+
+in_nc.close()
+out_nc.close()
+

+ 262 - 0
nemo_bdy/generate_new_bdy_geom.py

@@ -0,0 +1,262 @@
+from netCDF4 import Dataset
+import sys
+import time
+import numpy as np
+
+# Script for generating a NEMO BDY geometry file from a full 2D coordinates file.
+# Author: C. Pelletier
+
+# Note: this script just takes a NEMO grid file and "peels it off" (like a potato)
+# to get the input BDY file for a geometry.
+# All nodes are included, regardless of their mask status.
+# I think it's better to include them all, so that if the mask evolves (e.g. by changing
+# topography, or changing the minimum bathymetry), the BDY data and geometry will remain
+# relevant. The only drawback is that there are superfluous data (corresponding to masked cells),
+# but in practise it only means that data files are slightly larger. Since BDY stored are stored as
+# small 1D array, I do not consider this drawback a problem.
+
+data_folder="/storepelican/pelletie/nemo_bdy/grids"
+coord_file = "coord_paraso.nc"
+# must be the full domain and contain glamt, gphit, glamu, gphiu, glamv, gphiv
+# taking the "mesh mask" or the "coordinates" file from a run is OK
+
+is_size_bdy = [False, True, False, False]
+# boolean array: are the 4 sides of the domain an open bdy? Order: x=1, y=ny, x=nx, y=1 (in FORTRAN indexing convention; W/N/E/S if grid is orientated "normally", i.e. North upwards)
+
+
+nbr_out = 1
+# "rim width" of the BDY. NEMO documentation says it should be 8-9,
+# but at UCL et al. in 2018 - 2021, I've only seen 1 used (Huot, Van Achter, Pelletier).
+# N. Jourdain (IGE, Grenoble) also uses 1 so...
+# I've tried setting 8 over my config (circumpolar 0.25°), it ran a bit but was numerically unstable.
+# I am pretty confident about the "geometric" implementation of nbr_out > 1, though...
+
+out_file = "eORCA025_SO_BDY-COORDS_rim%d"%nbr_out+".nc"
+
+extra_dbg = False
+# output extra dbg vars in out_file
+
+### END INPUT
+
+coord_nc = Dataset(data_folder+'/'+coord_file, mode='r')
+grids = ['t', 'u', 'v']
+
+
+lons = {}
+lats = {}
+
+for grid in grids:
+    lons[grid] = coord_nc.variables['glam'+grid][:]
+    lats[grid] = coord_nc.variables['gphi'+grid][:]
+       
+coord_nc.close()
+
+ny, nx = np.shape(lons['t'])
+
+idx_ifort = np.arange(nx) + 1
+idx_jfort = np.arange(ny) + 1
+
+idx_ifort_2d = idx_ifort[None,:] * np.ones(shape=[ny], dtype=int)[:,None]
+idx_jfort_2d = idx_jfort[:,None] * np.ones(shape=[nx], dtype=int)[None,:]
+
+idx_ifort_flat = np.ndarray.flatten(idx_ifort_2d)
+idx_jfort_flat = np.ndarray.flatten(idx_jfort_2d)
+
+
+
+nbrt_3d_bdy = np.ma.array( data = np.ndarray(shape=[ny,nx,4], dtype=int), mask = False)
+
+if(is_size_bdy[0]):
+    nbrt_3d_bdy[:,1:,0] = (idx_ifort[1:])[None,:] * (np.ones(shape=[ny], dtype=int))[:,None] - 1
+else:
+    nbrt_3d_bdy.mask[:,:,0] = True
+
+
+if(is_size_bdy[1]):
+    nbrt_3d_bdy[:-1,:,1] = ny - (idx_jfort[:-1])[:,None] * (np.ones(shape=[nx], dtype=int))[None,:]
+else:
+    nbrt_3d_bdy.mask[:,:,1] = True
+
+if(is_size_bdy[2]):
+    nbrt_3d_bdy[:,:-1,2] = nx - (idx_ifort[:-1])[None,:] * (np.ones(shape=[ny], dtype=int))[:,None]
+else:
+    nbrt_3d_bdy.mask[:,:,2] = True
+
+if(is_size_bdy[3]):
+    nbrt_3d_bdy[1:,:,3] = (idx_jfort[1:])[:,None] * (np.ones(shape=[nx], dtype=int))[None,:] - 1
+else:
+    nbrt_3d_bdy.mask[:,:,3] = True
+    
+
+which_bdy = np.ma.array(data = np.argmin(nbrt_3d_bdy, axis=-1), \
+                              mask = np.all(nbrt_3d_bdy.mask, axis=-1))
+
+nbrt_3d_bdy.mask[0,:,:] = True
+nbrt_3d_bdy.mask[-1,:,:] = True
+nbrt_3d_bdy.mask[:,0,:] = True
+nbrt_3d_bdy.mask[:,-1,:] = True
+
+nbr_flats = {}
+nbrs = {}
+
+# prendre j*nx*4 + i*4 + which_bdy[grid]
+idx_take = np.repeat(np.arange(ny), nx) * nx * 4 + np.tile(np.arange(nx), ny) * 4 + np.ndarray.flatten(which_bdy)
+nbr_flats['t'] = (np.ndarray.flatten(nbrt_3d_bdy))[idx_take]
+nbrs['t'] = np.ma.array( data = np.reshape( nbr_flats['t'], (ny,nx)), \
+                         mask = np.all(nbrt_3d_bdy.mask))
+                        
+nbrs['u'] = np.ma.array(data = np.ndarray(shape=[ny, nx], dtype=int), \
+                        mask = False)
+nbrs['u'][:,-1].mask = True
+nbrs['u'][:,:-1] = np.minimum(nbrs['t'][:,:-1], nbrs['t'][:,1:])
+nbrs['u'].mask[:,:-1] = np.logical_or(nbrs['t'].mask[:,:-1], nbrs['t'].mask[:,1:])
+
+nbrs['v'] = np.ma.array(data = np.ndarray(shape=[ny, nx], dtype=int), \
+                        mask = False)
+nbrs['v'][-1,:].mask = True
+nbrs['v'][:-1,:] = np.minimum(nbrs['t'][:-1,:], nbrs['t'][1:,:])
+nbrs['v'].mask[:-1,:] = np.logical_or(nbrs['t'].mask[:-1,:], nbrs['t'].mask[1:,:])
+
+
+
+                                
+
+
+
+
+
+
+
+    
+    
+out_nc = Dataset(data_folder+'/'+out_file, mode='w')
+
+if(extra_dbg):
+    out_nc.createDimension('y', ny)
+    out_nc.createDimension('x', nx)
+    out_nc.createDimension('bdy', 4)
+
+    tmp = out_nc.createVariable('which_'+'t', datatype='i2', dimensions=['y', 'x'], fill_value = -10)
+    tmp[:] = which_bdy
+
+    nbr_3d = out_nc.createVariable("nbr"+'t'+"_3d", datatype='i4', dimensions=['y', 'x', 'bdy'], fill_value = -10)
+    nbr_3d[:] = nbrt_3d_bdy
+
+    for grid in grids:
+        tmp = out_nc.createVariable('nbr'+grid+'_2d', datatype='i4', dimensions=['y', 'x'], fill_value = -10)
+        tmp[:] = nbrs[grid]
+
+    
+n_xbs = {}
+
+out_nc.createDimension('yb', 1)
+
+
+for grid in grids:
+    cnt = 0
+    iidx_tmp = []
+    jidx_tmp = []
+
+    lat_tmp = []
+    lon_tmp = []
+
+    nbr_tmp = []
+
+    curr_nbrflat = np.ndarray.flatten(nbrs[grid])
+    
+    for l in range(1,nbr_out+1):
+        print(l)
+        idx_take = np.nonzero(curr_nbrflat == l)[0].astype(int)
+        
+        idx_ifort_take = (np.ndarray.flatten(idx_ifort_2d))[idx_take]
+        idx_jfort_take = (np.ndarray.flatten(idx_jfort_2d))[idx_take]
+
+        sort_order = []
+
+        n_take = np.size(idx_take)
+        
+        already_in = np.zeros(shape=[n_take], dtype=bool)
+        idx_sorted = []
+        
+        check = np.logical_and(np.logical_not(already_in), \
+                               idx_ifort_take == np.amin(idx_ifort_take))
+        if(np.any(check)):
+            where_line = (np.nonzero(check))[0]
+            tmp_idx = np.argsort(idx_jfort_take[where_line])
+            idx_sorted = np.concatenate(( idx_sorted, (idx_take[where_line])[tmp_idx] ))
+            already_in[where_line] = True
+
+            
+        check = np.logical_and(np.logical_not(already_in), \
+                                   idx_jfort_take == np.amax(idx_jfort_take))
+        if(np.any(check)):
+            where_line = (np.nonzero(check))[0]
+            tmp_idx = np.argsort(idx_ifort_take[where_line])
+            idx_sorted = np.concatenate(( idx_sorted, (idx_take[where_line])[tmp_idx] ))
+            already_in[where_line] = True
+
+
+        check = np.logical_and(np.logical_not(already_in), \
+                               idx_ifort_take == np.amax(idx_ifort_take))
+        if(np.any(check)):
+            where_line = (np.nonzero(check))[0]
+            tmp_idx = (np.argsort(idx_jfort_take[where_line]))[::-1]
+            idx_sorted = np.concatenate(( idx_sorted, (idx_take[where_line])[tmp_idx] ))
+            already_in[where_line] = True
+
+
+        check = np.logical_and(np.logical_not(already_in), \
+                               idx_jfort_take == np.amin(idx_jfort_take))
+        if(np.any(check)):
+            where_line = (np.nonzero(check))[0]
+            tmp_idx = (np.argsort(idx_ifort_take[where_line]))[::-1]
+            idx_sorted = np.concatenate(( idx_sorted, (idx_take[where_line])[tmp_idx] ))
+            already_in[where_line] = True
+
+        check = np.logical_not(already_in)
+        if(np.count_nonzero(check) > 0):
+            print('error in reordering')
+            sys.exit()
+
+
+        idx_sorted = idx_sorted.astype(int)
+            
+        iidx_tmp = np.concatenate(( iidx_tmp, idx_ifort_flat[idx_sorted] ))
+        jidx_tmp = np.concatenate(( jidx_tmp, idx_jfort_flat[idx_sorted] ))
+        
+        lon_tmp = np.concatenate(( lon_tmp, (np.ndarray.flatten(lons[grid]))[idx_sorted]))
+        lat_tmp = np.concatenate(( lat_tmp, (np.ndarray.flatten(lats[grid]))[idx_sorted]))
+        
+        curr_size = np.size(idx_take)
+        nbr_tmp = np.concatenate(( nbr_tmp, np.repeat(l, curr_size) ))
+        cnt+=curr_size
+        
+    out_nc.createDimension('xb'+grid.upper(), cnt)
+    tmp = out_nc.createVariable(varname='nbi'+grid, datatype='i4', dimensions=['yb', 'xb'+grid.upper()])
+
+    tmp[:] = iidx_tmp
+
+    tmp = out_nc.createVariable(varname='nbj'+grid, datatype='i4', dimensions=['yb', 'xb'+grid.upper()])
+    tmp[:] = jidx_tmp
+    
+    tmp = out_nc.createVariable(varname='nbr'+grid, datatype='i4', dimensions=['yb', 'xb'+grid.upper()])
+    tmp[:] = nbr_tmp
+
+
+    tmp = out_nc.createVariable(varname='glam'+grid, datatype='f8', dimensions=['yb', 'xb'+grid.upper()])
+    tmp[:] = lon_tmp
+
+    tmp = out_nc.createVariable(varname='gphi'+grid, datatype='f8', dimensions=['yb', 'xb'+grid.upper()])
+    tmp[:] = lat_tmp
+
+    
+out_nc.close()    
+    
+
+        
+
+        
+        
+
+
+

+ 46 - 0
nemo_bdy/nn_interp.py

@@ -0,0 +1,46 @@
+import numpy as np
+
+def nn_interp(input_arr):
+
+    shape = np.shape(input_arr)
+    n_shape = np.size(shape)
+    
+    if(n_shape < 2):
+        print('error. cant do nn_interp on arrays with rank < 2')
+        sys.exit()
+    
+    else:
+
+        iter_arr = np.ma.copy(input_arr)
+        
+        while( np.count_nonzero(iter_arr.mask) > 0 ):
+            for shift in (-1,1):
+                for axis in (-1,-2):        
+                    arr_shifted=np.roll(iter_arr,shift=shift,axis=axis)
+                    idx=~arr_shifted.mask * iter_arr.mask
+                    iter_arr[idx]=arr_shifted[idx]
+
+    return iter_arr
+
+
+
+def nn_interp_1d(input_arr):
+
+    shape = np.shape(input_arr)
+    n_shape = np.size(shape)
+
+    iter_arr = np.ma.copy(input_arr)
+        
+    while( np.count_nonzero(iter_arr.mask) > 0 ):
+        for shift in (-1,1):
+            arr_shifted=np.roll(iter_arr,shift=shift,axis=-1)
+            idx=~arr_shifted.mask * iter_arr.mask
+            iter_arr[idx]=arr_shifted[idx]
+
+    return iter_arr
+
+
+        
+
+
+        

+ 181 - 0
nemo_bdy/prepare_ecearth_paraso.sh

@@ -0,0 +1,181 @@
+#!/bin/bash
+set -u
+
+# This script takes EC-Earth outputs, reinterpolates them to the NEMO grid, and then generates BDY dta from it.
+# Author: C. Pelletier
+
+# Inclusive year range
+year_start=1979
+year_stop=2050
+
+# Input prefix (with folder) for input data files (in .tar format).
+in_prefix="/storage/elic/chamarro/data"
+# Name of experience
+name_exp="a1tq"
+# Frequency
+freq="1m"
+
+# Working folder where pre-interpolated datafiles will be put
+dest_folder="/storepelican/pelletie/nemo_bdy/bdydta"
+dest_prefix="ECEarth_${name_exp}_${freq}_eORCA025-SO_BDYDTA_nbr1_"
+
+pwd=`pwd`
+# Python script for reformatting BDYDTA (final step)
+reformat_script="${pwd}/generate_new_bdy_dta_onefile.py"
+
+# min_lon, max_lon, min_lat, max_lat
+# To precut the input data prior to interpolation (not required, but less expensive if your source code is much bigger than the destination one, which is the case from global to the Southern Ocean.
+# Take a generous margin (4-5 nodes bigger than the destination grid limit)
+latlon_cut="-180,180,-90,-28"
+
+# # Script for converting EOS80 to TEOS-10 (look at the script comments for why it exists and what it does).
+# not needed for EC-Earth, which is already in TEOS-10
+# curr_dir=`pwd`
+# eos10_script="${curr_dir}/convert_eos10_new.py"
+
+# NEMO destination CDO grid description (see `cdo_get_griddes.sh`).
+dest_gridfile_t="/storepelican/pelletie/nemo_bdy/grids/eORCA025-SO_grid_T.grid"
+dest_gridfile_u="/storepelican/pelletie/nemo_bdy/grids/eORCA025-SO_grid_U.grid"
+dest_gridfile_v="/storepelican/pelletie/nemo_bdy/grids/eORCA025-SO_grid_V.grid"
+
+pid="$$"
+
+mkdir -p ${dest_folder}
+cd ${dest_folder}
+
+for((year=year_start;year<=year_stop;year++)); do
+
+    # Loop over input years.
+    
+    module purge
+    module load 2018b CDO NCO
+
+    # 1) Extract T, S and ssh (T grid).
+
+    tar -xvf ${in_prefix}/${name_exp}/NEMO/MMO_${name_exp}_18500101_fc0_${year}0101-${year}1231.tar ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_T_3D.nc ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_T_2D.nc
+    
+    cdo selname,thetao,so ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_T_3D.nc tmp_pid${pid}.nc
+    cdo selname,zos ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_T_2D.nc tmp2_pid${pid}.nc
+    cdo merge tmp_pid${pid}.nc tmp2_pid${pid}.nc tmp3_pid${pid}.nc
+
+    rm -f ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_T_3D.nc ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_T_2D.nc
+    
+    # precut
+    cdo -O sellonlatbox,${latlon_cut} tmp3_pid${pid}.nc tmp_pid${pid}.nc
+
+    ncrename -v olevel,deptht tmp_pid${pid}.nc
+    ncrename -d olevel,deptht tmp_pid${pid}.nc
+    
+    # remap
+    cdo remapbil,${dest_gridfile_t} tmp_pid${pid}.nc tmp2_pid${pid}.nc
+    cdo setmisstonn tmp2_pid${pid}.nc tmp_pid${pid}.nc
+
+    module purge
+    module load ELIC/0.1-foss-2018b-Python-3.6.6
+
+    input_full=`readlink -f tmp_pid${pid}.nc`
+    output_full="${dest_folder}/${dest_prefix}grid_T_y${year}.nc"
+    
+    python ${reformat_script} ${input_full} t ${output_full}
+
+    module purge
+    module load CDO NCO    
+    
+    # 2) Extract u velocities.
+
+    if [ "${freq}" == "1d" ]; then
+	
+	tar -xvf ${in_prefix}/${name_exp}/NEMO/MMO_${name_exp}_18500101_fc0_${year}0101-${year}1231.tar ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_U_3D.nc ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_U_2D.nc
+	cdo selname,ubar ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_U_2D.nc tmp2_pid${pid}.nc
+
+    else
+	# ubar is not stored in monthly outputs, so take it from daily and monmean
+	tar -xvf ${in_prefix}/${name_exp}/NEMO/MMO_${name_exp}_18500101_fc0_${year}0101-${year}1231.tar ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_U_3D.nc ${name_exp}_1d_${year}0101_${year}1231_opa_grid_U_2D.nc
+	cdo selname,ubar ${name_exp}_1d_${year}0101_${year}1231_opa_grid_U_2D.nc tmp3_pid${pid}.nc
+	cdo monmean tmp3_pid${pid}.nc tmp2_pid${pid}.nc
+	
+    fi    
+    
+    cdo selname,uo ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_U_3D.nc tmp_pid${pid}.nc
+    cdo merge tmp_pid${pid}.nc tmp2_pid${pid}.nc tmp3_pid${pid}.nc
+
+    rm -f ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_U_3D.nc ${name_exp}_??_${year}0101_${year}1231_opa_grid_U_2D.nc
+    
+    # precut
+    cdo -O sellonlatbox,${latlon_cut} tmp3_pid${pid}.nc tmp_pid${pid}.nc
+
+    ncap2 -O -s 'uobaroclinic = uo - ubar;' tmp_pid${pid}.nc tmp2_pid${pid}.nc
+    ncks -x -v uo tmp2_pid${pid}.nc -O tmp_pid${pid}.nc
+    ncrename -v ubar,uobarotropic tmp_pid${pid}.nc
+    ncrename -v olevel,depthu -v olevel_bnds,depthu_bnds tmp_pid${pid}.nc
+    ncatted -a bounds,depthu,o,c,"depthu_bnds" -a name,depthu,o,c,"depthu" tmp_pid${pid}.nc
+    ncrename -d olevel,depthu tmp_pid${pid}.nc
+    ncpdq -O -a time_counter,depthu,y,x tmp_pid${pid}.nc tmp2_pid${pid}.nc
+    
+    # remap
+    cdo remapbil,${dest_gridfile_u} tmp2_pid${pid}.nc tmp_pid${pid}.nc
+    cdo setmisstonn tmp_pid${pid}.nc tmp2_pid${pid}.nc
+
+    module purge
+    module load ELIC/0.1-foss-2018b-Python-3.6.6
+
+    input_full=`readlink -f tmp2_pid${pid}.nc`
+    output_full="${dest_folder}/${dest_prefix}grid_U_y${year}.nc"
+    
+    python ${reformat_script} ${input_full} u ${output_full}
+
+    module purge
+    module load CDO NCO    
+
+    
+    # 3) Extract v velocities.
+
+    if [ "${freq}" == "1d" ]; then
+	
+	tar -xvf ${in_prefix}/${name_exp}/NEMO/MMO_${name_exp}_18500101_fc0_${year}0101-${year}1231.tar ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_V_3D.nc ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_V_2D.nc
+	cdo selname,vbar ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_V_2D.nc tmp2_pid${pid}.nc
+
+    else
+	# vbar is not stored in monthly outputs, so take it from daily and monmean
+	tar -xvf ${in_prefix}/${name_exp}/NEMO/MMO_${name_exp}_18500101_fc0_${year}0101-${year}1231.tar ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_V_3D.nc ${name_exp}_1d_${year}0101_${year}1231_opa_grid_V_2D.nc
+	cdo selname,vbar ${name_exp}_1d_${year}0101_${year}1231_opa_grid_V_2D.nc tmp3_pid${pid}.nc
+	cdo monmean tmp3_pid${pid}.nc tmp2_pid${pid}.nc
+	
+    fi    
+    
+    cdo selname,vo ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_V_3D.nc tmp_pid${pid}.nc
+    cdo merge tmp_pid${pid}.nc tmp2_pid${pid}.nc tmp3_pid${pid}.nc
+
+    rm -f ${name_exp}_${freq}_${year}0101_${year}1231_opa_grid_V_3D.nc ${name_exp}_??_${year}0101_${year}1231_opa_grid_V_2D.nc
+    
+    # precut
+    cdo -O sellonlatbox,${latlon_cut} tmp3_pid${pid}.nc tmp_pid${pid}.nc
+
+    ncap2 -O -s 'vobaroclinic = vo - vbar;' tmp_pid${pid}.nc tmp2_pid${pid}.nc
+    ncks -x -v vo tmp2_pid${pid}.nc -O tmp_pid${pid}.nc
+    ncrename -v vbar,vobarotropic tmp_pid${pid}.nc
+    ncrename -v olevel,depthv -v olevel_bnds,depthv_bnds tmp_pid${pid}.nc
+    ncatted -a bounds,depthv,o,c,"depthv_bnds" -a name,depthv,o,c,"depthv" tmp_pid${pid}.nc
+    ncrename -d olevel,depthv tmp_pid${pid}.nc
+    ncpdq -O -a time_counter,depthv,y,x tmp_pid${pid}.nc tmp2_pid${pid}.nc
+    
+    # remap
+    cdo remapbil,${dest_gridfile_v} tmp2_pid${pid}.nc tmp_pid${pid}.nc
+    cdo setmisstonn tmp_pid${pid}.nc tmp2_pid${pid}.nc 
+
+    module purge
+    module load ELIC/0.1-foss-2018b-Python-3.6.6
+
+    input_full=`readlink -f tmp2_pid${pid}.nc`
+    output_full="${dest_folder}/${dest_prefix}grid_V_y${year}.nc"
+    
+    python ${reformat_script} ${input_full} v ${output_full}
+
+    module purge
+    module load CDO NCO    
+
+    
+done
+
+
+rm -rf ./*pid${pid}*

+ 145 - 0
nemo_bdy/prepare_oras5_paraso.sh

@@ -0,0 +1,145 @@
+#!/bin/bash
+set -u
+
+# This script takes ORAS5 ocean reanalyses, reinterpolates them to the NEMO grid so that BDY data can be generated from it afterwards.
+# As of July 2021, input ORAS5 can be downloaded from <https://www.cen.uni-hamburg.de/icdc/data/ocean/easy-init-ocean/ecmwf-oras5.html>
+# Author: C. Pelletier
+
+# Inclusive year range
+year_start=1979
+year_stop=2018
+
+# Input folder where input data files (in .tar.gz format) are.
+# This script expects member 'opa1' but you can change it below.
+in_folder="/mfast/pelletie/oras5/raw"
+# Folder where pre-interpolated datafiles will be 
+dest_folder="/mfast/pelletie/oras5/preinterp"
+
+mkdir -p ${dest_folder}
+cd ${dest_folder}
+
+# min_lon, max_lon, min_lat, max_lat
+# To precut the input data prior to interpolation (not required, but less expensive if your source code is much bigger than the destination one, which is the case from global ORAS5 to the Southern Ocean (we d
+# Take a good margin (4-5 nodes bigger than the destination grid limit)
+latlon_cut="-180,180,-90,-28"
+
+# Script for converting EOS80 to TEOS-10 (look at the script comments for why it exists and what it does).
+eos10_script="/elic/home/pelletie/oras5/convert_eos10_new.py"
+
+# NEMO destination CDO grid description (see `cdo_get_griddes.sh`).
+dest_gridfile="/mfast/pelletie/oras5/eORCA025-SO_corners.grid"
+
+pid="$$"
+
+for((year=year_start;year<=year_stop;year++)); do
+
+    # Loop over input years.
+    
+    module purge
+    module load 2018b CDO
+
+    # 1) Uncompress, concatenate, pre-cut and merge temperatures and salinities.
+    
+    tar -xvf ${in_folder}/votemper_ORAS5_1m_${year}_opa1.tar.gz
+    cdo -O copy votemper_ORAS5_1m_${year}??_grid_T_02.nc tmp_pid${pid}.nc
+    rm -f votemper_ORAS5_1m_${year}??_grid_T_02.nc
+    cdo -O sellonlatbox,${latlon_cut} tmp_pid${pid}.nc tmp2_pid${pid}.nc
+
+    tar -xvf ${in_folder}/vosaline_ORAS5_1m_${year}_opa1.tar.gz
+    cdo -O copy vosaline_ORAS5_1m_${year}??_grid_T_02.nc tmp_pid${pid}.nc
+    rm -f vosaline_ORAS5_1m_${year}??_grid_T_02.nc
+    cdo -O sellonlatbox,${latlon_cut} tmp_pid${pid}.nc tmp3_pid${pid}.nc
+
+    cdo -O merge tmp2_pid${pid}.nc tmp3_pid${pid}.nc tmp_pid${pid}.nc
+
+    # 2) Convert EOS8 thermodynamics to TEOS-10 ones (look at the script to get why/what).
+    module purge;
+    module load ESMValTool;
+    PYTHONPATH="/elic/home/pelletie/.local/lib/python3.6/site-packages:${PYTHONPATH}"
+    python ${eos10_script} tmp_pid${pid}.nc tmp2_pid${pid}.nc
+
+    # 3) Uncompress, concatenate, pre-cut and merge ssh with TEOS-10 temperatures and salinities.
+    module purge
+    module load 2018b CDO
+    tar -xvf ${in_folder}/sossheig_ORAS5_1m_${year}_opa1.tar.gz
+    cdo -O copy sossheig_ORAS5_1m_${year}??_grid_T_02.nc tmp_pid${pid}.nc
+    rm -f sossheig_ORAS5_1m_${year}??_grid_T_02.nc
+    cdo -O sellonlatbox,${latlon_cut} tmp_pid${pid}.nc tmp3_pid${pid}.nc
+    cdo -O merge tmp3_pid${pid}.nc tmp2_pid${pid}.nc tmp_pid${pid}.nc
+
+
+    # 4) Remap (bilinear) and drown all "T points" variables (temp/sal/ssh) to your NEMO grid.
+    
+    cdo remapbil,${dest_gridfile} tmp_pid${pid}.nc tmp2_pid${pid}.nc
+    # For this interpolation, I have not put a src_gridfile, because this particular dataset (ORAS5) has 'good' netCDF attributes.
+    # IF the input data has:
+    # - a longitude variable with netCDF attributes standard_name="longitude", units="degrees East"
+    # - a latitude variable with netCDF attributes standard_name="latitude", units="degrees North"
+    # - at least one variable with a a netCDF attribute coordinates='<time> <depth> <lat> <lon>', where <time> is the name of the time variable; <depth> is the name of the depth variable (if the field is 3D); <lat> is the name of the latitude variable; <lon> is the name of the longitude variable.
+    # THEN things go well, CDO recognizes the input grid.
+    # And this happens to be the case with ORAS5 because it's a reanalysis from ECMWF, and they're standardized.
+    # ELSE, you need to explicitly generate (see `get_cdo_griddes.sh`) and specify the input grid description in the remap call, which would have been: `cdo remapbil,${dest_gridfile} -setgrid,${src_gridfile} tmp_pid${pid}.nc tmp2_pid${pid}.nc`
+
+    # Drown output fields. This is super lengthy because it computes distances (in kilometers).
+    # There are other much more efficient ways to do it (discrete nearest-neighbor would be fine), but I'm too lazy.
+    # I do have a function for doing it in nn_interp.py, though. Feel free to plug it in somehow.
+    # Else, do it overnight/the weekend, I guess...    
+    cdo setmisstonn tmp2_pid${pid}.nc ORAS5_preinterp_opa1_grid_T_${year}.nc
+    
+    module load NCO
+
+    # 5) Uncompress, concatenate, pre-cut U velocities.
+    tar -xvf ${in_folder}/vozocrtx_ORAS5_1m_${year}_opa1.tar.gz
+    cdo -O copy vozocrtx_ORAS5_1m_${year}??_grid_U_02.nc tmp_pid${pid}.nc
+    rm -f vozocrtx_ORAS5_1m_${year}??_grid_U_02.nc
+    cdo -O sellonlatbox,${latlon_cut} tmp_pid${pid}.nc tmp3_pid${pid}.nc
+
+    # 6) Get barotropic U velocities from full 3D velocities.
+    # vozocrtx(t,z,y,x) = u_barotropic(t,y,x) + u_baroclinic(t,z,y,x), where u_barotropic is the vertical average of u_3d(t,z,y,x).
+    # First, we get u_barotropic by vertical averaging, then we get u_baroclinic by removing it from vozocrtx.
+    cdo vertmean tmp3_pid${pid}.nc tmp2_pid${pid}.nc
+    ncks -C -v vozocrtx tmp2_pid${pid}.nc -O tmp_pid${pid}.nc
+    ncwa -O -a depthu tmp_pid${pid}.nc tmp2_pid${pid}.nc
+    ncrename -v vozocrtx,uobarotropic tmp2_pid${pid}.nc
+    ncatted -a standard_name,uobarotropic,o,c,"barotropic_currents_in_x_direction" -a long_name,uobarotropic,o,c,"barotropic currents in x direction" tmp2_pid${pid}.nc
+    ncks tmp2_pid${pid}.nc -A tmp3_pid${pid}.nc
+    ncap2 -O -s 'uobaroclinic=vozocrtx-uobarotropic;' tmp3_pid${pid}.nc tmp2_pid${pid}.nc
+    ncks -x -v vozocrtx tmp2_pid${pid}.nc -O tmp3_pid${pid}.nc
+    ncatted -a standard_name,'baroclinic_currents_in_x_direction' -a long_name,uobaroclinic,o,c,"baroclinic currents in x direction" -a units,uobaroclinic,o,c,"m s-1" -a coordinates,uobaroclinic,o,c,"time_counter depthu nav_lat nav_lon" tmp3_pid${pid}.nc
+
+    # 7) Remap both u velocities (barotropic and baroclinic) to destination grid, and drown.
+    # Here they're remapped to T nodes, which is not correct (but it's not a big deal).
+    # All interpolations and shifted by half a cell in the x direction, sorry.
+    # If you want to correct that, you should generate a ${dest_gridfile} which points to U nodes.
+    cdo remapbil,${dest_gridfile} tmp3_pid${pid}.nc tmp2_pid${pid}.nc
+    cdo setmisstonn tmp2_pid${pid}.nc ORAS5_preinterp_opa1_grid_U_${year}.nc
+
+    # 8) Uncompress, concatenate, pre-cut u velocities.
+    tar -xvf ${in_folder}/vomecrty_ORAS5_1m_${year}_opa1.tar.gz
+    cdo -O copy vomecrty_ORAS5_1m_${year}??_grid_V_02.nc tmp_pid${pid}.nc
+    rm -f vomecrty_ORAS5_1m_${year}??_grid_V_02.nc
+    cdo -O sellonlatbox,${latlon_cut} tmp_pid${pid}.nc tmp3_pid${pid}.nc
+
+    # 9) Get barotropic V velocities from full 3D velocities.
+    # vomecrty(t,z,y,x) = v_barotropic(t,y,x) + v_baroclinic(t,z,y,x), where v_barotropic is the vertical average of vomecrty(t,z,y,x).
+    # First, we get v_barotropic by vertical averaging, then we get v_baroclinic by removing it from vomecrty.
+    cdo vertmean tmp3_pid${pid}.nc tmp2_pid${pid}.nc
+    ncks -C -v vomecrty tmp2_pid${pid}.nc -O tmp_pid${pid}.nc
+    ncwa -O -a depthv tmp_pid${pid}.nc tmp2_pid${pid}.nc
+    ncrename -v vomecrty,vobarotropic tmp2_pid${pid}.nc
+    ncatted -a standard_name,vobarotropic,o,c,"barotropic_currents_in_x_direction" -a long_name,vobarotropic,o,c,"barotropic currents in x direction" tmp2_pid${pid}.nc
+    ncks tmp2_pid${pid}.nc -A tmp3_pid${pid}.nc
+    ncap2 -O -s 'vobaroclinic=vomecrty-vobarotropic;' tmp3_pid${pid}.nc tmp2_pid${pid}.nc
+    ncks -x -v vomecrty tmp2_pid${pid}.nc -O tmp3_pid${pid}.nc
+    ncatted -a standard_name,'baroclinic_currents_in_x_direction' -a long_name,vobaroclinic,o,c,"baroclinic currents in x direction" -a units,vobaroclinic,o,c,"m s-1" -a coordinates,vobaroclinic,o,c,"time_counter depthv nav_lat nav_lon" tmp3_pid${pid}.nc
+
+    # 10) Remap both v velocities (barotropic and baroclinic) to destination grid, and drown.
+    # Here they're remapped to T nodes, which is not correct (but it's not a big deal).
+    # All interpolations and shifted by half a cell in the y direction, sorry.
+    # If you want to correct that, you should generate a ${dest_gridfile} which points to V nodes.
+    cdo remapbil,${dest_gridfile} tmp3_pid${pid}.nc tmp2_pid${pid}.nc
+    cdo setmisstonn tmp2_pid${pid}.nc ORAS5_preinterp_opa1_grid_V_${year}.nc
+    
+done
+
+rm -f ./*pid${pid}*

+ 14 - 0
nemo_bdy/readme.md

@@ -0,0 +1,14 @@
+# Generating lateral NEMO ocean boundary condition.
+
+Scripts for generating lateral ocean boundary condition from EC-Earth ORCA1 outputs.
+
+Steps:
+
+  1. Run `python generate_new_bdy_geom.py`. All inputs are defined on top of the script with explicit comments.
+  2. Run `prepare_ecearth_paraso.sh`. All inputs are defined on top of the script with explicit comments. Be sure to use the "geometry" file that the script in 1. created.
+
+Then you should be settled, i.e. have boundary geometry and data files. Both scripts defined above are commented. 
+
+Step 2. is quite long, because one intermediate step is to remap the EC-Earth data to the full NEMO grid. It's an overkill (in the end, we're only interested in the boundary), but it's not straightforwartd to make a simple (in terms of code readability) implementation which would work without that. For example, on pelican, generating one year of monthly boundary data for PARASO will take about several hours. If you want to have daily data for PARASO (I haven't done that), you'll thus need about more than a day per year, and you'll generate intermediate files (which will be cleant off) of about 150GB. In terms of CPU, the bottleneck is mostly drowning the full 3D fields (`cdo setmisstonn` in `prepare_ecearth_paraso.sh`), so if you want to speed things up, start with that.
+
+It may be worth saying that N. Jourdain (IGE, Grenoble) has a more general tool for building configs which is called [BUILD_CONFIG_NEMO](https://github.com/nicojourdain/BUILD_CONFIG_NEMO).