| Title: | Assessing Skill of Climate Forecasts on Seasonal-to-Decadal Timescales |
| Version: | 5.3.0 |
| Description: | Exploits dynamical seasonal forecasts in order to provide information relevant to stakeholders at the seasonal timescale. The package contains process-based methods for forecast calibration, bias correction, statistical and stochastic downscaling, optimal forecast combination and multivariate verification, as well as basic and advanced tools to obtain tailored products. This package was developed in the context of the ERA4CS project MEDSCOPE and the H2020 S2S4E project and includes contributions from ArticXchange project founded by EU-PolarNet 2. Implements methods described in Pérez-Zanón et al. (2022) <doi:10.5194/gmd-15-6115-2022>, Doblas-Reyes et al. (2005) <doi:10.1111/j.1600-0870.2005.00104.x>, Mishra et al. (2018) <doi:10.1007/s00382-018-4404-z>, Sanchez-Garcia et al. (2019) <doi:10.5194/asr-16-165-2019>, Straus et al. (2007) <doi:10.1175/JCLI4070.1>, Terzago et al. (2018) <doi:10.5194/nhess-18-2825-2018>, Torralba et al. (2017) <doi:10.1175/JAMC-D-16-0204.1>, D'Onofrio et al. (2014) <doi:10.1175/JHM-D-13-096.1>, Verfaillie et al. (2017) <doi:10.5194/gmd-10-4257-2017>, Van Schaeybroeck et al. (2019) <doi:10.1016/B978-0-12-812372-0.00010-8>, Yiou et al. (2013) <doi:10.1007/s00382-012-1626-3>. |
| Depends: | R (≥ 3.5.0), maps, qmap, easyVerification |
| Imports: | s2dv, startR, rainfarmr, multiApply (≥ 2.1.1), ClimProjDiags, ncdf4, plyr, abind, data.table, reshape2, ggplot2, RColorBrewer, graphics, grDevices, stats, utils, verification, lubridate, scales, easyNCDF, dplyr |
| Suggests: | zeallot, testthat, knitr, markdown, rmarkdown |
| VignetteBuilder: | knitr |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| LazyData: | true |
| RoxygenNote: | 7.3.1 |
| Config/testthat/edition: | 3 |
| NeedsCompilation: | yes |
| Packaged: | 2025-11-14 10:24:46 UTC; tkariyat |
| Author: | Nuria Perez-Zanon |
| Maintainer: | Theertha Kariyathan <theertha.kariyathan@bsc.es> |
| Repository: | CRAN |
| Date/Publication: | 2025-11-14 12:40:27 UTC |
AdamontQQCorr computes quantile-quantile correction of seasonal or decadal forecast data using weather types
Description
This function computes a quantile mapping based on weather types
for experiment data (typically a hindcast) onto reference obs,
typically provided by reanalysis data.
Usage
AdamontQQCorr(
exp,
wt_exp,
obs,
wt_obs,
corrdims = c("member", "sdate", "ftime"),
londim = "lon",
latdim = "lat",
regrid = FALSE,
NN = NULL
)
Arguments
exp |
Array with named dimensions (such as |
wt_exp |
Corresponding weather types (same dimensions as |
obs |
Array with named dimensions with reference data (can also be
|
wt_obs |
Corresponding weather types (same dimensions as |
corrdims |
List of dimensions in |
londim |
Character name of longitude dimension in |
latdim |
Character name of latitude dimension in |
regrid |
(optional) Boolean indicating whether .NearestNeighbors regridding is needed. |
NN |
(optional, if regrid = TRUE) List (output from .NearestNeighbors) maps (nlat, nlon) onto (nlat_o, nlon_o). |
Value
An array (such as $data array from an object of class
s2dv_cube) with named dimensions, containing experiment data on the
lat/lon grid of obs array, corrected by quantile mapping depending on
the weather types wt_exp
Author(s)
Paola Marson, paola.marson@meteo.fr for PROSNOW version
Lauriane Batté, lauriane.batte@meteo.fr for CSTools adaptation
Examples
wt_exp <- c(1,1,2,3,3,2,2,1,1,2,2,3)
dim(wt_exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3)
wt_obs <- c(3,3,1,2,2,2,2,1,3,1,1,2)
dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3)
exp <- 1 : c(1 * 1 * 4 * 3 * 4 * 4)
dim(exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3,
lat = 4, lon = 4)
obs <- 101 : c(100 + 1 * 1 * 4 * 3 * 4 * 4)
dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3,
lat = 4, lon = 4)
exp_corr <- AdamontQQCorr(exp = exp, wt_exp = wt_exp,
obs = obs, wt_obs = wt_obs,
corrdims = c('dataset', 'member', 'sdate', 'ftime'))
Analogs based on large scale fields.
Description
This function perform a downscaling using Analogs. To compute the analogs, the function search for days with similar large scale conditions to downscaled fields in the local scale. The large scale and the local scale regions are defined by the user. The large scale is usually given by atmospheric circulation as sea level pressure or geopotential height (Yiou et al, 2013) but the function gives the possibility to use another field. The local scale will be usually given by precipitation or temperature fields, but might be another variable. The analogs function will find the best analogs based in three criterias: (1) Minimum Euclidean distance in the large scale pattern (i.e. SLP) (2) Minimum Euclidean distance in the large scale pattern (i.e. SLP) and minimum Euclidean distance in the local scale pattern (i.e. SLP). (3) Minimum Euclidean distance in the large scale pattern (i.e. SLP), minimum distance in the local scale pattern (i.e. SLP) and highest correlation in the local variable to downscale (i.e Precipitation). The search of analogs must be done in the longest dataset posible. This is important since it is necessary to have a good representation of the possible states of the field in the past, and therefore, to get better analogs. Once the search of the analogs is complete, and in order to used the three criterias the user can select a number of analogs , using parameter 'nAnalogs' to restrict the selection of the best analogs in a short number of posibilities, the best ones. This function has not constrains of specific regions, variables to downscale, or data to be used (seasonal forecast data, climate projections data, reanalyses data). The regrid into a finner scale is done interpolating with CST_Start. Then, this interpolation is corrected selecting the analogs in the large and local scale in based of the observations. The function is an adapted version of the method of Yiou et al 2013.
Usage
Analogs(
expL,
obsL,
time_obsL,
time_expL = NULL,
lonL = NULL,
latL = NULL,
expVar = NULL,
obsVar = NULL,
sdate_dim = "sdate",
criteria = "Large_dist",
excludeTime = NULL,
lonVar = NULL,
latVar = NULL,
region = NULL,
nAnalogs = NULL,
AnalogsInfo = FALSE,
ncores = NULL
)
Arguments
expL |
An array of N named dimensions containing the experimental field on the large scale for which the analog is aimed. This field is used to in all the criterias. If parameter 'expVar' is not provided, the function will return the expL analog. The element 'data' in the 's2dv_cube' object must have, at least, latitudinal and longitudinal dimensions. The object is expect to be already subset for the desired large scale region. Latitudinal dimension accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. Longitudinal dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', 'nav_lon'. |
obsL |
An array of N named dimensions containing the observational field on the large scale. The element 'data' in the 's2dv_cube' object must have the same latitudinal and longitudinal dimensions as parameter 'expL' and a single temporal dimension with the maximum number of available observations. |
time_obsL |
A character string indicating the date of the observations in the format "dd-mm-yyyy". Reference time to search for analogs. It must have time dimensions. |
time_expL |
An array of N named dimensions (coinciding with time dimensions in expL) of character string(s) indicating the date(s) of the experiment in the format "dd-mm-yyyy". Time(s) to find the analogs. If it is not an scalar it must have named dimensions. |
lonL |
A vector containing the longitude of parameter 'expL'. |
latL |
A vector containing the latitude of parameter 'expL'. |
expVar |
An array of N named dimensions containing the experimental field on the local scale, usually a different variable to the parameter 'expL'. If it is not NULL (by default, NULL), the returned field by this function will be the analog of parameter 'expVar'. |
obsVar |
An array of N named dimensions containing the field of the same variable as the passed in parameter 'expVar' for the same region. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
criteria |
A character string indicating the criteria to be used for the selection of analogs:
|
excludeTime |
An array of N named dimensions (coinciding with time dimensions in expL) of character string(s) indicating the date(s) of the observations in the format "dd/mm/yyyy" to be excluded during the search of analogs. It can be NULL but if expL is not a forecast (time_expL contained in time_obsL), by default time_expL will be removed during the search of analogs. |
lonVar |
A vector containing the longitude of parameter 'expVar'. |
latVar |
A vector containing the latitude of parameter 'expVar'. |
region |
A vector of length four indicating the minimum longitude, the maximum longitude, the minimum latitude and the maximum latitude. |
nAnalogs |
Number of Analogs to be selected to apply the criterias 'Local_dist' or 'Local_cor'. This is not the necessary the number of analogs that the user can get, but the number of events with minimum distance in which perform the search of the best Analog. The default value for the 'Large_dist' criteria is 1, for 'Local_dist' and 'Local_cor' criterias must be greater than 1 in order to match with the first criteria, if nAnalogs is NULL for 'Local_dist' and 'Local_cor' the default value will be set at the length of 'time_obsL'. If AnalogsInfo is FALSE the function returns just the best analog. |
AnalogsInfo |
A logical value. If it is TRUE it returns a list with two elements: 1) the downscaled field and 2) the AnalogsInfo which contains: a) the number of the best analogs, b) the corresponding value of the metric used in the selected criteria (distance values for Large_dist and Local_dist,correlation values for Local_cor), c)dates of the analogs). The analogs are listed in decreasing order, the first one is the best analog (i.e if the selected criteria is Local_cor the best analog will be the one with highest correlation, while for Large_dist criteria the best analog will be the day with minimum Euclidean distance). Set to FALSE to get a single analog, the best analog, for instance for downscaling. |
ncores |
The number of cores to use in parallel computation. |
Value
An array with the dowscaled values of the best analogs for the criteria selected. If 'AnalogsInfo' is set to TRUE it returns a list with an array of the dowsncaled field and the analogs information in elements 'analogs', 'metric' and 'dates'.
Author(s)
M. Carmen Alvarez-Castro, carmen.alvarez-castro@cmcc.it
Maria M. Chaves-Montero, mariadm.chaves@cmcc.it
Veronica Torralba, veronica.torralba@cmcc.it
Nuria Perez-Zanon nuria.perez@bsc.es
References
Yiou, P., T. Salameh, P. Drobinski, L. Menut, R. Vautard, and M. Vrac, 2013 : Ensemble reconstruction of the atmospheric column from surface pressure using analogues. Clim. Dyn., 41, 1419-1437. pascal.yiou@lsce.ipsl.fr
Examples
# Example 1: Downscaling using criteria 'Large_dist' and a single variable:
expSLP <- rnorm(1:20)
dim(expSLP) <- c(lat = 4, lon = 5)
obsSLP <- c(rnorm(1:180), expSLP * 1.2)
dim(obsSLP) <- c(time = 10, lat = 4, lon = 5)
time_obsSLP <- paste(rep("01", 10), rep("01", 10), 1994 : 2003, sep = "-")
dim(time_obsSLP) <- c(time = 10)
downscale_field <- Analogs(expL = expSLP, obsL = obsSLP,
time_obsL = time_obsSLP,time_expL = "01-01-1994")
# Example 2: Downscaling using criteria 'Large_dist' and 2 variables:
obs.pr <- c(rnorm(1:200) * 0.001)
dim(obs.pr) <- dim(obsSLP)
downscale_field <- Analogs(expL = expSLP, obsL = obsSLP, obsVar = obs.pr,
time_obsL = time_obsSLP, time_expL = "01-01-1994")
# Example 3: Downscaling using criteria 'Local_dist' and 2 variables:
# analogs of local scale using criteria 2
region = c(lonmin = -1 ,lonmax = 2, latmin = 30, latmax = 33)
Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP,
obsVar = obs.pr, criteria = "Local_dist",
lonL = seq(-1, 5, 1.5),latL = seq(30, 35, 1.5),
region = region,time_expL = "01-10-2000",
nAnalogs = 10, AnalogsInfo = TRUE)
# Example 4: Downscaling using criteria 'Local_cor' and 2 variables:
exp.pr <- c(rnorm(1:20) * 0.001)
dim(exp.pr) <- dim(expSLP)
Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP,
obsVar = obs.pr, expVar = exp.pr,
criteria = "Local_cor", lonL = seq(-1, 5, 1.5),
time_expL = "01-10-2000", latL = seq(30, 35, 1.5),
lonVar = seq(-1, 5, 1.5), latVar = seq(30, 35, 1.5),
nAnalogs = 8, region = region, AnalogsInfo = FALSE)
# Example 5: List of best analogs in the three criterias Large_dist,
Large_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP,
criteria = "Large_dist", time_expL = "01-10-2000",
nAnalogs = 7, AnalogsInfo = TRUE)
Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP,
time_expL = "01-10-2000", criteria = "Local_dist",
lonL = seq(-1, 5, 1.5), latL = seq(30, 35, 1.5),
nAnalogs = 7, region = region, AnalogsInfo = TRUE)
Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP,
obsVar = obsSLP, expVar = expSLP,
time_expL = "01-10-2000", criteria = "Local_cor",
lonL = seq(-1, 5, 1.5), latL = seq(30, 35, 1.5),
lonVar = seq(-1, 5, 1.5), latVar = seq(30, 35, 1.5),
nAnalogs = 7, region = region,
AnalogsInfo = TRUE)
Calculate the spatial area-weighted average of multidimensional arrays.
Description
This function computes a spatial area-weighted average of n-dimensional arrays given a spatial mask containing the area (e.g. in m^2) for each grid box.
Usage
AreaWeighted(data, area, lon_dim = "lon", lat_dim = "lat", ncores = NULL)
Arguments
data |
A multidimensional array with named dimensions (at least lon_dim and lat_dim) containing climate data. |
area |
A multidimensional array with named dimensions (at least lon_dim
and lat_dim) containing area of each grid box. The resolution, length and
order of the lon_dim and lat_dim dimensions should be identical as for
|
lon_dim |
A character string indicating the name of the longitudinal dimension. The default value is 'lon'. |
lat_dim |
A character string indicating the name of the latitudinal dimension. The default value is 'lat'. |
ncores |
An integer indicating the number of cores to use for parallel computation. The default value is NULL. |
Value
An array with the same dimensions as data, except with
lon_dim and lat_dim removed. Any extra dimensions in
area not present in data are preserved in the output (e.g.,
region dimension).
Examples
data <- array(data = 1:10, dim = c(year = 10, lon = 5, lat = 3))
area <- array(data = 1:10, dim = c(lon = 5, lat = 3))
AreaWeighted(data, area)
# With extra region dimension
area <- array(data = 1:10, dim = c(lon = 5, lat = 3, region = 4))
AreaWeighted(data, area)
Computing the weighted ensemble means for SFSs.
Description
This function implements the computation to obtain the weighted ensemble means for SFSs using a normalized weights array,
Usage
BEI_EMWeighting(var_exp, aweights, time_dim_name = "time", memb_dim = "member")
Arguments
var_exp |
Variable (e.g. precipitation, temperature, NAO index) array from a SFS with at least dimensions (time, member) for a spatially aggregated variable or dimensions (time, member, lat, lon) for a spatial variable, as 'time' the spatial dimension by default. |
aweights |
Normalized weights array with at least dimensions (time, member), when 'time' is the temporal dimension as default. |
time_dim_name |
A character string indicating the name of the temporal dimension, by default 'time'. |
memb_dim |
A character string indicating the name of the member dimension, by default 'member'. |
Value
BEI_EMWeighting() returns an array with at least one or three dimensions depending if the variable is spatially aggregated variable (as e.g. NAO index)(time) or it is spatial variable (as e.g. precipitation or temperature) (time, lat, lon), containing the ensemble means computing with weighted members.
Author(s)
Eroteida Sanchez-Garcia - AEMET, esanchezg@aemet.es
References
Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., Adv. Sci. Res., 16, 165174, 2019, doi:10.5194/asr-16-165-2019
Examples
# Example 1
var_exp <- 1 : (2 * 3 * 4)
dim(var_exp) <- c(time = 2, dataset = 3, member = 4)
aweights <- runif(24, min = 0.001, max = 0.999)
dim(aweights) <- c(time = 2, dataset = 3, member = 4)
res <- BEI_EMWeighting(var_exp, aweights)
# Example 2
var_exp <- 1 : (2 * 4 * 2 * 3)
dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3)
aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3)
dim(aweights) <- c(time = 2, member = 4)
res <- BEI_EMWeighting(var_exp, aweights)
Computing the Best Index PDFs combining Index PDFs from two SFSs
Description
This function implements the computation to obtain the index Probability Density Functions (PDFs) (e.g. NAO index) obtained to combining the Index PDFs for two Seasonal Forecast Systems (SFSs), the Best Index estimation (see Sanchez-Garcia, E. et al (2019), doi:10.5194/asr-16-165-2019 for more details about the methodology applied to estimate the Best Index).
Usage
BEI_PDFBest(
index_obs,
index_hind1,
index_hind2,
index_fcst1 = NULL,
index_fcst2 = NULL,
method_BC = "none",
time_dim_name = "time",
na.rm = FALSE
)
Arguments
index_obs |
Index (e.g. NAO index) array from an observational database or reanalysis with at least a temporal dimension (by default 'time'), which must be greater than 2. |
index_hind1 |
Index (e.g. NAO index) array from a SFS (named SFS1) with at least two dimensions (time , member) or (time, statistic). The temporal dimension, by default 'time', must be greater than 2. The dimension 'member' must be greater than 1. The dimension 'statistic' must be equal to 2, for containing the two paramenters of a normal distribution (mean and sd) representing the ensemble of a SFS. It is not possible to have the dimension 'member' and 'statistic' at the same time. |
index_hind2 |
Index (e.g. NAO index) array from a SFS (named SFS2) with at least two dimensions (time , member) or (time, statistic). The temporal dimension, by default 'time', must be greater than 2. The dimension 'member' must be greater than 1. The dimension 'statistic' must be equal to 2, for containing the two paramenters of a normal distribution (mean and sd) representing the ensemble of a SFS. It is not possible to have the dimension 'member' and 'statistic' together. |
index_fcst1 |
(optional, default = NULL) Index (e.g. NAO index) array from forescating of SFS1 with at least two dimensions (time , member) or (time, statistic). The temporal dimension, by default 'time', must be equal to 1, the forecast year target. The dimension 'member' must be greater than 1. The dimension 'statistic' must be equal to 2, for containing the two paramenters of a normal distribution (mean and sd) representing the ensemble of a SFS. It is not possible to have the dimension 'member' and 'statistic' together. |
index_fcst2 |
(optional, default = NULL) Index (e.g. NAO index) array from forescating of SFS2 with at least two dimensions (time , member) or (time, statistic). The temporal dimension, by default 'time', must be equal to 1, the forecast year target. The dimension 'member' must be greater than 1. The dimension 'statistic' must be equal to 2, for containing the two paramenters of a normal distribution (mean and sd) representing the ensemble of a SFS. It is not possible to have the dimension 'member' and 'statistic' together. |
method_BC |
A character vector of maximun length 2 indicating the bias correction methodology to be applied on each SFS. If it is 'none' or any of its elements is 'none', the bias correction won't be applied. Available methods developped are "ME" (a bias correction scheme based on the mean error or bias between observation and predictions to correct the predicted values), and "LMEV" (a bias correction scheme based on a linear model using ensemble variance of index as predictor). (see Sanchez-Garcia, E. et al (2019), doi:10.5194/asr-16-165-2019 for more details). |
time_dim_name |
A character string indicating the name of the temporal dimension, by default 'time'. |
na.rm |
Logical (default = FALSE). Should missing values be removed? |
Value
BEI_PDFBest() returns an array with the parameters that caracterize the PDFs, with at least a temporal dimension, by default 'time' and dimension 'statistic' equal to 2. The firt statistic is the parameter 'mean' of the PDF for the best estimation combining the two SFSs PDFs. The second statistic is the parameter 'standard deviation' of the PDF for the best estimation combining the two SFSs PDFs. If index_fcst1 and/or index_fcst2 are null, returns the values for hindcast period. Otherwise, it returns the values for a forecast year.
Author(s)
Eroteida Sanchez-Garcia - AEMET, esanchezg@aemet.es
References
Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., Adv. Sci. Res., 16, 165174, 2019, doi:10.5194/asr-16-165-2019
Examples
# Example 1 for the BEI_PDFBest function
index_obs<- rnorm(10, sd = 3)
dim(index_obs) <- c(time = 5, season = 2)
index_hind1 <- rnorm(40, mean = 0.2, sd = 3)
dim(index_hind1) <- c(time = 5, member = 4, season = 2)
index_hind2 <- rnorm(60, mean = -0.5, sd = 4)
dim(index_hind2) <- c(time = 5, member = 6, season = 2)
index_fcst1 <- rnorm(16, mean = 0.2, sd = 3)
dim(index_fcst1) <- c(time = 1, member = 8, season = 2)
index_fcst2 <- rnorm(18, mean = -0.5, sd = 4)
dim(index_fcst2) <- c(time = 1, member = 9, season = 2)
method_BC <- 'ME'
res <- BEI_PDFBest(index_obs, index_hind1, index_hind2, index_fcst1,
index_fcst2, method_BC)
# Example 2 for the BEI_PDFBest function
index_obs<- rnorm(10, sd = 3)
dim(index_obs) <- c(time = 5, season = 2)
index_hind1 <- rnorm(40, mean = 0.2, sd = 3)
dim(index_hind1) <- c(time = 5, member = 4, season = 2)
index_hind2 <- rnorm(60, mean = -0.5, sd = 4)
dim(index_hind2) <- c(time = 5, member = 6, season = 2)
index_fcst1 <- rnorm(16, mean = 0.2, sd = 3)
dim(index_fcst1) <- c(time = 1, member = 8, season = 2)
index_fcst2 <- rnorm(18, mean = -0.5, sd = 4)
dim(index_fcst2) <- c(time = 1, member = 9, season = 2)
method_BC <- c('LMEV', 'ME')
res <- BEI_PDFBest(index_obs, index_hind1, index_hind2, index_fcst1,
index_fcst2, method_BC)
Computing the weighted tercile probabilities for SFSs.
Description
This function implements the computation to obtain the tercile probabilities for a weighted variable for SFSs using a normalized weights array,
Usage
BEI_ProbsWeighting(
var_exp,
aweights,
terciles,
time_dim_name = "time",
memb_dim = "member"
)
Arguments
var_exp |
Variable (e.g. precipitation, temperature, NAO index) array from a SFS with at least dimensions (time, member) for a spatially aggregated variable or dimensions (time, member, lat, lon) for a spatial variable, as 'time' the spatial dimension by default. |
aweights |
Normalized weights array with at least dimensions (time, member), when 'time' is the temporal dimension as default. |
terciles |
A numeric array with at least one dimension 'tercil' equal to 2, the first element is the lower tercil for a hindcast period, and the second element is the upper tercile. |
time_dim_name |
A character string indicating the name of the temporal dimension, by default 'time'. |
memb_dim |
A character string indicating the name of the member dimension, by default 'member'. |
Value
BEI_ProbsWeighting() returns an array with at least two or four dimensions depending if the variable is a spatially aggregated variable (as e.g. NAO index)(time, tercil) or it is spatial variable (as e.g. precipitation or temperature)(time, tercile, lat, lon), containing the terciles probabilities computing with weighted members. The first tercil is the lower tercile, the second is the normal tercile and the third is the upper tercile.
Author(s)
Eroteida Sanchez-Garcia - AEMET, esanchezg@aemet.es
References
Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., Adv. Sci. Res., 16, 165174, 2019, doi:10.5194/asr-16-165-2019
Examples
# Example 1
var_exp <- 1 : (2 * 4)
dim(var_exp) <- c(time = 2, member = 4)
aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3)
dim(aweights) <- c(time = 2, member = 4)
terciles <- c(2.5,5)
dim(terciles) <- c(tercil = 2)
res <- BEI_ProbsWeighting(var_exp, aweights, terciles)
# Example 2
var_exp <- rnorm(48, 50, 9)
dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3)
aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3)
dim(aweights) <- c(time = 2, member = 4)
terciles <- rep(c(48,50), 2*3)
dim(terciles) <- c(tercil = 2, lat = 2, lon = 3)
res <- BEI_ProbsWeighting(var_exp, aweights, terciles)
Computing the weighted terciles for SFSs.
Description
This function implements the computation to obtain the terciles for a weighted variable for SFSs using a normalized weights array,
Usage
BEI_TercilesWeighting(
var_exp,
aweights,
time_dim_name = "time",
memb_dim = "member"
)
Arguments
var_exp |
Variable (e.g. precipitation, temperature, NAO index) array from a SFS with at least dimensions (time, member) for a spatially aggregated variable or dimensions (time, member, lat, lon) for a spatial variable, as 'time' the spatial dimension by default. |
aweights |
Normalized weights array with at least dimensions (time, member), when 'time' is the temporal dimension as default. |
time_dim_name |
A character string indicating the name of the temporal dimension, by default 'time'. |
memb_dim |
A character string indicating the name of the member dimension, by default 'member'. |
Value
BEI_TercilesWeighting() returns an array with at least one dimension depending if the variable is a spatially aggregated variable (as e.g. NAO index)(tercil) or it is spatial variable (as e.g. precipitation or temperature)(tercil, lat, lon), containing the terciles computing with weighted members. The first tercil is the lower tercile, the second is the upper tercile.
Author(s)
Eroteida Sanchez-Garcia - AEMET, esanchezg@aemet.es
References
Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., Adv. Sci. Res., 16, 165174, 2019, doi:10.5194/asr-16-165-2019
Examples
# Example 1
var_exp <- 1 : (2 * 4)
dim(var_exp) <- c(time = 2, member = 4)
aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3)
dim(aweights) <- c(time = 2, member = 4)
res <- BEI_TercilesWeighting(var_exp, aweights)
# Example 2
var_exp <- rnorm(48, 50, 9)
dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3)
aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3)
dim(aweights) <- c(time = 2, member = 4)
res <- BEI_TercilesWeighting(var_exp, aweights)
Computing the weights for SFSs using the Best Index PDFs.
Description
This function implements the computation to obtain the normalized weights for each member of each Seasonal Forecast Systems (SFS) or dataset using the Probability Density Functions (PDFs) indicated by the parameter 'pdf_weight' (for instance the Best Index estimation obtained using the 'PDFBest' function). The weight of each member is proportional to the probability of its index calculated with the PDF "pdf_weight".
Usage
BEI_Weights(index_weight, pdf_weight, time_dim_name = "time")
Arguments
index_weight |
Index (e.g. NAO index) array, from a dataset of SFSs for a period of years, with at least dimensions 'member'. Additional dimensions, for instance, a temporal dimension as 'time', must have the same lenght in both parameters, 'index_weight' and 'pdf_weight'. |
pdf_weight |
Statistics array to define a Gaussian PDF with at least dimensions 'statistic'. The firt statistic is the parameter 'mean' of the PDF and the second statistic is the parameter 'standard deviation' of the PDF. |
time_dim_name |
A character string indicating the name of the temporal dimension, by default 'time'. |
Value
BEI_Weights() returns a normalized weights array with the same dimensions that index_weight.
Author(s)
Eroteida Sanchez-Garcia - AEMET, esanchezg@aemet.es
References
Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., Adv. Sci. Res., 16, 165174, 2019, doi:10.5194/asr-16-165-2019
Examples
# Example for the BEI_Weights function
index_weight <- 1 : (10 * 3 * 5 * 1)
dim(index_weight) <- c(sdate = 10, dataset = 3, member = 5, season = 1)
pdf_weight <- 1 : (10 * 3 * 2 * 1)
dim(pdf_weight) <- c(sdate = 10, dataset = 3, statistic = 2, season = 1)
res <- BEI_Weights(index_weight, pdf_weight)
dim(res)
# sdate dataset member season
# 10 3 5 1
Bias Correction based on the mean and standard deviation adjustment
Description
This function applies the simple bias adjustment technique described in Torralba et al. (2017). The adjusted forecasts have an equivalent standard deviation and mean to that of the reference dataset.
Usage
BiasCorrection(
exp,
obs,
exp_cor = NULL,
na.rm = FALSE,
memb_dim = "member",
sdate_dim = "sdate",
dat_dim = NULL,
ncores = NULL
)
Arguments
exp |
A multidimensional array with named dimensions containing the seasonal forecast experiment data with at least time and member dimensions. |
obs |
A multidimensional array with named dimensions containing the observed data with at least time dimension. |
exp_cor |
A multidimensional array with named dimensions containing the seasonal forecast experiment to be corrected with at least time and member dimension. If it is NULL, the 'exp' forecast will be corrected. If there is only one corrected dataset, it should not have dataset dimension. If there is a corresponding corrected dataset for each 'exp' forecast, the dataset dimension must have the same length as in 'exp'. The default value is NULL. |
na.rm |
A logical value indicating whether missing values should be stripped before the computation proceeds, by default it is set to FALSE. |
memb_dim |
A character string indicating the name of the member dimension. By default, it is set to 'member'. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
dat_dim |
A character string indicating the name of dataset dimension. The length of this dimension can be different between 'exp' and 'obs'. The default value is NULL. |
ncores |
An integer that indicates the number of cores for parallel computations using multiApply function. The default value is NULL. |
Value
An array containing the bias corrected forecasts with the dimensions nexp, nobs and same dimensions as in the 'exp' object. nexp is the number of experiment (i.e., 'dat_dim' in exp), and nobs is the number of observation (i.e., 'dat_dim' in obs). If dat_dim is NULL, nexp and nobs are omitted. If 'exp_cor' is provided the returned array will be with the same dimensions as 'exp_cor'.
Author(s)
Verónica Torralba, veronica.torralba@bsc.es
References
Torralba, V., F.J. Doblas-Reyes, D. MacLeod, I. Christel and M. Davis (2017). Seasonal climate prediction: a new source of information for the management of wind energy resources. Journal of Applied Meteorology and Climatology, 56, 1231-1247, doi:10.1175/JAMC-D-16-0204.1. (CLIM4ENERGY, EUPORIAS, NEWA, RESILIENCE, SPECS)
Examples
mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7)
dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, time = 5, lat = 6, lon = 7)
obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7)
dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, time = 5, lat = 6, lon = 7)
a <- BiasCorrection(exp = mod1, obs = obs1)
Bind two arrays by a specified named dimension
Description
This function combines the data inside two or more arrays with named
dimensions along the specified along dimension. It is a wrapper of the
abind() function from the abind package.
Usage
BindDim(x, along)
Arguments
x |
A list of two or more arrays with named dimensions to be bound together. |
along |
A character string indicating the name of the binding dimension. |
Value
A single array combining the arrays in x along the specified
dimension, with dimension names. The order of the dimensions will be the
same as in the first array provided in the list.
Author(s)
Agudetse Roures Victoria, victoria.agudetse@bsc.es
See Also
Examples
array1 <- array(1:100, dim = c(time = 1, lon = 10, lat = 10))
array2 <- array(101:200, dim = c(lon = 10, lat = 10, time = 1))
# Bind arrays
array3 <- BindDim(x = list(array1, array2),
along = "time")
# Check new dimensions
dim(array3)
CST_AdamontAnalog finds analogous data in the reference dataset to experiment data based on weather types
Description
This function searches for analogs in a reference dataset for experiment data, based on corresponding weather types. The experiment data is typically a hindcast, observations are typically provided by reanalysis data.
This function searches for analogs in a reference dataset for experiment data, based on corresponding weather types. The experiment data is typically a hindcast, observations are typically provided by reanalysis data.
Usage
CST_AdamontAnalog(
exp,
obs,
wt_exp,
wt_obs,
nanalogs,
method = "pattcorr",
thres = NULL,
search_obsdims = c("member", "sdate", "ftime"),
londim = "lon",
latdim = "lat"
)
AdamontAnalog(
exp,
obs,
wt_exp,
wt_obs,
nanalogs = 5,
method = "pattcorr",
thres = NULL,
search_obsdims = c("member", "sdate", "ftime"),
londim = "lon",
latdim = "lat"
)
Arguments
exp |
A multidimensional array with named dimensions containing the experiment data. |
obs |
A multidimensional array with named dimensions containing the
reference data. Note that lat/lon dimensions need to be the same as
|
wt_exp |
Corresponding weather types (same dimensions as |
wt_obs |
Corresponding weather types (same dimensions as |
nanalogs |
Integer defining the number of analog values to return (default: 5). |
method |
A character string indicating the method used for analog definition. It can be:
|
thres |
Real number indicating the threshold to define rain occurrence/non occurrence in rain (0)1. |
search_obsdims |
List of dimensions in |
londim |
Name of longitude dimension. |
latdim |
Name of latitude dimension. |
Value
analog_vals An object of class s2dv_cube containing
nanalogs analog values for each value of exp input data.
analog_vals An array containing nanalogs analog values.
Author(s)
Paola Marson, paola.marson@meteo.fr for PROSNOW version
Lauriane Batté, lauriane.batte@meteo.fr for CSTools adaptation
Examples
wt_exp <- sample(1:3, 15*6*3, replace = TRUE)
dim(wt_exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3)
wt_obs <- sample(1:3, 6*3, replace = TRUE)
dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3)
exp <- NULL
exp$data <- 1 : c(1 * 15 * 6 * 3 * 8 * 8)
dim(exp$data) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3,
lat = 8, lon = 8)
class(exp) <- 's2dv_cube'
obs <- NULL
obs$data <- 101 : c(100 + 1 * 1 * 6 * 3 * 8 * 8)
dim(obs$data) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3,
lat = 8, lon = 8)
class(obs) <- 's2dv_cube'
analog_vals <- CST_AdamontAnalog(exp = exp, obs = obs, wt_exp = wt_exp,
wt_obs = wt_obs, nanalogs = 2)
wt_exp <- sample(1:3, 15*6*3, replace = TRUE)
dim(wt_exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3)
wt_obs <- sample(1:3, 6*3, replace = TRUE)
dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3)
exp <- 1 : c(1 * 15 * 6 * 3 * 8 * 8)
dim(exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3, lat = 8, lon = 8)
obs <- 101 : c(100 + 1 * 1 * 6 * 3 * 8 * 8)
dim(obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3, lat = 8, lon = 8)
analog_vals <- AdamontAnalog(exp = exp, obs = obs, wt_exp = wt_exp,
wt_obs = wt_obs, nanalogs = 2)
CST_AdamontQQCorr computes quantile-quantile correction of seasonal or decadal forecast data using weather types
Description
This function computes a quantile mapping based on weather types
for experiment data (typically a hindcast) onto reference obs,
typically provided by reanalysis data.
Usage
CST_AdamontQQCorr(
exp,
wt_exp,
obs,
wt_obs,
corrdims = c("member", "sdate", "ftime"),
londim = "lon",
latdim = "lat"
)
Arguments
exp |
Experiment data an object of class |
wt_exp |
Corresponding weather types (same dimensions as |
obs |
Reference data, also of class |
wt_obs |
Corresponding weather types (same dimensions as |
corrdims |
List of dimensions in |
londim |
Character name of longitude dimension in |
latdim |
Character name of latitude dimension in |
Value
An object of class s2dv_cube containing experiment data on the
lat/lon grid of obs input data, corrected by quantile mapping
depending on the weather types wt_exp.
Author(s)
Lauriane Batté, lauriane.batte@meteo.fr
Paola Marson, paola.marson@meteo.fr
Gildas Dayon, gildas.dayon@meteo.fr
Examples
wt_exp <- c(1,1,2,3,3,2,2,1,1,2,2,3)
dim(wt_exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3)
wt_obs <- c(3,3,1,2,2,2,2,1,3,1,1,2)
dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3)
exp <- NULL
exp$data <- 1 : c(1 * 1 * 4 * 3 * 4 * 4)
dim(exp$data) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3,
lat = 4, lon = 4)
class(exp) <- 's2dv_cube'
obs <- NULL
obs$data <- 101 : c(100 + 1 * 1 * 4 * 3 * 4 * 4)
dim(obs$data) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3,
lat = 4, lon = 4)
class(obs) <- 's2dv_cube'
exp_corr <- CST_AdamontQQCorr(exp = exp, wt_exp = wt_exp,
obs = obs, wt_obs = wt_obs,
corrdims = c('dataset','member','sdate','ftime'))
Downscaling using Analogs based on large scale fields.
Description
This function perform a downscaling using Analogs. To compute the analogs, the function search for days with similar large scale conditions to downscaled fields to a local scale. The large scale and the local scale regions are defined by the user. The large scale is usually given by atmospheric circulation as sea level pressure or geopotential height (Yiou et al, 2013) but the function gives the possibility to use another field. The local scale will be usually given by precipitation or temperature fields, but might be another variable.The analogs function will find the best analogs based in Minimum Euclidean distance in the large scale pattern (i.e.SLP).
The search of analogs must be done in the longest dataset posible. This is important since it is necessary to have a good representation of the possible states of the field in the past, and therefore, to get better analogs. This function has not constrains of specific regions, variables to downscale, or data to be used (seasonal forecast data, climate projections data, reanalyses data). The regrid into a finner scale is done interpolating with CST_Start. Then, this interpolation is corrected selecting the analogs in the large and local scale in based of the observations. The function is an adapted version of the method of Yiou et al 2013. For an advanced search of Analogs (multiple Analogs, different criterias, further information from the metrics and date of the selected Analogs) use the'Analog' function within 'CSTools' package.
Usage
CST_Analogs(
expL,
obsL,
expVar = NULL,
obsVar = NULL,
sdate_dim = "sdate",
region = NULL,
criteria = "Large_dist",
excludeTime = NULL,
time_expL = NULL,
time_obsL = NULL,
nAnalogs = NULL,
AnalogsInfo = FALSE,
ncores = NULL
)
Arguments
expL |
An 's2dv_cube' object containing the experimental field on the large scale for which the analog is aimed. This field is used to in all the criterias. If parameter 'expVar' is not provided, the function will return the expL analog. The element 'data' in the 's2dv_cube' object must have, at least, latitudinal and longitudinal dimensions. The object is expect to be already subset for the desired large scale region. Latitudinal dimension accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. Longitudinal dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', 'nav_lon'. |
obsL |
An 's2dv_cube' object containing the observational field on the large scale. The element 'data' in the 's2dv_cube' object must have the same latitudinal and longitudinal dimensions as parameter 'expL' and a temporal dimension with the maximum number of available observations. |
expVar |
An 's2dv_cube' object containing the experimental field on the local scale, usually a different variable to the parameter 'expL'. If it is not NULL (by default, NULL), the returned field by this function will be the analog of parameter 'expVar'. |
obsVar |
An 's2dv_cube' containing the field of the same variable as the passed in parameter 'expVar' for the same region. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
region |
A vector of length four indicating the minimum longitude, the maximum longitude, the minimum latitude and the maximum latitude. |
criteria |
A character string indicating the criteria to be used for the selection of analogs:
Criteria 'Large_dist' is recommended for CST_Analogs, for an advanced use of the criterias 'Local_dist' and 'Local_cor' use 'Analogs' function. |
excludeTime |
An array of N named dimensions (coinciding with time dimensions in expL)of character string(s) indicating the date(s) of the observations in the format "dd/mm/yyyy" to be excluded during the search of analogs. It can be NULL but if expL is not a forecast (time_expL contained in time_obsL), by default time_expL will be removed during the search of analogs. |
time_expL |
A character string indicating the date of the experiment
in the same format than time_obsL (i.e. "yyyy-mm-dd"). By default it is NULL
and dates are taken from element |
time_obsL |
A character string indicating the date of the observations
in the date format (i.e. "yyyy-mm-dd"). By default it is NULL and dates are
taken from element |
nAnalogs |
Number of Analogs to be selected to apply the criterias 'Local_dist' or 'Local_cor'. This is not the necessary the number of analogs that the user can get, but the number of events with minimum distance in which perform the search of the best Analog. The default value for the 'Large_dist' criteria is 1, for 'Local_dist' and 'Local_cor' criterias must be greater than 1 in order to match with the first criteria, if nAnalogs is NULL for 'Local_dist' and 'Local_cor' the default value will be set at the length of 'time_obsL'. If AnalogsInfo is FALSE the function returns just the best analog. |
AnalogsInfo |
A logical value. TRUE to get a list with two elements: 1) the downscaled field and 2) the AnalogsInfo which contains: a) the number of the best analogs, b) the corresponding value of the metric used in the selected criteria (distance values for Large_dist and Local_dist, correlation values for Local_cor), c)dates of the analogs). The analogs are listed in decreasing order, the first one is the best analog (i.e if the selected criteria is Local_cor the best analog will be the one with highest correlation, while for Large_dist criteria the best analog will be the day with minimum Euclidean distance). Set to FALSE to get a single analog, the best analog, for instance for downscaling. |
ncores |
The number of cores to use in parallel computation |
Value
An 's2dv_cube' object containing an array with the dowscaled values of the best analogs in element 'data'. If 'AnalogsInfo' is TRUE, 'data' is a list with an array of the downscaled fields and the analogs information in elements 'analogs', 'metric' and 'dates'.
Author(s)
M. Carmen Alvarez-Castro, carmen.alvarez-castro@cmcc.it
Maria M. Chaves-Montero, mariadm.chaves@cmcc.it
Veronica Torralba, veronica.torralba@cmcc.it
Nuria Perez-Zanon nuria.perez@bsc.es
References
Yiou, P., T. Salameh, P. Drobinski, L. Menut, R. Vautard, and M. Vrac, 2013 : Ensemble reconstruction of the atmospheric column from surface pressure using analogues. Clim. Dyn., 41, 1419-1437. pascal.yiou@lsce.ipsl.fr
See Also
Examples
expL <- rnorm(1:200)
dim(expL) <- c(member = 10, lat = 4, lon = 5)
obsL <- c(rnorm(1:180), expL[1, , ]*1.2)
dim(obsL) <- c(time = 10, lat = 4, lon = 5)
time_obsL <- as.POSIXct(paste(rep("01", 10), rep("01", 10), 1994:2003, sep = "-"),
format = "%d-%m-%y")
dim(time_obsL) <- c(time = 10)
time_expL <- time_obsL[1]
dim(time_expL) <- c(time = 1)
lon <- seq(-1, 5, 1.5)
lat <- seq(30, 35, 1.5)
coords <- list(lon = seq(-1, 5, 1.5), lat = seq(30, 35, 1.5))
attrs_expL <- list(Dates = time_expL)
attrs_obsL <- list(Dates = time_obsL)
expL <- list(data = expL, coords = coords, attrs = attrs_expL)
obsL <- list(data = obsL, coords = coords, attrs = attrs_obsL)
class(expL) <- 's2dv_cube'
class(obsL) <- 's2dv_cube'
region <- c(min(lon), max(lon), min(lat), max(lat))
downscaled_field <- CST_Analogs(expL = expL, obsL = obsL, region = region)
AEMET Downscaling Precipitation and maximum and minimum temperature downscaling method based on analogs: synoptic situations and significant predictors.
Description
This function downscales low resolution precipitation data (e.g. from Seasonal Forecast Models) through the association with an observational high resolution (HR) dataset (AEMET 5 km gridded data of daily precipitation (Peral et al., 2017)) and a collection of predictors and past synoptic situations similar to estimated day. The method uses three domains:
Peninsular Spain and Balearic Islands domain (5 km resolution): HR precipitation and the downscaling result domain.
Synoptic domain (low resolution, e.g. 1.5º x 1.5º): it should be centered over Iberian Peninsula and cover enough extension to detect as much synoptic situations as possible.
Extended domain (low resolution, e.g. 1.5º x 1.5º): it should have the same resolution as synoptic domain. It is used for SLP Seasonal Forecast Models.
Usage
CST_AnalogsPredictors(
exp,
slp,
obs,
lon,
lat,
slp_lon,
slp_lat,
var_name,
hr_obs,
tdates,
ddates,
restrain,
dim_name_longitude = "lon",
dim_name_latitude = "lat",
dim_name_time = "time"
)
Arguments
exp |
List of arrays with downscaled period seasonal forecast data. The list has to contain model atmospheric variables (instantaneous 12h data) that must be indentify by parenthesis name. For precipitation:
For temperature:
The arrays must have at least three dimensions with names 'lon', 'lat' and 'time'. (lon = gridpoints of longitude, lat = gridpoints of latitude, time = number of downscaling days) Seasonal forecast variables must have the same resolution and domain as reanalysis variables ('obs' parameter, below). |
slp |
Array with atmospheric seasonal forecast model sea level pressure (instantaneous 12h data) that must be indentify as 'slp' (hPa). It has the same resolution as 'exp' and 'obs' paremeters but with an extended domain. This domain contains extra degrees (most in the north and west part) compare to synoptic domain. The array must have at least three dimensions with names 'lon', 'lat' and 'time'. |
obs |
List of arrays with training period reanalysis data. The list has to contain reanalysis atmospheric variables (instantaneous 12h data) that must be indentify by parenthesis name. For precipitation:
For maximum and minimum temperature:
The arrays must have at least three dimensions with names 'lon', 'lat' and 'time'. |
lon |
Vector of the synoptic longitude (from (-180º) to 180º), The vector must go from west to east. The same as for the training function. |
lat |
Vector of the synoptic latitude. The vector must go from north to south. The same as for the training function. |
slp_lon |
Vector of the extended longitude (from (-180º) to 180º), The vector must go from west to east. The same as for the training function. |
slp_lat |
Vector of the extended latitude. The vector must go from north to south. The same as for the training function. |
var_name |
Variable name to downscale. There are two options: 'prec' for precipitation and 'temp' for maximum and minimum temperature. |
hr_obs |
Local path of HR observational files (maestro and pcp/tmx-tmn). For precipitation and temperature can be downloaded from the following link: https://www.aemet.es/en/serviciosclimaticos/cambio_climat/datos_diarios?w=2 respetively. Maestro file (maestro_red_hr_SPAIN.txt) has gridpoint (nptos), longitude (lon), latitude (lat) and altitude (alt) in columns (vector structure). Data file (pcp/tmx/tmn_red_SPAIN_1951-201903.txt) includes 5km resolution spanish daily data (precipitation or maximum and minimum temperature from january 1951 to june 2020. See README file for more information. IMPORTANT!: HR observational period must be the same as for reanalysis variables. It is assumed that the training period is smaller than the HR original one (1951-2019), so it is needed to make a new ascii file with the new period and the same structure as original, specifying the training dates in the name (e.g. 'pcp_red_SPAIN_19810101-19961231.txt' for '19810101-19961231' period). |
tdates |
Training period dates in format YYYYMMDD(start)-YYYYMMDD(end) (e.g. 19810101-20181231). |
ddates |
Downscaling period dates in format YYYYMMDD(start)-YYYYMMDD(end) (e.g. 20191001-20200331). |
restrain |
Output (list of matrix) obtained from 'training_analogs' function. For precipitation, 'restrain' object must contains um, vm, nger, gu92, gv92, gu52, gv52, neni, vdmin, vref, ccm, lab_pred and cor_pred variables. For maximum and minimum temperature, 'restrain' object must contains um, vm, insol, neni, vdmin y vref. See 'AnalogsPred_train.R' for more information. |
dim_name_longitude |
A character string indicating the name of the longitude dimension, by default 'longitude'. |
dim_name_latitude |
A character string indicating the name of the latitude dimension, by default 'latitude'. |
dim_name_time |
A character string indicating the name of the time dimension, by default 'time'. |
Value
Matrix with seasonal forecast precipitation (mm) or maximum and minimum temperature (dozens of ºC) in a 5km x 5km regular grid over peninsular Spain and Balearic Islands. The resulted matrices have two dimensions ('ddates' x 'nptos').(ddates = number of downscaling days and nptos = number of 'hr_obs' gridpoints).
Author(s)
Marta Dominguez Alonso - AEMET, mdomingueza@aemet.es
Nuria Perez-Zanon - BSC, nuria.perez@bsc.es
Anomalies relative to a climatology along selected dimension with or without cross-validation
Description
This function computes the anomalies relative to a climatology computed along the selected dimension (usually starting dates or forecast time) allowing the application or not of crossvalidated climatologies. The computation is carried out independently for experimental and observational data products.
Usage
CST_Anomaly(
exp = NULL,
obs = NULL,
dim_anom = "sdate",
cross = FALSE,
memb_dim = "member",
memb = TRUE,
dat_dim = c("dataset", "member"),
filter_span = NULL,
ftime_dim = "ftime",
ncores = NULL
)
Arguments
exp |
An object of class |
obs |
An object of class |
dim_anom |
A character string indicating the name of the dimension along which the climatology will be computed. The default value is 'sdate'. |
cross |
A logical value indicating whether cross-validation should be applied or not. Default = FALSE. |
memb_dim |
A character string indicating the name of the member dimension. It must be one dimension in 'exp' and 'obs'. If there is no member dimension, set NULL. The default value is 'member'. |
memb |
A logical value indicating whether to subtract the climatology based on the individual members (TRUE) or the ensemble mean over all members (FALSE) when calculating the anomalies. The default value is TRUE. |
dat_dim |
A character vector indicating the name of the dataset and member dimensions. If there is no dataset dimension, it can be NULL. The default value is "c('dataset', 'member')". |
filter_span |
A numeric value indicating the degree of smoothing. This
option is only available if parameter |
ftime_dim |
A character string indicating the name of the temporal dimension where the smoothing with 'filter_span' will be applied. It cannot be NULL if 'filter_span' is provided. The default value is 'ftime'. |
ncores |
An integer indicating the number of cores to use for parallel computation. The default value is NULL. It will be used only when 'filter_span' is not NULL. |
Value
A list with two S3 objects, 'exp' and 'obs', of the class 's2dv_cube', containing experimental and date-corresponding observational anomalies, respectively. These 's2dv_cube's can be ingested by other functions in CSTools.
Author(s)
Perez-Zanon Nuria, nuria.perez@bsc.es
Pena Jesus, jesus.pena@bsc.es
See Also
Ano_CrossValid, Clim and
CST_Start
Examples
mod <- 1 : (2 * 3 * 4 * 5 * 6 * 7)
dim(mod) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7)
obs <- 1 : (1 * 1 * 4 * 5 * 6 * 7)
dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7)
lon <- seq(0, 30, 5)
lat <- seq(0, 25, 5)
coords <- list(lon = lon, lat = lat)
exp <- list(data = mod, coords = coords)
obs <- list(data = obs, coords = coords)
attr(exp, 'class') <- 's2dv_cube'
attr(obs, 'class') <- 's2dv_cube'
anom <- CST_Anomaly(exp = exp, obs = obs, cross = FALSE, memb = TRUE)
Calculate the spatial area-weighted average of multidimensional arrays.
Description
This function computes a spatial area-weighted average of n-dimensional arrays given a spatial mask containing the area (e.g. in m^2) for each grid box.
Usage
CST_AreaWeighted(
data,
area,
lon_dim = "lon",
lat_dim = "lat",
ncores = NULL,
extra_info = NULL
)
Arguments
data |
An object of class |
area |
A multidimensional array with named dimensions (at least lon_dim
and lat_dim) containing area of each grid box. The resolution, length and
order of the lon_dim and lat_dim dimensions should be identical as for
|
lon_dim |
A character string indicating the name of the longitudinal dimension. The default value is 'lon'. |
lat_dim |
A character string indicating the name of the latitudinal dimension. The default value is 'lat'. |
ncores |
An integer indicating the number of cores to use for parallel computation. The default value is NULL. |
extra_info |
A named list with additional metadata to add to the s2dv_cube. There can be one entry for each dimension in 'area' that is not also present in 'data'. |
Value
An object of class s2dv_cube with same dimensions as in object
data, except with lon_dim and lat_dim removed. Any extra
dimensions in area not present in data are preserved in the output (e.g.,
region dimension).
Examples
data <- array(data = 1:10, dim = c(year = 10, lon = 5, lat = 3))
coords <- list(lat = 1:3, lon = 1:5)
data <- list(data = data, coords = coords)
attr(data, 'class') <- 's2dv_cube'
area <- array(data = 1:10, dim = c(lon = 5, lat = 3))
CST_AreaWeighted(data, area)
# With extra region dimension
area <- array(data = 1:10, dim = c(lon = 5, lat = 3, region = 4))
CST_AreaWeighted(data, area)
Weighting SFSs of a CSTools object.
Description
Function to apply weights to a 's2dv_cube' object. It could return a weighted ensemble mean (deterministic output) or the terciles probabilities (probabilistic output) for Seasonal Forecast Systems (SFSs).
Usage
CST_BEI_Weighting(
var_exp,
aweights,
terciles = NULL,
type = "ensembleMean",
time_dim_name = "time",
memb_dim = "member"
)
Arguments
var_exp |
An object of the class 's2dv_cube' containing the variable
(e.g. precipitation, temperature, NAO index) array.
The var_exp object is expected to have an element named |
aweights |
Normalized weights array with at least dimensions (time, member), when 'time' is the temporal dimension as default. When 'aweights' parameter has any other dimensions (as e.g. 'lat') and 'var_exp' parameter has also the same dimension, they must be equals. |
terciles |
A numeric array with at least one dimension 'tercil' equal to 2, the first element is the lower tercil for a hindcast period, and the second element is the upper tercile. By default is NULL, the terciles are computed from var_exp data. |
type |
A character string indicating the type of output. If 'type' = 'probs', the function returns, in the element data from 'var_exp' parameter, an array with at least two or four dimensions depending if the variable is spatially aggregated variable (as e.g. NAO index), dimension (time, tercil) or it is spatial variable (as e.g. precipitation or temperature), dimension (time, tercile, lat, lon), containing the terciles probabilities computing with weighted members. The first tercil is the lower tercile, the second is the normal tercile and the third is the upper tercile. If 'type' = 'ensembleMean', the function returns, in the element data from 'var_exp' parameter, an array with at least one or three dimensions depending if the variable is a spatially aggregated variable (as e.g. NAO index)(time) or it is spatial variable (as e.g. precipitation or temperature) (time, lat, lon), containing the ensemble means computing with weighted members. |
time_dim_name |
A character string indicating the name of the temporal dimension, by default 'time'. |
memb_dim |
A character string indicating the name of the member dimension, by default 'member'. |
Value
CST_BEI_Weighting() returns a CSTools object (i.e., of the
class 's2dv_cube').
This object has at least an element named $data
with at least a temporal dimension (and dimension 'tercil' when the output
are tercile probabilities), containing the ensemble means computing with
weighted members or probabilities of terciles.
Author(s)
Eroteida Sanchez-Garcia - AEMET, esanchezg@aemet.es
References
Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., Adv. Sci. Res., 16, 165174, 2019, doi:10.5194/asr-16-165-2019
Examples
var_exp <- 1 : (2 * 4 * 3 * 2)
dim(var_exp) <- c(time = 2, member = 4, lat = 3, lon = 2)
aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3, 0.1, 0.2, 0.4, 0.4, 0.1,
0.2, 0.4, 0.2)
dim(aweights) <- c(time = 2, member = 4, dataset = 2)
var_exp <- list(data = var_exp)
class(var_exp) <- 's2dv_cube'
res_CST <- CST_BEI_Weighting(var_exp, aweights)
Bias Correction based on the mean and standard deviation adjustment
Description
This function applies the simple bias adjustment technique described in Torralba et al. (2017). The adjusted forecasts have an equivalent standard deviation and mean to that of the reference dataset.
Usage
CST_BiasCorrection(
exp,
obs,
exp_cor = NULL,
na.rm = FALSE,
memb_dim = "member",
sdate_dim = "sdate",
dat_dim = NULL,
ncores = NULL
)
Arguments
exp |
An object of class |
obs |
An object of class |
exp_cor |
An object of class |
na.rm |
A logical value indicating whether missing values should be stripped before the computation proceeds, by default it is set to FALSE. |
memb_dim |
A character string indicating the name of the member dimension. By default, it is set to 'member'. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
dat_dim |
A character string indicating the name of dataset dimension. The length of this dimension can be different between 'exp' and 'obs'. The default value is NULL. |
ncores |
An integer that indicates the number of cores for parallel computations using multiApply function. The default value is NULL. |
Value
An object of class s2dv_cube containing the bias corrected
forecasts with the dimensions nexp, nobs and same dimensions as in the 'exp'
object. nexp is the number of experiment (i.e., 'dat_dim' in exp), and nobs is
the number of observation (i.e., 'dat_dim' in obs). If dat_dim is NULL, nexp
and nobs are omitted. If 'exp_cor' is provided the returned array will be with
the same dimensions as 'exp_cor'.
Author(s)
Verónica Torralba, veronica.torralba@bsc.es
References
Torralba, V., F.J. Doblas-Reyes, D. MacLeod, I. Christel and M. Davis (2017). Seasonal climate prediction: a new source of information for the management of wind energy resources. Journal of Applied Meteorology and Climatology, 56, 1231-1247, doi:10.1175/JAMC-D-16-0204.1. (CLIM4ENERGY, EUPORIAS, NEWA, RESILIENCE, SPECS)
Examples
mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7)
dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, time = 5, lat = 6, lon = 7)
obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7)
dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, time = 5, lat = 6, lon = 7)
lon <- seq(0, 30, 5)
lat <- seq(0, 25, 5)
coords <- list(lat = lat, lon = lon)
exp <- list(data = mod1, coords = coords)
obs <- list(data = obs1, coords = coords)
attr(exp, 'class') <- 's2dv_cube'
attr(obs, 'class') <- 's2dv_cube'
a <- CST_BiasCorrection(exp = exp, obs = obs)
Bind two objects of class s2dv_cube
Description
This function combines the data inside two or more objects of class
s2dv_cube along the specified along dimension, and modifies the
dimensions, coordinates and attributes accordingly, producing a result that
contains the complete metadata for all variables, time steps and spatial
coordinates that are bound in the process. It ensures that the information
inside the s2dv_cube remains coherent with the data it contains.
If the dimension specified in along is among the time dimensions in
attrs$Dates, the dates arrays are also bound along this dimension.
The load_parameters and when attributes of the first cube
are preserved. The source_files attribute is bound along the
var_dim and dat_dim dimensions.
Usage
CST_BindDim(x, along, var_dim = NULL, dat_dim = NULL)
Arguments
x |
Two or more objects of class |
along |
A character string indicating the name of the binding dimension. |
var_dim |
A character string indicating the name of the variable dimension. The default value is NULL. Specifying this dimension ensures |
dat_dim |
A character string indicating the name of dataset dimension. The default value is NULL. Specifying this dimension ensures the dataset metadata is correctly preserved. |
Value
An object of class s2dv_cube with the combined data,
dimensions, coordinates and attributes of the elements in x.
Author(s)
Agudetse Roures Victoria, victoria.agudetse@bsc.es
See Also
Examples
# Example with sample data:
# Check original dimensions and coordinates
lonlat_temp$exp$dims
lonlat_temp$obs$dims
# Bind both datasets along the member dimension
res <- CST_BindDim(x = list(lonlat_temp$exp, lonlat_temp$obs),
along = "member",
var_dim = NULL,
dat_dim = "dat")
# Check new dimensions and coordinates
res$dims
names(res$coords)
Forecast Calibration
Description
Five types of member-by-member bias correction can be performed.
The "bias" method corrects the bias only, the "evmos" method
applies a variance inflation technique to ensure the correction of the bias
and the correspondence of variance between forecast and observation (Van
Schaeybroeck and Vannitsem, 2011). The ensemble calibration methods
"mse_min" and "crps_min" correct the bias, the overall forecast
variance and the ensemble spread as described in Doblas-Reyes et al. (2005)
and Van Schaeybroeck and Vannitsem (2015), respectively. While the
"mse_min" method minimizes a constrained mean-squared error using three
parameters, the "crps_min" method features four parameters and
minimizes the Continuous Ranked Probability Score (CRPS). The
"rpc-based" method adjusts the forecast variance ensuring that the
ratio of predictable components (RPC) is equal to one, as in Eade et al.
(2014). It is equivalent to function Calibration but for objects
of class s2dv_cube.
Usage
CST_Calibration(
exp,
obs,
exp_cor = NULL,
cal.method = "mse_min",
eval.method = "leave-k-out",
multi.model = FALSE,
na.fill = TRUE,
na.rm = TRUE,
apply_to = NULL,
alpha = NULL,
memb_dim = "member",
sdate_dim = "sdate",
dat_dim = NULL,
ncores = NULL,
k = 1,
tail.out = TRUE
)
Arguments
exp |
An object of class |
obs |
An object of class |
exp_cor |
An optional object of class |
cal.method |
A character string indicating the calibration method used,
can be either |
eval.method |
A character string indicating the sampling method used, it
can be either |
multi.model |
A boolean that is used only for the |
na.fill |
A boolean that indicates what happens in case calibration is
not possible or will yield unreliable results. This happens when three or
less forecasts-observation pairs are available to perform the training phase
of the calibration. By default |
na.rm |
A boolean that indicates whether to remove the NA values or not.
The default value is |
apply_to |
A character string that indicates whether to apply the
calibration to all the forecast ( |
alpha |
A numeric value indicating the significance level for the
correlation test. Only useful if |
memb_dim |
A character string indicating the name of the member dimension. By default, it is set to 'member'. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
dat_dim |
A character string indicating the name of dataset dimension. The length of this dimension can be different between 'exp' and 'obs'. The default value is NULL. |
ncores |
An integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. |
k |
Positive integer. Default = 1.
In method |
tail.out |
Boolean for 'leave-k-out' method; TRUE to remove both extremes keeping the same sample size for all k-folds (e.g. sample.length=50, k=3, eval.dexes = 1, train.dexes = (3,49)). FALSE to remove only the corresponding tail (e.g.. sample.length=50, k=3, eval.dexes = 1, train.dexes = (3,50)) |
Details
Both the na.fill and na.rm parameters can be used to
indicate how the function has to handle the NA values. The na.fill
parameter checks whether there are more than three forecast-observations pairs
to perform the computation. In case there are three or less pairs, the
computation is not carried out, and the value returned by the function depends
on the value of this parameter (either NA if na.fill == TRUE or the
uncorrected value if na.fill == TRUE). On the other hand, na.rm
is used to indicate the function whether to remove the missing values during
the computation of the parameters needed to perform the calibration.
Value
An object of class s2dv_cube containing the calibrated
forecasts in the element data with the dimensions nexp, nobs and same
dimensions as in the 'exp' object. nexp is the number of experiment
(i.e., 'dat_dim' in exp), and nobs is the number of observation (i.e.,
'dat_dim' in obs). If dat_dim is NULL, nexp and nobs are omitted. If 'exp_cor'
is provided the returned array will be with the same dimensions as 'exp_cor'.
Author(s)
Verónica Torralba, veronica.torralba@bsc.es
Bert Van Schaeybroeck, bertvs@meteo.be
References
Doblas-Reyes F.J, Hagedorn R, Palmer T.N. The rationale behind the success of multi-model ensembles in seasonal forecasting-II calibration and combination. Tellus A. 2005;57:234-252. doi:10.1111/j.1600-0870.2005.00104.x
Eade, R., Smith, D., Scaife, A., Wallace, E., Dunstone, N., Hermanson, L., & Robinson, N. (2014). Do seasonal-to-decadal climate predictions underestimate the predictability of the read world? Geophysical Research Letters, 41(15), 5620-5628. doi:10.1002/2014GL061146
Van Schaeybroeck, B., & Vannitsem, S. (2011). Post-processing through linear regression. Nonlinear Processes in Geophysics, 18(2), 147. doi:10.5194/npg-18-147-2011
Van Schaeybroeck, B., & Vannitsem, S. (2015). Ensemble post-processing using member-by-member approaches: theoretical aspects. Quarterly Journal of the Royal Meteorological Society, 141(688), 807-818. doi:10.1002/qj.2397
See Also
Examples
# Example 1:
mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7)
dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7)
obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7)
dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7)
lon <- seq(0, 30, 5)
lat <- seq(0, 25, 5)
coords <- list(lat = lat, lon = lon)
exp <- list(data = mod1, coords = coords)
obs <- list(data = obs1, coords = coords)
attr(exp, 'class') <- 's2dv_cube'
attr(obs, 'class') <- 's2dv_cube'
a <- CST_Calibration(exp = exp, obs = obs, cal.method = "mse_min", eval.method = "in-sample")
# Example 2:
mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7)
mod2 <- 1 : (1 * 3 * 1 * 5 * 6 * 7)
dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7)
dim(mod2) <- c(dataset = 1, member = 3, sdate = 1, ftime = 5, lat = 6, lon = 7)
obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7)
dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7)
lon <- seq(0, 30, 5)
lat <- seq(0, 25, 5)
coords <- list(lat = lat, lon = lon)
exp <- list(data = mod1, coords = coords)
obs <- list(data = obs1, coords = coords)
exp_cor <- list(data = mod2, lat = lat, lon = lon)
attr(exp, 'class') <- 's2dv_cube'
attr(obs, 'class') <- 's2dv_cube'
attr(exp_cor, 'class') <- 's2dv_cube'
a <- CST_Calibration(exp = exp, obs = obs, exp_cor = exp_cor, cal.method = "evmos")
Make categorical forecast based on a multi-model forecast with potential for calibrate
Description
This function converts a multi-model ensemble forecast into a categorical forecast by giving the probability for each category. Different methods are available to combine the different ensemble forecasting models into probabilistic categorical forecasts.
Motivation: Beyond the short range, the unpredictable component of weather predictions becomes substantial due to the chaotic nature of the earth system. Therefore, predictions can mostly be skillful when used in a probabilistic sense. In practice this is done using ensemble forecasts. It is then common to convert the ensemble forecasts to occurence probabilities for different categories. These categories typically are taken as terciles from climatolgical distributions. For instance for temperature, there is a cold, normal and warm class. Commonly multiple ensemble forecasting systems are available but some models may be more competitive than others for the variable, region and user need under consideration. Therefore, when calculating the category probabilities, the ensemble members of the different forecasting system may be differently weighted. Such weighting is typically done by comparison of the ensemble forecasts with observations.
Description of the tool: The tool considers all forecasts (all members from all forecasting systems) and converts them into occurrence probabilities of different categories. The amount of categories can be changed and are taken as the climatological quantiles (e.g. terciles), extracted from the observational data. The methods that are available to combine the ensemble forecasting models into probabilistic categorical forecasts are: 1) ensemble pooling where all ensemble members of all ensemble systems are weighted equally, 2) model combination where each model system is weighted equally, and, 3) model weighting. The model weighting method is described in Rajagopalan et al. (2002), Robertson et al. 2004 and Van Schaeybroeck and Vannitsem (2019). More specifically, this method uses different weights for the occurence probability predicted by the available models and by a climatological model and optimizes the weights by minimizing the ignorance score. Finally, the function can also be used to categorize the observations in the categorical quantiles.
Usage
CST_CategoricalEnsCombination(
exp,
obs,
cat.method = "pool",
eval.method = "leave-k-out",
amt.cat = 3,
k = 1,
tail.out = TRUE,
...
)
Arguments
exp |
An object of class |
obs |
An object of class |
cat.method |
Method used to produce the categorical forecast, can be
either |
eval.method |
Is the sampling method used, can be either
|
amt.cat |
Is the amount of categories. Equally-sized quantiles will be calculated based on the amount of categories. |
k |
k Positive integer. Default = 1.
In method |
tail.out |
Boolean for 'leave-k-out' method; TRUE to remove both extremes keeping the same sample size for all k-folds (e.g. sample.length = 50, k = 3, eval.dexes = 1, train.dexes = (3,49)). FALSE to remove only the corresponding tail (e.g.. sample.length = 50, k = 3, eval.dexes = 1, train.dexes = (3,50)) |
... |
other parameters to be passed on to the calibration procedure. |
Value
An object of class s2dv_cube containing the categorical
forecasts in the element called $data. The first two dimensions of the
returned object are named dataset and member and are both of size one. An
additional dimension named category is introduced and is of size amt.cat.
Author(s)
Bert Van Schaeybroeck, bertvs@meteo.be
References
Rajagopalan, B., Lall, U., & Zebiak, S. E. (2002). Categorical climate forecasts through regularization and optimal combination of multiple GCM ensembles. Monthly Weather Review, 130(7), 1792-1811.
Robertson, A. W., Lall, U., Zebiak, S. E., & Goddard, L. (2004). Improved combination of multiple atmospheric GCM ensembles for seasonal prediction. Monthly Weather Review, 132(12), 2732-2744.
Van Schaeybroeck, B., & Vannitsem, S. (2019). Postprocessing of Long-Range Forecasts. In Statistical Postprocessing of Ensemble Forecasts (pp. 267-290).
Examples
mod1 <- 1 : (2 * 2* 4 * 5 * 2 * 2)
dim(mod1) <- c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2, lon = 2)
mod1[2, 1, , , , ] <- NA
datasets <- c("MF", "UKMO")
obs1 <- 1 : (1 * 1 * 4 * 5 * 2 * 2)
dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2, lon = 2)
lon <- seq(0, 30, 5)
lat <- seq(0, 25, 5)
coords <- list(lat = lat, lon = lon)
attrs <- list(Datasets = datasets)
exp <- list(data = mod1, coords = coords, attrs = attrs)
obs <- list(data = obs1, coords = coords)
attr(exp, 'class') <- 's2dv_cube'
attr(obs, 'class') <- 's2dv_cube'
a <- CST_CategoricalEnsCombination(exp = exp, obs = obs, amt.cat = 3,
cat.method = "mmw")
Change the name of one or more dimensions for an object of class s2dv_cube
Description
Change the names of the dimensions specified in 'original_names' to the names in 'new_names'. The coordinate names and the dimensions of any attributes are also modified accordingly.
Usage
CST_ChangeDimNames(data, original_names, new_names)
Arguments
data |
An object of class |
original_names |
A single character string or a vector indicating the dimensions to be renamed. |
new_names |
A single character string or a vector indicating the new dimension names, in the same order as the dimensions in 'original_names'. |
Value
An object of class s2dv_cube with similar data, coordinates and
attributes as the data input, but with modified dimension names.
Author(s)
Agudetse Roures Victoria, victoria.agudetse@bsc.es
Examples
# Example with sample data:
# Check original dimensions and coordinates
lonlat_temp$exp$dims
names(lonlat_temp$exp$coords)
dim(lonlat_temp$exp$attrs$Dates)
# Change 'dataset' to 'dat' and 'ftime' to 'time'
exp <- CST_ChangeDimNames(lonlat_temp$exp,
original_names = c("dataset", "ftime"),
new_names = c("dat", "time"))
# Check new dimensions and coordinates
exp$dims
names(exp$coords)
dim(exp$attrs$Dates)
Performing a Bias Correction conditioned by the dynamical properties of the data.
Description
This function perform a bias correction conditioned by the dynamical properties of the dataset. This function internally uses the functions 'Predictability' to divide in terciles the two dynamical proxies computed with 'CST_ProxiesAttractor'. A bias correction between the model and the observations is performed using the division into terciles of the local dimension 'dim' and inverse of the persistence 'theta'. For instance, model values with lower 'dim' will be corrected with observed values with lower 'dim', and the same for theta. The function gives two options of bias correction: one for 'dim' and/or one for 'theta'
Usage
CST_DynBiasCorrection(
exp,
obs,
method = "QUANT",
wetday = FALSE,
proxy = "dim",
quanti,
ncores = NULL
)
Arguments
exp |
An s2v_cube object with the experiment data. |
obs |
An s2dv_cube object with the reference data. |
method |
A character string indicating the method to apply bias correction among these ones: "PTF","RQUANT","QUANT","SSPLIN". |
wetday |
Logical indicating whether to perform wet day correction or not OR a numeric threshold below which all values are set to zero (by default is set to 'FALSE'). |
proxy |
A character string indicating the proxy for local dimension 'dim' or inverse of persistence 'theta' to apply the dynamical conditioned bias correction method. |
quanti |
A number lower than 1 indicating the quantile to perform the computation of local dimension and theta. |
ncores |
The number of cores to use in parallel computation. |
Value
dynbias An s2dvcube object with a bias correction performed conditioned by local dimension 'dim' or inverse of persistence 'theta'.
Author(s)
Carmen Alvarez-Castro, carmen.alvarez-castro@cmcc.it
Maria M. Chaves-Montero, mdm.chaves-montero@cmcc.it
Veronica Torralba, veronica.torralba@cmcc.it
Davide Faranda, davide.faranda@lsce.ipsl.fr
References
Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale atmospheric predictability.Nature Communications, 10(1), 1316. doi:10.1038/s41467-019-09305-8"
Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. Scientific Reports, 7-41278, 2017.
Examples
expL <- rnorm(1:2000)
dim(expL) <- c(time = 100, lat = 4, lon = 5)
obsL <- c(rnorm(1:1980), expL[1, , ] * 1.2)
dim(obsL) <- c(time = 100, lat = 4, lon = 5)
time_obsL <- as.POSIXct(paste(rep("01", 100), rep("01", 100), 1920:2019, sep = "-"),
format = "%d-%m-%y")
time_expL <- as.POSIXct(paste(rep("01", 100), rep("01", 100), 1929:2019, sep = "-"),
format = "%d-%m-%y")
lon <- seq(-1, 5, 1.5)
lat <- seq(30, 35, 1.5)
# qm = 0.98 #'too high for this short dataset, it is possible that doesn't
# get the requirement, in that case it would be necessary select a lower qm
# for instance qm = 0.60
expL <- s2dv_cube(data = expL, coords = list(lon = lon, lat = lat),
Dates = time_expL)
obsL <- s2dv_cube(data = obsL, coords = list(lon = lon, lat = lat),
Dates = time_obsL)
# to use DynBiasCorrection
dynbias1 <- DynBiasCorrection(exp = expL$data, obs = obsL$data, proxy= "dim",
quanti = 0.6)
# to use CST_DynBiasCorrection
dynbias2 <- CST_DynBiasCorrection(exp = expL, obs = obsL, proxy= "dim",
quanti = 0.6)
Ensemble clustering
Description
This function performs a clustering on members/starting dates and returns a number of scenarios, with representative members for each of them. The clustering is performed in a reduced EOF space.
Motivation: Ensemble forecasts give a probabilistic insight of average weather conditions on extended timescales, i.e. from sub-seasonal to seasonal and beyond. With large ensembles, it is often an advantage to be able to group members according to similar characteristics and to select the most representative member for each cluster. This can be useful to characterize the most probable forecast scenarios in a multi-model (or single model) ensemble prediction. This approach, applied at a regional level, can also be used to identify the subset of ensemble members that best represent the full range of possible solutions for downscaling applications. The choice of the ensemble members is made flexible in order to meet the requirements of specific (regional) climate information products, to be tailored for different regions and user needs.
Description of the tool: EnsClustering is a cluster analysis tool, based on the k-means algorithm, for ensemble predictions. The aim is to group ensemble members according to similar characteristics and to select the most representative member for each cluster. The user chooses which feature of the data is used to group the ensemble members by clustering: time mean, maximum, a certain percentile (e.g., 75 time period. For each ensemble member this value is computed at each grid point, obtaining N lat-lon maps, where N is the number of ensemble members. The anomaly is computed subtracting the ensemble mean of these maps to each of the single maps. The anomaly is therefore computed with respect to the ensemble members (and not with respect to the time) and the Empirical Orthogonal Function (EOF) analysis is applied to these anomaly maps. Regarding the EOF analysis, the user can choose either how many Principal Components (PCs) to retain or the percentage of explained variance to keep. After reducing dimensionality via EOF analysis, k-means analysis is applied using the desired subset of PCs.
The major final outputs are the classification in clusters, i.e. which member belongs to which cluster (in k-means analysis the number k of clusters needs to be defined prior to the analysis) and the most representative member for each cluster, which is the closest member to the cluster centroid. Other outputs refer to the statistics of clustering: in the PC space, the minimum and the maximum distance between a member in a cluster and the cluster centroid (i.e. the closest and the furthest member), the intra-cluster standard deviation for each cluster (i.e. how much the cluster is compact).
Usage
CST_EnsClustering(
exp,
time_moment = "mean",
numclus = NULL,
lon_lim = NULL,
lat_lim = NULL,
variance_explained = 80,
numpcs = NULL,
time_dim = NULL,
time_percentile = 90,
cluster_dim = "member",
verbose = FALSE
)
Arguments
exp |
An object of the class 's2dv_cube', containing the variables to be analysed. The element 'data' in the 's2dv_cube' object must have, at least, spatial and temporal dimensions. Latitudinal dimension accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. Longitudinal dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', 'nav_lon'. |
time_moment |
Decides the moment to be applied to the time dimension. Can be either 'mean' (time mean), 'sd' (standard deviation along time) or 'perc' (a selected percentile on time). If 'perc' the keyword 'time_percentile' is also used. |
numclus |
Number of clusters (scenarios) to be calculated. If set to NULL the number of ensemble members divided by 10 is used, with a minimum of 2 and a maximum of 8. |
lon_lim |
List with the two longitude margins in 'c(-180,180)' format. |
lat_lim |
List with the two latitude margins. |
variance_explained |
variance (percentage) to be explained by the set of EOFs. Defaults to 80. Not used if numpcs is specified. |
numpcs |
Number of EOFs retained in the analysis (optional). |
time_dim |
String or character array with name(s) of dimension(s) over which to compute statistics. If omitted c("ftime", "sdate", "time") are searched in this order. |
time_percentile |
Set the percentile in time you want to analyse (used for 'time_moment = "perc"). |
cluster_dim |
Dimension along which to cluster. Typically "member" or "sdate". This can also be a list like c("member", "sdate"). |
verbose |
Logical for verbose output |
Value
A list with elements $cluster (cluster assigned for each
member), $freq (relative frequency of each cluster),
$closest_member (representative member for each cluster),
$repr_field (list of fields for each representative member),
composites (list of mean fields for each cluster), $lon
(selected longitudes of output fields), $lat (selected longitudes of
output fields).
Author(s)
Federico Fabiano - ISAC-CNR, f.fabiano@isac.cnr.it
Ignazio Giuntoli - ISAC-CNR, i.giuntoli@isac.cnr.it
Danila Volpi - ISAC-CNR, d.volpi@isac.cnr.it
Paolo Davini - ISAC-CNR, p.davini@isac.cnr.it
Jost von Hardenberg - ISAC-CNR, j.vonhardenberg@isac.cnr.it
Examples
dat_exp <- array(abs(rnorm(1152))*275, dim = c(dataset = 1, member = 4,
sdate = 6, ftime = 3,
lat = 4, lon = 4))
lon <- seq(0, 3)
lat <- seq(48, 45)
coords <- list(lon = lon, lat = lat)
exp <- list(data = dat_exp, coords = coords)
attr(exp, 'class') <- 's2dv_cube'
res <- CST_EnsClustering(exp = exp, numclus = 3,
cluster_dim = c("sdate"))
Add a named dimension to an object of class s2dv_cube
Description
Insert an extra dimension into an array at position 'posdim' with length
'lendim'. The array in data repeats along the new dimension.
The dimensions, coordinates and attributes are modified accordingly.
Usage
CST_InsertDim(data, posdim, lendim, name, values = NULL)
Arguments
data |
An object of class |
posdim |
An integer indicating the position of the new dimension. |
lendim |
An integer indicating the length of the new dimension. |
name |
A character string indicating the name for the new dimension. |
values |
A vector containing the values of the new dimension and any relevant attributes. If NULL, a sequence of integers from 1 to lendim will be added. |
Value
An object of class s2dv_cube with similar data, coordinates and
attributes as the data input, but with an additional dimension.
Author(s)
Agudetse Roures Victoria, victoria.agudetse@bsc.es
See Also
Examples
#Example with sample data:
# Check original dimensions and coordinates
lonlat_temp$exp$dims
names(lonlat_temp$exp$coords)
# Add 'variable' dimension
exp <- CST_InsertDim(lonlat_temp$exp,
posdim = 2,
lendim = 1,
name = "variable",
values = c("tas"))
# Check new dimensions and coordinates
exp$dims
exp$coords$variable
CSTools Data Retreival Function
Description
This function aggregates, subsets and retrieves sub-seasonal, seasonal, decadal or climate projection data from NetCDF files in a local file system or on remote OPeNDAP servers, and arranges it for easy application of the CSTools functions.
Usage
CST_Load(...)
Arguments
... |
Parameters that are automatically forwarded to the 's2dv::Load' function. See details in '?s2dv::Load'. |
Details
It receives any number of parameters ('...') that are automatically forwarded to the 's2dv::Load' function. See details in '?s2dv::Load'.
It is recommended to use this function in combination with the 'zeallot::"
Value
A list with one or two S3 objects, named 'exp' and 'obs', of the class 's2dv_cube', containing experimental and date-corresponding observational data, respectively. These 's2dv_cube's can be ingested by other functions in CSTools. If the parameter ‘exp' in the call to 'CST_Load' is set to 'NULL', then only the ’obs' component is returned, and viceversa.
Author(s)
Nicolau Manubens, nicolau.manubens@bsc.es
Examples
## Not run:
library(zeallot)
startDates <- c('20001101', '20011101', '20021101',
'20031101', '20041101', '20051101')
c(exp, obs) %<-%
CST_Load(
var = 'tas',
exp = 'system5c3s',
obs = 'era5',
nmember = 15,
sdates = startDates,
leadtimemax = 3,
latmin = 27, latmax = 48,
lonmin = -12, lonmax = 40,
output = 'lonlat',
nprocs = 1
)
## End(Not run)
Function to Merge Dimensions
Description
This function merges two dimensions of the array data in a
's2dv_cube' object into one. The user can select the dimensions to merge and
provide the final name of the dimension. The user can select to remove NA
values or keep them.
Usage
CST_MergeDims(
data,
merge_dims = c("ftime", "monthly"),
rename_dim = NULL,
na.rm = FALSE
)
Arguments
data |
An 's2dv_cube' object |
merge_dims |
A character vector indicating the names of the dimensions to merge. |
rename_dim |
a character string indicating the name of the output
dimension. If left at NULL, the first dimension name provided in parameter
|
na.rm |
A logical indicating if the NA values should be removed or not. |
Value
An object of class 's2dv_cube' with the specified dimensions merged
into a single dimension.
Author(s)
Nuria Perez-Zanon, nuria.perez@bsc.es
Examples
data <- 1 : c(2 * 3 * 4 * 5 * 6 * 7)
dim(data) <- c(time = 7, lat = 2, lon = 3, monthly = 4, member = 6,
dataset = 5, var = 1)
data[2,,,,,,] <- NA
data[c(3,27)] <- NA
data <- list(data = data)
class(data) <- 's2dv_cube'
new_data <- CST_MergeDims(data, merge_dims = c('time', 'monthly'))
new_data <- CST_MergeDims(data, merge_dims = c('lon', 'lat'), rename_dim = 'grid')
new_data <- CST_MergeDims(data, merge_dims = c('time', 'monthly'), na.rm = TRUE)
EOF analysis of multiple variables
Description
This function performs EOF analysis over multiple variables, accepting in input a list of CSTools objects. Based on Singular Value Decomposition. For each field the EOFs are computed and the corresponding PCs are standardized (unit variance, zero mean); the minimum number of principal components needed to reach the user-defined variance is retained. The function weights the input data for the latitude cosine square root.
Usage
CST_MultiEOF(
datalist,
lon_dim = "lon",
lat_dim = "lat",
time_dim = "ftime",
sdate_dim = "sdate",
var_dim = "var",
neof_max = 40,
neof_composed = 5,
minvar = 0.6,
lon_lim = NULL,
lat_lim = NULL,
ncores = NULL
)
Arguments
datalist |
A list of objects of the class 's2dv_cube', containing the
variables to be analysed. Each data object in the list is expected to have
an element named |
lon_dim |
A character string indicating the name of the longitudinal dimension. By default, it is set to 'lon'. |
lat_dim |
A character string indicating the name of the latitudinal dimension. By default, it is set to 'lat'. |
time_dim |
A character string indicating the name of the temporal dimension. By default, it is set to 'time'. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
var_dim |
A character string indicating the name of the variable dimension. By default, it is set to 'var'. |
neof_max |
Maximum number of single eofs considered in the first decomposition. |
neof_composed |
Number of composed eofs to return in output. |
minvar |
Minimum variance fraction to be explained in first decomposition. |
lon_lim |
Vector with longitudinal range limits for the EOF calculation for all input variables. |
lat_lim |
Vector with latitudinal range limits for the EOF calculation for all input variables. |
ncores |
An integer indicating the number of cores to use for parallel computation. The default value is NULL. |
Value
A list containing:
coeff |
An 's2dv_cube' with the data element being an array of principal components with dimensions 'time_dim', 'sdate_dim', number of eof, rest of the dimensions of 'data' except 'lon_dim' and 'lat_dim'. |
variance |
An 's2dv_cube' with the data element being an array of explained variances with dimensions 'eof' and the rest of the dimensions of 'data' except 'time_dim', 'sdate_dim', 'lon_dim' and 'lat_dim'. |
eof_pattern |
An 's2dv_cube' with the data element being an array of EOF patterns obtained by regression with dimensions: 'eof' and the rest of the dimensions of 'data' except 'time_dim' and 'sdate_dim'. |
mask |
An 's2dv_cube' with the data element being an array of the mask with dimensions ('lon_dim', 'lat_dim', rest of the dimensions of 'data' except 'time_dim'). It is made from 'data', 1 for the positions that 'data' has value and NA for the positions that 'data' has NA. It is used to replace NAs with 0s for EOF calculation and mask the result with NAs again after the calculation. |
coordinates |
Longitudinal and latitudinal coordinates vectors. |
Author(s)
Jost von Hardenberg - ISAC-CNR, j.vonhardenberg@isac.cnr.it
Paolo Davini - ISAC-CNR, p.davini@isac.cnr.it
Examples
seq <- 1 : (2 * 3 * 4 * 5 * 6 * 8)
mod1 <- sin( 0.7 + seq )^2 + cos( seq ^ 2 * 1.22 )
dim(mod1) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6,
lon = 8)
mod2 <- sin( seq * 2 ) ^ 3 + cos( seq ^ 2 )
dim(mod2) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6,
lon = 8)
lon <- seq(0, 35, 5)
lat <- seq(0, 25, 5)
exp1 <- list(data = mod1, coords = list(lat = lat, lon = lon))
exp2 <- list(data = mod2, coords = list(lat = lat, lon = lon))
attr(exp1, 'class') <- 's2dv_cube'
attr(exp2, 'class') <- 's2dv_cube'
d = as.POSIXct(c("2017/01/01", "2017/01/02", "2017/01/03", "2017/01/04",
"2017/01/05", "2018/01/01", "2018/01/02", "2018/01/03",
"2018/01/04", "2018/01/05", "2019/01/01", "2019/01/02",
"2019/01/03", "2019/01/04", "2019/01/05", "2020/01/01",
"2020/01/02", "2020/01/03", "2020/01/04", "2020/01/05"))
exp1$attrs$Dates = d
exp2$attrs$Dates = d
cal <- CST_MultiEOF(datalist = list(exp1, exp2), neof_composed = 2)
Multiple Metrics applied in Multiple Model Anomalies
Description
This function calculates correlation (Anomaly Correlation Coefficient; ACC), root mean square error (RMS) and the root mean square error skill score (RMSSS) of individual anomaly models and multi-models mean (if desired) with the observations.
Usage
CST_MultiMetric(
exp,
obs,
metric = "correlation",
multimodel = TRUE,
time_dim = "ftime",
memb_dim = "member",
sdate_dim = "sdate"
)
Arguments
exp |
An object of class |
obs |
An object of class |
metric |
A character string giving the metric for computing the maximum skill. This must be one of the strings 'correlation', 'rms', 'rmsss' and 'rpss'. If 'rpss' is chossen the terciles probabilities are evaluated. |
multimodel |
A logical value indicating whether a Multi-Model Mean should be computed. |
time_dim |
Name of the temporal dimension where a mean will be applied. It can be NULL, the default value is 'ftime'. |
memb_dim |
Name of the member dimension. It can be NULL, the default value is 'member'. |
sdate_dim |
Name of the start date dimension or a dimension name identifiying the different forecast. It can be NULL, the default value is 'sdate'. |
Value
An object of class s2dv_cube containing the statistics of the
selected metric in the element $data which is a list of arrays: for the
metric requested and others for statistics about its signeificance. The arrays
have two dataset dimensions equal to the 'dataset' dimension in the
exp$data and obs$data inputs. If multimodel is TRUE, the
first position in the first 'nexp' dimension correspons to the Multi-Model Mean.
Author(s)
Mishra Niti, niti.mishra@bsc.es
Perez-Zanon Nuria, nuria.perez@bsc.es
References
Mishra, N., Prodhomme, C., & Guemas, V. (n.d.). Multi-Model Skill Assessment of Seasonal Temperature and Precipitation Forecasts over Europe, 29-31. doi:10.1007/s00382-018-4404-z
See Also
Examples
mod <- rnorm(2*2*4*5*2*2)
dim(mod) <- c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2, lon = 2)
obs <- rnorm(1*1*4*5*2*2)
dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2, lon = 2)
lon <- seq(0, 30, 5)
lat <- seq(0, 25, 5)
coords <- list(lat = lat, lon = lon)
exp <- list(data = mod, coords = coords)
obs <- list(data = obs, coords = coords)
attr(exp, 'class') <- 's2dv_cube'
attr(obs, 'class') <- 's2dv_cube'
a <- CST_MultiMetric(exp = exp, obs = obs)
Multivariate Root Mean Square Error (RMSE)
Description
This function calculates the RMSE from multiple variables, as the mean of each variable's RMSE scaled by its observed standard deviation. Variables can be weighted based on their relative importance (defined by the user).
Usage
CST_MultivarRMSE(
exp,
obs,
weight = NULL,
memb_dim = "member",
dat_dim = "dataset",
sdate_dim = "sdate",
ftime_dim = "ftime"
)
Arguments
exp |
A list of objects, one for each variable, of class |
obs |
A list of objects, one for each variable (in the same order than
the input in 'exp') of class |
weight |
(optional) A vector of weight values to assign to each variable. If no weights are defined, a value of 1 is assigned to every variable. |
memb_dim |
A character string indicating the name of the member dimension. It must be one dimension in 'exp' and 'obs'. The default value is 'member'. |
dat_dim |
A character string indicating the name of the dataset dimension. It must be one dimension in 'exp' and 'obs'. If there is no dataset dimension, it can be NULL. The default value is 'dataset'. |
sdate_dim |
A character string indicating the name of the start date dimension. It must be one dimension in 'exp' and 'obs'. The default value is 'sdate'. |
ftime_dim |
A character string indicating the name of the forecast time dimension. It must be one dimension in 'exp' and 'obs'. The default value is 'ftime'. |
Value
An object of class s2dv_cube containing the RMSE in the element
$data which is an array with two datset dimensions equal to the
'dataset' dimension in the exp$data and obs$data inputs. An
array with dimensions: c(number of exp, number of obs, 1 (the multivariate
RMSE value), number of lat, number of lon)
Author(s)
Deborah Verfaillie, deborah.verfaillie@bsc.es
See Also
Examples
# Example with 2 variables
mod1 <- abs(rnorm(1 * 3 * 4 * 5 * 6 * 7))
mod2 <- abs(rnorm(1 * 3 * 4 * 5 * 6 * 7))
dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7)
dim(mod2) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7)
obs1 <- abs(rnorm(1 * 1 * 4 * 5 * 6 * 7))
obs2 <- abs(rnorm(1 * 1 * 4 * 5 * 6 * 7))
dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7)
dim(obs2) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7)
lon <- seq(0, 30, 5)
lat <- seq(0, 25, 5)
coords <- list(lat = lat, lon = lon)
exp1 <- list(data = mod1, coords = coords,
attrs = list(Datasets = "EXP1", source_files = "file1",
Variable = list(varName = 'pre')))
exp2 <- list(data = mod2, coords = coords,
attrs = list(Datasets = "EXP2", source_files = "file2",
Variable = list(varName = 'tas')))
obs1 <- list(data = obs1, coords = coords,
attrs = list(Datasets = "OBS1", source_files = "file1",
Variable = list(varName = 'pre')))
obs2 <- list(data = obs2, coords = coords,
attrs = list(Datasets = "OBS2", source_files = "file2",
Variable = list(varName = 'tas')))
attr(exp1, 'class') <- 's2dv_cube'
attr(exp2, 'class') <- 's2dv_cube'
attr(obs1, 'class') <- 's2dv_cube'
attr(obs2, 'class') <- 's2dv_cube'
anom1 <- CST_Anomaly(exp1, obs1, cross = TRUE, memb = TRUE)
anom2 <- CST_Anomaly(exp2, obs2, cross = TRUE, memb = TRUE)
ano_exp <- list(anom1$exp, anom2$exp)
ano_obs <- list(anom1$obs, anom2$obs)
a <- CST_MultivarRMSE(exp = ano_exp, obs = ano_obs, weight = c(1, 2))
Computing two dinamical proxies of the attractor in s2dv_cube.
Description
This function computes two dinamical proxies of the attractor: The local dimension (d) and the inverse of the persistence (theta) for an 's2dv_cube' object. These two parameters will be used as a condition for the computation of dynamical scores to measure predictability and to compute bias correction conditioned by the dynamics with the function DynBiasCorrection Function based on the matlab code (davide.faranda@lsce.ipsl.fr) used in
Usage
CST_ProxiesAttractor(data, quanti, ncores = NULL)
Arguments
data |
An s2dv_cube object with the data to create the attractor. Must be a matrix with the timesteps in nrow and the grids in ncol(dat(time,grids) |
quanti |
A number lower than 1 indicating the quantile to perform the computation of local dimension and theta. |
ncores |
The number of cores to use in parallel computation. |
Value
dim and theta
Author(s)
Carmen Alvarez-Castro, carmen.alvarez-castro@cmcc.it
Maria M. Chaves-Montero, mdm.chaves-montero@cmcc.it
Veronica Torralba, veronica.torralba@cmcc.it
Davide Faranda, davide.faranda@lsce.ipsl.fr
References
Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale atmospheric predictability. Nature Communications, 10(1), 1316. doi:10.1038/s41467-019-09305-8"
Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. Scientific Reports, 7-41278, 2017.
Examples
# Example 1: Computing the attractor using simple s2dv data
obs <- rnorm(2 * 3 * 4 * 8 * 8)
dim(obs) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8)
lon <- seq(10, 13.5, 0.5)
lat <- seq(40, 43.5, 0.5)
coords <- list(lon = lon, lat = lat)
data <- list(data = obs, coords = coords)
class(data) <- "s2dv_cube"
attractor <- CST_ProxiesAttractor(data = data, quanti = 0.6)
Quantile Mapping for seasonal or decadal forecast data
Description
This function is a wrapper of fitQmap and doQmap from package 'qmap' to be applied on the object of class 's2dv_cube'. The quantile mapping adjustment between an experiment, typically a hindcast, and observation is applied to the experiment itself or to a provided forecast.
Usage
CST_QuantileMapping(
exp,
obs,
exp_cor = NULL,
sdate_dim = "sdate",
memb_dim = "member",
window_dim = NULL,
method = "QUANT",
na.rm = FALSE,
eval.method = "leave-k-out",
k = 1,
ncores = NULL,
...
)
Arguments
exp |
An object of class |
obs |
An object of class |
exp_cor |
A multidimensional array with named dimensions in which the quantile mapping correction should be applied. If it is not specified, the correction is applied to object 'exp' using leave-one-out cross-validation. This is useful to correct a forecast when the hindcast is provided in parameter 'exp'. |
sdate_dim |
A character string indicating the dimension name in which cross-validation would be applied when exp_cor is not provided. 'sdate' by default. |
memb_dim |
A character string indicating the dimension name where ensemble members are stored in the experimental arrays. It can be NULL if there is no ensemble member dimension. It is set as 'member' by default. |
window_dim |
A character string indicating the dimension name in which extra samples are stored. This dimension is joined to the 'member' dimension. This is useful to correct daily data, for which robust statistics can be obtained by creating a window of dates around the target date. |
method |
A character string indicating the method to be used:'PTF', 'DIST', 'RQUANT', 'QUANT', 'SSPLIN'. By default, the empirical quantile mapping 'QUANT' is used. |
na.rm |
A logical value indicating if missing values should be removed (FALSE by default). |
eval.method |
A character string indicating the evaluation method for cross-validaton. the default method is 'leave-k-out', other available methods are 'retrospective', 'in-sample', 'hindcast-vs-forecast'. |
k |
Positive integer. Default = 1.
In method |
ncores |
An integer indicating the number of cores for parallel computation using multiApply function. The default value is NULL (1). |
... |
Additional parameters to be used by the method choosen. See qmap package for details. |
Value
An object of class s2dv_cube containing the experimental data
after applying the quantile mapping correction.
Author(s)
Nuria Perez-Zanon, nuria.perez@bsc.es
See Also
Examples
# Use synthetic data
exp <- NULL
exp$data <- 1 : c(1 * 3 * 5 * 4 * 3 * 2)
dim(exp$data) <- c(dataset = 1, member = 3, sdate = 5, ftime = 4,
lat = 3, lon = 2)
class(exp) <- 's2dv_cube'
obs <- NULL
obs$data <- 101 : c(100 + 1 * 1 * 5 * 4 * 3 * 2)
dim(obs$data) <- c(dataset = 1, member = 1, sdate = 5, ftime = 4,
lat = 3, lon = 2)
class(obs) <- 's2dv_cube'
res <- CST_QuantileMapping(exp, obs)
RainFARM spectral slopes from a CSTools object
Description
This function computes spatial spectral slopes from a CSTools object to be used for RainFARM stochastic precipitation downscaling method and accepts a CSTools object (of the class 's2dv_cube') as input.
Usage
CST_RFSlope(data, kmin = 1, time_dim = NULL, ncores = NULL)
Arguments
data |
An object of the class 's2dv_cube', containing the spatial
precipitation fields to downscale. The data object is expected to have an
element named |
kmin |
First wavenumber for spectral slope (default |
time_dim |
String or character array with name(s) of dimension(s) (e.g. "ftime", "sdate", "member" ...) over which to compute spectral slopes. If a character array of dimension names is provided, the spectral slopes will be computed as an average over all elements belonging to those dimensions. If omitted one of c("ftime", "sdate", "time") is searched and the first one with more than one element is chosen. |
ncores |
Is an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. |
Value
CST_RFSlope() returns spectral slopes using the RainFARM convention
(the logarithmic slope of k*|A(k)|^2 where A(k) are the spectral amplitudes).
The returned array has the same dimensions as the exp element of the
input object, minus the dimensions specified by lon_dim,
lat_dim and time_dim.
Author(s)
Jost von Hardenberg - ISAC-CNR, j.vonhardenberg@isac.cnr.it
Examples
exp <- 1 : (2 * 3 * 4 * 8 * 8)
dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8)
lon <- seq(10, 13.5, 0.5)
lat <- seq(40, 43.5, 0.5)
coords <- list(lon = lon, lat = lat)
data <- list(data = exp, coords = coords)
class(data) <- 's2dv_cube'
slopes <- CST_RFSlope(data)
Temperature downscaling of a CSTools object using lapse rate correction or a reference field
Description
This function implements a simple lapse rate correction of a temperature field (an object of class 's2dv_cube' as provided by 'CST_Load') as input. The input lon grid must be increasing (but can be modulo 360). The input lat grid can be irregularly spaced (e.g. a Gaussian grid) The output grid can be irregularly spaced in lon and/or lat.
Usage
CST_RFTemp(
data,
oro,
xlim = NULL,
ylim = NULL,
lapse = 6.5,
lon_dim = "lon",
lat_dim = "lat",
time_dim = NULL,
nolapse = FALSE,
verbose = FALSE,
compute_delta = FALSE,
method = "bilinear",
delta = NULL
)
Arguments
data |
An object of the class 's2dv_cube' as returned by 'CST_Load',
containing the temperature fields to downscale. The data object is expected
to have an element named |
oro |
An object of the class 's2dv_cube' as returned by 'CST_Load', containing fine scale orography (in meters). The destination downscaling area must be contained in the orography field. |
xlim |
Vector with longitude bounds for downscaling; the full input field is downscaled if 'xlim' and 'ylim' are not specified. |
ylim |
Vector with latitude bounds for downscaling |
lapse |
Float with environmental lapse rate |
lon_dim |
String with name of longitude dimension |
lat_dim |
String with name of latitude dimension |
time_dim |
A vector of character string indicating the name of temporal dimension. By default, it is set to NULL and it considers "ftime", "sdate" and "time" as temporal dimensions. |
nolapse |
Logical, if true 'oro' is interpreted as a fine-scale climatology and used directly for bias correction. |
verbose |
Logical if to print diagnostic output. |
compute_delta |
Logical if true returns only a delta to be used for out-of-sample forecasts. Returns an object of the class 's2dv_cube', containing a delta. Activates 'nolapse = TRUE'. |
method |
String indicating the method used for interpolation: "nearest" (nearest neighbours followed by smoothing with a circular uniform weights kernel), "bilinear" (bilinear interpolation) The two methods provide similar results, but nearest is slightly better provided that the fine-scale grid is correctly centered as a subdivision of the large-scale grid. |
delta |
An object of the class 's2dv_cube', containing a delta to be applied to the downscaled input data. Activates 'nolapse = TRUE'. The grid of this object must coincide with that of the required output. |
Value
CST_RFTemp() returns a downscaled CSTools object (i.e., of the class 's2dv_cube').
Author(s)
Jost von Hardenberg - ISAC-CNR, j.vonhardenberg@isac.cnr.it
References
Method described in ERA4CS MEDSCOPE milestone M3.2: High-quality climate prediction data available to WP4 here: https://www.medscope-project.eu/the-project/deliverables-reports/ and in H2020 ECOPOTENTIAL Deliverable No. 8.1: High resolution (1-10 km) climate, land use and ocean change scenarios available here: https://ec.europa.eu/research/participants/documents/downloadPublic?documentIds=080166e5b6cd2324&appId=PPGMS
Examples
# Generate simple synthetic data and downscale by factor 4
t <- rnorm(7 * 6 * 2 * 3 * 4)*10 + 273.15 + 10
dim(t) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 6, lon = 7)
lon <- seq(3, 9, 1)
lat <- seq(42, 47, 1)
coords <- list(lat = lat, lon = lon)
exp <- list(data = t, coords = coords)
attr(exp, 'class') <- 's2dv_cube'
o <- runif(29*29)*3000
dim(o) <- c(lats = 29, lons = 29)
lon <- seq(3, 10, 0.25)
lat <- seq(41, 48, 0.25)
coords <- list(lat = lat, lon = lon)
oro <- list(data = o, coords = coords)
attr(oro, 'class') <- 's2dv_cube'
res <- CST_RFTemp(data = exp, oro = oro, xlim = c(4,8), ylim = c(43, 46),
lapse = 6.5, time_dim = 'ftime',
lon_dim = 'lon', lat_dim = 'lat')
Compute climatological weights for RainFARM stochastic precipitation downscaling
Description
Compute climatological ("orographic") weights from a fine-scale precipitation climatology file.
Usage
CST_RFWeights(
climfile,
nf,
lon,
lat,
varname = NULL,
fsmooth = TRUE,
lonname = "lon",
latname = "lat",
ncores = NULL
)
Arguments
climfile |
Filename of a fine-scale precipitation climatology. The file is expected to be in NetCDF format and should contain at least one precipitation field. If several fields at different times are provided, a climatology is derived by time averaging. Suitable climatology files could be for example a fine-scale precipitation climatology from a high-resolution regional climate model (see e.g. Terzago et al. 2018), a local high-resolution gridded climatology from observations, or a reconstruction such as those which can be downloaded from the WORLDCLIM (https://www.worldclim.org) or CHELSA (https://chelsa-climate.org/) websites. The latter data will need to be converted to NetCDF format before being used (see for example the GDAL tools (https://gdal.org/). It could also be an 's2dv_cube' object. |
nf |
Refinement factor for downscaling (the output resolution is increased by this factor). |
lon |
Vector of longitudes. |
lat |
Vector of latitudes. The number of longitudes and latitudes is expected to be even and the same. If not the function will perform a subsetting to ensure this condition. |
varname |
Name of the variable to be read from |
fsmooth |
Logical to use smooth conservation (default) or large-scale box-average conservation. |
lonname |
A character string indicating the name of the longitudinal dimension set as 'lon' by default. |
latname |
A character string indicating the name of the latitudinal dimension set as 'lat' by default. |
ncores |
An integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. |
Value
An object of class 's2dv_cube' containing in matrix data the
weights with dimensions (lon, lat).
Author(s)
Jost von Hardenberg - ISAC-CNR, j.vonhardenberg@isac.cnr.it
References
Terzago, S., Palazzi, E., & von Hardenberg, J. (2018). Stochastic downscaling of precipitation in complex orography: A simple method to reproduce a realistic fine-scale climatology. Natural Hazards and Earth System Sciences, 18(11), 2825-2840. doi:10.5194/nhess-18-2825-2018.
Examples
# Create weights to be used with the CST_RainFARM() or RainFARM() functions
# using an external random data in the form of 's2dv_cube'.
obs <- rnorm(2 * 3 * 4 * 8 * 8)
dim(obs) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8)
lon <- seq(10, 13.5, 0.5)
lat <- seq(40, 43.5, 0.5)
coords <- list(lon = lon, lat = lat)
data <- list(data = obs, coords = coords)
class(data) <- "s2dv_cube"
res <- CST_RFWeights(climfile = data, nf = 3, lon, lat, lonname = 'lon',
latname = 'lat', fsmooth = TRUE)
RainFARM stochastic precipitation downscaling of a CSTools object
Description
This function implements the RainFARM stochastic precipitation downscaling method and accepts a CSTools object (an object of the class 's2dv_cube' as provided by 'CST_Load') as input. Adapted for climate downscaling and including orographic correction as described in Terzago et al. 2018.
Usage
CST_RainFARM(
data,
weights = 1,
slope = 0,
nf,
kmin = 1,
nens = 1,
fglob = FALSE,
fsmooth = TRUE,
nprocs = 1,
time_dim = NULL,
verbose = FALSE,
drop_realization_dim = FALSE
)
Arguments
data |
An object of the class 's2dv_cube' as returned by 'CST_Load',
containing the spatial precipitation fields to downscale.
The data object is expected to have an element named |
weights |
Matrix with climatological weights which can be obtained using
the |
slope |
Prescribed spectral slope. The default is |
nf |
Refinement factor for downscaling (the output resolution is increased by this factor). |
kmin |
First wavenumber for spectral slope (default: |
nens |
Number of ensemble members to produce (default: |
fglob |
Logical to conserve global precipitation over the domain (default: FALSE). |
fsmooth |
Logical to conserve precipitation with a smoothing kernel (default: TRUE). |
nprocs |
The number of parallel processes to spawn for the use for parallel computation in multiple cores. (default: 1) |
time_dim |
String or character array with name(s) of dimension(s) (e.g. "ftime", "sdate", "member" ...) over which to compute spectral slopes. If a character array of dimension names is provided, the spectral slopes will be computed as an average over all elements belonging to those dimensions. If omitted one of c("ftime", "sdate", "time") is searched and the first one with more than one element is chosen. |
verbose |
Logical for verbose output (default: FALSE). |
drop_realization_dim |
Logical to remove the "realization" stochastic ensemble dimension, needed for saving data through function CST_SaveData (default: FALSE) with the following behaviour if set to TRUE:
|
Details
Wether parameter 'slope' and 'weights' presents seasonality dependency, a dimension name should match between these parameters and the input data in parameter 'data'. See example 2 below where weights and slope vary with 'sdate' dimension.
Value
CST_RainFARM() returns a downscaled CSTools object (i.e., of the
class 's2dv_cube'). If nens > 1 an additional dimension named
"realization" is added to the $data array after the "member" dimension
(unless drop_realization_dim = TRUE is specified). The ordering of the
remaining dimensions in the $data element of the input object is
maintained.
Author(s)
Jost von Hardenberg - ISAC-CNR, j.vonhardenberg@isac.cnr.it
References
Terzago, S. et al. (2018). NHESS 18(11), 2825-2840. doi:10.5194/nhess-18-2825-2018; D'Onofrio et al. (2014), J of Hydrometeorology 15, 830-843; Rebora et. al. (2006), JHM 7, 724.
Examples
# Example 1: using CST_RainFARM for a CSTools object
nf <- 8 # Choose a downscaling by factor 8
exp <- 1 : (2 * 3 * 4 * 8 * 8)
dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8)
lon <- seq(10, 13.5, 0.5)
lat <- seq(40, 43.5, 0.5)
coords <- list(lon = lon, lat = lat)
data <- list(data = exp, coords = coords)
class(data) <- 's2dv_cube'
# Create a test array of weights
ww <- array(1., dim = c(lon = 8 * nf, lat = 8 * nf))
res <- CST_RainFARM(data, nf = nf, weights = ww, nens = 3, time_dim = 'ftime')
Function for matching a field of anomalies with a set of maps used as a reference (e.g. clusters obtained from the WeatherRegime function)
Description
This function performs the matching between a field of anomalies and a set of maps which will be used as a reference. The anomalies will be assigned to the reference map for which the minimum Eucledian distance (method =’distance’) or highest spatial correlation (method = 'ACC') is obtained.
Usage
CST_RegimesAssign(
data,
ref_maps,
method = "distance",
composite = FALSE,
memb = FALSE,
ncores = NULL
)
Arguments
data |
An 's2dv_cube' object. |
ref_maps |
An 's2dv_cube' object as the output of CST_WeatherRegimes. |
method |
Whether the matching will be performed in terms of minimum distance (default = 'distance') or the maximum spatial correlation (method = 'ACC') between the maps. |
composite |
A logical parameter indicating if the composite maps are computed or not (default = FALSE). |
memb |
A logical value indicating whether to compute composites for separate members (default FALSE) or as unique ensemble (TRUE). This option is only available for when parameter 'composite' is set to TRUE and the data object has a dimension named 'member'. |
ncores |
The number of multicore threads to use for parallel computation. |
Value
A list with two elements $data (a 's2dv_cube' object containing
the composites cluster=1,..,K for case (*1) or only k=1 for any specific
cluster, i.e., case (*2)) (only when composite = 'TRUE') and $statistics
that includes $pvalue (array with the same structure as $data
containing the pvalue of the composites obtained through a t-test that
accounts for the serial dependence of the data with the same structure as
Composite.)(only when composite = 'TRUE'), $cluster (array with the
same dimensions as data (except latitude and longitude which are removed)
indicating the ref_maps to which each point is allocated.), $frequency
(A vector of integers (from k=1,...k n reference maps) indicating the
percentage of assignations corresponding to each map.).
Author(s)
Verónica Torralba - BSC, veronica.torralba@bsc.es
References
Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools for the development of a climate service. Thesis. Available online: https://eprints.ucm.es/56841/
Examples
data <- array(abs(rnorm(1280, 282.7, 6.4)), dim = c(dataset = 2, member = 2,
sdate = 3, ftime = 3,
lat = 4, lon = 4))
coords <- list(lon = seq(0, 3), lat = seq(47, 44))
exp <- list(data = data, coords = coords)
class(exp) <- 's2dv_cube'
regimes <- CST_WeatherRegimes(data = exp, EOFs = FALSE,
ncenters = 4)
res1 <- CST_RegimesAssign(data = exp, ref_maps = regimes,
composite = FALSE)
Save objects of class 's2dv_cube' to data in NetCDF format
Description
This function allows to divide and save a object of class
's2dv_cube' into a NetCDF file, allowing to reload the saved data using
CST_Start or CST_Load functions. It also allows to save any
's2dv_cube' object that follows the NetCDF attributes conventions.
Usage
CST_SaveExp(
data,
destination = tempdir(),
startdates = NULL,
sdate_dim = "sdate",
ftime_dim = "time",
memb_dim = "member",
dat_dim = "dataset",
var_dim = "var",
drop_dims = NULL,
single_file = FALSE,
extra_string = NULL,
global_attrs = NULL,
units_hours_since = FALSE
)
Arguments
data |
An object of class |
destination |
A character string containing the directory name in which to save the data. NetCDF file for each starting date are saved into the folder tree: 'destination/Dataset/variable/'. By default the function saves the data into the working directory. |
startdates |
A vector of dates that will be used for the filenames
when saving the data in multiple files (single_file = FALSE). It must be a
vector of the same length as the start date dimension of data. It must be a
vector of class |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. It can be NULL if there is no start date dimension. |
ftime_dim |
A character string indicating the name of the forecast time dimension. If 'Dates' are used, it can't be NULL. If there is no forecast time dimension, 'Dates' will be set to NULL and will not be used. By default, it is set to 'time'. |
memb_dim |
A character string indicating the name of the member dimension. It can be NULL if there is no member dimension. By default, it is set to 'member'. |
dat_dim |
A character string indicating the name of dataset dimension. It can be NULL if there is no dataset dimension. By default, it is set to 'dataset'. |
var_dim |
A character string indicating the name of variable dimension. It can be NULL if there is no variable dimension. By default, it is set to 'var'. |
drop_dims |
(optional) A vector of character strings indicating the dimension names of length 1 that need to be dropped in order that they don't appear in the netCDF file. Only is allowed to drop dimensions that are not used in the computation. The dimensions used in the computation are the ones specified in: sdate_dim, ftime_dim, dat_dim, var_dim and memb_dim. It is NULL by default. |
single_file |
A logical value indicating if all object is saved in a single file (TRUE) or in multiple files (FALSE). When it is FALSE, the array is separated for datasets, variable and start date. When there are no specified time dimensions, the data will be saved in a single file by default. The output file name when 'single_file' is TRUE is a character string containing: '<var>_<first_sdate>_<last_sdate>.nc'; when it is FALSE, it is '<var>_<sdate>.nc'. It is FALSE by default. |
extra_string |
(Optional) A character string to be included as part of the file name, for instance, to identify member or realization. When single_file is TRUE, the 'extra_string' will substitute all the default file name; when single_file is FALSE, the 'extra_string' will be added in the file name as: '<var>_<extra_string>_<sdate>.nc'. It is NULL by default. |
global_attrs |
(Optional) A list with elements containing the global attributes to be saved in the NetCDF. |
units_hours_since |
(Optional) A logical value only available for the case: 'Dates' have forecast time and start date dimension, 'single_file' is TRUE and 'time_bounds' are not used. When it is TRUE, it saves the forecast time with units of 'hours since'; if it is FALSE, the time units will be a number of time steps with its corresponding frequency (e.g. n days, n months or n hours). It is FALSE by default. |
Value
Multiple or single NetCDF files containing the data array.
single_file is TRUE |
All data is saved in a single file located in the specified destination path with the following name (by default): '<variable_name>_<first_sdate>_<last_sdate>.nc'. Multiple variables are saved separately in the same file. The forecast time units are calculated from each start date (if sdate_dim is not NULL) or from the time step. If 'units_hours_since' is TRUE, the forecast time units will be 'hours since <each start date>'. If 'units_hours_since' is FALSE, the forecast time units are extracted from the frequency of the time steps (hours, days, months); if no frequency is found, the units will be ’hours since’. When the time units are 'hours since' the time ateps are assumed to be equally spaced. |
single_file is FALSE |
The data array is subset and stored into multiple files. Each file contains the data subset for each start date, variable and dataset. Files with different variables and datasets are stored in separated directories within the following directory tree: 'destination/Dataset/variable/'. The name of each file will be by default: '<variable_name>_<sdate>.nc'. The forecast time units are calculated from each start date (if sdate_dim is not NULL) or from the time step. The forecast time units will be 'hours since <each start date>'. |
Author(s)
Perez-Zanon Nuria, nuria.perez@bsc.es
See Also
Start, as.s2dv_cube and
s2dv_cube
Examples
data <- lonlat_temp_st$exp
CST_SaveExp(data = data, ftime_dim = 'ftime', var_dim = 'var',
dat_dim = 'dataset', sdate_dim = 'sdate')
Function to Split Dimension
Description
This function split a dimension in two. The user can select the dimension to split and provide indices indicating how to split that dimension or dates and the frequency expected (monthly or by day, month and year). The user can also provide a numeric frequency indicating the length of each division.
Usage
CST_SplitDim(
data,
split_dim = "time",
indices = NULL,
freq = "monthly",
new_dim_name = NULL,
insert_ftime = NULL,
ftime_dim = "time",
sdate_dim = "sdate",
return_indices = FALSE
)
Arguments
data |
A 's2dv_cube' object |
split_dim |
A character string indicating the name of the dimension to split. It is set as 'time' by default. |
indices |
A vector of numeric indices or dates. If left at NULL, the dates provided in the s2dv_cube object (element Dates) will be used. |
freq |
A character string indicating the frequency: by 'day', 'month' and 'year' or 'monthly' (by default). 'month' identifies months between 1 and 12 independently of the year they belong to, while 'monthly' differenciates months from different years. |
new_dim_name |
A character string indicating the name of the new dimension. |
insert_ftime |
An integer indicating the number of time steps to add at the begining of the time series. |
ftime_dim |
A character string indicating the name of the forecast time dimension. It is set as 'time' by default. |
sdate_dim |
A character string indicating the name of the start date dimension. It is set as 'sdate' by default. |
return_indices |
A logical value that if it is TRUE, the indices used in splitting the dimension will be returned. It is FALSE by default. |
Details
Parameter 'insert_ftime' has been included for the case of using
daily data, requiring split the temporal dimensions by months (or similar) and
the first lead time doesn't correspondt to the 1st day of the month. In this
case, the insert_ftime could be used, to get a final output correctly
organized. E.g.: leadtime 1 is the 2nd of November and the input time series
extend to the 31st of December. When requiring split by month with
inset_ftime = 1, the 'monthly' dimension of length two will indicate
the month (position 1 for November and position 2 for December), dimension
'time' will be length 31. For November, the position 1 and 31 will be NAs,
while from positon 2 to 30 will be filled with the data provided. This allows
to select correctly days trhough time dimension.
Value
An object of class 's2dv_cube' with the specified dimension split
into a new dimension, preserving all other original dimensions, coordinates,
and attributes (including Dates).
Author(s)
Nuria Perez-Zanon, nuria.perez@bsc.es
Examples
data <- 1 : 20
dim(data) <- c(time = 10, lat = 2)
data <-list(data = data)
class(data) <- 's2dv_cube'
indices <- c(rep(1,5), rep(2,5))
new_data <- CST_SplitDim(data, indices = indices)
time <- c(seq(ISOdate(1903, 1, 1), ISOdate(1903, 1, 4), "days"),
seq(ISOdate(1903, 2, 1), ISOdate(1903, 2, 4), "days"),
seq(ISOdate(1904, 1, 1), ISOdate(1904, 1, 2), "days"))
data <- list(data = data$data, Dates = time)
class(data) <- 's2dv_cube'
new_data <- CST_SplitDim(data, indices = time)
new_data <- CST_SplitDim(data, indices = time, freq = 'day')
new_data <- CST_SplitDim(data, indices = time, freq = 'month')
new_data <- CST_SplitDim(data, indices = time, freq = 'year')
CSTools data retrieval function using Start
Description
This function aggregates, subsets and retrieves sub-seasonal, seasonal,
decadal or climate projection data from NetCDF files in a local file system
and arranges it for easy application of the CSTools functions. It calls the
function Start from startR, which is an R package started at BSC with
the aim to develop a tool that allows the user to automatically process large
multidimensional distributed data sets. Then, the output is transformed into
's2dv_cube' object.
Usage
CST_Start(...)
Arguments
... |
Parameters that are automatically forwarded to the 'startR::Start' function. See details in '?startR::Start'. |
Details
It receives any number of parameters ('...') that are automatically forwarded to the 'startR::Start' function. See details in '?startR::Start'. The auxiliary startR functions (e.g. indices(), values(), Sort(), CircularSort(), CDORemapper(), ...) can be used to define dimensions.
CST_Start() uses as.s2dv_cube() to transform the output into an s2dv_cube object. The as.s2dv_cube() function is designed to be used with data that has been retrieved into memory. To avoid errors, please ensure that CST_Start(..., retrieve = TRUE) is specified.
Value
An object of class 's2dv_cube' containing the subsetted and aggregated
data retrieved from the NetCDF files using startR. The data structure
is compatible with CSTools functions.
Examples
## Not run:
sdates <- c('20101101', '20111101', '20121101')
latmin <- 44
latmax <- 47
lonmin <- 6
lonmax <- 9
data <- CST_Start(dat = path,
var = 'prlr',
ensemble = indices(1:6),
sdate = sdates,
time = 121:151,
latitude = values(list(latmin, latmax)),
longitude = values(list(lonmin, lonmax)),
synonims = list(longitude = c('lon', 'longitude'),
latitude = c('lat', 'latitude')),
return_vars = list(time = 'sdate',
longitude = NULL, latitude = NULL),
retrieve = TRUE)
## End(Not run)
Subset an object of class s2dv_cube
Description
This function allows to subset (i.e. slice, take a chunk of) the data inside
an object of class s2dv_cube and modify the dimensions, coordinates and
attributes accordingly, removing any variables, time steps and spatial
coordinates that are dropped when subsetting. It ensures that the information
inside the s2dv_cube remains coherent with the data it contains.
As in the function Subset from the ClimProjDiags package, the
dimensions to subset along can be specified via the parameter along
either with integer indices or by their name.
There are additional ways to adjust which dimensions are dropped in the
resulting object: either to drop all, to drop none, to drop only the ones that
have been sliced or to drop only the ones that have not been sliced.
The load_parameters and when attributes of the original cube
are preserved. The source_files attribute is subset along the
var_dim and dat_dim dimensions.
Usage
CST_Subset(x, along, indices, drop = FALSE, var_dim = NULL, dat_dim = NULL)
Arguments
x |
An object of class |
along |
A vector with references to the dimensions to take the subset from: either integers or dimension names. |
indices |
A list of indices to take from each dimension specified in 'along'. If a single dimension is specified in 'along', it can be directly provided as an integer or a vector. |
drop |
Whether to drop all the dimensions of length 1 in the resulting array, none, only those that are specified in 'along', or only those that are not specified in 'along'. The possible values are: 'all' or TRUE, 'none' or FALSE, 'selected', and 'non-selected'. The default value is FALSE. |
var_dim |
A chatacter string indicating the name of the variable dimension. The default value is NULL. |
dat_dim |
A character string indicating the name of dataset dimension. The default value is NULL. |
Value
An object of class s2dv_cube with similar data, coordinates and
attributes as the x input, but with trimmed or dropped dimensions.
Author(s)
Agudetse Roures Victoria, victoria.agudetse@bsc.es
See Also
Examples
#Example with sample data:
# Check original dimensions and coordinates
lonlat_temp$exp$dims
names(lonlat_temp$exp$coords)
# Subset the s2dv_cube
exp_subset <- CST_Subset(lonlat_temp$exp,
along = c("lat", "lon"),
indices = list(1:10, 1:10),
drop = 'non-selected')
# Check new dimensions and coordinates
exp_subset$dims
names(exp_subset$coords)
Generate a Summary of the data and metadata in the s2dv_cube object
Description
This function prints the summary of the data and metadata of an object of
class s2dv_cube.
Usage
CST_Summary(data, show_loaded_files = FALSE, show_NA = FALSE, var_dim = "var")
Arguments
data |
An |
show_loaded_files |
A logical value indicating if the names of the loaded files will be displayed in the output or not. Default = FALSE. |
show_NA |
A logical value indicating if details of NA values in the loaded object will be displayed in the output or not. Default = FALSE. |
var_dim |
A character string indicating the name of the variable dimension. Default = "var". |
Details
The function uses the data and metadata from the loaded
s2dv_cube object to generate a summary of the object.The summary
includes :
- months: Months that have been loaded.
- range: Range of the dates that have been loaded.
- dimensions: Object dimensions.
- Statistical summary of the data in data: Basic statistical
summary of the data.
- Variable: Loaded Variables, along with their units (units:)
- NA-Indices per Dimension: Index with NA values per dimension
- Number of NAs in NA-Indices per Dimensions: Number of NAs,
in the Indices with NA values per dimension
- Loaded files: Successfully loaded Files
Value
Does not return a value. This function prints a detailed summary of
an s2dv_cube object to the console.
Author(s)
Theertha Kariyathan, theertha.kariyathan@bsc.es
See Also
CST_Start or s2dv_cube for creating an s2dv cube object.
Examples
# Example 1:
CST_Summary(data = lonlat_temp_st$exp)
# Example 2:
## Not run:
# s2dv cube paths
repos1 <- "/esarchive/exp/ecmwf/system4_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc"
repos2 <- "/esarchive/exp/ecmwf/system5_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc"
# Create data cube
data <- CST_Start(dat = list(
list(name = 'system4_m1', path = repos1),
list(name = 'system5_m1', path = repos2)),
var = c('tas', 'sfcWind'),
sdate = '20170101',
ensemble = indices(1),
time = indices(1:3),
lat = indices(1:5),
lon = indices(1:5),
synonims = list(lat = c('lat', 'latitude'),
lon = c('lon', 'longitude')),
return_vars = list(time = 'sdate',
longitude = 'dat',
latitude = 'dat'),
metadata_dims = c('dat', 'var'),
retrieve = TRUE)
# Generate summary
CST_Summary(data)
## End(Not run)
Function for Calculating the Cluster analysis
Description
This function computes the weather regimes from a cluster
analysis. It is applied on the array data in a 's2dv_cube' object. The
dimensionality of this object can be also reduced by using PCs obtained from
the application of the #'EOFs analysis to filter the dataset. The cluster
analysis can be performed with the traditional k-means or those methods
included in the hclust (stats package).
Usage
CST_WeatherRegimes(
data,
ncenters = NULL,
EOFs = TRUE,
neofs = 30,
varThreshold = NULL,
method = "kmeans",
iter.max = 100,
nstart = 30,
ncores = NULL
)
Arguments
data |
An 's2dv_cube' object. |
ncenters |
Number of clusters to be calculated with the clustering function. |
EOFs |
Whether to compute the EOFs (default = 'TRUE') or not (FALSE) to filter the data. |
neofs |
Number of modes to be kept (default = 30). |
varThreshold |
Value with the percentage of variance to be explained by the PCs. Only sufficient PCs to explain this much variance will be used in the clustering. |
method |
Different options to estimate the clusters. The most traditional approach is the k-means analysis (default=’kmeans’) but the function also support the different methods included in the hclust . These methods are: "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). For more details about these methods see the hclust function documentation included in the stats package. |
iter.max |
Parameter to select the maximum number of iterations allowed (Only if method='kmeans' is selected). |
nstart |
Parameter for the cluster analysis determining how many random sets to choose (Only if method='kmeans' is selected). |
ncores |
The number of multicore threads to use for parallel computation. |
Value
A list with two elements $data (a 's2dv_cube' object containing
the composites cluster = 1,..,K for case (*1) or only k = 1 for any specific
cluster, i.e., case (*2)) and $statistics that includes $pvalue
(array with the same structure as $data containing the pvalue of the
composites obtained through a t-test that accounts for the serial dependence.),
cluster (A matrix or vector with integers (from 1:k) indicating the
cluster to which each time step is allocated.), persistence (Percentage
of days in a month/season before a cluster is replaced for a new one (only if
method=’kmeans’ has been selected.)), frequency (Percentage of days in
a month/season belonging to each cluster (only if method=’kmeans’ has been
selected).),
Author(s)
Verónica Torralba - BSC, veronica.torralba@bsc.es
References
Cortesi, N., V., Torralba, N., González-Reviriego, A., Soret, and F.J., Doblas-Reyes (2019). Characterization of European wind speed variability using weather regimes. Climate Dynamics,53, 4961–4976, doi:10.1007/s00382-019-04839-5.
Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools for the development of a climate service. Thesis. Available online: https://eprints.ucm.es/56841/.
Examples
data <- array(abs(rnorm(1280, 283.7, 6)), dim = c(dataset = 2, member = 2,
sdate = 3, ftime = 3,
lat = 4, lon = 4))
coords <- list(lon = seq(0, 3), lat = seq(47, 44))
obs <- list(data = data, coords = coords)
class(obs) <- 's2dv_cube'
res1 <- CST_WeatherRegimes(data = obs, EOFs = FALSE, ncenters = 4)
res2 <- CST_WeatherRegimes(data = obs, EOFs = TRUE, ncenters = 3)
Forecast Calibration
Description
Five types of member-by-member bias correction can be performed.
The "bias" method corrects the bias only, the "evmos" method
applies a variance inflation technique to ensure the correction of the bias
and the correspondence of variance between forecast and observation (Van
Schaeybroeck and Vannitsem, 2011). The ensemble calibration methods
"mse_min" and "crps_min" correct the bias, the overall forecast
variance and the ensemble spread as described in Doblas-Reyes et al. (2005)
and Van Schaeybroeck and Vannitsem (2015), respectively. While the
"mse_min" method minimizes a constrained mean-squared error using three
parameters, the "crps_min" method features four parameters and
minimizes the Continuous Ranked Probability Score (CRPS). The
"rpc-based" method adjusts the forecast variance ensuring that the
ratio of predictable components (RPC) is equal to one, as in Eade et al.
(2014). Both in-sample or our out-of-sample (leave-k-out cross
validation) calibration are possible.
Usage
Calibration(
exp,
obs,
exp_cor = NULL,
cal.method = "mse_min",
eval.method = "leave-k-out",
multi.model = FALSE,
na.fill = TRUE,
na.rm = TRUE,
apply_to = NULL,
alpha = NULL,
memb_dim = "member",
sdate_dim = "sdate",
dat_dim = NULL,
ncores = NULL,
k = 1,
tail.out = TRUE
)
Arguments
exp |
A multidimensional array with named dimensions (at least 'sdate' and 'member') containing the seasonal hindcast experiment data. The hindcast is used to calibrate the forecast in case the forecast is provided; if not, the same hindcast will be calibrated instead. |
obs |
A multidimensional array with named dimensions (at least 'sdate') containing the observed data. |
exp_cor |
An optional multidimensional array with named dimensions (at least 'sdate' and 'member') containing the seasonal forecast experiment data. If the forecast is provided, it will be calibrated using the hindcast and observations; if not, the hindcast will be calibrated instead. If there is only one corrected dataset, it should not have dataset dimension. If there is a corresponding corrected dataset for each 'exp' forecast, the dataset dimension must have the same length as in 'exp'. The default value is NULL. |
cal.method |
A character string indicating the calibration method used,
can be either |
eval.method |
A character string indicating the sampling method used, it
can be either |
multi.model |
A boolean that is used only for the |
na.fill |
A boolean that indicates what happens in case calibration is
not possible or will yield unreliable results. This happens when three or
less forecasts-observation pairs are available to perform the training phase
of the calibration. By default |
na.rm |
A boolean that indicates whether to remove the NA values or
not. The default value is |
apply_to |
A character string that indicates whether to apply the
calibration to all the forecast ( |
alpha |
A numeric value indicating the significance level for the
correlation test. Only useful if |
memb_dim |
A character string indicating the name of the member dimension. By default, it is set to 'member'. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
dat_dim |
A character string indicating the name of dataset dimension. The length of this dimension can be different between 'exp' and 'obs'. The default value is NULL. |
ncores |
An integer that indicates the number of cores for parallel computation using multiApply function. The default value is NULL (one core). |
k |
k Positive integer. Default = 1.
In method |
tail.out |
Boolean for 'leave-k-out' method; TRUE to remove both extremes keeping the same sample size for all k-folds (e.g. sample.length = 50, k = 3, eval.dexes = 1, train.dexes = (3,49)). FALSE to remove only the corresponding tail (e.g.. sample.length=50, k=3, eval.dexes=1, train.dexes= (3,50)) |
Details
Both the na.fill and na.rm parameters can be used to
indicate how the function has to handle the NA values. The na.fill
parameter checks whether there are more than three forecast-observations pairs
to perform the computation. In case there are three or less pairs, the
computation is not carried out, and the value returned by the function depends
on the value of this parameter (either NA if na.fill == TRUE or the
uncorrected value if na.fill == TRUE). On the other hand, na.rm
is used to indicate the function whether to remove the missing values during
the computation of the parameters needed to perform the calibration.
Value
An array containing the calibrated forecasts with the dimensions nexp, nobs and same dimensions as in the 'exp' array. nexp is the number of experiment (i.e., 'dat_dim' in exp), and nobs is the number of observation (i.e., 'dat_dim' in obs). If dat_dim is NULL, nexp and nobs are omitted. If 'exp_cor' is provided the returned array will be with the same dimensions as 'exp_cor'.
Author(s)
Verónica Torralba, veronica.torralba@bsc.es
Bert Van Schaeybroeck, bertvs@meteo.be
References
Doblas-Reyes F.J, Hagedorn R, Palmer T.N. The rationale behind the success of multi-model ensembles in seasonal forecasting-II calibration and combination. Tellus A. 2005;57:234-252. doi:10.1111/j.1600-0870.2005.00104.x
Eade, R., Smith, D., Scaife, A., Wallace, E., Dunstone, N., Hermanson, L., & Robinson, N. (2014). Do seasonal-to-decadal climate predictions underestimate the predictability of the read world? Geophysical Research Letters, 41(15), 5620-5628. doi:10.1002/2014GL061146
Van Schaeybroeck, B., & Vannitsem, S. (2011). Post-processing through linear regression. Nonlinear Processes in Geophysics, 18(2), 147. doi:10.5194/npg-18-147-2011
Van Schaeybroeck, B., & Vannitsem, S. (2015). Ensemble post-processing using member-by-member approaches: theoretical aspects. Quarterly Journal of the Royal Meteorological Society, 141(688), 807-818. doi:10.1002/qj.2397
See Also
Examples
mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7)
dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7)
obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7)
dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7)
a <- Calibration(exp = mod1, obs = obs1)
Make categorical forecast based on a multi-model forecast with potential for calibrate
Description
This function converts a multi-model ensemble forecast into a categorical forecast by giving the probability for each category. Different methods are available to combine the different ensemble forecasting models into probabilistic categorical forecasts.
See details in ?CST_CategoricalEnsCombination
Usage
CategoricalEnsCombination(
fc,
obs,
cat.method,
eval.method,
amt.cat,
k,
tail.out,
...
)
Arguments
fc |
A multi-dimensional array with named dimensions containing the
seasonal forecast experiment data in the element named |
obs |
A multidimensional array with named dimensions containing the
observed data in the element named |
cat.method |
Method used to produce the categorical forecast, can be
either |
eval.method |
Is the sampling method used, can be either
|
amt.cat |
Is the amount of categories. Equally-sized quantiles will be calculated based on the amount of categories. |
k |
k Positive integer. Default = 1.
In method |
tail.out |
Boolean for 'leave-k-out' method; TRUE to remove both extremes keeping the same sample size for all k-folds (e.g. sample.length = 50, k = 3, eval.dexes = 1, train.dexes = (3,49)). FALSE to remove only the corresponding tail (e.g.. sample.length = 50, k = 3, eval.dexes = 1, train.dexes = (3,50)) |
... |
Other parameters to be passed on to the calibration procedure. |
Value
An array containing the categorical forecasts in the element called
$data. The first two dimensions of the returned object are named
dataset and member and are both of size one. An additional dimension named
category is introduced and is of size amt.cat.
Author(s)
Bert Van Schaeybroeck, bertvs@meteo.be
References
Rajagopalan, B., Lall, U., & Zebiak, S. E. (2002). Categorical climate forecasts through regularization and optimal combination of multiple GCM ensembles. Monthly Weather Review, 130(7), 1792-1811.
Robertson, A. W., Lall, U., Zebiak, S. E., & Goddard, L. (2004). Improved combination of multiple atmospheric GCM ensembles for seasonal prediction. Monthly Weather Review, 132(12), 2732-2744.
Van Schaeybroeck, B., & Vannitsem, S. (2019). Postprocessing of Long-Range Forecasts. In Statistical Postprocessing of Ensemble Forecasts (pp. 267-290).
Performing a Bias Correction conditioned by the dynamical properties of the data.
Description
This function perform a bias correction conditioned by the dynamical properties of the dataset. This function used the functions 'CST_Predictability' to divide in terciles the two dynamical proxies computed with 'CST_ProxiesAttractor'. A bias correction between the model and the observations is performed using the division into terciles of the local dimension 'dim' and inverse of the persistence 'theta'. For instance, model values with lower 'dim' will be corrected with observed values with lower 'dim', and the same for theta. The function gives two options of bias correction: one for 'dim' and/or one for 'theta'
Usage
DynBiasCorrection(
exp,
obs,
method = "QUANT",
wetday = FALSE,
proxy = "dim",
quanti,
ncores = NULL
)
Arguments
exp |
A multidimensional array with named dimensions with the experiment data. |
obs |
A multidimensional array with named dimensions with the observation data. |
method |
A character string indicating the method to apply bias correction among these ones: "PTF", "RQUANT", "QUANT", "SSPLIN". |
wetday |
Logical indicating whether to perform wet day correction or not OR a numeric threshold below which all values are set to zero (by default is set to 'FALSE'). |
proxy |
A character string indicating the proxy for local dimension 'dim' or inverse of persistence 'theta' to apply the dynamical conditioned bias correction method. |
quanti |
A number lower than 1 indicating the quantile to perform the computation of local dimension and theta. |
ncores |
The number of cores to use in parallel computation. |
Value
A multidimensional array with named dimensions with a bias correction performed conditioned by local dimension 'dim' or inverse of persistence 'theta'.
Author(s)
Carmen Alvarez-Castro, carmen.alvarez-castro@cmcc.it
Maria M. Chaves-Montero, mdm.chaves-montero@cmcc.it
Veronica Torralba, veronica.torralba@cmcc.it
Davide Faranda, davide.faranda@lsce.ipsl.fr
References
Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale atmospheric predictability.Nature Communications, 10(1), 1316. doi:10.1038/s41467-019-09305-8"
Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. Scientific Reports, 7-41278, 2017.
Examples
expL <- rnorm(1:2000)
dim (expL) <- c(time =100,lat = 4, lon = 5)
obsL <- c(rnorm(1:1980),expL[1,,]*1.2)
dim (obsL) <- c(time = 100,lat = 4, lon = 5)
dynbias <- DynBiasCorrection(exp = expL, obs = obsL, method='QUANT',
proxy= "dim", quanti = 0.6)
Ensemble clustering
Description
This function performs a clustering on members/starting dates and returns a number of scenarios, with representative members for each of them. The clustering is performed in a reduced EOF space.
Usage
EnsClustering(
data,
lat,
lon,
time_moment = "mean",
numclus = NULL,
lon_lim = NULL,
lat_lim = NULL,
variance_explained = 80,
numpcs = NULL,
time_percentile = 90,
time_dim = NULL,
cluster_dim = "member",
verbose = TRUE
)
Arguments
data |
A matrix of dimensions 'dataset member sdate ftime lat lon' containing the variables to be analysed. Latitudinal dimension accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. Longitudinal dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', 'nav_lon'. |
lat |
Vector of latitudes. |
lon |
Vector of longitudes. |
time_moment |
Decides the moment to be applied to the time dimension. Can be either 'mean' (time mean), 'sd' (standard deviation along time) or 'perc' (a selected percentile on time). If 'perc' the keyword 'time_percentile' is also used. |
numclus |
Number of clusters (scenarios) to be calculated. If set to NULL the number of ensemble members divided by 10 is used, with a minimum of 2 and a maximum of 8. |
lon_lim |
List with the two longitude margins in 'c(-180,180)' format. |
lat_lim |
List with the two latitude margins. |
variance_explained |
variance (percentage) to be explained by the set of EOFs. Defaults to 80. Not used if numpcs is specified. |
numpcs |
Number of EOFs retained in the analysis (optional). |
time_percentile |
Set the percentile in time you want to analyse (used for 'time_moment = "perc"). |
time_dim |
String or character array with name(s) of dimension(s) over which to compute statistics. If omitted c("ftime", "sdate", "time") are searched in this order. |
cluster_dim |
Dimension along which to cluster. Typically "member" or "sdate". This can also be a list like c("member", "sdate"). |
verbose |
Logical for verbose output |
Value
A list with elements $cluster (cluster assigned for each member),
$freq (relative frequency of each cluster), $closest_member
(representative member for each cluster), $repr_field (list of fields for
each representative member), composites (list of mean fields for each
cluster), $lon (selected longitudes of output fields), $lat
(selected longitudes of output fields).
Author(s)
Federico Fabiano - ISAC-CNR, f.fabiano@isac.cnr.it
Ignazio Giuntoli - ISAC-CNR, i.giuntoli@isac.cnr.it
Danila Volpi - ISAC-CNR, d.volpi@isac.cnr.it
Paolo Davini - ISAC-CNR, p.davini@isac.cnr.it
Jost von Hardenberg - ISAC-CNR, j.vonhardenberg@isac.cnr.it
Examples
exp <- array(abs(rnorm(1152))*275, dim = c(dataset = 1, member = 4,
sdate = 6, ftime = 3,
lat = 4, lon = 4))
lon <- seq(0, 3)
lat <- seq(48, 45)
res <- EnsClustering(exp, lat = lat, lon = lon, numclus = 2,
cluster_dim = c("member", "dataset", "sdate"))
Generate Training and Evaluation Indices for Cross-Validation
Description
This function generates training and evaluation indices based on different cross-validation methods.
Usage
EvalTrainIndices(
eval.method,
sample.length,
sample.length_cor,
k = 1,
tail.out = TRUE
)
Arguments
eval.method |
Character. The cross-validation method. Options include:
- |
sample.length |
Integer. Length of the sample (in years). |
sample.length_cor |
Integer. Length of forecast sample (in years) in
|
k |
Positive integer. Default = 1.
In method |
tail.out |
Logical for method |
Value
A list of lists, where each element contains:
eval.dexes: Indices of evaluation points.
train.dexes: Indices of training points.
Author(s)
Theertha Kariyathan, theertha.kariyathan@bsc.es
Examples
# Leave-k-out cross-validation
EvalTrainIndices("leave-k-out", sample.length = 10, sample.length_cor = 5, k = 3)
# Retrospective cross-validation
EvalTrainIndices("retrospective", sample.length = 10, sample.length_cor = 5, k = 3)
# In-sample validation
EvalTrainIndices("in-sample", sample.length = 10, sample.length_cor = 5)
# Hindcast vs. Forecast validation
EvalTrainIndices("hindcast-vs-forecast", sample.length = 10, sample.length_cor = 5)
Function to Split Dimension
Description
This function merges two dimensions of an array into one. The user can select the dimensions to merge and provide the final name of the dimension. The user can select to remove NA values or keep them.
Usage
MergeDims(
data,
merge_dims = c("time", "monthly"),
rename_dim = NULL,
na.rm = FALSE
)
Arguments
data |
An n-dimensional array with named dimensions |
merge_dims |
A character vector indicating the names of the dimensions to merge. |
rename_dim |
A character string indicating the name of the output
dimension. If left at NULL, the first dimension name provided in parameter
|
na.rm |
A logical indicating if the NA values should be removed or not. |
Value
An array with the two selected dimensions merged into a single dimension.
Author(s)
Nuria Perez-Zanon, nuria.perez@bsc.es
Examples
data <- 1 : 20
dim(data) <- c(time = 10, lat = 2)
new_data <- MergeDims(data, merge_dims = c('time', 'lat'))
EOF analysis of multiple variables starting from an array (reduced version)
Description
This function performs EOF analysis over multiple variables,
accepting in input an array with a dimension "var" for each variable to
analyse. Based on Singular Value Decomposition. For each field the EOFs are
computed and the corresponding PCs are standardized (unit variance, zero mean);
the minimum number of principal components needed to reach the user-defined
variance is retained. The function weights the input data for the latitude
cosine square root.
Usage
MultiEOF(
data,
lon,
lat,
dates,
time = NULL,
lon_dim = "lon",
lat_dim = "lat",
time_dim = "ftime",
sdate_dim = "sdate",
var_dim = "var",
neof_max = 40,
neof_composed = 5,
minvar = 0.6,
lon_lim = NULL,
lat_lim = NULL,
ncores = NULL
)
Arguments
data |
A multidimensional array with dimension |
lon |
Vector of longitudes. |
lat |
Vector of latitudes. |
dates |
Vector or matrix of dates in POSIXct format. |
time |
Deprecated parameter, it has been substituted by 'dates'. It will be removed in the next release. |
lon_dim |
A character string indicating the name of the longitudinal dimension. By default, it is set to 'lon'. |
lat_dim |
A character string indicating the name of the latitudinal dimension. By default, it is set to 'lat'. |
time_dim |
A character string indicating the name of the temporal dimension. By default, it is set to 'time'. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
var_dim |
A character string indicating the name of the variable dimension. By default, it is set to 'var'. |
neof_max |
Maximum number of single eofs considered in the first decomposition. |
neof_composed |
Number of composed eofs to return in output. |
minvar |
Minimum variance fraction to be explained in first decomposition. |
lon_lim |
Vector with longitudinal range limits for the calculation for all input variables. |
lat_lim |
Vector with latitudinal range limits for the calculation for all input variables. |
ncores |
An integer indicating the number of cores to use for parallel computation. The default value is NULL. |
Value
A list containing:
coeff |
An array of principal components with dimensions 'time_dim', 'sdate_dim', number of eof, rest of the dimensions of 'data' except 'lon_dim' and 'lat_dim'. |
variance |
An array of explained variances with dimensions 'eof' and the rest of the dimensions of 'data' except 'time_dim', 'sdate_dim', 'lon_dim' and 'lat_dim'. |
eof_pattern |
An array of EOF patterns obtained by regression with dimensions: 'eof' and the rest of the dimensions of 'data' except 'time_dim' and 'sdate_dim'. |
mask |
An array of the mask with dimensions ('lon_dim', 'lat_dim', rest of the dimensions of 'data' except 'time_dim'). It is made from 'data', 1 for the positions that 'data' has value and NA for the positions that 'data' has NA. It is used to replace NAs with 0s for EOF calculation and mask the result with NAs again after the calculation. |
coordinates |
Longitudinal and latitudinal coordinates vectors. |
Author(s)
Jost von Hardenberg - ISAC-CNR, j.vonhardenberg@isac.cnr.it
Paolo Davini - ISAC-CNR, p.davini@isac.cnr.it
Examples
exp <- array(runif(1280)*280, dim = c(dataset = 2, member = 2, sdate = 3,
ftime = 3, lat = 4, lon = 4, var = 1))
lon <- seq(0, 3)
lat <- seq(47, 44)
dates <- c("2000-11-01", "2000-12-01", "2001-01-01", "2001-11-01",
"2001-12-01", "2002-01-01", "2002-11-01", "2002-12-01", "2003-01-01")
Dates <- as.POSIXct(dates, format = "%Y-%m-%d")
dim(Dates) <- c(ftime = 3, sdate = 3)
cal <- MultiEOF(data = exp, lon = lon, lat = lat, dates = Dates)
Multiple Metrics applied in Multiple Model Anomalies
Description
This function calculates correlation (Anomaly Correlation Coefficient; ACC), root mean square error (RMS) and the root mean square error skill score (RMSSS) of individual anomaly models and multi-models mean (if desired) with the observations on arrays with named dimensions.
Usage
MultiMetric(
exp,
obs,
metric = "correlation",
multimodel = TRUE,
time_dim = "ftime",
memb_dim = "member",
sdate_dim = "sdate"
)
Arguments
exp |
A multidimensional array with named dimensions. |
obs |
A multidimensional array with named dimensions. |
metric |
A character string giving the metric for computing the maximum skill. This must be one of the strings 'correlation', 'rms' or 'rmsss. |
multimodel |
A logical value indicating whether a Multi-Model Mean should be computed. |
time_dim |
Name of the temporal dimension where a mean will be applied. It can be NULL, the default value is 'ftime'. |
memb_dim |
Name of the member dimension. It can be NULL, the default value is 'member'. |
sdate_dim |
Name of the start date dimension or a dimension name identifiying the different forecast. It can be NULL, the default value is 'sdate'. |
Value
A list of arrays containing the statistics of the selected metric in
the element $data which is a list of arrays: for the metric requested
and others for statistics about its signeificance. The arrays have two dataset
dimensions equal to the 'dataset' dimension in the exp$data and
obs$data inputs. If multimodel is TRUE, the greatest position in
the first dimension correspons to the Multi-Model Mean.
Author(s)
Mishra Niti, niti.mishra@bsc.es
Perez-Zanon Nuria, nuria.perez@bsc.es
References
Mishra, N., Prodhomme, C., & Guemas, V. (n.d.). Multi-Model Skill Assessment of Seasonal Temperature and Precipitation Forecasts over Europe, 29-31. doi:10.1007/s00382-018-4404-z
See Also
Examples
exp <- array(rnorm(2*2*4*5*2*2),
dim = c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2,
lon = 2))
obs <- array(rnorm(1*1*4*5*2*2),
dim = c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2,
lon = 2))
res <- MultiMetric(exp = exp, obs = obs)
Computing the Index PDFs for a dataset of SFSs for a hindcats period.
Description
This function implements the computation to obtain the index PDFs (e.g. NAO index) to improve the index estimate from SFSs for a hindcast period.
Usage
PDFIndexHind(
index_hind,
index_obs,
method = "ME",
time_dim_name = "time",
na.rm = FALSE
)
Arguments
index_hind |
Index (e.g. NAO index) array from SFSs with at least two dimensions (time , member) or (time, statistic). The temporal dimension, by default 'time', must be greater than 2. The dimension 'member' must be greater than 1. The dimension 'statistic' must be equal to 2, for containing the two paramenters of a normal distribution (mean and sd) representing the ensemble of a SFS. It is not possible to have the dimension 'member' and 'statistic' together. |
index_obs |
Index (e.g. NAO index) array from an observational database or reanalysis with at least a temporal dimension (by default 'time'), which must be greater than 2. |
method |
A character string indicating which methodology is applied to compute the PDFs. One of "ME" (default) or "LMEV". |
time_dim_name |
A character string indicating the name of the temporal dimension, by default 'time'. |
na.rm |
Logical (default = FALSE). Should missing values be removed? |
Value
An array with at least two dimensions (time, statistic = 4). The firt statistic is the parameter 'mean' of the PDF with not bias corrected. The second statistic is the parameter 'standard deviation' of the PDF with not bias corrected. The third statistic is the parameter 'mean' of the PDF with bias corrected. The fourth statistic is the parameter 'standard deviation' of the PDF with bias corrected.
Author(s)
Eroteida Sanchez-Garcia - AEMET, esanchezg@aemet.es
References
Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., Adv. Sci. Res., 16, 165174, 2019, doi:10.5194/asr-16-165-2019
Examples
# Example for the PDFIndexHind function
# Example 1
index_obs <- 1 : (5 * 3 )
dim(index_obs) <- c(time = 5, season = 3)
index_hind <- 1 : (5 * 4 * 3)
dim(index_hind) <- c(time = 5, member = 4, season = 3)
res <- PDFIndexHind(index_hind, index_obs)
dim(res)
# time statistic season
# 5 4 3
# Example 2
index_obs <- 1 : (5 * 3)
dim(index_obs) <- c(time = 5, season = 3)
index_hind <- 1 : (5 * 2 * 3)
dim(index_hind) <- c(time = 5, statistic = 2, season = 3)
res <- PDFIndexHind(index_hind, index_obs)
Plot Multiple Lon-Lat Variables In a Single Map According to a Decision Function
Description
Plot a number a two dimensional matrices with (longitude, latitude) dimensions on a single map with the cylindrical equidistant latitude and longitude projection.
Usage
PlotCombinedMap(
maps,
lon,
lat,
map_select_fun,
display_range,
map_dim = "map",
brks = NULL,
cols = NULL,
bar_limits = NULL,
triangle_ends = c(FALSE, FALSE),
col_inf = NULL,
col_sup = NULL,
col_unknown_map = "white",
mask = NULL,
col_mask = "grey",
dots = NULL,
bar_titles = NULL,
legend_scale = 1,
cex_bar_titles = 1.5,
plot_margin = NULL,
bar_extra_margin = c(2, 0, 2, 0),
fileout = NULL,
width = 8,
height = 5,
size_units = "in",
res = 100,
drawleg = TRUE,
return_leg = FALSE,
...
)
Arguments
maps |
List of matrices to plot, each with (longitude, latitude) dimensions, or 3-dimensional array with the dimensions (longitude, latitude, map). Dimension names are required. |
lon |
Vector of longitudes. Must match the length of the corresponding dimension in 'maps'. |
lat |
Vector of latitudes. Must match the length of the corresponding dimension in 'maps'. |
map_select_fun |
Function that selects, for each grid point, which value
to take among all the provided maps. This function receives as input a
vector of values for a same grid point for all the provided maps, and must
return a single selected value (not its index!) or NA. For example, the
|
display_range |
Range of values to be displayed for all the maps. This must be a numeric vector c(range min, range max). The values in the parameter 'maps' can go beyond the limits specified in this range. If the selected value for a given grid point (according to 'map_select_fun') falls outside the range, it will be coloured with 'col_unknown_map'. |
map_dim |
Optional name for the dimension of 'maps' along which the multiple maps are arranged. Only applies when 'maps' is provided as a 3-dimensional array. Takes the value 'map' by default. |
brks |
Colour levels to be sent to PlotEquiMap. This parameter is optional and adjusted automatically by the function. |
cols |
List of vectors of colours to be sent to PlotEquiMap for the colour bar of each map. This parameter is optional and adjusted automatically by the function (up to 5 maps). The colours provided for each colour bar will be automatically interpolated to match the number of breaks. Each item in this list can be named, and the name will be used as title for the corresponding colour bar (equivalent to the parameter 'bar_titles'). |
bar_limits |
Parameter from s2dv::ColorBar. Vector of two numeric values with the extremes of the range of values represented in the colour bar. If 'var_limits' go beyond this interval, the drawing of triangle extremes is triggered at the corresponding sides, painted in 'col_inf' and 'col_sup'. Either of them can be set as NA and will then take as value the corresponding extreme in 'var_limits' (hence a triangle end won't be triggered for these sides). Takes as default the extremes of 'brks' if available, else the same values as 'var_limits'. |
triangle_ends |
Parameter from s2dv::ColorBar. Vector of two logical elements, indicating whether to force the drawing of triangle ends at each of the extremes of the colour bar. This choice is automatically made from the provided 'brks', 'bar_limits', 'var_limits', 'col_inf' and 'col_sup', but the behaviour can be manually forced to draw or not to draw the triangle ends with this parameter. If 'cols' is provided, 'col_inf' and 'col_sup' will take priority over 'triangle_ends' when deciding whether to draw the triangle ends or not. |
col_inf |
Parameter from s2dv::ColorBar. Colour to fill the inferior triangle end with. Useful if specifying colours manually with parameter 'cols', to specify the colour and to trigger the drawing of the lower extreme triangle, or if 'cols' is not specified, to replace the colour automatically generated by ColorBar(). |
col_sup |
Parameter from s2dv::ColorBar. Colour to fill the superior triangle end with. Useful if specifying colours manually with parameter 'cols', to specify the colour and to trigger the drawing of the upper extreme triangle, or if 'cols' is not specified, to replace the colour automatically generated by ColorBar(). |
col_unknown_map |
Colour to use to paint the grid cells for which a map is not possible to be chosen according to 'map_select_fun' or for those values that go beyond 'display_range'. Takes the value 'white' by default. |
mask |
Optional numeric array with dimensions (latitude, longitude), with values in the range [0, 1], indicating the opacity of the mask over each grid point. Cells with a 0 will result in no mask, whereas cells with a 1 will result in a totally opaque superimposed pixel coloured in 'col_mask'. |
col_mask |
Colour to be used for the superimposed mask (if specified in 'mask'). Takes the value 'grey' by default. |
dots |
Array of same dimensions as 'var' or with dimensions c(n, dim(var)), where n is the number of dot/symbol layers to add to the plot. A value of TRUE at a grid cell will draw a dot/symbol on the corresponding square of the plot. By default all layers provided in 'dots' are plotted with dots, but a symbol can be specified for each of the layers via the parameter 'dot_symbol'. |
bar_titles |
Optional vector of character strings providing the titles to be shown on top of each of the colour bars. |
legend_scale |
Scale factor for the size of the colour bar labels. Takes 1 by default. |
cex_bar_titles |
Scale factor for the sizes of the bar titles. Takes 1.5 by default. |
plot_margin |
Numeric vector of length 4 for the margin sizes in the following order: bottom, left, top, and right. If not specified, use the default of par("mar"), c(5.1, 4.1, 4.1, 2.1). Used as 'margin_scale' in s2dv::PlotEquiMap. |
bar_extra_margin |
Parameter from s2dv::ColorBar. Extra margins to be added around the colour bar, in the format c(y1, x1, y2, x2). The units are margin lines. Takes rep(0, 4) by default. |
fileout |
File where to save the plot. If not specified (default) a graphics device will pop up. Extensions allowed: eps/ps, jpeg, png, pdf, bmp and tiff |
width |
File width, in the units specified in the parameter size_units (inches by default). Takes 8 by default. |
height |
File height, in the units specified in the parameter size_units (inches by default). Takes 5 by default. |
size_units |
Units of the size of the device (file or window) to plot in. Inches ('in') by default. See ?Devices and the creator function of the corresponding device. |
res |
Resolution of the device (file or window) to plot in. See ?Devices and the creator function of the corresponding device. |
drawleg |
Where to draw the common colour bar. Can take values TRUE,
FALSE or: |
return_leg |
A logical value indicating if the color bars information should be returned by the function. If TRUE, the function doesn't plot the color bars but still creates the layout with color bar areas, and the arguments for GradientCatsColorBar() or ColorBar() will be returned. It is convenient for users to adjust the color bars manually. The default is FALSE, the color bars will be plotted directly. |
... |
Additional parameters to be passed on to |
Value
If return_leg = FALSE (default), the function plots a combined map
to the current graphical device (or saves it to file if fileout is
specified) and returns NULL.
If return_leg = TRUE, no map is plotted. Instead, a list is returned
containing the arguments required to plot the color bars manually using
s2dv::GradientCatsColorBar or s2dv::ColorBar. This allows
users to customize the color bars independently.
Author(s)
Nicolau Manubens, nicolau.manubens@bsc.es
Veronica Torralba, veronica.torralba@bsc.es
See Also
PlotCombinedMap and PlotEquiMap
Examples
# Simple example
x <- array(1:(20 * 10), dim = c(lat = 10, lon = 20)) / 200
a <- x * 0.6
b <- (1 - x) * 0.6
c <- 1 - (a + b)
lons <- seq(0, 359.5, length = 20)
lats <- seq(-89.5, 89.5, length = 10)
PlotCombinedMap(list(a, b, c), lons, lats,
toptitle = 'Maximum map',
map_select_fun = max,
display_range = c(0, 1),
bar_titles = paste('% of belonging to', c('a', 'b', 'c')),
brks = 20, width = 12, height = 10,
fileout = file.path(tempdir(), "example1.pdf"))
unlink(file.path(tempdir(), "example1.pdf"))
Lon <- c(0:40, 350:359)
Lat <- 51:26
data <- rnorm(51 * 26 * 3)
dim(data) <- c(map = 3, lon = 51, lat = 26)
mask <- sample(c(0,1), replace = TRUE, size = 51 * 26)
dim(mask) <- c(lat = 26, lon = 51)
PlotCombinedMap(data, lon = Lon, lat = Lat, map_select_fun = max,
display_range = range(data), mask = mask,
width = 14, height = 10,
fileout = file.path(tempdir(), "example2.pdf"))
unlink(file.path(tempdir(), "example2.pdf"))
Plot one or multiple ensemble forecast pdfs for the same event
Description
This function plots the probability distribution function of several ensemble forecasts. Separate panels are used to plot forecasts valid or initialized at different times or by different models or even at different locations. Probabilities for tercile categories are computed, plotted in colors and annotated. An asterisk marks the tercile with higher probabilities. Probabilities for extreme categories (above P90 and below P10) can also be included as hatched areas. Individual ensemble members can be plotted as jittered points. The observed value is optionally shown as a diamond.
Usage
PlotForecastPDF(
fcst,
tercile.limits,
extreme.limits = NULL,
obs = NULL,
plotfile = NULL,
title = "Set a title",
var.name = "Varname (units)",
fcst.names = NULL,
add.ensmemb = c("above", "below", "no"),
color.set = c("ggplot", "s2s4e", "hydro", "vitigeoss"),
memb_dim = "member"
)
Arguments
fcst |
A dataframe or array containing all the ensember members for each
forecast. If |
tercile.limits |
An array or vector with P33 and P66 values that define the tercile categories for each panel. Use an array of dimensions (nforecasts,2) to define different terciles for each forecast panel, or a vector with two elements to reuse the same tercile limits for all forecast panels. |
extreme.limits |
(optional) An array or vector with P10 and P90 values that define the extreme categories for each panel. Use an array of (nforecasts,2) to define different extreme limits for each forecast panel, or a vector with two elements to reuse the same tercile limits for all forecast panels. (Default: extreme categories are not shown). |
obs |
(optional) A vector providing the observed values for each forecast panel or a single value that will be reused for all forecast panels. (Default: observation is not shown). |
plotfile |
(optional) A filename (pdf, png...) where the plot will be saved. (Default: the plot is not saved). |
title |
A string with the plot title. |
var.name |
A string with the variable name and units. |
fcst.names |
(optional) An array of strings with the titles of each individual forecast. |
add.ensmemb |
Either to add the ensemble members |
color.set |
A selection of predefined color sets: use |
memb_dim |
A character string indicating the name of the member dimension. |
Value
A ggplot object containing the plot.
Author(s)
Llorenç Lledó llledo@bsc.es
Examples
fcsts <- data.frame(fcst1 = rnorm(10), fcst2 = rnorm(10, 0.5, 1.2))
PlotForecastPDF(fcsts,c(-1,1))
Plot Maps of Most Likely Quantiles
Description
This function receives as main input (via the parameter
probs) a collection of longitude-latitude maps, each containing the
probabilities (from 0 to 1) of the different grid cells of belonging to a
category. As many categories as maps provided as inputs are understood to
exist. The maps of probabilities must be provided on a common rectangular
regular grid, and a vector with the longitudes and a vector with the latitudes
of the grid must be provided. The input maps can be provided in two forms,
either as a list of multiple two-dimensional arrays (one for each category) or
as a three-dimensional array, where one of the dimensions corresponds to the
different categories.
Usage
PlotMostLikelyQuantileMap(
probs,
lon,
lat,
cat_dim = "bin",
bar_titles = NULL,
col_unknown_cat = "white",
drawleg = TRUE,
...
)
Arguments
probs |
A list of bi-dimensional arrays with the named dimensions 'latitude' (or 'lat') and 'longitude' (or 'lon'), with equal size and in the same order, or a single tri-dimensional array with an additional dimension (e.g. 'bin') for the different categories. The arrays must contain probability values between 0 and 1, and the probabilities for all categories of a grid cell should not exceed 1 when added. |
lon |
A numeric vector with the longitudes of the map grid, in the same
order as the values along the corresponding dimension in |
lat |
A numeric vector with the latitudes of the map grid, in the same
order as the values along the corresponding dimension in |
cat_dim |
The name of the dimension along which the different categories
are stored in |
bar_titles |
Vector of character strings with the names to be drawn on
top of the color bar for each of the categories. As many titles as
categories provided in |
col_unknown_cat |
Character string with a colour representation of the colour to be used to paint the cells for which no category can be clearly assigned. Takes the value 'white' by default. |
drawleg |
Where to draw the common colour bar. Can take values TRUE,
FALSE or: |
... |
Additional parameters to be sent to |
Value
Produces a map showing the most likely category or quantile for each grid cell, based on the input probability maps.
Author(s)
Veronica Torralba, veronica.torralba@bsc.es, Nicolau Manubens, nicolau.manubens@bsc.es
See Also
PlotCombinedMap and PlotEquiMap
Examples
# Simple example
x <- array(1:(20 * 10), dim = c(lat = 10, lon = 20)) / 200
a <- x * 0.6
b <- (1 - x) * 0.6
c <- 1 - (a + b)
lons <- seq(0, 359.5, length = 20)
lats <- seq(-89.5, 89.5, length = 10)
PlotMostLikelyQuantileMap(list(a, b, c), lons, lats,
toptitle = 'Most likely tercile map',
bar_titles = paste('% of belonging to', c('a', 'b', 'c')),
brks = 20, width = 10, height = 8,
bar_extra_margin = c(1, 1, 1, 1),
fileout = file.path(tempdir(),"example1.pdf"))
unlink(file.path(tempdir(),"example1.pdf"))
# More complex example
n_lons <- 40
n_lats <- 20
n_timesteps <- 100
n_bins <- 4
# 1. Generation of sample data
lons <- seq(0, 359.5, length = n_lons)
lats <- seq(-89.5, 89.5, length = n_lats)
# This function builds a 3-D gaussian at a specified point in the map.
make_gaussian <- function(lon, sd_lon, lat, sd_lat) {
w <- outer(lons, lats, function(x, y) dnorm(x, lon, sd_lon) * dnorm(y, lat, sd_lat))
min_w <- min(w)
w <- w - min_w
w <- w / max(w)
w <- t(w)
names(dim(w)) <- c('lat', 'lon')
w
}
# This function generates random time series (with values ranging 1 to 5)
# according to 2 input weights.
gen_data <- function(w1, w2, n) {
r <- sample(1:5, n,
prob = c(.05, .9 * w1, .05, .05, .9 * w2),
replace = TRUE)
r <- r + runif(n, -0.5, 0.5)
dim(r) <- c(time = n)
r
}
# We build two 3-D gaussians.
w1 <- make_gaussian(120, 80, 20, 30)
w2 <- make_gaussian(260, 60, -10, 40)
# We generate sample data (with dimensions time, lat, lon) according
# to the generated gaussians
sample_data <- multiApply::Apply(list(w1, w2), NULL,
gen_data, n = n_timesteps)$output1
# 2. Binning sample data
prob_thresholds <- 1:n_bins / n_bins
prob_thresholds <- prob_thresholds[1:(n_bins - 1)]
thresholds <- quantile(sample_data, prob_thresholds)
binning <- function(x, thresholds) {
n_samples <- length(x)
n_bins <- length(thresholds) + 1
thresholds <- c(thresholds, max(x))
result <- 1:n_bins
lower_threshold <- min(x) - 1
for (i in 1:n_bins) {
result[i] <- sum(x > lower_threshold & x <= thresholds[i]) / n_samples
lower_threshold <- thresholds[i]
}
dim(result) <- c(bin = n_bins)
result
}
bins <- multiApply::Apply(sample_data, 'time', binning, thresholds)$output1
# 3. Plotting most likely quantile/bin
PlotMostLikelyQuantileMap(bins, lons, lats,
toptitle = 'Most likely quantile map',
bar_titles = paste('% of belonging to', letters[1:n_bins]),
mask = 1 - (w1 + w2 / max(c(w1, w2))),
brks = 20, width = 10, height = 8,
bar_extra_margin = c(1, 1, 1, 1),
fileout = file.path(tempdir(),"example2.pdf"))
unlink(file.path(tempdir(),"example2.pdf"))
Plotting two probability density gaussian functions and the optimal linear estimation (OLE) as result of combining them.
Description
This function plots two probability density gaussian functions and the optimal linear estimation (OLE) as result of combining them.
Usage
PlotPDFsOLE(
pdf_1,
pdf_2,
nsigma = 3,
legendPos = "bottom",
legendSize = 1,
plotfile = NULL,
width = 30,
height = 15,
units = "cm",
dpi = 300
)
Arguments
pdf_1 |
A numeric array with a dimension named 'statistic', containg two parameters: mean' and 'standard deviation' of the first gaussian pdf to combining. |
pdf_2 |
A numeric array with a dimension named 'statistic', containg two parameters: mean' and 'standard deviation' of the second gaussian pdf to combining. |
nsigma |
(optional) A numeric value for setting the limits of X axis. (Default nsigma = 3). |
legendPos |
(optional) A character value for setting the position of the legend ("bottom", "top", "right" or "left")(Default 'bottom'). |
legendSize |
(optional) A numeric value for setting the size of the legend text. (Default 1.0). |
plotfile |
(optional) A filename where the plot will be saved. (Default: the plot is not saved). |
width |
(optional) A numeric value indicating the plot width in units ("in", "cm", or "mm"). (Default width = 30). |
height |
(optional) A numeric value indicating the plot height. (Default height = 15). |
units |
(optional) A character value indicating the plot size unit. (Default units = 'cm'). |
dpi |
(optional) A numeric value indicating the plot resolution. (Default dpi = 300). |
Value
PlotPDFsOLE() returns a ggplot object containing the plot.
Author(s)
Eroteida Sanchez-Garcia - AEMET, esanchezg@aemet.es
Examples
# Example 1
pdf_1 <- c(1.1,0.6)
attr(pdf_1, "name") <- "NAO1"
dim(pdf_1) <- c(statistic = 2)
pdf_2 <- c(1,0.5)
attr(pdf_2, "name") <- "NAO2"
dim(pdf_2) <- c(statistic = 2)
PlotPDFsOLE(pdf_1, pdf_2)
Function to convert any 3-d numerical array to a grid of coloured triangles.
Description
This function converts a 3-d numerical data array into a coloured grid with triangles. It is useful for a slide or article to present tabular results as colors instead of numbers. This can be used to compare the outputs of two or four categories (e.g. modes of variability, clusters, or forecast systems).
Usage
PlotTriangles4Categories(
data,
brks = NULL,
cols = NULL,
toptitle = NULL,
sig_data = NULL,
pch_sig = 18,
col_sig = "black",
cex_sig = 1,
xlab = TRUE,
ylab = TRUE,
xlabels = NULL,
xtitle = NULL,
ylabels = NULL,
ytitle = NULL,
legend = TRUE,
lab_legend = NULL,
cex_leg = 1,
col_leg = "black",
cex_axis = 1.5,
mar = c(5, 4, 0, 0),
fileout = NULL,
size_units = "px",
res = 100,
figure.width = 1,
...
)
Arguments
data |
Array with three named dimensions: 'dimx', 'dimy', 'dimcat', containing the values to be displayed in a coloured image with triangles. |
brks |
A vector of the color bar intervals. The length must be one more than the parameter 'cols'. Use ColorBar() to generate default values. |
cols |
A vector of valid colour identifiers for color bar. The length must be one less than the parameter 'brks'. Use ColorBar() to generate default values. |
toptitle |
A string of the title of the grid. Set NULL as default. |
sig_data |
Logical array with the same dimensions as 'data' to add layers to the plot. A value of TRUE at a grid cell will draw a dot/symbol on the corresponding triangle of the plot. Set NULL as default. |
pch_sig |
Symbol to be used to represent sig_data. Takes 18 (diamond) by default. See 'pch' in par() for additional accepted options. |
col_sig |
Colour of the symbol to represent sig_data. |
cex_sig |
Parameter to increase/reduce the size of the symbols used to represent sig_data. |
xlab |
A logical value (TRUE) indicating if xlabels should be plotted |
ylab |
A logical value (TRUE) indicating if ylabels should be plotted |
xlabels |
A vector of labels of the x-axis The length must be length of the col of parameter 'data'. Set the sequence from 1 to the length of the row of parameter 'data' as default. |
xtitle |
A string of title of the x-axis. Set NULL as default. |
ylabels |
A vector of labels of the y-axis The length must be length of the row of parameter 'data'. Set the sequence from 1 to the length of the row of parameter 'data' as default. |
ytitle |
A string of title of the y-axis. Set NULL as default. |
legend |
A logical value to decide to draw the color bar legend or not. Set TRUE as default. |
lab_legend |
A vector of labels indicating what is represented in each category (i.e. triangle). Set the sequence from 1 to the length of the categories (2 or 4). |
cex_leg |
A number to indicate the increase/reductuion of the lab_legend used to represent sig_data. |
col_leg |
Color of the legend (triangles). |
cex_axis |
A number to indicate the increase/reduction of the axis labels. |
mar |
A numerical vector of the form c(bottom, left, top, right) which gives the number of lines of margin to be specified on the four sides of the plot. |
fileout |
A string of full directory path and file name indicating where to save the plot. If not specified (default), a graphics device will pop up. |
size_units |
A string indicating the units of the size of the device (file or window) to plot in. Set 'px' as default. See ?Devices and the creator function of the corresponding device. |
res |
A positive number indicating resolution of the device (file or window) to plot in. See ?Devices and the creator function of the corresponding device. |
figure.width |
a numeric value to control the width of the plot. |
... |
The additional parameters to be passed to function ColorBar() in s2dv for color legend creation. |
Value
A figure in popup window by default, or saved to the specified path.
Author(s)
History:
1.0 - 2020-10 (V.Torralba, veronica.torralba@bsc.es) - Original code
Examples
# Example with random data
arr1 <- array(runif(n = 4 * 5 * 4, min = -1, max = 1), dim = c(4,5,4))
names(dim(arr1)) <- c('dimx', 'dimy', 'dimcat')
arr2 <- array(TRUE, dim = dim(arr1))
arr2[which(arr1 < 0.3)] <- FALSE
PlotTriangles4Categories(data = arr1,
cols = c('white','#fef0d9','#fdd49e','#fdbb84','#fc8d59'),
brks = c(-1, 0, 0.1, 0.2, 0.3, 0.4),
lab_legend = c('NAO+', 'BL','AR','NAO-'),
xtitle = "Target month", ytitle = "Lead time",
xlabels = c("Jan", "Feb", "Mar", "Apr"))
Plots the observed weekly means and climatology of a timeseries data
Description
This function plots the observed weekly means and climatology of a timeseries data using ggplot package. It compares the weekly climatology in a specified period (reference period) to the observed conditions during the target period analyzed in the case study.
Usage
PlotWeeklyClim(
data,
first_date,
ref_period,
last_date = NULL,
data_years = NULL,
time_dim = "time",
sdate_dim = "sdate",
ylim = NULL,
title = NULL,
subtitle = NULL,
ytitle = NULL,
legend = TRUE,
palette = "Blues",
fileout = NULL,
device = NULL,
width = 8,
height = 6,
units = "in",
dpi = 300
)
Arguments
data |
A multidimensional array with named dimensions with at least sdate and time dimensions containing observed daily data. It can also be a dataframe with computed percentiles as input for ggplot. If it's a dataframe, it must contain the following column names: 'week', 'clim', 'p10', 'p90', 'p33', 'p66', 'week_mean', 'day' and 'data'. |
first_date |
The first date of the observed values of timeseries. It can be of class 'Date', 'POSIXct' or a character string in the format 'yyyy-mm-dd'. If parameter 'data_years' is not provided, it must be a date included in the reference period. |
ref_period |
A vector of numeric values indicating the years of the reference period. If parameter 'data_years' is not specified, it must be of the same length of dimension 'sdate_dim' of parameter 'data'. |
last_date |
Optional parameter indicating the last date of the target period of the daily timeseries. It can be of class 'Date', 'POSIXct' or a character string in the format 'yyyy-mm-dd'. If it is NULL, the last date of the daily timeseries will be set as the last date of 'data'. As the data is plotted by weeks, only full groups of 7 days will be plotted. If the last date of the timeseries is not a multiple of 7 days, the last week will not be plotted. |
data_years |
A vector of numeric values indicating the years of the data. It must be of the same length of dimension 'sdate_dim' of parameter 'data'. It is optional, if not specified, all the years will be used as the target period. |
time_dim |
A character string indicating the daily time dimension name. The default value is 'time'. |
sdate_dim |
A character string indicating the start year dimension name. The default value is 'sdate'. |
ylim |
A numeric vector of length two providing limits of the scale. Use NA to refer to the existing minimum or maximum. For more information, see 'ggplot2' documentation of 'scale_y_continuous' parameter. |
title |
The text for the top title of the plot. It is NULL by default. |
subtitle |
The text for the subtitle of the plot. It is NULL bu default. |
ytitle |
Character string to be drawn as y-axis title. It is NULL by default. |
legend |
A logical value indicating whether a legend should be included in the plot. If it is TRUE or NA, the legend will be included. If it is FALSE, the legend will not be included. It is TRUE by default. |
palette |
A palette name from the R Color Brewer’s package. The default value is 'Blues'. |
fileout |
A character string indicating the file name where to save the plot. If not specified (default) a graphics device will pop up. |
device |
A character string indicating the device to use. Can either be a device function (e.g. png), or one of "eps", "ps", "tex" (pictex), "pdf", "jpeg", "tiff", "png", "bmp", "svg" or "wmf" (windows only). |
width |
A numeric value of the plot width in units ("in", "cm", "mm", or "px"). It is set to 8 by default. |
height |
A numeric value of the plot height in units ("in", "cm", "mm", or "px"). It is set to 6 by default. |
units |
Units of the size of the device (file or window) to plot in. Inches (’in’) by default. |
dpi |
A numeric value of the plot resolution. It is set to 300 by default. |
Value
A ggplot object containing the plot.
Examples
data <- array(rnorm(49*20*3, 274), dim = c(time = 49, sdate = 20, member = 3))
PlotWeeklyClim(data = data, first_date = '2002-08-09',
last_date = '2002-09-15', ref_period = 2010:2019,
data_years = 2000:2019, time_dim = 'time', sdate_dim = 'sdate',
title = "Observed weekly means and climatology",
subtitle = "Target years: 2010 to 2019",
ytitle = paste0('tas', " (", "deg.C", ")"))
Computing scores of predictability using two dynamical proxies based on dynamical systems theory.
Description
This function divides in terciles the two dynamical proxies computed with CST_ProxiesAttractor or ProxiesAttractor. These terciles will be used to measure the predictability of the system in dyn_scores. When the local dimension 'dim' is small and the inverse of persistence 'theta' is small the predictability is high, and viceversa.
Usage
Predictability(dim, theta, ncores = NULL)
Arguments
dim |
An array of N named dimensions containing the local dimension as the output of CST_ProxiesAttractor or ProxiesAttractor. |
theta |
An array of N named dimensions containing the inverse of the persistence 'theta' as the output of CST_ProxiesAttractor or ProxiesAttractor. |
ncores |
The number of cores to use in parallel computation. |
Value
A list of length 2:
'pred.dim', a list of two lists 'qdim' and 'pos.d'. The 'qdim' list contains values of local dimension 'dim' divided by terciles: d1: lower tercile (high predictability), d2: middle tercile, d3: higher tercile (low predictability) The 'pos.d' list contains the position of each tercile in parameter 'dim'.
'pred.theta', a list of two lists 'qtheta' and 'pos.t'. The 'qtheta' list contains values of the inverse of persistence 'theta' divided by terciles: th1: lower tercile (high predictability), th2: middle tercile, th3: higher tercile (low predictability) The 'pos.t' list contains the position of each tercile in parameter 'theta'.
dyn_scores values from 0 to 1. A dyn_score of 1 indicates the highest predictability.
Author(s)
Carmen Alvarez-Castro, carmen.alvarez-castro@cmcc.it
Maria M. Chaves-Montero, mdm.chaves-montero@cmcc.it
Veronica Torralba, veronica.torralba@cmcc.it
Davide Faranda, davide.faranda@lsce.ipsl.fr
References
Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale atmospheric predictability.Nature Communications, 10(1), 1316. doi:10.1038/s41467-019-09305-8"
Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. Scientific Reports, 7-41278, 2017.
Examples
# Creating an example of matrix dat(time,grids):
m <- matrix(rnorm(2000) * 10, nrow = 50, ncol = 40)
names(dim(m)) <- c('time', 'grid')
# imposing a threshold
quanti <- 0.90
# computing dyn_scores from parameters dim and theta of the attractor
attractor <- ProxiesAttractor(dat = m, quanti = 0.60)
predyn <- Predictability(dim = attractor$dim, theta = attractor$theta)
Computing two dinamical proxies of the attractor.
Description
This function computes two dinamical proxies of the attractor: The local dimension (d) and the inverse of the persistence (theta). These two parameters will be used as a condition for the computation of dynamical scores to measure predictability and to compute bias correction conditioned by the dynamics with the function DynBiasCorrection. Funtion based on the matlab code (davide.faranda@lsce.ipsl.fr) used in:
Usage
ProxiesAttractor(data, quanti, ncores = NULL)
Arguments
data |
A multidimensional array with named dimensions to create the attractor. It requires a temporal dimension named 'time' and spatial dimensions called 'lat' and 'lon', or 'latitude' and 'longitude' or 'grid'. |
quanti |
A number lower than 1 indicating the quantile to perform the computation of local dimension and theta |
ncores |
The number of cores to use in parallel computation. |
Value
dim and theta
Author(s)
Carmen Alvarez-Castro, carmen.alvarez-castro@cmcc.it
Maria M. Chaves-Montero, mdm.chaves-montero@cmcc.it
Veronica Torralba, veronica.torralba@cmcc.it
Davide Faranda, davide.faranda@lsce.ipsl.fr
References
Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale atmospheric predictability. Nature Communications, 10(1), 1316. doi:10.1038/s41467-019-09305-8"
Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. Scientific Reports, 7-41278, 2017.
Examples
# Example 1: Computing the attractor using simple data
# Creating an example of matrix data(time,grids):
mat <- array(rnorm(36 * 40), c(time = 36, grid = 40))
qm <- 0.90 # imposing a threshold
Attractor <- ProxiesAttractor(data = mat, quanti = qm)
# to plot the result
time = c(1:length(Attractor$theta))
plot(time, Attractor$dim, xlab = 'time', ylab = 'd',
main = 'local dimension', type = 'l')
Quantile Mapping for seasonal or decadal forecast data
Description
This function is a wrapper of fitQmap and doQmap from package 'qmap' to be applied on multi-dimensional arrays. The quantile mapping adjustment between an experiment, typically a hindcast, and observation is applied to the experiment itself or to a provided forecast.
Usage
QuantileMapping(
exp,
obs,
exp_cor = NULL,
sdate_dim = "sdate",
memb_dim = "member",
window_dim = NULL,
method = "QUANT",
na.rm = FALSE,
eval.method = "leave-k-out",
k = 1,
ncores = NULL,
...
)
Arguments
exp |
A multidimensional array with named dimensions containing the hindcast. |
obs |
A multidimensional array with named dimensions containing the reference dataset. |
exp_cor |
A multidimensional array with named dimensions in which the quantile mapping correction should be applied. If it is not specified, the correction is applied to object 'exp' using leave-one-out cross-validation. This is useful to correct a forecast when the hindcast is provided in parameter 'exp'. |
sdate_dim |
A character string indicating the dimension name in which cross-validation would be applied when exp_cor is not provided. 'sdate' by default. |
memb_dim |
A character string indicating the dimension name where ensemble members are stored in the experimental arrays. It can be NULL if there is no ensemble member dimension. It is set as 'member' by default. |
window_dim |
A character string indicating the dimension name in which extra samples are stored. This dimension is joined to the 'member' dimension. This is useful to correct daily data, for which robust statistics can be obtained by creating a window of dates around the target date. |
method |
A character string indicating the method to be used: 'PTF', 'DIST', 'RQUANT', 'QUANT', 'SSPLIN'. By default, the empirical quantile mapping 'QUANT' is used. |
na.rm |
A logical value indicating if missing values should be removed (FALSE by default). |
eval.method |
A character string indicating the evaluation method for cross-validaton. the default method is 'leave-k-out', other available methods are 'retrospective', 'in-sample', 'hindcast-vs-forecast'. |
k |
Positive integer. Default = 1. In method 'leave-k-out', 'k' is expected to be odd integer, indicating the number of points to leave out. In method 'retrospective', 'k' can be any positive integer greater than 1, indicating when to start. |
ncores |
An integer indicating the number of cores for parallel computation using multiApply function. The default value is NULL (1). |
... |
Additional parameters to be used by the method choosen. See qmap package for details. |
Value
An array containing the experimental data after applying the quantile mapping correction.
Author(s)
Nuria Perez-Zanon, nuria.perez@bsc.es
See Also
Examples
# Use synthetic data
set.seed(123)
exp <- as.numeric(1:prod(6,10,15))
dim(exp) <- c(member = 6, syear = 10, window = 15)
obs <- as.numeric(rnorm(prod(1,10,15), 50))
dim(obs) <- c(member = 1, syear = 10, window = 15)
fcst <- 100*(1:prod(8,1,1))
dim(fcst) <- c(member = 8, syear = 1, swindow = 1)
res <- QuantileMapping(exp = exp, obs = obs, exp_cor = fcst,
memb_dim = 'member', sdate_dim = 'syear', window_dim = 'window')
RainFARM spectral slopes from an array (reduced version)
Description
This function computes spatial spectral slopes from an array, to be used for RainFARM stochastic precipitation downscaling method.
Usage
RFSlope(
data,
kmin = 1,
time_dim = NULL,
lon_dim = "lon",
lat_dim = "lat",
ncores = NULL
)
Arguments
data |
Array containing the spatial precipitation fields to downscale.
The input array is expected to have at least two dimensions named "lon" and
"lat" by default (these default names can be changed with the |
kmin |
First wavenumber for spectral slope (default |
time_dim |
String or character array with name(s) of dimension(s) (e.g. "ftime", "sdate", "member" ...) over which to compute spectral slopes. If a character array of dimension names is provided, the spectral slopes will be computed as an average over all elements belonging to those dimensions. If omitted one of c("ftime", "sdate", "time") is searched and the first one with more than one element is chosen. |
lon_dim |
Name of lon dimension ("lon" by default). |
lat_dim |
Name of lat dimension ("lat" by default). |
ncores |
is an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. |
Value
RFSlope() returns spectral slopes using the RainFARM convention
(the logarithmic slope of k*|A(k)|^2 where A(k) are the spectral amplitudes).
The returned array has the same dimensions as the input array,
minus the dimensions specified by lon_dim, lat_dim and time_dim.
Author(s)
Jost von Hardenberg - ISAC-CNR, j.vonhardenberg@isac.cnr.it
Examples
# Example for the 'reduced' RFSlope function
# Create a test array with dimension 8x8 and 20 timesteps,
# 3 starting dates and 20 ensemble members.
pr <- 1:(4*3*8*8*20)
dim(pr) <- c(ensemble = 4, sdate = 3, lon = 8, lat = 8, ftime = 20)
# Compute the spectral slopes ignoring the wavenumber
# corresponding to the largest scale (the box)
slopes <- RFSlope(pr, kmin = 2, time_dim = 'ftime')
Temperature downscaling of a CSTools object using lapse rate correction (reduced version)
Description
This function implements a simple lapse rate correction of a temperature field (a multidimensional array) as input. The input lon grid must be increasing (but can be modulo 360). The input lat grid can be irregularly spaced (e.g. a Gaussian grid) The output grid can be irregularly spaced in lon and/or lat.
Usage
RFTemp(
data,
lon,
lat,
oro,
lonoro,
latoro,
xlim = NULL,
ylim = NULL,
lapse = 6.5,
lon_dim = "lon",
lat_dim = "lat",
time_dim = NULL,
nolapse = FALSE,
verbose = FALSE,
compute_delta = FALSE,
method = "bilinear",
delta = NULL
)
Arguments
data |
Temperature array to downscale. The input array is expected to
have at least two dimensions named "lon" and "lat" by default (these default
names can be changed with the |
lon |
Vector or array of longitudes. |
lat |
Vector or array of latitudes. |
oro |
Array containing fine-scale orography (in m). The destination downscaling area must be contained in the orography field. |
lonoro |
Vector or array of longitudes corresponding to the fine orography. |
latoro |
Vector or array of latitudes corresponding to the fine orography. |
xlim |
Vector with longitude bounds for downscaling; the full input field is downscaled if 'xlim' and 'ylim' are not specified. |
ylim |
Vector with latitude bounds for downscaling. |
lapse |
Float with environmental lapse rate. |
lon_dim |
String with name of longitude dimension. |
lat_dim |
String with name of latitude dimension. |
time_dim |
A vector of character string indicating the name of temporal dimension. By default, it is set to NULL and it considers "ftime", "sdate" and "time" as temporal dimensions. |
nolapse |
Logical, if true 'oro' is interpreted as a fine-scale climatology and used directly for bias correction. |
verbose |
Logical if to print diagnostic output. |
compute_delta |
Logical if true returns only a delta to be used for out-of-sample forecasts. |
method |
String indicating the method used for interpolation: "nearest" (nearest neighbours followed by smoothing with a circular uniform weights kernel), "bilinear" (bilinear interpolation) The two methods provide similar results, but nearest is slightly better provided that the fine-scale grid is correctly centered as a subdivision of the large-scale grid. |
delta |
Matrix containing a delta to be applied to the downscaled input data. The grid of this matrix is supposed to be same as that of the required output field. |
Value
CST_RFTemp() returns a downscaled CSTools object.
RFTemp() returns a list containing the fine-scale longitudes, latitudes and the downscaled fields.
Author(s)
Jost von Hardenberg - ISAC-CNR, j.vonhardenberg@isac.cnr.it
References
Method described in ERA4CS MEDSCOPE milestone M3.2: High-quality climate prediction data available to WP4 here: https://www.medscope-project.eu/the-project/deliverables-reports/ and in H2020 ECOPOTENTIAL Deliverable No. 8.1: High resolution (1-10 km) climate, land use and ocean change scenarios here: https://ec.europa.eu/research/participants/documents/downloadPublic?documentIds=080166e5b6cd2324&appId=PPGMS.
Examples
# Generate simple synthetic data and downscale by factor 4
t <- rnorm(7 * 6 * 4 * 3) * 10 + 273.15 + 10
dim(t) <- c(sdate = 3, ftime = 4, lat = 6, lon = 7)
lon <- seq(3, 9, 1)
lat <- seq(42, 47, 1)
o <- runif(29 * 29) * 3000
dim(o) <- c(lat = 29, lon = 29)
lono <- seq(3, 10, 0.25)
lato <- seq(41, 48, 0.25)
res <- RFTemp(t, lon, lat, o, lono, lato, xlim = c(4, 8), ylim = c(43, 46),
lapse = 6.5, time_dim = 'ftime')
Compute climatological weights for RainFARM stochastic precipitation downscaling
Description
Compute climatological ("orographic") weights from a fine-scale precipitation climatology file.
Usage
RF_Weights(
zclim,
latin,
lonin,
nf,
lat,
lon,
fsmooth = TRUE,
lonname = "lon",
latname = "lat",
ncores = NULL
)
Arguments
zclim |
A multi-dimensional array with named dimension containing at least one precipiation field with spatial dimensions. |
latin |
A vector indicating the latitudinal coordinates corresponding to
the |
lonin |
A vector indicating the longitudinal coordinates corresponding to
the |
nf |
Refinement factor for downscaling (the output resolution is increased by this factor). |
lat |
Vector of latitudes. The number of longitudes and latitudes is expected to be even and the same. If not the function will perform a subsetting to ensure this condition. |
lon |
Vector of longitudes. |
fsmooth |
Logical to use smooth conservation (default) or large-scale box-average conservation. |
lonname |
A character string indicating the name of the longitudinal dimension set as 'lon' by default. |
latname |
A character string indicating the name of the latitudinal dimension set as 'lat' by default. |
ncores |
An integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. |
Value
An object of class 's2dv_cube' containing in matrix data the
weights with dimensions (lon, lat).
Author(s)
Jost von Hardenberg - ISAC-CNR, j.vonhardenberg@isac.cnr.it
References
Terzago, S., Palazzi, E., & von Hardenberg, J. (2018). Stochastic downscaling of precipitation in complex orography: A simple method to reproduce a realistic fine-scale climatology. Natural Hazards and Earth System Sciences, 18(11), 2825-2840. doi:10.5194/nhess-18-2825-2018.
Examples
a <- array(1:2500, c(lat = 50, lon = 50))
res <- RF_Weights(a, seq(0.1 ,5, 0.1), seq(0.1 ,5, 0.1),
nf = 5, lat = 1:5, lon = 1:5)
RainFARM stochastic precipitation downscaling (reduced version)
Description
This function implements the RainFARM stochastic precipitation downscaling method and accepts in input an array with named dims ("lon", "lat") and one or more dimension (such as "ftime", "sdate" or "time") over which to average automatically determined spectral slopes. Adapted for climate downscaling and including orographic correction. References: Terzago, S. et al. (2018). NHESS 18(11), 2825-2840. doi:10.5194/nhess-18-2825-2018, D'Onofrio et al. (2014), J of Hydrometeorology 15, 830-843; Rebora et. al. (2006), JHM 7, 724.
Usage
RainFARM(
data,
lon,
lat,
nf,
weights = 1,
nens = 1,
slope = 0,
kmin = 1,
fglob = FALSE,
fsmooth = TRUE,
nprocs = 1,
time_dim = NULL,
lon_dim = "lon",
lat_dim = "lat",
drop_realization_dim = FALSE,
verbose = FALSE
)
Arguments
data |
Precipitation array to downscale. The input array is expected to
have at least two dimensions named "lon" and "lat" by default (these default
names can be changed with the |
lon |
Vector or array of longitudes. |
lat |
Vector or array of latitudes. |
nf |
Refinement factor for downscaling (the output resolution is increased by this factor). |
weights |
Multi-dimensional array with climatological weights which can
be obtained using the |
nens |
Number of ensemble members to produce (default: |
slope |
Prescribed spectral slope. The default is |
kmin |
First wavenumber for spectral slope (default: |
fglob |
Logical to conseve global precipitation over the domain (default: FALSE). |
fsmooth |
Logical to conserve precipitation with a smoothing kernel (default: TRUE). |
nprocs |
The number of parallel processes to spawn for the use for parallel computation in multiple cores. (default: 1) |
time_dim |
String or character array with name(s) of time dimension(s) (e.g. "ftime", "sdate", "time" ...) over which to compute spectral slopes. If a character array of dimension names is provided, the spectral slopes will be computed over all elements belonging to those dimensions. If omitted one of c("ftime", "sdate", "time") is searched and the first one with more than one element is chosen. |
lon_dim |
Name of lon dimension ("lon" by default). |
lat_dim |
Name of lat dimension ("lat" by default). |
drop_realization_dim |
Logical to remove the "realization" stochastic ensemble dimension (default: FALSE) with the following behaviour if set to TRUE:
|
verbose |
logical for verbose output (default: FALSE). |
Details
Wether parameter 'slope' and 'weights' presents seasonality dependency, a dimension name should match between these parameters and the input data in parameter 'data'. See example 2 below where weights and slope vary with 'sdate' dimension.
Value
RainFARM() Returns a list containing the fine-scale longitudes,
latitudes and the sequence of nens downscaled fields. If
nens > 1 an additional dimension named "realization" is added to the
output array after the "member" dimension (if it exists and unless
drop_realization_dim = TRUE is specified). The ordering of the
remaining dimensions in the exp element of the input object is
maintained.
Author(s)
Jost von Hardenberg - ISAC-CNR, j.vonhardenberg@isac.cnr.it
Examples
# Example for the 'reduced' RainFARM function
nf <- 8 # Choose a downscaling by factor 8
exp <- 1 : (2 * 3 * 4 * 8 * 8)
dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8)
lon <- seq(10, 13.5, 0.5)
lat <- seq(40, 43.5, 0.5)
# Create a test array of weights
ww <- array(1., dim = c(lon = 8 * nf, lat = 8 * nf))
res <- RainFARM(data = exp, lon = lon, lat = lat, nf = nf,
weights = ww, nens = 3, time_dim = 'ftime')
Function for matching a field of anomalies with a set of maps used as a reference (e.g. clusters obtained from the WeatherRegime function).
Description
This function performs the matching between a field of anomalies and a set of maps which will be used as a reference. The anomalies will be assigned to the reference map for which the minimum Eucledian distance (method = 'distance') or highest spatial correlation (method = 'ACC') is obtained.
Usage
RegimesAssign(
data,
ref_maps,
lat,
method = "distance",
composite = FALSE,
memb = FALSE,
ncores = NULL
)
Arguments
data |
An array containing anomalies with named dimensions: dataset, member, sdate, ftime, lat and lon. |
ref_maps |
Array with 3-dimensions ('lon', 'lat', 'cluster') containing the maps/clusters that will be used as a reference for the matching. |
lat |
A vector of latitudes corresponding to the positions provided in data and ref_maps. |
method |
Whether the matching will be performed in terms of minimum distance (default = 'distance') or the maximum spatial correlation (method = 'ACC') between the maps. |
composite |
A logical parameter indicating if the composite maps are computed or not (default = FALSE). |
memb |
A logical value indicating whether to compute composites for separate members (default FALSE) or as unique ensemble (TRUE). This option is only available for when parameter 'composite' is set to TRUE and the data object has a dimension named 'member'. |
ncores |
The number of multicore threads to use for parallel computation. |
Value
A list with elements $composite (3-d array (lon, lat, k)
containing the composites k = 1,..,K for case (*1) or only k = 1 for any specific
cluster, i.e., case (*2)) (only if composite = 'TRUE'), $pvalue (array
with the same structure as $composite containing the pvalue of the
composites obtained through a t-test that accounts for the serial dependence
of the data with the same structure as Composite.) (only if composite='TRUE'),
$cluster (array with the same dimensions as data (except latitude and
longitude which are removed) indicating the ref_maps to which each point is
allocated.), $frequency (A vector of integers (from k = 1, ... k n
reference maps) indicating the percentage of assignations corresponding to
each map.),
Author(s)
Verónica Torralba - BSC, veronica.torralba@bsc.es
References
Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools for the development of a climate service. Thesis. Available online: https://eprints.ucm.es/56841/
Examples
data <- array(abs(rnorm(1280, 282.7, 6.4)), dim = c(dataset = 2, member = 2,
sdate = 3, ftime = 3,
lat = 4, lon = 4))
regimes <- WeatherRegime(data = data, lat = seq(47, 44),
EOFs = FALSE, ncenters = 4)$composite
res1 <- RegimesAssign(data = data, ref_maps = drop(regimes),
lat = seq(47, 44), composite = FALSE)
Save a multidimensional array with metadata to data in NetCDF format
Description
This function allows to save a data array with metadata into a
NetCDF file, allowing to reload the saved data using Start function
from StartR package. If the original 's2dv_cube' object has been created from
CST_Load(), then it can be reloaded with Load().
Usage
SaveExp(
data,
destination = tempdir(),
coords = NULL,
Dates = NULL,
time_bounds = NULL,
startdates = NULL,
varname = NULL,
metadata = NULL,
Datasets = NULL,
sdate_dim = "sdate",
ftime_dim = "time",
memb_dim = "member",
dat_dim = "dataset",
var_dim = "var",
drop_dims = NULL,
single_file = FALSE,
extra_string = NULL,
global_attrs = NULL,
units_hours_since = FALSE
)
Arguments
data |
A multi-dimensional array with named dimensions. |
destination |
A character string indicating the path where to store the NetCDF files. |
coords |
A named list with elements of the coordinates corresponding to the dimensions of the data parameter. The names and length of each element must correspond to the names of the dimensions. If any coordinate is not provided, it is set as an index vector with the values from 1 to the length of the corresponding dimension. |
Dates |
A named array of dates with the corresponding sdate and forecast time dimension. If there is no sdate_dim, you can set it to NULL. It must have ftime_dim dimension. |
time_bounds |
(Optional) A list of two arrays of dates containing the lower (first array) and the upper (second array) time bounds corresponding to Dates. Each array must have the same dimensions as Dates. If 'Dates' parameter is NULL, 'time_bounds' are not used. It is NULL by default. |
startdates |
A vector of dates that will be used for the filenames
when saving the data in multiple files (single_file = FALSE). It must be a
vector of the same length as the start date dimension of data. It must be a
vector of class |
varname |
A character string indicating the name of the variable to be saved. |
metadata |
A named list where each element is a variable containing the corresponding information. The information must be contained in a list of lists for each variable. |
Datasets |
A vector of character string indicating the names of the datasets. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. It can be NULL if there is no start date dimension. |
ftime_dim |
A character string indicating the name of the forecast time dimension. By default, it is set to 'time'. It can be NULL if there is no forecast time dimension. |
memb_dim |
A character string indicating the name of the member dimension. By default, it is set to 'member'. It can be NULL if there is no member dimension. |
dat_dim |
A character string indicating the name of dataset dimension. By default, it is set to 'dataset'. It can be NULL if there is no dataset dimension. |
var_dim |
A character string indicating the name of variable dimension. By default, it is set to 'var'. It can be NULL if there is no variable dimension. |
drop_dims |
(optional) A vector of character strings indicating the dimension names of length 1 that need to be dropped in order that they don't appear in the netCDF file. Only is allowed to drop dimensions that are not used in the computation. The dimensions used in the computation are the ones specified in: sdate_dim, ftime_dim, dat_dim, var_dim and memb_dim. It is NULL by default. |
single_file |
A logical value indicating if all object is saved in a single file (TRUE) or in multiple files (FALSE). When it is FALSE, the array is separated for datasets, variable and start date. When there are no specified time dimensions, the data will be saved in a single file by default. The output file name when 'single_file' is TRUE is a character string containing: '<var>_<first_sdate>_<last_sdate>.nc'; when it is FALSE, it is '<var>_<sdate>.nc'. It is FALSE by default. |
extra_string |
(Optional) A character string to be included as part of the file name, for instance, to identify member or realization. When single_file is TRUE, the 'extra_string' will substitute all the default file name; when single_file is FALSE, the 'extra_string' will be added in the file name as: '<var>_<extra_string>_<sdate>.nc'. It is NULL by default. |
global_attrs |
(Optional) A list with elements containing the global attributes to be saved in the NetCDF. |
units_hours_since |
(Optional) A logical value only available for the case: Dates have forecast time and start date dimension, single_file is TRUE and 'time_bounds' is NULL. When it is TRUE, it saves the forecast time with units of 'hours since'; if it is FALSE, the time units will be a number of time steps with its corresponding frequency (e.g. n days, n months or n hours). It is FALSE by default. |
Value
Multiple or single NetCDF files containing the data array.
single_file is TRUE |
All data is saved in a single file located in the specified destination path with the following name (by default): '<variable_name>_<first_sdate>_<last_sdate>.nc'. Multiple variables are saved separately in the same file. The forecast time units are calculated from each start date (if sdate_dim is not NULL) or from the time step. If 'units_hours_since' is TRUE, the forecast time units will be 'hours since <each start date>'. If 'units_hours_since' is FALSE, the forecast time units are extracted from the frequency of the time steps (hours, days, months); if no frequency is found, the units will be ’hours since’. When the time units are 'hours since' the time ateps are assumed to be equally spaced. |
single_file is FALSE |
The data array is subset and stored into multiple files. Each file contains the data subset for each start date, variable and dataset. Files with different variables and datasets are stored in separated directories within the following directory tree: 'destination/Dataset/variable/'. The name of each file will be by default: '<variable_name>_<sdate>.nc'. The forecast time units are calculated from each start date (if sdate_dim is not NULL) or from the time step. The forecast time units will be 'hours since <each start date>'. |
Author(s)
Perez-Zanon Nuria, nuria.perez@bsc.es
Examples
data <- lonlat_temp_st$exp$data
lon <- lonlat_temp_st$exp$coords$lon
lat <- lonlat_temp_st$exp$coords$lat
coords <- list(lon = lon, lat = lat)
Datasets <- lonlat_temp_st$exp$attrs$Datasets
varname <- 'tas'
Dates <- lonlat_temp_st$exp$attrs$Dates
metadata <- lonlat_temp_st$exp$attrs$Variable$metadata
SaveExp(data = data, coords = coords, Datasets = Datasets, varname = varname,
Dates = Dates, metadata = metadata, single_file = TRUE,
ftime_dim = 'ftime', var_dim = 'var', dat_dim = 'dataset')
Function to Split Dimension
Description
This function split a dimension in two. The user can select the dimension to split and provide indices indicating how to split that dimension or dates and the frequency expected (monthly or by day, month and year). The user can also provide a numeric frequency indicating the length of each division.
Usage
SplitDim(
data,
split_dim = "time",
indices,
freq = "monthly",
new_dim_name = NULL,
dates = NULL,
return_indices = FALSE
)
Arguments
data |
An n-dimensional array with named dimensions. |
split_dim |
A character string indicating the name of the dimension to split. |
indices |
A vector of numeric indices or dates. |
freq |
A character string indicating the frequency: by 'day', 'month' and 'year' or 'monthly' (by default). 'month' identifies months between 1 and 12 independently of the year they belong to, while 'monthly' differentiates months from different years. Parameter 'freq' can also be numeric indicating the length in which to subset the dimension. |
new_dim_name |
A character string indicating the name of the new dimension. |
dates |
An optional parameter containing an array of dates of class 'POSIXct' with the corresponding time dimensions of 'data'. It is NULL by default. |
return_indices |
A logical value that if it is TRUE, the indices used in splitting the dimension will be returned. It is FALSE by default. |
Value
An n-dimensional array with the specified dimension split into a new dimension, preserving all other original dimensions and their names,
Author(s)
Nuria Perez-Zanon, nuria.perez@bsc.es
Examples
data <- 1 : 20
dim(data) <- c(time = 10, lat = 2)
indices <- c(rep(1,5), rep(2,5))
new_data <- SplitDim(data, indices = indices)
time <- c(seq(ISOdate(1903, 1, 1), ISOdate(1903, 1, 4), "days"),
seq(ISOdate(1903, 2, 1), ISOdate(1903, 2, 4), "days"),
seq(ISOdate(1904, 1, 1), ISOdate(1904, 1, 2), "days"))
new_data <- SplitDim(data, indices = time)
new_data <- SplitDim(data, indices = time, freq = 'day')
new_data <- SplitDim(data, indices = time, freq = 'month')
new_data <- SplitDim(data, indices = time, freq = 'year')
Function for Calculating the Cluster analysis
Description
This function computes the weather regimes from a cluster analysis. It can be applied over the dataset with dimensions c(year/month, month/day, lon, lat), or by using PCs obtained from the application of the EOFs analysis to filter the dataset. The cluster analysis can be performed with the traditional k-means or those methods included in the hclust (stats package).
Usage
WeatherRegime(
data,
ncenters = NULL,
EOFs = TRUE,
neofs = 30,
varThreshold = NULL,
lon = NULL,
lat = NULL,
method = "kmeans",
iter.max = 100,
nstart = 30,
ncores = NULL
)
Arguments
data |
An array containing anomalies with named dimensions with at least start date 'sdate', forecast time 'ftime', latitude 'lat' and longitude 'lon'. |
ncenters |
Number of clusters to be calculated with the clustering function. |
EOFs |
Whether to compute the EOFs (default = 'TRUE') or not (FALSE) to filter the data. |
neofs |
Number of modes to be kept only if EOFs = TRUE has been selected. (default = 30). |
varThreshold |
Value with the percentage of variance to be explained by the PCs. Only sufficient PCs to explain this much variance will be used in the clustering. |
lon |
Vector of longitudes. |
lat |
Vector of latitudes. |
method |
Different options to estimate the clusters. The most traditional approach is the k-means analysis (default=’kmeans’) but the function also support the different methods included in the hclust . These methods are: "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). For more details about these methods see the hclust function documentation included in the stats package. |
iter.max |
Parameter to select the maximum number of iterations allowed (Only if method = 'kmeans' is selected). |
nstart |
Parameter for the cluster analysis determining how many random sets to choose (Only if method='kmeans' is selected). |
ncores |
The number of multicore threads to use for parallel computation. |
Value
A list with elements $composite (array with at least 3-d ('lat',
'lon', 'cluster') containing the composites k = 1,..,K for case (*1) or only k = 1
for any specific cluster, i.e., case (*2)), pvalue (array with at least
3-d ('lat','lon','cluster') with the pvalue of the composites obtained through
a t-test that accounts for the serial dependence of the data with the same
structure as Composite.), cluster (A matrix or vector with integers
(from 1:k) indicating the cluster to which each time step is allocated.),
persistence (Percentage of days in a month/season before a cluster is
replaced for a new one (only if method=’kmeans’ has been selected.)),
frequency (Percentage of days in a month/season belonging to each
cluster (only if method=’kmeans’ has been selected).),
Author(s)
Verónica Torralba - BSC, veronica.torralba@bsc.es
References
Cortesi, N., V., Torralba, N., González-Reviriego, A., Soret, and F.J., Doblas-Reyes (2019). Characterization of European wind speed variability using weather regimes. Climate Dynamics,53, 4961–4976, doi:10.1007/s00382-019-04839-5.
Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools for the development of a climate service. Thesis. Available online: https://eprints.ucm.es/56841/
Examples
data <- array(abs(rnorm(1280, 283.7, 6)), dim = c(dataset = 2, member = 2,
sdate = 3, ftime = 3,
lat = 4, lon = 4))
lat <- seq(47, 44)
res <- WeatherRegime(data = data, lat = lat,
EOFs = FALSE, ncenters = 4)
Conversion of 'startR_array' or 'list' objects to 's2dv_cube'
Description
This function converts data loaded using Start function from startR package or Load from s2dv into an 's2dv_cube' object.
Usage
as.s2dv_cube(object, remove_attrs_coords = FALSE, remove_null = FALSE)
Arguments
object |
An object of class 'startR_array' generated from function
|
remove_attrs_coords |
A logical value indicating whether to remove the attributes of the coordinates (TRUE) or not (FALSE). The default value is FALSE. |
remove_null |
Optional. A logical value indicating whether to remove the
elements that are NULL (TRUE) or not (FALSE) of the output object. It is
only used when the object is an output from function |
Value
The function returns an 's2dv_cube' object to be easily used with
functions with the prefix CST from CSTools and CSIndicators packages.
The object is mainly a list with the following elements:
'data', array with named dimensions;
'dims', named vector of the data dimensions;
'coords', list of named vectors with the coordinates corresponding to the dimensions of the data parameter;
'attrs', named list with elements:
'Dates', array with named temporal dimensions of class 'POSIXct' from time values in the data;
'Variable', has the following components:
'varName', character vector of the short variable name. It is usually specified in the parameter 'var' from the functions Start and Load;
'metadata', named list of elements with variable metadata. They can be from coordinates variables (e.g. longitude) or main variables (e.g. 'var');
'Datasets', character strings indicating the names of the datasets;
'source_files', a vector of character strings with complete paths to all the found files involved in loading the data;
'when', a time stamp of the date issued by the Start() or Load() call to obtain the data;
'load_parameters', it contains the components used in the arguments to load the data from Start() or Load() functions.
Author(s)
Perez-Zanon Nuria, nuria.perez@bsc.es
Nicolau Manubens, nicolau.manubens@bsc.es
See Also
s2dv_cube, CST_Start,
Start and CST_Load
Examples
## Not run:
# Example 1: convert an object from startR::Start function to 's2dv_cube'
library(startR)
repos <- '/esarchive/exp/ecmwf/system5_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc'
data <- Start(dat = repos,
var = 'tas',
sdate = c('20170101', '20180101'),
ensemble = indices(1:5),
time = 'all',
latitude = indices(1:5),
longitude = indices(1:5),
return_vars = list(latitude = 'dat', longitude = 'dat', time = 'sdate'),
retrieve = TRUE)
data <- as.s2dv_cube(data)
# Example 2: convert an object from s2dv::Load function to 's2dv_cube'
startDates <- c('20001101', '20011101', '20021101',
'20031101', '20041101', '20051101')
data <- Load(var = 'tas', exp = 'system5c3s',
nmember = 2, sdates = startDates,
leadtimemax = 3, latmin = 10, latmax = 30,
lonmin = -10, lonmax = 10, output = 'lonlat')
data <- as.s2dv_cube(data)
## End(Not run)
Sample Of Experimental Precipitation Data In Function Of Longitudes And Latitudes
Description
This sample data set contains a small cutout of gridded seasonal precipitation forecast data from the Copernicus Climate Change ECMWF-System 5 forecast system, to be used to demonstrate downscaling. Specifically, for the 'pr' (precipitation) variable, for the first 6 forecast ensemble members, daily values, for all 31 days in March following the forecast starting dates in November of years 2010 to 2012, for a small 4x4 pixel cutout in a region in the North-Western Italian Alps (44N-47N, 6E-9E). The data resolution is 1 degree.
Details
The 'CST_Load' call used to generate the data set in the infrastructure of the Marconi machine at CINECA is shown next, working on files which were extracted from forecast data available in the MEDSCOPE internal archive.
Value
sample s2dv_cube object
library(CSTools)
infile <- list(path = paste0('/esarchive/exp/ecmwf/system5c3s/daily_mean/',
'$VAR_NAME$_s0-24h/$VAR_NAME$_$START_DATE$.nc'))
lonlat_prec <- CST_Load('prlr', exp = list(infile), obs = NULL,
sdates = c('20101101', '20111101', '20121101'),
leadtimemin = 121, leadtimemax = 151,
latmin = 44, latmax = 47,
lonmin = 6, lonmax = 9,
nmember = 6,
storefreq = "daily", sampleperiod = 1,
output = "lonlat"
)
Author(s)
Jost von Hardenberg j.vonhardenberg@isac.cnr.it
An-Chi Ho an.ho@bsc.es
Sample Of Experimental Precipitation Data In Function Of Longitudes And Latitudes with Start
Description
This sample data set contains a small cutout of gridded seasonal precipitation forecast data from the Copernicus Climate Change ECMWF-System 5 forecast system, to be used to demonstrate downscaling. Specifically, for the 'pr' (precipitation) variable, for the first 6 forecast ensemble members, daily values, for all 31 days in March following the forecast starting dates in November of years 2010 to 2012, for a small 4x4 pixel cutout in a region in the North-Western Italian Alps (44N-47N, 6E-9E). The data resolution is 1 degree.
Details
The 'CST_Start' call used to generate the data set in the infrastructure of the Marconi machine at CINECA is shown next, working on files which were extracted from forecast data available in the MEDSCOPE internal archive.
Value
sample s2dv_cube object
path <- paste0('/esarchive/exp/ecmwf/system5c3s/daily_mean/',
'$var$_s0-24h/$var$_$sdate$.nc')
sdates = c('20101101', '20111101', '20121101')
latmin <- 44
latmax <- 47
lonmin <- 6
lonmax <- 9
lonlat_prec_st <- CST_Start(dataset = path,
var = 'prlr',
member = startR::indices(1:6),
sdate = sdates,
ftime = 121:151,
lat = startR::values(list(latmin, latmax)),
lat_reorder = startR::Sort(decreasing = TRUE),
lon = startR::values(list(lonmin, lonmax)),
lon_reorder = startR::CircularSort(0, 360),
synonims = list(lon = c('lon', 'longitude'),
lat = c('lat', 'latitude'),
ftime = c('time', 'ftime'),
member = c('member', 'ensemble')),
return_vars = list(ftime = 'sdate',
lon = NULL, lat = NULL),
retrieve = TRUE)
Author(s)
Jost von Hardenberg j.vonhardenberg@isac.cnr.it
An-Chi Ho an.ho@bsc.es
Sample Of Experimental And Observational Climate Data In Function Of Longitudes And Latitudes
Description
This sample data set contains gridded seasonal forecast and corresponding observational data from the Copernicus Climate Change ECMWF-System 5 forecast system, and from the Copernicus Climate Change ERA-5 reconstruction. Specifically, for the 'tas' (2-meter temperature) variable, for the 15 first forecast ensemble members, monthly averaged, for the 3 first forecast time steps (lead months 1 to 4) of the November start dates of 2000 to 2005, for the Mediterranean region (27N-48N, 12W-40E). The data was generated on (or interpolated onto, for the reconstruction) a rectangular regular grid of size 360 by 181.
Details
It is recommended to use the data set as follows:
Value
sample s2dv_cube object
require(zeallot) c(exp, obs)
The 'CST_Load' call used to generate the data set in the infrastructure of the Earth Sciences Department of the Barcelona Supercomputing Center is shown next. Note that 'CST_Load' internally calls 's2dv::Load', which would require a configuration file (not provided here) expressing the distribution of the 'system5c3s' and 'era5' NetCDF files in the file system.
library(CSTools)
require(zeallot)
startDates <- c('20001101', '20011101', '20021101',
'20031101', '20041101', '20051101')
lonlat_temp <-
CST_Load(
var = 'tas',
exp = 'system5c3s',
obs = 'era5',
nmember = 15,
sdates = startDates,
leadtimemax = 3,
latmin = 27, latmax = 48,
lonmin = -12, lonmax = 40,
output = 'lonlat',
nprocs = 1
)
Author(s)
Nicolau Manubens nicolau.manubens@bsc.es
Sample Of Experimental And Observational Climate Data In Function Of Longitudes And Latitudes with Start
Description
This sample data set contains gridded seasonal forecast and corresponding observational data from the Copernicus Climate Change ECMWF-System 5 forecast system, and from the Copernicus Climate Change ERA-5 reconstruction. Specifically, for the 'tas' (2-meter temperature) variable, for the 15 first forecast ensemble members, monthly averaged, for the 3 first forecast time steps (lead months 1 to 4) of the November start dates of 2000 to 2005, for the Mediterranean region (27N-48N, 12W-40E). The data was generated on (or interpolated onto, for the reconstruction) a rectangular regular grid of size 360 by 181.
Details
The 'CST_Start' call used to generate the data set in the infrastructure of the Earth Sciences Department of the Barcelona Supercomputing Center is shown next. Note that 'CST_Start' internally calls 'startR::Start' and then uses 'as.s2dv_cube' that converts the 'startR_array' into 's2dv_cube'.
Value
sample s2dv_cube object
lonlat_temp_st <- NULL
repos_exp <- paste0('/esarchive/exp/ecmwf/system5c3s/monthly_mean/',
'$var$_f6h/$var$_$sdate$.nc')
sdates <- sapply(2000:2005, function(x) paste0(x, '1101'))
lonmax <- 40
lonmin <- -12
latmax <- 48
latmin <- 27
lonlat_temp_st$exp <- CST_Start(dataset = repos_exp,
var = 'tas',
member = startR::indices(1:15),
sdate = sdates,
ftime = startR::indices(1:3),
lat = startR::values(list(latmin, latmax)),
lat_reorder = startR::Sort(decreasing = TRUE),
lon = startR::values(list(lonmin, lonmax)),
lon_reorder = startR::CircularSort(0, 360),
synonims = list(lon = c('lon', 'longitude'),
lat = c('lat', 'latitude'),
member = c('member', 'ensemble'),
ftime = c('ftime', 'time')),
return_vars = list(lat = NULL,
lon = NULL, ftime = 'sdate'),
retrieve = TRUE)
dates <- c(paste0(2000, c(11, 12)), paste0(2001, c('01', 11, 12)),
paste0(2002, c('01', 11, 12)), paste0(2003, c('01', 11, 12)),
paste0(2004, c('01', 11, 12)), paste0(2005, c('01', 11, 12)), 200601)
dates <- sapply(dates, function(x) {paste0(x, '01')})
dates <- as.POSIXct(dates, format = '
dim(dates) <- c(ftime = 3, sdate = 6)
dates <- t(dates)
names(dim(dates)) <- c('sdate', 'ftime')
path.obs <- '/esarchive/recon/ecmwf/era5/monthly_mean/$var$_f1h-r1440x721cds/$var$_$date$.nc'
lonlat_temp_st$obs <- CST_Start(dataset = path.obs,
var = 'tas',
date = unique(format(dates, '
ftime = startR::values(dates),
ftime_across = 'date',
ftime_var = 'ftime',
merge_across_dims = TRUE,
split_multiselected_dims = TRUE,
lat = startR::values(list(latmin, latmax)),
lat_reorder = startR::Sort(decreasing = TRUE),
lon = startR::values(list(lonmin, lonmax)),
lon_reorder = startR::CircularSort(0, 360),
synonims = list(lon = c('lon', 'longitude'),
lat = c('lat', 'latitude'),
ftime = c('ftime', 'time')),
transform = startR::CDORemapper,
transform_extra_cells = 2,
transform_params = list(grid = 'r360x181',
method = 'conservative'),
transform_vars = c('lat', 'lon'),
return_vars = list(lon = NULL,
lat = NULL,
ftime = 'date'),
retrieve = TRUE)
library(lubridate)
dates_exp <- lonlat_temp_st$exp$attrs$Dates
lonlat_temp_st$exp$attrs$Dates <- floor_date(ymd_hms(dates_exp), unit = "months")
dim(lonlat_temp_st$exp$attrs$Dates) <- dim(dates_exp)
dates_obs <- lonlat_temp_st$obs$attrs$Dates
lonlat_temp_st$obs$attrs$Dates <- floor_date(ymd_hms(dates_obs), unit = "months")
dim(lonlat_temp_st$obs$attrs$Dates) <- dim(dates_obs)
Author(s)
Nicolau Manubens nicolau.manubens@bsc.es
Print method for s2dv_cube objects
Description
This is an S3 method of the generic 'print' for the class 's2dv_cube'. When you will call 'print' on an 's2dv_cube' object, this method will display the content of the object in a clear and informative way.
Usage
## S3 method for class 's2dv_cube'
print(x, ...)
Arguments
x |
An 's2dv_cube' object. |
... |
Additional arguments of print function. |
Details
The object will be displayed following 's2dv_cube' class conventions. The
top-level elements are: 'Data', a multidimensional array containing the
object's data; 'Dimensions', the dimensions of the array; 'Coordinates', the
array coordinates that match its dimensions, explicit coordinates have an
asterisk (*) at the beginning while index coordinates do not; and
'Attributes', which contains all the metadata of the object. For more
information about the 's2dv_cube', see s2dv_cube() and
as.s2dv_cube() functions.
Value
No return value, called for side effects. It prints a summary of an 's2dv_cube' object to the console.
Creation of a 's2dv_cube' object
Description
This function allows to create an 's2dv_cube' object by passing information through its parameters. This function will be needed if the data hasn't been loaded using CST_Start or has been transformed with other methods. An 's2dv_cube' object has many different components including metadata. This function will allow to create 's2dv_cube' objects even if not all elements are defined and for each expected missed parameter a warning message will be returned.
Usage
s2dv_cube(
data,
coords = NULL,
varName = NULL,
metadata = NULL,
Datasets = NULL,
Dates = NULL,
when = NULL,
source_files = NULL,
...
)
Arguments
data |
A multidimensional array with named dimensions, typically with dimensions: dataset, member, sdate, time, lat and lon. |
coords |
A list of named vectors with the coordinates corresponding to the dimensions of the data parameter. If any coordinate has dimensions, they will be set as NULL. If any coordinate is not provided, it is set as an index vector with the values from 1 to the length of the corresponding dimension. The attribute 'indices' indicates wether the coordinate is an index vector (TRUE) or not (FALSE). |
varName |
A character string indicating the abbreviation of the variable name. |
metadata |
A named list where each element is a variable containing the corresponding information. The information can be contained in a list of lists for each variable. |
Datasets |
Character strings indicating the names of the dataset. It there are multiple datasets it can be a vector of its names or a list of lists with additional information. |
Dates |
A POSIXct array of time dimensions containing the Dates. |
when |
A time stamp of the date when the data has been loaded. This parameter is also found in Load() and Start() functions output. |
source_files |
A vector of character strings with complete paths to all the found files involved in loading the data. |
... |
Additional elements to be added in the object. They will be stored in the end of 'attrs' element. Multiple elements are accepted. |
Value
The function returns an object of class 's2dv_cube' with the following
elements in the structure:
'data', array with named dimensions;
'dims', named vector of the data dimensions;
'coords', list of named vectors with the coordinates corresponding to the dimensions of the data parameter;
'attrs', named list with elements:
'Dates', array with named temporal dimensions of class 'POSIXct' from time values in the data;
'Variable', has the following components:
'varName', with the short name of the loaded variable as specified in the parameter 'var';
'metadata', named list of elements with variable metadata. They can be from coordinates variables (e.g. longitude) or main variables (e.g. 'var');
'Datasets', character strings indicating the names of the dataset;
'source_files', a vector of character strings with complete paths to all the found files involved in loading the data;
'when', a time stamp of the date issued by the Start() or Load() call to obtain the data;
'load_parameters', it contains the components used in the arguments to load the data from Start() or Load() functions.
Author(s)
Perez-Zanon Nuria, nuria.perez@bsc.es
See Also
Examples
exp_original <- 1:100
dim(exp_original) <- c(lat = 2, time = 10, lon = 5)
exp1 <- s2dv_cube(data = exp_original)
class(exp1)
coords <- list(lon = seq(-10, 10, 5), lat = c(45, 50))
exp2 <- s2dv_cube(data = exp_original, coords = coords)
class(exp2)
metadata <- list(tas = list(level = '2m'))
exp3 <- s2dv_cube(data = exp_original, coords = coords,
varName = 'tas', metadata = metadata)
class(exp3)
Dates = as.POSIXct(paste0(rep("01", 10), rep("01", 10), 1990:1999), format = "%d%m%Y")
dim(Dates) <- c(time = 10)
exp4 <- s2dv_cube(data = exp_original, coords = coords,
varName = 'tas', metadata = metadata,
Dates = Dates)
class(exp4)
exp5 <- s2dv_cube(data = exp_original, coords = coords,
varName = 'tas', metadata = metadata,
Dates = Dates, when = "2019-10-23 19:15:29 CET")
class(exp5)
exp6 <- s2dv_cube(data = exp_original, coords = coords,
varName = 'tas', metadata = metadata,
Dates = Dates,
when = "2019-10-23 19:15:29 CET",
source_files = c("/path/to/file1.nc", "/path/to/file2.nc"))
class(exp6)
exp7 <- s2dv_cube(data = exp_original, coords = coords,
varName = 'tas', metadata = metadata,
Dates = Dates,
when = "2019-10-23 19:15:29 CET",
source_files = c("/path/to/file1.nc", "/path/to/file2.nc"),
Datasets = list(
exp1 = list(InitializationsDates = list(Member_1 = "01011990",
Members = "Member_1"))))
class(exp7)
dim(exp_original) <- c(dataset = 1, member = 1, time = 10, lat = 2, lon = 5)
exp8 <- s2dv_cube(data = exp_original, coords = coords,
varName = 'tas', metadata = metadata,
Dates = Dates, original_dates = Dates)
class(exp8)
AEMET Training Training method (pre-downscaling) based on analogs: synoptic situations and significant predictors.
Description
This function caracterizes the synoptic situations in a past period based on low resolution reanalysis data (e.g, ERAInterim 1.5º x 1.5º) and an observational high resolution (HR) dataset (AEMET 5 km gridded daily precipitation and maximum and minimum temperature) (Peral et al., 2017)). The method uses three domains:
peninsular Spain and Balearic Islands domain (5 km resolution): HR domain
synoptic domain (low resolution): it should be centered over Iberian Peninsula and cover enough extension to detect as much synoptic situations as possible.
extended domain (low resolution): it is an extension of the synoptic domain. It is used for 'slp_ext' parameter (see 'slp_lon' and 'slp_lat' below).
Usage
training_analogs(
pred,
slp_ext,
lon,
lat,
slp_lon,
slp_lat,
var,
HR_path,
tdates
)
Arguments
pred |
List of matrix reanalysis data in a synoptic domain. The list has to contain reanalysis atmospheric variables (instantaneous 12h data) that must be indentify by parenthesis name. For precipitation:
For maximum and minimum temperature:
All matrix must have [time,gridpoint] dimensions. (time = number of training days, gridpoint = number of synoptic gridpoints). |
slp_ext |
Matrix with atmospheric reanalysis sea level pressure (instantaneous 12h data)(hPa). It has the same resolution as 'pred' parameter but with an extended domain. This domain contains extra degrees (most in the north and west part) compare to synoptic domain. The matrix must have [time,gridpoint] dimensions. (time = number of training days, gridpoint = number of extended gridpoints). |
lon |
Vector of the synoptic longitude (from (-180º) to 180º), The vector must go from west to east. |
lat |
Vector of the synoptic latitude. The vector must go from north to south. |
slp_lon |
Vector of the extended longitude (from (-180º) to 180º). The vector must go from west to east. |
slp_lat |
Vector of the extended latitude. The vector must go from north to south. |
var |
Variable name to downscale. There are two options: 'prec' for precipitation and 'temp' for maximum and minimum temperature. |
HR_path |
Local path of HR observational files (maestro and pcp/tmx-tmn). For precipitation and temperature can be downloaded from the following link: https://www.aemet.es/en/serviciosclimaticos/cambio_climat/datos_diarios?w=2 respetively. Maestro file (maestro_red_hr_SPAIN.txt) has gridpoint (nptos), longitude (lon), latitude (lat) and altitude (alt) in columns (vector structure). Data file (pcp/tmx/tmn_red_SPAIN_1951-201903.txt) includes 5km resolution spanish daily data (precipitation or maximum and minimum temperature from january 1951 to june 2020. See README file for more information. IMPORTANT!: HR observational period must be the same as for reanalysis variables. It is assumed that the training period is smaller than the HR original one (1951-2020), so it is needed to make a new ascii file with the new period and the same structure as original, specifying the training dates ('tdates' parameter) in the name (e.g. 'pcp_red_SPAIN_19810101-19961231.txt' for '19810101-19961231' period). |
tdates |
Training period dates in format YYYYMMDD(start)-YYYYMMDD(end) (e.g. 19810101-19961231). |
Value
A matrix list (e.g. restrain) as a result of characterize the past synoptic situations and the significant predictors needed to downscale seasonal forecast variables. For precipitation the output includes:
'um': u component of geostrophic wind in all period (numeric matrix with [time, gridpoint] dimensions).
'vm': v component of geostrophic wind in all period (numeric matrix with [time,gridpoint] dimensions).
'nger': number of synoptic situations (integer).
'gu92': u component of geostrophic wind for each synoptic situation (numeric matrix with [nger,gridpoint] dimensions).
'gv92': v component of geostrophic wind for each synoptic situation (numeric matrix with [nger, gridpoint] dimensions).
'gu52': u component of wind at 500 hPa for each synotic situation (numeric matrix with [nger, gridpoint] dimensions).
'gv52': v component of wind at 500 hPa for each synotic situation (numeric matrix with [nger, gridpoint] dimensions).
'neni': number of reference centers where predictors are calculated (integer).
'vdmin': minimum distances between each HR gridpoint and the four nearest synoptic gridpoints (numeric matrix with [nptos,4] dimensions) (nptos = number of HR gridpoints).
'vref': four nearest synoptic gridpoints to each HR gridpoint (integer matrix with [nptos, 4] dimensions).
'ccm': multiple correlation coeficients (numeric matrix with [nger, nptos] dimensions) indices:
'lab_pred': numeric labels of selected predictors (integer matrix with [nger,nptos,11,1] dimensions).
'cor_pred': partial correlation of selected predictors (numeric matrix with [nger,nptos,11,2] dimensions).
For maximum and minimum temperature the output includes:
'um': u component of geostrophic wind in all training period (numeric matrix with [time,gridpoint] dimensions).
'vm': v component of geostrophic wind in all training period (numeric matrix with [time,gridpoint] dimensions).
'insol': insolation in all training period (numeric vector with [time] dimension).
'neni': number of reference centers where predictors are calculated (integer).
'vdmin': minimum distances between each HR gridpoint and the four nearest synoptic gridpoints (numeric matrix with [nptos,4] dimensions) (nptos = number of HR gridpoints).
'vref': four nearest synoptic gridpoints to each HR gridpoint (integer matrix with [nptos,4] dimensions).
The output can directly use as argument to 'CST_AnalogsPredictors' function (e.g. resdowns <- CST_AnalogsPredictors(...,restrain)).
Author(s)
Marta Dominguez Alonso - AEMET, mdomingueza@aemet.es
Nuria Perez-Zanon - BSC, nuria.perez@bsc.es