Using the eFCM Package: An Example with European Winter Precipitation

The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Introduction

This vignette illustrates a complete workflow for fitting the exponential factor copula model using the eFCM package, including data preprocessing, model fitting, diagnostic checking, and simulation. We illustrate the workflow using precipitation data from a counterfactual climate scenario, chosen here purely as an example dataset.

Exponential factor copula model

The exponential factor copula model (eFCM; Castro-Camilo and Huser, 2020) is stochastically defined as

\[ W(s) = Z(s) + V, \]

where:

\(Z(s)\), \(s \in \boldsymbol{S} \subset \mathbb{R}^2\), is a standard Gaussian process with stationary correlation function \(\rho(h)\),
\(h = \| s_1 - s_2 \|\) is the distance between two locations \(s_1, s_2 \in \boldsymbol{S}\),
\(V \sim \text{Exp}(\lambda)\) is an exponentially distributed random variable with rate parameter \(\lambda\), independent of both \(Z(s)\) and the spatial location \(s\).

The process W(s) belongs to the class of asymptotically dependent models and may be viewed as a Gaussian location mixture. Dependence is introduced through the common factor V, which induces asymptotic dependence while allowing flexible representation of sub-asymptotic extremal dependence. Because this factor does not vary spatially, the model is particularly well-suited to spatially homogeneous regions, where it is reasonable to assume similar marginal distributions and tail behavior across sites.

If \(Z_j = Z(s_j)\), for \(j = 1,\ldots,d\), the multivariate latent Gaussian vector \(\boldsymbol{Z} = (Z_1, \dots, Z_d)^\top\) follows a multivariate normal distribution, \(\boldsymbol{Z} \sim \Phi(\cdot; \Sigma_Z),\), where the covariance matrix \(\Sigma_Z\) is determined by the correlation function \(\rho(h)\). In this application, we adopt the exponential correlation function,a special case of the Matérn class, \[ \rho(h) = \exp\left(-\frac{h}{\delta}\right), \]

where \(\delta > 0\) is a range parameter controlling the spatial decay. With this, the joint distribution of the process \(W\) is \[F_d^W(\mathbf{w}) = \Phi_{D}(\mathbf{w};\mathbf{\Sigma}_{\mathbf{Z}}) - \sum_{j = 1}^D\exp\left(\frac{\lambda^2}{2} - \lambda w_j\right)\Phi_{D}\left(\mathbf{q}_{j,0}- \mathbf{\mu}_{j,0};\mathbf\Omega_{j,0}\right),\] where \[ \mathbf{q}_{j,0} = \begin{pmatrix}\mathbf{w}_{-j} - \mathbf{\Sigma}_{\mathbf{Z};-j,j}w_j\\ 0\end{pmatrix}, \quad \mathbf{\mu}_{j,0} = \begin{pmatrix} (w_j - \lambda)(\mathbf{1}_{d-1} - \mathbf{\Sigma}_{\mathbf{Z};-j,j})\\ \lambda -w_j \end{pmatrix},\] \[\mathbf\Omega_{j,0} =\begin{pmatrix} \mathbf{1}_{d-1}\mathbf{1}_{d-1}^T - 2\mathbf{\Sigma}_{\mathbf{Z};-j,j}+\mathbf{\Sigma}_{\mathbf{Z};-j,-j}&\mathbf{\Sigma}_{\mathbf{Z}; -j,j}-\mathbf{1}_{d-1}\\\mathbf{\Sigma}_{\mathbf{Z}; -j,j}-\mathbf{1}_{d-1}&1\end{pmatrix}, \] In particular, by setting \(d=1\), the marginal distribution may be written as \[F_1^{\mathbf{W}}(w;\lambda) = \Phi(w) - \exp(\lambda^2/2 - \lambda w)\Phi(w - \lambda).\]

Data Preparation

We start by loading the package and the pre-processed weekly counterfactual precipitation data. Specifically, counterfactual contains weekly maxima of precipitation under natural-only forcing, and LonLat provides spatial coordinates for each station.

library(eFCM)

# Load weekly precipitation maxima (pre-aggregated)
data("counterfactual")  # matrix/data frame: [weeks × stations]

# Load station coordinates
data("LonLat")
coord <- LonLat

C:\2592713L3906b7fc4.R

We can visualize how spatially averaged weekly maxima evolve over time:

plot(1:nrow(counterfactual), apply(counterfactual, 1, mean),
     type = "l", xlab = "Week", ylab = "Mean Precipitation (mm)",
     main = "Weekly Maxima of Counterfactual Precipitation")
abline(h = quantile(apply(counterfactual, 1, mean), 0.9), col = "red", lty = 2)

C:\2592713L3906b7fc4.R

Create Spatial Data Objects

The function fdata() converts the spatiotemporal precipitation data into the format required for model fitting:

cf_data = fdata(counterfactual, coord, cellsize = c(1, 1))

C:\2592713L3906b7fc4.R To reduce vignette build time, we load precomputed results:

data(cf_data)
data(fit)

C:\2592713L3906b7fc4.R

Model Fitting

We fit the exponential factor copula model to the counterfactual data using fcm():

fit = fcm(s = 1, cf_data, boot = T, R = 1000)

C:\2592713L3906b7fc4.R Here, s = 1 is the index of the grid point and the and the exceedance threshold is set via thres, a quantile level in (0,1) (default 0.9), and boot = TRUE, R = 1000 requests 1,000 bootstrap replications for uncertainty quantification.

Model Summary and Diagnostics

summary(fit)
#> 
#> === Summary of Factor Copula Model Fit ===
#> Grid location: ID = 1 | Longitude = -7.844 | Latitude = 37.120
#> 
#> Number of neighbors: 2
#> Neighbor coordinates:
#>   - ID = 1 | Longitude = -8.438 | Latitude = 37.120
#> 
#> Model Coefficients (95% CI):
#> $`Estimate (Bootstrap)`
#>          par      lower      upper
#> 1   2.519319  0.5087895   35.00694
#> 2 791.454198 13.5021393 1647.42038
#> 
#> 
#> Objective value (negative log-likelihood):
#> [1] -18.10454
#> 
#> Information criteria:
#>       AIC       BIC      AICc 
#> -32.20908 -24.29223 -32.17783 
#> 
#> Fitting arguments:
#> $thres
#> [1] 0.9
#> 
#> $nu
#> [1] 0.5
#> 
#> $censorL
#> [1] TRUE
#> 
#> === End of Summary ===

C:\2592713L3906b7fc4.R

We can extract point estimates of the parameters. Recall that the parameter estimates (\(\lambda\) and \(\delta\)) describe the strength of the common factor and the spatial range of dependence, respectively:

coef(fit, method = "hessian")
#>          par       lower      upper
#> 1   2.519319   0.2104598   30.15764
#> 2 791.454198 133.2100399 4702.34637
coef(fit, method = "boot")
#>          par      lower      upper
#> 1   2.519319  0.5087895   35.00694
#> 2 791.454198 13.5021393 1647.42038

C:\2592713L3906b7fc4.R

We can also compute the usual model selection criteria indices:

logLik(fit)
#> [1] 18.10454
AIC(fit)
#>       AIC 
#> -32.20908
BIC(fit)
#>       BIC 
#> -24.29223
AICc(fit)
#>      AICc 
#> -32.17783

C:\2592713L3906b7fc4.R

Goodness-of-Fit

We can draw Q-Q plots to compare empirical and model-based upper tail quantiles for a given station. By default, qqplot() also overlays a generalized Pareto distribution (GPD) fit; this can be suppressed by setting gpd = FALSE.

qqplot(fit, which = 1)

C:\2592713L3906b7fc4.R

We can also assess extremal dependence using the conditional exceedance probability \(\chi_h(u)\), which measures the probability of simultaneous exceedances at high but finite thresholds. For two locations \(s_1\) and \(s_2\) separated by distance \(h\), with respective vector components \(W(s_1)\) and \(W(s_2)\), \(\chi_h(u)\) is defined as \(\chi_h(u)= \lim_{u\to1}\Pr(W_{s_1}(W(s_1))>u\mid F_{s_2}(W(s_2))>u)\). The function chiplot() in eFCM plots model-based estimated of \(\chi_h(u)\) alongside their empirical counterpart for a range of values of \(u\). The package also provides two methods for pointwise confidence intervals: a normal approximation to the MLE or bootstrap resampling. Both approaches are illustrated below.

chiplot(fit, method = "hessian", ylim = c(0, 1))

C:\2592713L3906b7fc4.R

chiplot(fit, method = "boot", ylim = c(0, 1))

C:\2592713L3906b7fc4.R

Simulation from the Fitted Model

To generate synthetic precipitation fields consistent with the estimated extremal dependence structure, we can simulate from the fitted model as follows:

lbda <- fit$par[1]
delta <- fit$par[2]
dist <- rdist.earth(fit$coord)
sim  <- rmfcm(lbda, delta, dist, n = 1e4)

C:\2592713L3906b7fc4.R

Summary

This vignette has demonstrated the complete pipeline for fitting the exponential factor copula model using the eFCM package. It covered preprocessing of data (in this case, climate model outputs), model estimation, diagnostic checks, and simulation. For a more detailed discussion of the method and its application to extreme event attribution, see Li and Castro-Camilo (2025+).

References

Castro-Camilo, D., & Huser, R. (2020). Local likelihood estimation of complex tail dependence structures, with application to U.S. precipitation extremes. Journal of the American Statistical Association, 115(531), 1037–1054. https://doi.org/10.1080/01621459.2019.1611584

Li, M., & Castro-Camilo, D. (2025+). On the importance of tail assumptions in climate extreme event attribution. arXiv preprint. https://doi.org/10.48550/arXiv.2507.14019

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.