The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
library(MEFM)
In ‘MEFM’, we mainly consider a main effect matrix factor model
(MEFM) such that for any time \(t\in[T]\), \[
\mathbf{Y}_t = \mu_t \mathbf{1}_p \mathbf{1}_q^\top +
\boldsymbol{\alpha}_t \mathbf{1}_q^\top + \mathbf{1}_p
\boldsymbol{\beta}_t^\top + \mathbf{A}_r \mathbf{F}_t \mathbf{A}_c^\top
+ \mathbf{E}_t ,
\] where \(\mathbf{Y}_t\) is an
observation of dimension \(p\times q\),
\(\mu_t\) is a scalar representing the
time-varying grand mean, \(\boldsymbol{\alpha}_t \in \mathbb{R}^p\) is
the time-varying row effect, \(\boldsymbol{\beta}_t \in \mathbb{R}^q\) is
the time-varying column effect, \(\mathbf{E}_t\) is the noise matrix with the
same dimension as the observation, and \(\mathbf{A}_r \mathbf{F}_t
\mathbf{A}_c^\top\) is the common component in tradition factor
models with \(\mathbf{A}_r \in
\mathbb{R}^{p\times k_r}\), \(\mathbf{A}_c \in \mathbb{R}^{q\times
k_c}\), \(\mathbf{F}_t \in
\mathbb{R}^{k_r \times k_c}\) being the row loading, column
loading and core factors, respectively.
We may simulate such model series by using the base class ‘array’ in R.
An example data is generated using the ‘gen_MEFM’ function below. For
details of the data generation mechanism, see Lam
and Cen (2024). For reproducibility, a seed parameter is
required, which is 2024 by default.
<- 60 #time length
TT <- c(60,60) #spatial dimensions
d <- c(2,2) #rank of core tensor
r <- c(2,2) #rank of common component in error
re <- list(c(0,0), c(0,0)) #strong factors
eta <- c(0.7, 0.3, -0.4, 0.2, -0.1) #AR(5) coefficients for core factor
coef_f <- c(-0.7, -0.3, -0.4, 0.2, 0.1) #AR(5) coefficients for common component in error
coef_fe <- c(0.8, 0.4, -0.4, 0.2, -0.1) #AR(5) coefficients for idiosyncratic part in error
coef_e <- c(0,1) #Normal mean and variance for mu
param_mu <- c(0,1) #Normal mean and variance for alpha
param_alpha <- c(0,1) #Normal mean and variance for beta
param_beta <- gen_MEFM(TT,d,r,re,eta, coef_f, coef_fe, coef_e, param_mu, param_alpha, param_beta) data_example
With the ‘est_MEFM’ function, the factor structure can be estimated in one go. The number of factors could be either provided or not provided, in the latter case the function estimates the number of factors using the eigenvalue-ratio-based estimator. For the details of estimation, see Lam and Cen (2024). As an example, we show the estimated number of factors and also the estimation error.
<- est_MEFM(data_example$MEFM)
est_result
paste0('Estimated number of factors: ', est_result$r[1], ', ', est_result$r[2])
#> [1] "Estimated number of factors: 2, 2"
paste0('Estimation error for mu: ', sum((est_result$mu - data_example$mu)^2)/TT)
#> [1] "Estimation error for mu: 0.000204405935075275"
paste0('Estimation error for alpha: ', sum((est_result$alpha - data_example$alpha)^2)/prod(dim(data_example$alpha)))
#> [1] "Estimation error for alpha: 0.0183163947105022"
paste0('Estimation error for beta: ', sum((est_result$beta - data_example$beta)^2)/prod(dim(data_example$beta)))
#> [1] "Estimation error for beta: 0.0169548243336111"
paste0('Estimation error for row factor loading: ', tensorMiss::fle(est_result$A[[1]], data_example$A[[1]]))
#> [1] "Estimation error for row factor loading: 0.0220912835157286"
paste0('Estimation error for column factor loading: ', tensorMiss::fle(est_result$A[[2]], data_example$A[[2]]))
#> [1] "Estimation error for column factor loading: 0.0212000969244649"
paste0('Estimation error for Yt: ', sum((est_result$Yt - (data_example$MEFM - data_example$E) )^2)/prod(dim(est_result$Yt)))
#> [1] "Estimation error for Yt: 0.040113558882167"
We illustrate below how to perform testing on \(\mu_t\), \(\boldsymbol{\alpha}_t\) and \(\boldsymbol{\beta}_t\) using the asymptotic normality of the estimated parameters. First, consider testing \(\mathcal{H}_0: \mu_{10} = \text{-}1.15\) over \(\mathcal{H}_1: \mu_{10} \neq \text{-}1.15\), we construct the test statistic as the following.
<- 10 #select a time point
time_select <- (est_result$Yt - data_example$MEFM)[time_select,,]
hat.E <- make_gamma(hat.E, type = 'mu')
gamma_mu <- prod(d)^0.5 * gamma_mu^(-1) * (est_result$mu[time_select] - (-1.15))
asymp_i
paste0('The true mu at time 10 is: ', data_example$mu[time_select])
#> [1] "The true mu at time 10 is: -1.12135130216029"
paste0('The test statistic is: ', asymp_i )
#> [1] "The test statistic is: 2.22064506054644"
As the test statistic is outside of \([\text{-}1.96, 1.96]\), we reject the null hypothesis under \(5\%\) significance level. For \(\boldsymbol{\alpha}_t\) and \(\boldsymbol{\beta}_t\), we only demonstrate their asymptotic normality here. Consider the first three entries at time \(t=10\), we have
<- matrix(0, nrow=3, ncol=3)
gamma_alpha_inv for (j in 1:3){ gamma_alpha_inv[j,j] <- (make_gamma(hat.E, type = 'alpha', j))^(-1) }
<- (d[2])^0.5 * gamma_alpha_inv %*% (est_result$alpha[time_select, 1:3] - data_example$alpha[time_select, 1:3])
asymp_alpha_example
<- matrix(0, nrow=3, ncol=3)
gamma_beta_inv for (j in 1:3){ gamma_beta_inv[j,j] <- (make_gamma(hat.E, type = 'beta', j))^(-1) }
<- (d[1])^0.5 * gamma_beta_inv %*% (est_result$beta[time_select, 1:3] - data_example$beta[time_select, 1:3])
asymp_beta_example
paste0('The test statistic using true alpha: ', asymp_alpha_example[1], ', ', asymp_alpha_example[2], ', ', asymp_alpha_example[3])
#> [1] "The test statistic using true alpha: -1.10145252408513, 0.914153410430621, 0.425092850778305"
paste0('The test statistic using true beta: ', asymp_beta_example[1], ', ', asymp_beta_example[2], ', ', asymp_beta_example[3])
#> [1] "The test statistic using true beta: 0.531048546382394, -0.354637264024393, 0.684069398353172"
Under certain conditions (Lam and Cen 2024), the residue between the row of the estimated loading matrix and its corresponding true row under some rotation can be shown to be asymptotically normal. A consistent covariance matrix estimator can be computed by the ‘sigmaD_MEFM’ function. For instance, to compute the residue on the first row of the estimated column loading matrix, the covariance matrix estimator is obtained below. The rotation matrix is also computed afterwards.
# computing the covariance matrix estimator
<- r[2]
r2 <- floor(0.2*((TT * prod(d))^0.25))
eta <- diag(x=(svd(est_result$covMatrix[[2]])$d)[1:r2], nrow=r2, ncol=r2)
D2 # HAC_cov: HAC-type covariance matrix estimator
<- sigmaD_MEFM(2, D2, est_result$A[[2]], est_result$Ct, data_example$MEFM - est_result$Yt, 1, eta)
HAC_cov
# computing the rotation matrix
<- data_example$A[[2]]
A2 <- data_example$A[[1]]
Amk_i <- 0
R_ast_i for (tt in 1:TT){
<- R_ast_i + tensorMiss::unfold(matrix(data_example$Ft[tt,,], ncol=r2),2) %*% t(Amk_i) %*% Amk_i %*% t(tensorMiss::unfold(matrix(data_example$Ft[tt,,], ncol=r2),2))
R_ast_i
}<- A2 %*% R_ast_i %*% t(A2)
R_ast_i <- R_ast_i/TT
R_ast_i <- diag(x = diag(t(A2) %*% A2), nrow=r2, ncol=r2)
Z2_i <- A2 %*% diag(x=diag(solve(Z2_i))^0.5, nrow=r2, ncol=r2)
Q2_i # H2_i: rotation matrix
<- solve(D2) %*% t(est_result$A[[2]]) %*% R_ast_i %*% Q2_i %*% solve(t(Q2_i)%*% Q2_i) H2_i
Eventually, the standardised residue is shown below and should follow a standard normal distribution when dimensions are increased.
<- eigen(HAC_cov)
HAC_cov.eigen <- HAC_cov.eigen$vectors %*% diag(sqrt(HAC_cov.eigen$values)) %*% solve(HAC_cov.eigen$vectors)
HAC_cov.sqrt <- TT * (solve(HAC_cov.sqrt) %*% D2) %*% (matrix(est_result$A[[2]], nrow=d[2], ncol=r2)[1,] - (H2_i %*% Q2_i[1,]))
A2_1
A2_1#> [,1]
#> [1,] 0.1819581
#> [2,] -0.5004283
This section acts as a toy example on how to test the necessity of using MEFM instead of traditional factor models (FM). Before that, if the data is indeed generated using FM, we can see that MEFM will not make the estimation worse off and actually it is always better to use MEFM.
paste0('MSE of estimating MEFM on FM-generated data: ', sum((est_MEFM(data_example$FM)$Yt - data_example$FM)^2) / sum(data_example$FM^2))
#> [1] "MSE of estimating MEFM on FM-generated data: 0.172853133412508"
paste0('MSE of estimating FM on FM-generated data: ', sum((est_FM(data_example$FM)$Ct - data_example$FM)^2) / sum(data_example$FM^2))
#> [1] "MSE of estimating FM on FM-generated data: 0.179171222681373"
However, if the data has MEFM structure, then estimation based on FM is hardly satisfying even if we use much larger number of factors, as shown below.
paste0('MSE of estimating MEFM on MEFM-generated data: ', sum((est_MEFM(data_example$MEFM)$Yt - data_example$MEFM)^2) / sum(data_example$MEFM^2))
#> [1] "MSE of estimating MEFM on MEFM-generated data: 0.11303111038302"
paste0('MSE of estimating FM on MEFM-generated data (using number of factors as (3,3)): ', sum((est_FM(data_example$MEFM)$Ct - data_example$MEFM)^2) / sum(data_example$MEFM^2))
#> [1] "MSE of estimating FM on MEFM-generated data (using number of factors as (3,3)): 0.331459654558444"
paste0('MSE of estimating FM on MEFM-generated data (using number of factors as (30,30)): ', sum((est_FM(data_example$MEFM, r=c(30,30))$Ct - data_example$MEFM)^2) / sum(data_example$MEFM^2))
#> [1] "MSE of estimating FM on MEFM-generated data (using number of factors as (30,30)): 0.11429915366256"
It is hence worthy to test if using FM with slightly larger number of factors is sufficient. According to Lam and Cen (2024), we perform testing as follows.
<- est_MEFM(data_example$MEFM)
MEFM_on_MEFM <- est_FM(data_example$MEFM, r=c(MEFM_on_MEFM$r[1] +1, MEFM_on_MEFM$r[1] +1))
FM_on_MEFM
<- make_xy(MEFM_on_MEFM$Yt - data_example$MEFM, type = 'alpha')
x_alpha <- make_xy(FM_on_MEFM$Ct - data_example$MEFM, type = 'alpha')
y_alpha <- make_xy(MEFM_on_MEFM$Yt - data_example$MEFM, type = 'beta')
x_beta <- make_xy(FM_on_MEFM$Ct - data_example$MEFM, type = 'beta')
y_beta
paste0('Computed alpha_reject: ', sum(y_alpha >= qHat(x_alpha, 0.95))/length(y_alpha))
#> [1] "Computed alpha_reject: 1"
paste0('Computed beta_reject: ', sum(y_beta >= qHat(x_beta, 0.95))/length(y_beta))
#> [1] "Computed beta_reject: 1"
We expect the computed \(\boldsymbol{\alpha}_\text{reject}\) and \(\boldsymbol{\beta}_\text{reject}\) to be close to \(5\%\) if FM is sufficient, and clearly it is not in our case.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.