The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
geex
In the basic set-up, M-estimation applies to estimators of the \(p \times 1\) parameter \(\theta=(\theta_1, \theta_2, \dots, \theta_p)^{\intercal}\) which can be obtained as solutions to an equation of the form
\[\begin{equation} \label{eq:psi} \sum_{i = 1}^m \psi(O_i, \theta) = 0, \end{equation}\]
where \(O_1, \ldots, O_m\) are \(m\) independent and identically distributed (iid) random variables, and the function \(\psi\) returns a vector of length \(p\) and does not depend on \(i\) or \(m\) (Stefanski and Boos 2002, hereafter SB). See SB for the case where the \(O_i\) are independent but not necessarily identically distributed. The roots of (1) are referred to as M-estimators and denoted by \(\hat \theta\). M-estimators can be solved for analytically in some cases or computed numerically in general. Under certain regularity conditions, the asymptotic properties of \(\hat{\theta}\) can be derived from Taylor series approximations, the law of large numbers, and the central limit theorem. In brief, let \(\theta_0\) be the true parameter value defined by \(\int \psi(o, \theta_0) dF(o) = 0\), where \(F\) is the distribution function of \(O\). Let \(\dot{\psi}(o, \theta) = {\partial \psi(O_i, \theta)/\partial \theta}^{\intercal}\), \(A(\theta_0) = E[-\dot{\psi}(O_1, \theta_0)]\), and \(B(\theta_0) = E[\psi(O_1, \theta_0)\psi(O_1, \theta_0)^{\intercal}]\). Then under suitable regularity assumptions, \(\hat{\theta}\) is consistent and asymptotically Normal, i.e.,
\[\begin{equation*} \label{eq:variance} \sqrt{m}(\hat{\theta} - \theta_0) \overset{d}{\to} N(0, V(\theta_0)) \text{ as } m \to \infty, \end{equation*}\]
where \(V(\theta_0) = A(\theta_0)^{-1} B(\theta_0) \{A(\theta_0)^{-1} \}^{\intercal}\). The sandwich form of \(V(\theta_0)\) suggests several possible large sample variance estimators. For some problems, the analytic form of \(V(\theta_0)\) can be derived and estimators of \(\theta_0\) and other unknowns simply plugged into \(V(\theta_0)\). Alternatively, \(V(\theta_0)\) can be consistently estimated by the empirical sandwich variance estimator, where the expectations in \(A(\theta)\) and \(B(\theta)\) are replaced with their empirical counterparts. Let \(A_i = - \dot{\psi}(O_i, \theta)|_{\theta = \hat{\theta}}\), \(A_m = m^{-1} \sum_{i = 1}^m A_i\), \(B_i = \psi(O_i, \hat{\theta}) \psi(O_i, \hat{\theta})^{\intercal}\), and \(B_m = m^{-1} \sum_{i = 1}^m B_i\). The empirical sandwich estimator of the variance of \(\hat{\theta}\) is:
\[\begin{equation} \label{eq:esve} \hat{\Sigma} = A_m^{-1} B_m \{A_m^{-1}\}^{\intercal}/m . \end{equation}\]
The geex
package provides an application programming
interface (API) for carrying out M-estimation. The analyst provides a
function, called estFUN
, corresponding to \(\psi(O_i, \theta)\) that maps data \(O_i\) to a function of \(\theta\). Numerical derivatives approximate
\(\dot{\psi}\) so that evaluating \(\hat{\Sigma}\) is entirely a computational
exercise. No analytic derivations are required from the analyst.
Consider estimating the population mean \(\theta = E[Y_i]\) using the sample mean \(\hat{\theta} = m^{-1} \sum_{i=1}^m Y_i\) of \(m\) iid random variables \(Y_1, \dots, Y_m\). The estimator \(\hat{\theta}\) can be expressed as the solution to the following estimating equation:
\[ \sum_{i = 1}^m (Y_i - \theta) = 0. \]
which is equivalent to solving \(\eqref{eq:psi}\) where \(O_i = Y_i\) and \(\psi(O_i, \theta) = Y_i - \theta\). An
estFUN
is a translation of \(\psi\) into a function whose first argument
is data
and returns a function whose first argument is
theta
. An estFUN
corresponding to the
estimating equation for the sample mean of \(Y\) is:
meanFUN <- function(data){ function(theta){ data$Y - theta } } .
If an estimator fits into the above framework, then the user need
only specify estFUN
. No other programming is required to
obtain point and variance estimates. The remaining sections provide
examples of translating \(\psi\) into
an estFUN
.
The three examples of M-estimation from SB are presented here for
demonstration. For these examples, the data are \(O_i = \{Y_{1i}, Y_{2i}\}\) where \(Y_1 \sim N(5, 16)\) and \(Y_2 \sim N(2, 1)\) for \(m = 100\) and are included in the
geexex
dataset.
The first example estimates the population mean (\(\theta_1\)) and variance (\(\theta_2\)) of \(Y_1\). The solution to the estimating equations below are the sample mean \(\hat{\theta}_1 = m^{-1} \sum_{i = 1}^m Y_{1i}\) and sample variance \(\hat{\theta}_2 = m^{-1} \sum_{i = 1}^m (Y_{1i} - \hat{\theta}_1)^2\).
\[ \psi(Y_{1i}, \theta) = \begin{pmatrix} Y_{1i} - \theta_1 \\ (Y_{1i} - \theta_1)^2 - \theta_2 \end{pmatrix} \]
<- function(data){
SB1_estfun <- data$Y1
Y1 function(theta){
c(Y1 - theta[1],
- theta[1])^2 - theta[2])
(Y1
} }
The primary geex
function is m_estimate
,
which requires two inputs: estFUN
(the \(\psi\) function), data
(the
data frame containing \(O_i\) for \(i = 1, \dots, m\)). The package defaults to
rootSolve::multiroot
or estimating roots of \(\eqref{eq:psi}\), though the solver
algorithm can be specified in the root_control
argument.
Starting values for rootSolve::multiroot
are passed via the
root_control
argument; e.g.,
library(geex)
<- m_estimate(
results estFUN = SB1_estfun,
data = geexex,
root_control = setup_root_control(start = c(1,1)))
<- nrow(geexex)
n <- diag(1, nrow = 2)
A
<- with(geexex, {
B <- mean(Y1)
Ybar <- mean( (Y1 - Ybar)^2 )
B11 <- mean( (Y1 - Ybar) * ((Y1 - Ybar)^2 - B11) )
B12 <- mean( ((Y1 - Ybar)^2 - B11)^2 )
B22 matrix(
c(B11, B12,
nrow = 2
B12, B22),
)
})
# closed form roots
<- c(mean(geexex$Y1),
theta_cls # since var() divides by n - 1, not n
var(geexex$Y1) * (n - 1)/ n )
# closed form sigma
<- (solve(A) %*% B %*% t(solve(A))) / n
Sigma_cls
<- list(geex = list(estimates = coef(results), vcov = vcov(results)),
comparison cls = list(estimates = theta_cls, vcov = Sigma_cls))
The m_estimate
function returns an object of the
S4
class geex
, which contains an
estimates
slot and vcov
slot for \(\hat{\theta}\) and \(\hat{\Sigma}\), respectively. These slots
can be accessed by the functions coef
(or
roots
) and vcov
. The point estimates obtained
for \(\theta_1\) and \(\theta_2\) are analogous to the base R
functions mean
and var
(after multiplying by
\(m-1/m\) for the latter). SB gave a
closed form for \(A(\theta_0)\) (an
identity matrix) and \(B(\theta_0)\)
(not shown here) and suggest plugging in sample moments to compute \(B(\hat{\theta})\). The point and variance
estimates using both geex
and the analytic solutions are
shown below}. The maximum absolute difference between either the point
or variance estimates is 4e-11, thus demonstrating excellent agreement
between the numerical results obtained from geex
and the
closed form solutions for this set of estimating equations and data.
## $geex
## $geex$estimates
## [1] 5.044563 10.041239
##
## $geex$vcov
## [,1] [,2]
## [1,] 0.10041239 0.03667969
## [2,] 0.03667969 2.49219638
##
##
## $cls
## $cls$estimates
## [1] 5.044563 10.041239
##
## $cls$vcov
## [,1] [,2]
## [1,] 0.10041239 0.03667969
## [2,] 0.03667969 2.49219638
This example calculates a ratio estimator and illustrates the delta method via M-estimation. The estimating equations target the means of \(Y_1\) and \(Y_2\), labelled \(\theta_1\) and \(\theta_2\), as well as the estimand \(\theta_3 = \theta_1/ \theta_2\).
\[ \psi(Y_{1i}, Y_{2i}, \theta) = \begin{pmatrix} Y_{1i} - \theta_1 \\ Y_{2i} - \theta_2 \\ \theta_1 - \theta_3\theta_2 \end{pmatrix} \]
The solution to \(\eqref{eq:psi}\) for this \(\psi\) function yields \(\hat{\theta}_3 = \bar{Y}_1 / \bar{Y}_2\), where \(\bar{Y}_j\) denotes the sample mean of \(Y_{j1}, \ldots,Y_{jm}\) for \(j=1,2\).
<- function(data){
SB2_estfun <- data$Y1; Y2 <- data$Y2
Y1 function(theta){
c(Y1 - theta[1],
- theta[2],
Y2 1] - (theta[3] * theta[2])
theta[
)
} }
<- m_estimate(
results estFUN = SB2_estfun,
data = geexex,
root_control = setup_root_control(start = c(1, 1, 1)))
# Comparison to an analytically derived sanwich estimator
<- with(geexex, {
A matrix(
c(1 , 0, 0,
0 , 1, 0,
-1, mean(Y1)/mean(Y2), mean(Y2)),
byrow = TRUE, nrow = 3)
})
<- with(geexex, {
B matrix(
c(var(Y1) , cov(Y1, Y2), 0,
cov(Y1, Y2), var(Y2) , 0,
0, 0, 0),
byrow = TRUE, nrow = 3)
})
## closed form roots
<- c(mean(geexex$Y1), mean(geexex$Y2))
theta_cls 3] <- theta_cls[1]/theta_cls[2]
theta_cls[## closed form covariance
<- (solve(A) %*% B %*% t(solve(A))) / nrow(geexex)
Sigma_cls <- list(geex = list(estimates = coef(results), vcov = vcov(results)),
comparison cls = list(estimates = theta_cls, vcov = Sigma_cls))
SB gave closed form expressions for \(A(\theta_0)\) and \(B(\theta_0)\), into which we plug in
appropriate estimates for the matrix components and compare to the
results from geex
. The point estimates again show excellent
agreement (maximum absolute difference 4.4e-16), while the covariance
estimates differ by the third decimal (maximum absolute difference
1e-03).
comparison
## $geex
## $geex$estimates
## [1] 5.044563 2.012793 2.506250
##
## $geex$vcov
## [,1] [,2] [,3]
## [1,] 0.100412389 -0.008718712 0.06074328
## [2,] -0.008718712 0.010693092 -0.01764626
## [3,] 0.060743283 -0.017646263 0.05215103
##
##
## $cls
## $cls$estimates
## [1] 5.044563 2.012793 2.506250
##
## $cls$vcov
## [,1] [,2] [,3]
## [1,] 0.10142666 -0.00880678 0.06135685
## [2,] -0.00880678 0.01080110 -0.01782451
## [3,] 0.06135685 -0.01782451 0.05267781
This example extends Example 1 to again illustrate the delta method. The estimating equations target not only the mean (\(\theta_1\)) and variance (\(\theta_2\)) of \(Y_1\), but also the standard deviation (\(\theta_3\)) and the log of the variance (\(\theta_4\)) of \(Y_1\).
\[ \psi(Y_{1i}, \mathbf{\theta}) = \begin{pmatrix} Y_{1i} - \theta_1 \\ (Y_{1i} - \theta_1)^2 - \theta_2 \\ \sqrt{\theta_2} - \theta_3 \\ \log(\theta_2) - \theta_4 \end{pmatrix} \]
<- function(data){
SB3_estfun <- data$Y1
Y1 function(theta){
c(Y1 - theta[1],
- theta[1])^2 - theta[2],
(Y1 sqrt(theta[2]) - theta[3],
log(theta[2]) - theta[4])
} }
## closed form roots
<- numeric(4)
theta_cls 1] <- mean(geexex$Y1)
theta_cls[2] <- sum((geexex$Y1 - theta_cls[1])^2)/nrow(geexex)
theta_cls[3] <- sqrt(theta_cls[2])
theta_cls[4] <- log(theta_cls[2])
theta_cls[
## Compare to closed form ##
<- theta_cls[2]
theta2 <- moments::moment(geexex$Y1, order = 3, central = TRUE)
mu3 <- moments::moment(geexex$Y1, order = 4, central = TRUE)
mu4 ## closed form covariance
<- matrix(
Sigma_cls c(theta2, mu3, mu3/(2*sqrt(theta2)), mu3/theta2,
- theta2^2, (mu4 - theta2^2)/(2*sqrt(theta2)), (mu4 - theta2^2)/theta2,
mu3, mu4 /(2 * sqrt(theta2)), (mu4 - theta2^2)/(2*sqrt(theta2)), (mu4 - theta2^2)/(4*theta2), (mu4 - theta2^2)/(2*theta2^(3/2)),
mu3/theta2, (mu4 - theta2^2)/theta2, (mu4 - theta2^2)/(2*theta2^(3/2)), (mu4/theta2^2) - 1) ,
mu3nrow = 4, byrow = TRUE) / nrow(geexex)
## closed form covariance
# Sigma_cls <- (solve(A) %*% B %*% t(solve(A))) / n
<- list(geex = list(estimates = coef(results), vcov = vcov(results)),
comparison cls = list(estimates = theta_cls, vcov = Sigma_cls))
SB again provided a closed form for \(A(\theta_0)\) and \(B(\theta_0)\), which we compare to the
geex
results. The maximum absolute difference between
geex
and the closed form estimates for both the parameters
and the covariance is 3.8e-11.
## $geex
## $geex$estimates
## [1] 5.044563 10.041239 3.168791 2.306700
##
## $geex$vcov
## [,1] [,2] [,3] [,4]
## [1,] 0.100412389 0.03667969 0.005787646 0.003652905
## [2,] 0.036679687 2.49219638 0.393240840 0.248196105
## [3,] 0.005787646 0.39324084 0.062049026 0.039162582
## [4,] 0.003652905 0.24819611 0.039162582 0.024717678
##
##
## $cls
## $cls$estimates
## [1] 5.044563 10.041239 3.168791 2.306700
##
## $cls$vcov
## [,1] [,2] [,3] [,4]
## [1,] 0.100412389 0.03667969 0.005787646 0.003652905
## [2,] 0.036679687 2.49219638 0.393240840 0.248196105
## [3,] 0.005787646 0.39324084 0.062049026 0.039162582
## [4,] 0.003652905 0.24819611 0.039162582 0.024717678
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.