Dimension reduction regression

Description

The function dr implements dimension reduction methods, including SIR, SAVE and pHd.

Usage

dr(formula, data=list(), subset, weights, na.action=na.omit, method="sir", 
     contrasts=NULL,offset=NULL, ...)
dr.weights(formula, data=list(), subset, weights, na.action=na.omit, method="sir", 
     contrasts=NULL,offset=NULL, ...)
 

Arguments

formula a symbolic description of the model to be fit. The details of the model are the same as for lm.
data an optional data frame containing the variables in the model. By default the variables are taken from the environment from which `dr' is called.
subset an optional vector specifying a subset of observations to be used in the fitting process.
weights an optional vector of weights to be used where appropriate.
na.action a function which indicates what should happen when the data contain `NA's. The default is `na.omit,' which will force calculations on a complete subset of cases.
method This character string specifies the method of fitting. ``sir" specifies sliced inverse regression and ``save" specifies sliced average variance estimation. ``phdy" uses principal hessian directions using the response as suggested by Li, and ``phdres" uses the LS residuals as suggested by Cook. Other methods may be added
contrasts an optional list. See the `contrasts.arg' of `model.matrix.default'.
offset Set an offset or NULL
... additional items that may be required or permitted by some methods. nslices is the number of slices used by sir and save. numdir is the maximum number of directions to compute, with default equal to 4.

Details

The general regression problem studies F(y|x), the conditional distribution of a response y given a set of predictors x. This function provides methods for estimating the dimension and central subspace of a general regression problem. That is, we want to find a p by d matrix B such that

F(y|x)=F(y|B'x)

Both the dimension d and the subspace R(B) are unknown. These methods make few assumptions. All the methods available in this function estimate the unknowns by study of the inverse problem, F(x|y). In each, a kernel matrix M is estimated such that the column space of M should be close to the central subspace. Eigenanalysis of M is then used to estimate the central subspace. Objects created using this function have appropriate print, summary and plot methods.

Weights can be used, essentially to specify the relative frequency of each case in the data. Empirical weights that make the contours of the weighted sample closer to elliptical can be computed using dr.weights. This will usually result in zero weight for some cases. The function will set zero estimated weights to missing.

Several functions are provided that require a dr object as input. dr.permutation.tests uses a permutation test to obtain significance levels for tests of dimension. dr.coplot allows visualizing the results using a coplot of either two selected directions conditioning on a third and using color to mark the response, or the resonse versus one direction, conditioning on a second direction. plot.dr provides the default plot method for dr objects, based on a scatterplot matrix.

Value

dr returns an object that inherits from dr (the name of the type is the value of the method argument), with attributes:

M A matrix that depends on the method of computing. The column space of M should be close to the central subspace.
evalues The eigenvalues of M (or squared singular values if M is not symmetric).
evectors The eigenvectors of M (or of M'M if M is not square and symmetric) ordered according to the eigenvalues.
numdir The maximum number of directions to be found. The output value of numdir may be smaller than the input value.
ols.coef Estimated regression coefficients, excluding the intercept, for the (weighted) LS fit.
ols.fit LS fitted values.
slice.info output from sir.slice, used by sir and save.
method the dimension reduction method used.

Other returned values repeat quantities from input.
dr.weights returns a vector of weights NA substituted for estimated zero weights.

Author(s)

Sanford Weisberg, sandy@stat.umn.edu

For weights, see R. D. Cook and C. Nachtsheim (1994), Reweighting to achieve elliptically contoured predictors in regression. Journal of the American Statistical Association, 89, 592–599.

References

The details of these methods are given by R. D. Cook (1998). Regression Graphics. New York: Wiley. Equivalent methods are also available in Arc, R. D. Cook and S. Weisberg (1999). Applied Regression Including Computing and Graphics, New York: Wiley, www.stat.umn.edu/arc.

See Also

dr.permutation.test,dr.x,dr.y, dr.direction,dr.coplot,dr.weights

Examples

library(dr)
data(ais)
attach(ais)  # the Australian athletes data
#fit dimension reduction using sir
m1 <- dr(LBM~Wt+Ht+RCC+WCC, method="sir", nslices = 8)
summary(m1)

# repeat, using save:

m2 <- update(m1,method="save")
summary(m2)

# repeat, using phd:

m3 <- update(m2, method="phdres")
summary(m3)

[Package Contents]