The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Type: Package
Title: Projected Refinement for Imputation of Missing Entries in PCA
Version: 1.2
Date: 2021-8-5
Author: Ziwei Zhu, Tengyao Wang, Richard J. Samworth
Maintainer: Ziwei Zhu <ziweiz@umich.edu>
Description: Implements the primePCA algorithm, developed and analysed in Zhu, Z., Wang, T. and Samworth, R. J. (2019) High-dimensional principal component analysis with heterogeneous missingness. <doi:10.48550/arXiv.1906.12125>.
Imports: softImpute, Matrix, MASS, methods
RoxygenNote: 7.1.1
License: GPL-3
NeedsCompilation: no
Packaged: 2021-08-05 13:57:37 UTC; ziweizhu
Repository: CRAN
Date/Publication: 2021-08-05 15:10:02 UTC

Center and/or normalize each column of a matrix

Description

Center and/or normalize each column of a matrix

Usage

col_scale(X, center = T, normalize = F)

Arguments

X

a numeric matrix with NAs or "Incomplete" matrix object (see softImpute package)

center

center each column of X if center == TRUE. The default value is TRUE.

normalize

normalize each column of X such that its sample variance is 1 if normalize == TRUE. The default value is False.

Value

a centered and/or normalized matrix of the same dimension as X.


Inverse probability weighted method for estimating the top K eigenspaces

Description

Inverse probability weighted method for estimating the top K eigenspaces

Usage

inverse_prob_method(X, K, trace.it = F, center = T, normalize = F)

Arguments

X

a numeric matrix with NAs or "Incomplete" matrix object (see softImpute package)

K

the number of principal components of interest

trace.it

report the progress if trace.it == TRUE

center

center each column of X if center == TRUE. The default value is TRUE.

normalize

normalize each column of X such that its sample variance is 1 if normalize == TRUE. The default value is False.

Value

Columnwise centered matrix of the same dimension as X.

Examples

X <- matrix(1:30 + .1 * rnorm(30), 10, 3)
X[1, 1] <- NA
X[2, 3] <- NA
v_hat <- inverse_prob_method(X, 1)

primePCA algorithm

Description

primePCA algorithm

Usage

primePCA(
  X,
  K,
  V_init = NULL,
  thresh_sigma = 10,
  max_iter = 1000,
  thresh_convergence = 1e-05,
  thresh_als = 1e-10,
  trace.it = F,
  prob = 1,
  save_file = "",
  center = T,
  normalize = F
)

Arguments

X

an n-by-d data matrix with NA values

K

the number of the principal components of interest

V_init

an initial estimate of the top K eigenspaces of the covariance matrix of X. By default, primePCA will be initialized by the inverse probability method.

thresh_sigma

used to select the "good" rows of X to update the principal eigenspaces \sigma_* in the paper).

max_iter

maximum number of iterations of refinement

thresh_convergence

The algorithm is halted if the Frobenius-norm sine-theta distance between the two consecutive iterates

thresh_als

This is fed into thresh in svd.als of softImpute. is less than thresh_convergence.

trace.it

report the progress if trace.it = TRUE

prob

probability of reserving the "good" rows. prob == 1 means to reserve all the "good" rows.

save_file

the location that saves the intermediate results, including V_cur, step_cur and loss_all, which are introduced in the section of returned values. The algorithm will not save any intermediate result if save_file == "".

center

center each column of X if center == TRUE. The default value is TRUE.

normalize

normalize each column of X such that its sample variance is 1 if normalize == TRUE. The default value is False.

Value

a list is returned, with components V_cur, step_cur and loss_all. V_cur is a d-by-K matrix of the top K eigenvectors. step_cur is the number of iterations. loss_all is an array of the trajectory of MSE.

Examples

X <- matrix(1:30 + .1 * rnorm(30), 10, 3)
X[1, 1] <- NA
X[2, 3] <- NA
v_tilde <- primePCA(X, 1)$V_cur

Frobenius norm sin theta distance between two column spaces

Description

Frobenius norm sin theta distance between two column spaces

Usage

sin_theta_distance(V1, V2)

Arguments

V1

a matrix with orthonormal columns

V2

a matrix of the same dimension as V1 with orthonormal columns

Value

the Frobenius norm sin theta distance between two V1 and V2

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.