Type: | Package |
Title: | Projected Refinement for Imputation of Missing Entries in PCA |
Version: | 1.2 |
Date: | 2021-8-5 |
Author: | Ziwei Zhu, Tengyao Wang, Richard J. Samworth |
Maintainer: | Ziwei Zhu <ziweiz@umich.edu> |
Description: | Implements the primePCA algorithm, developed and analysed in Zhu, Z., Wang, T. and Samworth, R. J. (2019) High-dimensional principal component analysis with heterogeneous missingness. <doi:10.48550/arXiv.1906.12125>. |
Imports: | softImpute, Matrix, MASS, methods |
RoxygenNote: | 7.1.1 |
License: | GPL-3 |
NeedsCompilation: | no |
Packaged: | 2021-08-05 13:57:37 UTC; ziweizhu |
Repository: | CRAN |
Date/Publication: | 2021-08-05 15:10:02 UTC |
Center and/or normalize each column of a matrix
Description
Center and/or normalize each column of a matrix
Usage
col_scale(X, center = T, normalize = F)
Arguments
X |
a numeric matrix with NAs or "Incomplete" matrix object (see softImpute package) |
center |
center each column of |
normalize |
normalize each column of |
Value
a centered and/or normalized matrix of the same dimension as X
.
Inverse probability weighted method for estimating the top K eigenspaces
Description
Inverse probability weighted method for estimating the top K eigenspaces
Usage
inverse_prob_method(X, K, trace.it = F, center = T, normalize = F)
Arguments
X |
a numeric matrix with |
K |
the number of principal components of interest |
trace.it |
report the progress if |
center |
center each column of |
normalize |
normalize each column of |
Value
Columnwise centered matrix of the same dimension as X
.
Examples
X <- matrix(1:30 + .1 * rnorm(30), 10, 3)
X[1, 1] <- NA
X[2, 3] <- NA
v_hat <- inverse_prob_method(X, 1)
primePCA algorithm
Description
primePCA algorithm
Usage
primePCA(
X,
K,
V_init = NULL,
thresh_sigma = 10,
max_iter = 1000,
thresh_convergence = 1e-05,
thresh_als = 1e-10,
trace.it = F,
prob = 1,
save_file = "",
center = T,
normalize = F
)
Arguments
X |
an |
K |
the number of the principal components of interest |
V_init |
an initial estimate of the top |
thresh_sigma |
used to select the "good" rows of |
max_iter |
maximum number of iterations of refinement |
thresh_convergence |
The algorithm is halted if the Frobenius-norm sine-theta distance between the two consecutive iterates |
thresh_als |
This is fed into |
trace.it |
report the progress if |
prob |
probability of reserving the "good" rows. |
save_file |
the location that saves the intermediate results, including |
center |
center each column of |
normalize |
normalize each column of |
Value
a list is returned, with components V_cur
, step_cur
and loss_all
.
V_cur
is a d
-by-K
matrix of the top K
eigenvectors. step_cur
is the number of iterations.
loss_all
is an array of the trajectory of MSE.
Examples
X <- matrix(1:30 + .1 * rnorm(30), 10, 3)
X[1, 1] <- NA
X[2, 3] <- NA
v_tilde <- primePCA(X, 1)$V_cur
Frobenius norm sin theta distance between two column spaces
Description
Frobenius norm sin theta distance between two column spaces
Usage
sin_theta_distance(V1, V2)
Arguments
V1 |
a matrix with orthonormal columns |
V2 |
a matrix of the same dimension as V1 with orthonormal columns |
Value
the Frobenius norm sin theta distance between two V1 and V2