Estimating the true number of principal components under the random design
By: Yasuyuki Matsumura
Potential Business Impact:
Finds the best way to simplify complex data.
Principal component analysis (PCA) is frequently employed as a dimension reduction tool when the number of covariates is large. However, the number of principal components to be retained in PCA is typically determined in a researcher-dependent manner. To mitigate the subjectivity in PCA, this paper proposes a data-driven testing procedure to estimate the number of underlying principal components. While existing work such as G'Sell et al. (2016), Taylor et al. (2016) and Choi et al. (2017) discuss similar tests under fixed design, this paper investigates an extension of their framework to a more general econometric setup with the random design. The proposed test is proved to achieve asymptotically exact type 1 error controls under a locally defined null hypothesis, with simulation examples indicating an asymptotic validity of our test.
Similar Papers
Large-dimensional Factor Analysis with Weighted PCA
Methodology
Improves computer analysis of complex data.
Highly robust factored principal component analysis for matrix-valued outlier accommodation and explainable detection via matrix minimum covariance determinant
Methodology
Finds bad data points in complex pictures.
Estimation of Semiparametric Factor Models with Missing Data
Methodology
Fixes broken data for better predictions.