A New Two-Sample Test for Covariance Matrices in High Dimensions: U-Statistics Meet Leading Eigenvalues
By: Thomas Lam, Nina Dörnemann, Holger Dette
Potential Business Impact:
Finds differences in data patterns, even when data is huge.
We propose a two-sample test for covariance matrices in the high-dimensional regime, where the dimension diverges proportionally to the sample size. Our hybrid test combines a Frobenius-norm-based statistic as considered in Li and Chen (2012) with the leading eigenvalue approach proposed in Zhang et al. (2022), making it sensitive to both dense and sparse alternatives. The two statistics are combined via Fisher's method, leveraging our key theoretical result: a joint central limit theorem showing the asymptotic independence of the leading eigenvalues of the sample covariance matrix and an estimator of the Frobenius norm of the difference of the two population covariance matrices, under suitable signal conditions. The level of the test can be controlled asymptotically, and we show consistency against certain types of both sparse and dense alternatives. A comprehensive numerical study confirms the favorable performance of our method compared to existing approaches.
Similar Papers
Testing for large-dimensional covariance matrix under differential privacy
Methodology
Protects private data while finding hidden patterns.
On eigenvalues of a renormalized sample correlation matrix
Statistics Theory
Finds if data is related, even with lots of info.
Direct Estimation of Eigenvalues of Large Dimensional Precision Matrix
Statistics Theory
Finds hidden patterns in data faster.