Testing for latent structure via the Wilcoxon--Wigner random matrix of normalized rank statistics
By: Jonquil Z. Liao, Joshua Cape
This paper considers the problem of testing for latent structure in large symmetric data matrices. The goal here is to develop statistically principled methodology that is flexible in its applicability, computationally efficient, and insensitive to extreme data variation, thereby overcoming limitations facing existing approaches. To do so, we introduce and systematically study certain symmetric matrices, called Wilcoxon--Wigner random matrices, whose entries are normalized rank statistics derived from an underlying independent and identically distributed sample of absolutely continuous random variables. These matrices naturally arise as the matricization of one-sample problems in statistics and conceptually lie at the interface of nonparametrics, multivariate analysis, and data reduction. Among our results, we establish that the leading eigenvalue and corresponding eigenvector of Wilcoxon--Wigner random matrices admit asymptotically Gaussian fluctuations with explicit centering and scaling terms. These asymptotic results enable rigorous parameter-free and distribution-free spectral methodology for addressing two hypothesis testing problems, namely community detection and principal submatrix detection. Numerical examples illustrate the performance of the proposed approach. Throughout, our findings are juxtaposed with existing results based on the spectral properties of independent entry symmetric random matrices in signal-plus-noise data settings.
Similar Papers
Monitoring for a Phase Transition in a Time Series of Wigner Matrices
Statistics Theory
Find hidden changes in data streams.
Asymptotic distributions of four linear hypotheses test statistics under generalized spiked model
Statistics Theory
Tests if data patterns are real or random.
Enhanced Rank-Based Correlation Estimation Using Smoothed Wilcoxon Rank Scores
Methodology
Finds stronger connections between data points.