Ties, Tails and Spectra: On Rank-Based Dependency Measures in High Dimensions
By: Nina Dörnemann, Michael Fleermann, Johannes Heiny
Potential Business Impact:
Finds patterns in messy, big number sets.
This work is concerned with the limiting spectral distribution of rank-based dependency measures in high dimensions. We provide distribution-free results for multivariate empirical versions of Kendall's $\tau$ and Spearman's $\rho$ in a setting where the dimension $p$ grows at most proportionally to the sample size $n$. Although rank-based measures are known to be well suited for discrete and heavy-tailed data, previous works in the field focused mostly on the continuous and light-tailed case. We close this gap by imposing mild assumptions and allowing for general types of distributions. Interestingly, our analysis reveals that a non-trivial adjustment of classical Kendall's $\tau$ is needed to obtain a pivotal limiting distribution in the presence of tied data. The proof for Spearman's $\rho$ is facilitated by a result regarding the limiting eigenvalue distribution of a general class of random matrices with rows on the Euclidean unit sphere, which is of independent interest. For instance, this finding can be used to derive the limiting spectral distribution of sample correlation matrices, which, in contrast to most existing works, accommodates heavy-tailed data.
Similar Papers
Limiting Spectral Distribution of High-dimensional Multivariate Kendall-$τ$
Statistics Theory
Finds patterns in many numbers at once.
Limiting Spectral Distribution of High-dimensional Multivariate Kendall-$τ$
Statistics Theory
Finds patterns in many numbers at once.
Differentially private testing for relevant dependencies in high dimensions
Statistics Theory
Finds hidden links in private data safely.