Score: 1

Kernel Two-Sample Testing via Directional Components Analysis

Published: August 12, 2025 | arXiv ID: 2508.08564v2

By: Rui Cui, Yuhao Li, Xiaojun Song

Potential Business Impact:

Finds differences between groups of data.

We propose a novel kernel-based two-sample test that leverages the spectral decomposition of the maximum mean discrepancy (MMD) statistic to identify and utilize well-estimated directional components in reproducing kernel Hilbert space (RKHS). Our approach is motivated by the observation that the estimation quality of these components varies significantly, with leading eigen-directions being more reliably estimated in finite samples. By focusing on these directions and aggregating information across multiple kernels, the proposed test achieves higher power and improved robustness, especially in high-dimensional and unbalanced sample settings. We further develop a computationally efficient multiplier bootstrap procedure for approximating critical values, which is theoretically justified and significantly faster than permutation-based alternatives. Extensive simulations and empirical studies on microarray datasets demonstrate that our method maintains the nominal Type I error rate and delivers superior power compared to other existing MMD-based tests.

Country of Origin
🇨🇳 China

Repos / Data Links

Page Count
37 pages

Category
Statistics:
Methodology