A Bayesian Two-Sample Mean Test for High-Dimensional Data
By: Daojiang He, Suren Xu, Jing Zhou
Potential Business Impact:
Finds hidden differences in data, even with few examples.
We propose a two-sample Bayesian mean test based on the Bayes factor with non-informative priors, specifically designed for scenarios where $p$ grows with $n$ with a linear rate $p/n \to c_1 \in (0, \infty)$. We establish the asymptotic normality of the test statistic and the asymptotic power. Through extensive simulations, we demonstrate that the proposed test performs competitively, particularly when the diagonal elements have heterogeneous variances and for small sample sizes. Furthermore, our test remains robust under distribution misspecification. The proposed method not only effectively detects both sparse and non-sparse differences in mean vectors but also maintains a well-controlled type I error rate, even in small-sample scenarios. We also demonstrate the performance of our proposed test using the \texttt{SRBCTs} dataset.
Similar Papers
High dimensional Mean Test for Temporal Dependent Data
Methodology
Tests time-based data faster and more accurately.
Equivalence Test for Mean Functions from Multi-population Functional Data
Methodology
Tests if groups' data patterns are the same.
A New Two-Sample Test for Covariance Matrices in High Dimensions: U-Statistics Meet Leading Eigenvalues
Statistics Theory
Finds differences in data patterns, even when data is huge.