Second-Order Asymptotics of Two-Sample Tests
By: K V Harsha, Jithin Ravi, Tobias Koch
In two-sampling testing, one observes two independent sequences of independent and identically distributed random variables distributed according to the distributions $P_1$ and $P_2$ and wishes to decide whether $P_1=P_2$ (null hypothesis) or $P_1\neq P_2$ (alternative hypothesis). The Gutman test for this problem compares the empirical distributions of the observed sequences and decides on the null hypothesis if the Jensen-Shannon (JS) divergence between these empirical distributions is below a given threshold. This paper proposes a generalization of the Gutman test, termed \emph{divergence test}, which replaces the JS divergence by an arbitrary divergence. For this test, the exponential decay of the type-II error probability for a fixed type-I error probability is studied. First, it is shown that the divergence test achieves the optimal first-order exponent, irrespective of the choice of divergence. Second, it is demonstrated that the divergence test with an invariant divergence achieves the same second-order asymptotics as the Gutman test. In addition, it is shown that the Gutman test is the GLRT for the two-sample testing problem, and a connection between two-sample testing and robust goodness-of-fit testing is established.
Similar Papers
Two tales for a geometric Jensen--Shannon divergence
Information Theory
Improves math for computers learning from data.
On a transform of the Vincze-statistic and its exact and asymptotic distribution
Statistics Theory
Finds if two groups of things are different.
Universal Outlier Hypothesis Testing via Mean- and Median-Based Tests
Information Theory
Find weird data in big groups of information.