Optimal Transport Based Testing in Factorial Design
By: Michel Groppe , Linus Niemöller , Shayan Hundrieser and more
Potential Business Impact:
Tests if groups of data are different.
We introduce a general framework for testing statistical hypotheses for probability measures supported on finite spaces, which is based on optimal transport (OT). These tests are inspired by the analysis of variance (ANOVA) and its nonparametric counterparts. They allow for testing linear relationships in factorial designs between discrete probability measures and are based on pairwise comparisons of the OT distance and corresponding barycenters. To this end, we derive under the null hypotheses and (local) alternatives the asymptotic distribution of empirical OT costs and the empirical OT barycenter cost functional as the optimal value of linear programs with random objective function. In particular, we extend existing techniques for probability to signed measures and show directional Hadamard differentiability and the validity of the functional delta method. We discuss computational issues, permutation and bootstrap tests, and back up our findings with simulations. We illustrate our methodology on two datasets from cellular biophysics and biometric identification.
Similar Papers
Optimal Transport-Based Generative Models for Bayesian Posterior Sampling
Computation
Creates better computer guesses about data.
Functional $K$ Sample Problem via Multivariate Optimal Measure Transport-Based Permutation Test
Statistics Theory
Tests if Bitcoin price changes are random.
Sharp Convergence Rates of Empirical Unbalanced Optimal Transport for Spatio-Temporal Point Processes
Statistics Theory
Measures how well data points match patterns.