A Generative Conditional Distribution Equality Testing Framework and Its Minimax Analysis
By: Siming Zheng , Meifang Lan , Tong Wang and more
Potential Business Impact:
Helps computers learn from different data.
In this paper, we propose a general framework for testing the equality of the conditional distributions in a two-sample problem. This problem is most relevant to transfer learning under covariate shift. Our framework is built on neural network-based generative methods and sample splitting techniques by transforming the conditional distribution testing problem into an unconditional one. We introduce two special tests: the generative permutation-based conditional distribution equality test and the generative classification accuracy-based conditional distribution equality test. Theoretically, we establish a minimax lower bound for statistical inference in testing the equality of two conditional distributions under certain smoothness conditions. We demonstrate that the generative permutation-based conditional distribution equality test and its modified version can attain this lower bound precisely or up to some iterated logarithmic factor. Moreover, we prove the testing consistency of the generative classification accuracy-based conditional distribution equality test. We also establish the convergence rate for the learned conditional generator by deriving new results related to the recently-developed offset Rademacher complexity and approximation properties using neural networks. Empirically, we conduct numerical studies including synthetic datasets and two real-world datasets, demonstrating the effectiveness of our approach.
Similar Papers
A kernel conditional two-sample test
Machine Learning (CS)
Finds when two groups of data are different.
Testing Random Effects for Binomial Data
Statistics Theory
Helps scientists combine study results more accurately.
Practically significant differences between conditional distribution functions
Econometrics
Tests if two groups are different enough.