Testing relevant difference in high-dimensional linear regression with applications to detect transferability
By: Xu Liu
Potential Business Impact:
Helps computers learn from old data for new tasks.
Most of researchers on testing a significance of coefficient $\ubeta$ in high-dimensional linear regression models consider the classical hypothesis testing problem $H_0^{c}: \ubeta=\uzero \mbox{ versus } H_1^{c}: \ubeta \neq \uzero$. We take a different perspective and study the testing problem with the null hypothesis of no relevant difference between $\ubeta$ and $\uzero$, that is, $H_0: \|\ubeta\|\leq δ_0 \mbox{ versus } H_1: \|\ubeta\|> δ_0$, where $δ_0$ is a prespecified small constant. This testing problem is motivated by the urgent requirement to detect the transferability of source data in the transfer learning framework. We propose a novel test procedure incorporating the estimation of the largest eigenvalue of a high-dimensional covariance matrix with the assistance of the random matrix theory. In the more challenging setting in the presence of high-dimensional nuisance parameters, we establish the asymptotic normality for the proposed test statistics under both the null and alternative hypotheses. By applying the proposed test approaches to detect the transferability of source data, the unified transfer learning models simultaneously achieve lower estimation and prediction errors with comparison to existing methods. We study the finite-sample properties of the new test by means of simulation studies and illustrate its performance by analyzing the GTEx data.
Similar Papers
Practically significant change points in high dimension -- measuring signal strength pro active component
Statistics Theory
Finds changes in data even when it's messy.
Differentially private testing for relevant dependencies in high dimensions
Statistics Theory
Finds hidden links in private data safely.
Machine-Learning-Assisted Comparison of Regression Functions
Methodology
Compares data patterns even with many details.