Minimax Optimal Robust Sparse Regression with Heavy-Tailed Designs: A Gradient-Based Approach
By: Kaiyuan Zhou , Xiaoyu Zhang , Wenyang Zhang and more
Potential Business Impact:
Makes computer learning work with messy data.
We investigate high-dimensional sparse regression when both the noise and the design matrix exhibit heavy-tailed behavior. Standard algorithms typically fail in this regime, as heavy-tailed covariates distort the empirical risk geometry. We propose a unified framework, Robust Iterative Gradient descent with Hard Thresholding (RIGHT), which employs a robust gradient estimator to bypass the need for higher-order moment conditions. Our analysis reveals a fundamental decoupling phenomenon: in linear regression, the estimation error rate is governed by the noise tail index, while the sample complexity required for stability is governed by the design tail index. This implies that while heavy-tailed noise limits precision, heavy-tailed designs primarily raise the sample size barrier for convergence. In contrast, for logistic regression, we show that the bounded gradient naturally robustifies the estimator against heavy-tailed designs, restoring standard parametric rates. We derive matching minimax lower bounds to prove that RIGHT achieves optimal estimation accuracy and sample complexity across these regimes, without requiring sample splitting or the existence of the population risk.
Similar Papers
Regularized least squares learning with heavy-tailed noise is minimax optimal
Machine Learning (CS)
Makes computer learning work better with messy data.
Understanding Robust Machine Learning for Nonparametric Regression with Heavy-Tailed Noise
Machine Learning (CS)
Makes computers learn from messy, unreliable data.
Can SGD Handle Heavy-Tailed Noise?
Optimization and Control
Makes computers learn better with messy data.