Understanding Robust Machine Learning for Nonparametric Regression with Heavy-Tailed Noise
By: Yunlong Feng, Qiang Wu
Potential Business Impact:
Makes computers learn from messy, unreliable data.
We investigate robust nonparametric regression in the presence of heavy-tailed noise, where the hypothesis class may contain unbounded functions and robustness is ensured via a robust loss function $\ell_\sigma$. Using Huber regression as a close-up example within Tikhonov-regularized risk minimization in reproducing kernel Hilbert spaces (RKHS), we address two central challenges: (i) the breakdown of standard concentration tools under weak moment assumptions, and (ii) the analytical difficulties introduced by unbounded hypothesis spaces. Our first message is conceptual: conventional generalization-error bounds for robust losses do not faithfully capture out-of-sample performance. We argue that learnability should instead be quantified through prediction error, namely the $L_2$-distance to the truth $f^\star$, which is $\sigma$-independent and directly reflects the target of robust estimation. To make this workable under unboundedness, we introduce a \emph{probabilistic effective hypothesis space} that confines the estimator with high probability and enables a meaningful bias--variance decomposition under weak $(1+\epsilon)$-moment conditions. Technically, we establish new comparison theorems linking the excess robust risk to the $L_2$ prediction error up to a residual of order $\mathcal{O}(\sigma^{-2\epsilon})$, clarifying the robustness--bias trade-off induced by the scale parameter $\sigma$. Building on this, we derive explicit finite-sample error bounds and convergence rates for Huber regression in RKHS that hold without uniform boundedness and under heavy-tailed noise. Our study delivers principled tuning rules, extends beyond Huber to other robust losses, and highlights prediction error, not excess generalization risk, as the fundamental lens for analyzing robust learning.
Similar Papers
Estimating a regression function under possible heteroscedastic and heavy-tailed errors. Application to shape-restricted regression
Statistics Theory
Improves computer predictions with messy data.
Heavy Lasso: sparse penalized regression under heavy-tailed noise via data-augmented soft-thresholding
Methodology
Makes computer models better with messy data.
An Easily Tunable Approach to Robust and Sparse High-Dimensional Linear Regression
Statistics Theory
Finds hidden patterns even with messy data.