Score: 0

Estimating a regression function under possible heteroscedastic and heavy-tailed errors. Application to shape-restricted regression

Published: June 1, 2025 | arXiv ID: 2506.00852v1

By: Yannick Baraud, Guillaume Maillard

Potential Business Impact:

Improves computer predictions with messy data.

Business Areas:

A/B Testing Data and Analytics

We consider a regression framework where the design points are deterministic and the errors possibly non-i.i.d. and heavy-tailed (with a moment of order $p$ in $[1,2]$). Given a class of candidate regression functions, we propose a surrogate for the classical least squares estimator (LSE). For this new estimator, we establish a nonasymptotic risk bound with respect to the absolute loss which takes the form of an oracle type inequality. This inequality shows that our estimator possesses natural adaptation properties with respect to some elements of the class. When this class consists of monotone functions or convex functions on an interval, these adaptation properties are similar to those established in the literature for the LSE. However, unlike the LSE, we prove that our estimator remains stable with respect to a possible heteroscedasticity of the errors and may even converge at a parametric rate (up to a logarithmic factor) when the LSE is not even consistent. We illustrate the performance of this new estimator over classes of regression functions that satisfy a shape constraint: piecewise monotone, piecewise convex/concave, among other examples. The paper also contains some approximation results by splines with degrees in $\{0,1\}$ and VC bounds for the dimensions of classes of level sets. These results may be of independent interest.

Understanding Robust Machine Learning for Nonparametric Regression with Heavy-Tailed Noise

Machine Learning (CS)

Makes computers learn from messy, unreliable data.

10 Oct 2025 0

87%

Heavy Lasso: sparse penalized regression under heavy-tailed noise via data-augmented soft-thresholding

Methodology

Makes computer models better with messy data.

9 Jun 2025 1

87%

Heteroscedastic Growth Curve Modeling with Shape-Restricted Splines

Methodology

Makes growth predictions more accurate by fixing errors.

1 Mar 2025 1

View PDF Login to Bookmark

Page Count

40 pages

Estimating a regression function under possible heteroscedastic and heavy-tailed errors. Application to shape-restricted regression

Improves computer predictions with messy data.

Technical Abstract

Understanding Robust Machine Learning for Nonparametric Regression with Heavy-Tailed Noise

Heavy Lasso: sparse penalized regression under heavy-tailed noise via data-augmented soft-thresholding

Heteroscedastic Growth Curve Modeling with Shape-Restricted Splines