Score: 0

Robust Variable Selection in High-dimensional Nonparametric Additive Model

Published: May 6, 2025 | arXiv ID: 2505.04009v1

By: Suneel Babu Chatla, Abhijit Mandal

Potential Business Impact:

Finds important patterns even with messy data.

Business Areas:

A/B Testing Data and Analytics

Additive models belong to the class of structured nonparametric regression models that do not suffer from the curse of dimensionality. Finding the additive components that are nonzero when the true model is assumed to be sparse is an important problem, and it is well studied in the literature. The majority of the existing methods focused on using the $L_2$ loss function, which is sensitive to outliers in the data. We propose a new variable selection method for additive models that is robust to outliers in the data. The proposed method employs a nonconcave penalty for variable selection and considers the framework of B-splines and density power divergence loss function for estimation. The loss function produces an M-estimator that down weights the effect outliers. Our asymptotic results are derived under the sub-Weibull assumption, which allows the error distribution to have an exponentially heavy tail. Under regularity conditions, we show that the proposed method achieves the optimal convergence rate. In addition, our results include the convergence rates for sub-Gaussian and sub-Exponential distributions as special cases. We numerically validate our theoretical findings using simulations and real data analysis.