Score: 0

Comparing Model-agnostic Feature Selection Methods through Relative Efficiency

Published: August 19, 2025 | arXiv ID: 2508.14268v1

By: Chenghui Zheng, Garvesh Raskutti

Potential Business Impact:

Finds best clues for computer predictions.

Business Areas:

A/B Testing Data and Analytics

Feature selection and importance estimation in a model-agnostic setting is an ongoing challenge of significant interest. Wrapper methods are commonly used because they are typically model-agnostic, even though they are computationally intensive. In this paper, we focus on feature selection methods related to the Generalized Covariance Measure (GCM) and Leave-One-Covariate-Out (LOCO) estimation, and provide a comparison based on relative efficiency. In particular, we present a theoretical comparison under three model settings: linear models, non-linear additive models, and single index models that mimic a single-layer neural network. We complement this with extensive simulations and real data examples. Our theoretical results, along with empirical findings, demonstrate that GCM-related methods generally outperform LOCO under suitable regularity conditions. Furthermore, we quantify the asymptotic relative efficiency of these approaches. Our simulations and real data analysis include widely used machine learning methods such as neural networks and gradient boosting trees.

Cooperative effects in feature importance of individual patterns: application to air pollutants and Alzheimer disease

Machine Learning (CS)

Shows how pollution and green space affect Alzheimer's.

30 Jul 2025 0

85%

Mathematical Theory of Collinearity Effects on Machine Learning Variable Importance Measures

Statistics Theory

Explains how computers know which data is most important.

1 Oct 2025 0

85%

Asymptotic Consistency and Generalization in Hybrid Models of Regularized Selection and Nonlinear Learning

Other Statistics

Finds best clues in messy data for smart choices.

11 Aug 2025 0

View PDF Login to Bookmark

Page Count

33 pages

Comparing Model-agnostic Feature Selection Methods through Relative Efficiency

Finds best clues for computer predictions.

Technical Abstract

Cooperative effects in feature importance of individual patterns: application to air pollutants and Alzheimer disease

Mathematical Theory of Collinearity Effects on Machine Learning Variable Importance Measures

Asymptotic Consistency and Generalization in Hybrid Models of Regularized Selection and Nonlinear Learning