Score: 0

Random Subset Averaging

Published: December 27, 2025 | arXiv ID: 2512.22472v1

By: Wenhao Cui, Jie Hu

We propose a new ensemble prediction method, Random Subset Averaging (RSA), tailored for settings with many covariates, particularly in the presence of strong correlations. RSA constructs candidate models via binomial random subset strategy and aggregates their predictions through a two-round weighting scheme, resulting in a structure analogous to a two-layer neural network. All tuning parameters are selected via cross-validation, requiring no prior knowledge of covariate relevance. We establish the asymptotic optimality of RSA under general conditions, allowing the first-round weights to be data-dependent, and demonstrate that RSA achieves a lower finite-sample risk bound under orthogonal design. Simulation studies demonstrate that RSA consistently delivers superior and stable predictive performance across a wide range of sample sizes, dimensional settings, sparsity levels and correlation structures, outperforming conventional model selection and ensemble learning methods. An empirical application to financial return forecasting further illustrates its practical utility.

Category
Statistics:
Methodology