Score: 0

Sequential Bootstrap for Out-of-Bag Error Estimation: A Simulation-Based Replication and Stability-Oriented Refinement

Published: November 22, 2025 | arXiv ID: 2511.18065v1

By: Cheng Peng

Potential Business Impact:

Makes computer learning models more predictable.

Business Areas:
A/B Testing Data and Analytics

Bootstrap resampling is the foundation of many ensemble learning methods, and out-of-bag (OOB) error estimation is the most widely used internal measure of generalization performance. In the standard multinomial bootstrap, the number of distinct observations in each resample is random. Although this source of variability exists, it has rarely been studied in isolation to understand how much it affects OOB-based quantities. To address this gap, we investigate Sequential Bootstrap, a resampling method that forces every bootstrap replicate to contain the same number of distinct observations, and treat it as a controlled modification of the classical bootstrap within the OOB framework. We reproduce Breiman's five original OOB experiments on both synthetic and real-world datasets, repeating all analyses across many different random seeds. Our results show that switching from the classical bootstrap to Sequential Bootstrap leaves accuracy-related metrics essentially unchanged, but yields measurable and data-dependent reductions in several variance-related measures. Therefore, Sequential Bootstrap should not be viewed as a new method for improving predictive performance, but rather as a tool for understanding how randomness in the number of distinct samples contributes to the variance of OOB estimators. This work provides a reproducible setting for studying the statistical properties of resampling-based ensemble estimators and offers empirical evidence that may support future theoretical work on variance decomposition in bootstrap-based systems.

Page Count
25 pages

Category
Statistics:
Methodology