Score: 1

Empirical Likelihood for Random Forests and Ensembles

Published: November 17, 2025 | arXiv ID: 2511.13934v1

By: Harold D. Chiang, Yukitoshi Matsushita, Taisuke Otsu

Potential Business Impact:

Quantifies computer predictions to show how sure they are.

Business Areas:
A/B Testing Data and Analytics

We develop an empirical likelihood (EL) framework for random forests and related ensemble methods, providing a likelihood-based approach to quantify their statistical uncertainty. Exploiting the incomplete $U$-statistic structure inherent in ensemble predictions, we construct an EL statistic that is asymptotically chi-squared when subsampling induced by incompleteness is not overly sparse. Under sparser subsampling regimes, the EL statistic tends to over-cover due to loss of pivotality; we therefore propose a modified EL that restores pivotality through a simple adjustment. Our method retains key properties of EL while remaining computationally efficient. Theory for honest random forests and simulations demonstrate that modified EL achieves accurate coverage and practical reliability relative to existing inference methods.

Country of Origin
πŸ‡¬πŸ‡§ πŸ‡―πŸ‡΅ πŸ‡ΊπŸ‡Έ Japan, United States, United Kingdom

Page Count
34 pages

Category
Statistics:
Machine Learning (Stat)