Tree-based methods for estimating heterogeneous model performance and model combining
By: Ruotao Zhang, Constantine Gatsonis, Jon Steingrimsson
Potential Business Impact:
Finds groups where a computer model works poorly.
Model performance is frequently reported only for the overall population under consideration. However, due to heterogeneity, overall performance measures often do not accurately represent model performance within specific subgroups. We develop tree-based methods for the data-driven identification of subgroups with differential model performance, where splitting decisions are made to maximize heterogeneity in performance between subgroups. We extend these methods to tree ensembles, including both random forests and gradient boosting. Lastly, we illustrate how these ensembles can be used for model combination. We evaluate the methods through simulations and apply them to lung cancer screening data.
Similar Papers
A novel gradient-based method for decision trees optimizing arbitrary differential loss functions
Machine Learning (CS)
Makes computer "brains" learn better from data.
Enhanced Survival Trees
Methodology
Helps doctors predict how long patients will live.
Utilizing subgroup information in random-effects meta-analysis of few studies
Methodology
Improves medical study results with few data points.