Lassoed Forests: Random Forests with Adaptive Lasso Post-selection
By: Jing Shang , James Bannon , Benjamin Haibe-Kains and more
Potential Business Impact:
Improves computer predictions by combining methods.
Random forests are a statistical learning technique that use bootstrap aggregation to average high-variance and low-bias trees. Improvements to random forests, such as applying Lasso regression to the tree predictions, have been proposed in order to reduce model bias. However, these changes can sometimes degrade performance (e.g., an increase in mean squared error). In this paper, we show in theory that the relative performance of these two methods, standard and Lasso-weighted random forests, depends on the signal-to-noise ratio. We further propose a unified framework to combine random forests and Lasso selection by applying adaptive weighting and show mathematically that it can strictly outperform the other two methods. We compare the three methods through simulation, including bias-variance decomposition, error estimates evaluation, and variable importance analysis. We also show the versatility of our method by applications to a variety of real-world datasets.
Similar Papers
Lassoed Forests: Random Forests with Adaptive Lasso Post-selection
Machine Learning (Stat)
Improves computer predictions by combining methods.
Adaptive Forests For Classification
Machine Learning (CS)
Makes computer predictions smarter by changing how it learns.
Asymptotic Consistency and Generalization in Hybrid Models of Regularized Selection and Nonlinear Learning
Other Statistics
Finds best clues in messy data for smart choices.