An Infinite BART model
By: Marco Battiston, Yu Luo
Potential Business Impact:
Lets computers learn better from different data.
Bayesian additive regression trees (BART) are popular Bayesian ensemble models used in regression and classification analysis. Under this modeling framework, the regression function is approximated by an ensemble of decision trees, interpreted as weak learners that capture different features of the data. In this work, we propose a generalization of the BART model that has two main features: first, it automatically selects the number of decision trees using the given data; second, the model allows clusters of observations to have different regression functions since each data point can only use a selection of weak learners, instead of all of them. This model generalization is accomplished by including a binary weight matrix in the conditional distribution of the response variable, which activates only a specific subset of decision trees for each observation. Such a matrix is endowed with an Indian Buffet process prior, and sampled within the MCMC sampler, together with the other BART parameters. We then compare the Infinite BART model with the classic one on simulated and real datasets. Specifically, we provide examples illustrating variable importance, partial dependence and causal estimation.
Similar Papers
GS-BART: Bayesian Additive Regression Trees with Graph-split Decision Rules
Methodology
Helps computers learn from connected data better.
Bayesian Additive Regression Trees for functional ANOVA model
Machine Learning (Stat)
Shows how different things affect results.
Bayesian Additive Regression Trees for functional ANOVA model
Machine Learning (Stat)
Explains how different things affect results.