Predictor-Informed Bayesian Nonparametric Clustering
By: Md Yasin Ali Parh, Jeremy T. Gaskins
In this project we are interested in performing clustering of observations such that the cluster membership is influenced by a set of predictors. To that end, we employ the Bayesian nonparameteric Common Atoms Model, which is a nested clustering algorithm that utilizes a (fixed) group membership for each observation to encourage more similar clustering of members of the same group. CAM operates by assuming each group has its own vector of cluster probabilities, which are themselves clustered to allow similar clustering for some groups. We extend this approach by treating the group membership as an unknown latent variable determined as a flexible nonparametric form of the covariate vector. Consequently, observations with similar predictor values will be in the same latent group and are more likely to be clustered together than observations with disparate predictors. We propose a pyramid group model that flexibly partitions the predictor space into these latent group memberships. This pyramid model operates similarly to a Bayesian regression tree process except that it uses the same splitting rule for at all nodes at the same tree depth which facilitates improved mixing. We outline a block Gibbs sampler to perform posterior inference from our model. Our methodology is demonstrated in simulation and real data examples. In the real data application, we utilize the RAND Health and Retirement Study to cluster and predict patient outcomes in terms of the number of overnight hospital stays.
Similar Papers
Monitoring Adverse Events Through Bayesian Nonparametric Clustering Across Studies
Methodology
Finds hidden drug dangers faster and safer.
Bayesian Clustering Factor Models
Methodology
Finds hidden groups in data for better care.
Learning Heterogeneous Ordinal Graphical Models via Bayesian Nonparametric Clustering
Methodology
Finds hidden player groups to improve sports strategy.