Bayesian nonparametric modeling of mixed-type bounded data
By: Rufeng Liu , Claudia Wehrhahn , Andrés F. Barrientos and more
Potential Business Impact:
Helps understand mixed health data better.
We propose a Bayesian nonparametric model for mixed-type bounded data, where some variables are compositional and others are interval-bounded. Compositional variables are non-negative and sum to a given constant, such as the proportion of time an individual spends on different activities during the day or the fraction of different types of nutrients in a person's diet. Interval-bounded variables, on the other hand, are real numbers constrained by both a lower and an upper bound. Our approach relies on a novel class of random multivariate Bernstein polynomials, which induce a Dirichlet process mixture model of products of Dirichlet and beta densities. We study the theoretical properties of the model, including its topological support and posterior consistency. The model can be used for density and conditional density estimation, where both the response and predictors take values in the simplex space and/or hypercube. We illustrate the model's behavior through the analysis of simulated data and data from the 2005-2006 cycle of the U.S. National Health and Nutrition Examination Survey.
Similar Papers
Density estimation for compositional data using nonparametric mixtures
Methodology
Helps computers understand data with zero values.
From Partial Exchangeability to Predictive Probability: A Bayesian Perspective on Classification
Methodology
Helps computers guess better with less data.
Statistical Modeling of Combinatorial Response Data
Methodology
Helps understand complex choices with many rules.