EM Approaches to Nonparametric Estimation for Mixture of Linear Regressions
By: Andrew Welbaum, Wanli Qiao
Potential Business Impact:
Finds hidden groups in data.
In a mixture of linear regression model, the regression coefficients are treated as random vectors that may follow either a continuous or discrete distribution. We propose two Expectation-Maximization (EM) algorithms to estimate this prior distribution. The first algorithm solves a kernelized version of the nonparametric maximum likelihood estimation (NPMLE). This method not only recovers continuous prior distributions but also accurately estimates the number of clusters when the prior is discrete. The second algorithm, designed to approximate the NPMLE, targets prior distributions with a density. It also performs well for discrete priors when combined with a post-processing step. We study the convergence properties of both algorithms and demonstrate their effectiveness through simulations and applications to real datasets.
Similar Papers
Characterizing Evolution in Expectation-Maximization Estimates for Overspecified Mixed Linear Regression
Machine Learning (CS)
Helps computer models learn from messy data faster.
Some Simplifications for the Expectation-Maximization (EM) Algorithm: The Linear Regression Model Case
Methodology
Fills in missing data to make predictions.
Convergence and Optimality of the EM Algorithm Under Multi-Component Gaussian Mixture Models
Statistics Theory
Helps computers find hidden patterns in messy data.