Score: 0

Some Simplifications for the Expectation-Maximization (EM) Algorithm: The Linear Regression Model Case

Published: September 23, 2025 | arXiv ID: 2509.19461v1

By: Daniel A. Griffith

Potential Business Impact:

Fills in missing data to make predictions.

Business Areas:

Predictive Analytics Artificial Intelligence, Data and Analytics, Software

The EM algorithm is a generic tool that offers maximum likelihood solutions when datasets are incomplete with data values missing at random or completely at random. At least for its simplest form, the algorithm can be rewritten in terms of an ANCOVA regression specification. This formulation allows several analytical results to be derived that permit the EM algorithm solution to be expressed in terms of new observation predictions and their variances. Implementations can be made with a linear regression or a nonlinear regression model routine, allowing missing value imputations, even when they must satisfy constraints. Fourteen example datasets gleaned from the EM algorithm literature are reanalyzed. Imputation results have been verified with SAS PROC MI. Six theorems are proved that broadly contextualize imputation findings in terms of the theory, methodology, and practice of statistical science.

EM Approaches to Nonparametric Estimation for Mixture of Linear Regressions

Methodology

Finds hidden groups in data.

16 Oct 2025 0

88%

Characterizing Evolution in Expectation-Maximization Estimates for Overspecified Mixed Linear Regression

Machine Learning (CS)

Helps computer models learn from messy data faster.

13 Aug 2025 0

88%

Maximum Likelihood for Logistic Regression Model with Incomplete and Hybrid-Type Covariates

Methodology

Fixes computer math when some numbers are missing.

3 Jun 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

23 pages

Some Simplifications for the Expectation-Maximization (EM) Algorithm: The Linear Regression Model Case

Fills in missing data to make predictions.

Technical Abstract

EM Approaches to Nonparametric Estimation for Mixture of Linear Regressions

Characterizing Evolution in Expectation-Maximization Estimates for Overspecified Mixed Linear Regression

Maximum Likelihood for Logistic Regression Model with Incomplete and Hybrid-Type Covariates