Score: 0

Some Simplifications for the Expectation-Maximization (EM) Algorithm: The Linear Regression Model Case

Published: September 23, 2025 | arXiv ID: 2509.19461v1

By: Daniel A. Griffith

Potential Business Impact:

Fills in missing data to make predictions.

Business Areas:
Predictive Analytics Artificial Intelligence, Data and Analytics, Software

The EM algorithm is a generic tool that offers maximum likelihood solutions when datasets are incomplete with data values missing at random or completely at random. At least for its simplest form, the algorithm can be rewritten in terms of an ANCOVA regression specification. This formulation allows several analytical results to be derived that permit the EM algorithm solution to be expressed in terms of new observation predictions and their variances. Implementations can be made with a linear regression or a nonlinear regression model routine, allowing missing value imputations, even when they must satisfy constraints. Fourteen example datasets gleaned from the EM algorithm literature are reanalyzed. Imputation results have been verified with SAS PROC MI. Six theorems are proved that broadly contextualize imputation findings in terms of the theory, methodology, and practice of statistical science.

Country of Origin
🇺🇸 United States

Page Count
23 pages

Category
Statistics:
Methodology