Score: 0

A variational Bayes latent class approach for EHR-based patient phenotyping in R

Published: December 16, 2025 | arXiv ID: 2512.14272v1

By: Brian Buckley, Adrian O'Hagan, Marie Galligan

The VBphenoR package for R provides a closed-form variational Bayes approach to patient phenotyping using Electronic Health Records (EHR) data. We implement a variational Bayes Gaussian Mixture Model (GMM) algorithm using closed-form coordinate ascent variational inference (CAVI) to determine the patient phenotype latent class. We then implement a variational Bayes logistic regression, where we determine the probability of the phenotype in the supplied EHR cohort, the shift in biomarkers for patients with the phenotype of interest versus a healthy population and evaluate predictive performance of binary indicator clinical codes and medication codes. The logistic model likelihood applies the latent class from the GMM step to inform the conditional.

Category
Statistics:
Computation