CEHR-XGPT: A Scalable Multi-Task Foundation Model for Electronic Health Records
By: Chao Pang , Jiheum Park , Xinzhuo Jiang and more
Potential Business Impact:
Helps doctors predict patient health using past records.
Electronic Health Records (EHRs) provide a rich, longitudinal view of patient health and hold significant potential for advancing clinical decision support, risk prediction, and data-driven healthcare research. However, most artificial intelligence (AI) models for EHRs are designed for narrow, single-purpose tasks, limiting their generalizability and utility in real-world settings. Here, we present CEHR-XGPT, a general-purpose foundation model for EHR data that unifies three essential capabilities - feature representation, zero-shot prediction, and synthetic data generation - within a single architecture. To support temporal reasoning over clinical sequences, CEHR-XGPT incorporates a novel time-token-based learning framework that explicitly encodes patients' dynamic timelines into the model structure. CEHR-XGPT demonstrates strong performance across all three tasks and generalizes effectively to external datasets through vocabulary expansion and fine-tuning. Its versatility enables rapid model development, cohort discovery, and patient outcome forecasting without the need for task-specific retraining.
Similar Papers
Generative Foundation Model for Structured and Unstructured Electronic Health Records
Artificial Intelligence
Helps doctors predict sickness and write notes faster.
Zero-shot Medical Event Prediction Using a Generative Pre-trained Transformer on Electronic Health Records
Machine Learning (CS)
Predicts future patient health events without extra training.
ProtoEHR: Hierarchical Prototype Learning for EHR-based Healthcare Predictions
Machine Learning (CS)
Helps doctors predict patient health better.