Semantic NLP Pipelines for Interoperable Patient Digital Twins from Unstructured EHRs
By: Rafael Brens , Yuqiao Meng , Luoxi Tang and more
Potential Business Impact:
Turns doctor notes into patient digital copies.
Digital twins -- virtual replicas of physical entities -- are gaining traction in healthcare for personalized monitoring, predictive modeling, and clinical decision support. However, generating interoperable patient digital twins from unstructured electronic health records (EHRs) remains challenging due to variability in clinical documentation and lack of standardized mappings. This paper presents a semantic NLP-driven pipeline that transforms free-text EHR notes into FHIR-compliant digital twin representations. The pipeline leverages named entity recognition (NER) to extract clinical concepts, concept normalization to map entities to SNOMED-CT or ICD-10, and relation extraction to capture structured associations between conditions, medications, and observations. Evaluation on MIMIC-IV Clinical Database Demo with validation against MIMIC-IV-on-FHIR reference mappings demonstrates high F1-scores for entity and relation extraction, with improved schema completeness and interoperability compared to baseline methods.
Similar Papers
Electronic Health Records: Towards Digital Twins in Healthcare
Artificial Intelligence
Helps doctors predict patient health problems early.
A Semantic Framework for Patient Digital Twins in Chronic Care
Software Engineering
Creates a digital twin of you for better health.
DR.EHR: Dense Retrieval for Electronic Health Record with Knowledge Injection and Synthetic Data
Information Retrieval
Helps doctors find patient info faster.