Score: 0

Semantic NLP Pipelines for Interoperable Patient Digital Twins from Unstructured EHRs

Published: January 9, 2026 | arXiv ID: 2601.05847v1

By: Rafael Brens , Yuqiao Meng , Luoxi Tang and more

Potential Business Impact:

Turns doctor notes into patient digital copies.

Business Areas:
Electronic Health Record (EHR) Health Care

Digital twins -- virtual replicas of physical entities -- are gaining traction in healthcare for personalized monitoring, predictive modeling, and clinical decision support. However, generating interoperable patient digital twins from unstructured electronic health records (EHRs) remains challenging due to variability in clinical documentation and lack of standardized mappings. This paper presents a semantic NLP-driven pipeline that transforms free-text EHR notes into FHIR-compliant digital twin representations. The pipeline leverages named entity recognition (NER) to extract clinical concepts, concept normalization to map entities to SNOMED-CT or ICD-10, and relation extraction to capture structured associations between conditions, medications, and observations. Evaluation on MIMIC-IV Clinical Database Demo with validation against MIMIC-IV-on-FHIR reference mappings demonstrates high F1-scores for entity and relation extraction, with improved schema completeness and interoperability compared to baseline methods.

Page Count
6 pages

Category
Computer Science:
Computation and Language