CKD-EHR:Clinical Knowledge Distillation for Electronic Health Records
By: Junke Wang , Hongshun Ling , Li Zhang and more
Potential Business Impact:
Helps doctors predict sickness faster and better.
Electronic Health Records (EHR)-based disease prediction models have demonstrated significant clinical value in promoting precision medicine and enabling early intervention. However, existing large language models face two major challenges: insufficient representation of medical knowledge and low efficiency in clinical deployment. To address these challenges, this study proposes the CKD-EHR (Clinical Knowledge Distillation for EHR) framework, which achieves efficient and accurate disease risk prediction through knowledge distillation techniques. Specifically, the large language model Qwen2.5-7B is first fine-tuned on medical knowledge-enhanced data to serve as the teacher model.It then generates interpretable soft labels through a multi-granularity attention distillation mechanism. Finally, the distilled knowledge is transferred to a lightweight BERT student model. Experimental results show that on the MIMIC-III dataset, CKD-EHR significantly outperforms the baseline model:diagnostic accuracy is increased by 9%, F1-score is improved by 27%, and a 22.2 times inference speedup is achieved. This innovative solution not only greatly improves resource utilization efficiency but also significantly enhances the accuracy and timeliness of diagnosis, providing a practical technical approach for resource optimization in clinical settings. The code and data for this research are available athttps://github.com/209506702/CKD_EHR.
Similar Papers
Improving Hospital Risk Prediction with Knowledge-Augmented Multimodal EHR Modeling
Machine Learning (CS)
Predicts patient risks more accurately from records
Integrating Large Language Models with Human Expertise for Disease Detection in Electronic Health Records
Computation and Language
Helps doctors find patient sicknesses faster.
DR.EHR: Dense Retrieval for Electronic Health Record with Knowledge Injection and Synthetic Data
Information Retrieval
Helps doctors find patient info faster.