Causal Explainability of Machine Learning in Heart Failure Prediction from Electronic Health Records
By: Yina Hou , Shourav B. Rabbani , Liang Hong and more
Potential Business Impact:
Finds real causes of heart problems.
The importance of clinical variables in the prognosis of the disease is explained using statistical correlation or machine learning (ML). However, the predictive importance of these variables may not represent their causal relationships with diseases. This paper uses clinical variables from a heart failure (HF) patient cohort to investigate the causal explainability of important variables obtained in statistical and ML contexts. Due to inherent regression modeling, popular causal discovery methods strictly assume that the cause and effect variables are numerical and continuous. This paper proposes a new computational framework to enable causal structure discovery (CSD) and score the causal strength of mixed-type (categorical, numerical, binary) clinical variables for binary disease outcomes. In HF classification, we investigate the association between the importance rank order of three feature types: correlated features, features important for ML predictions, and causal features. Our results demonstrate that CSD modeling for nonlinear causal relationships is more meaningful than its linear counterparts. Feature importance obtained from nonlinear classifiers (e.g., gradient-boosting trees) strongly correlates with the causal strength of variables without differentiating cause and effect variables. Correlated variables can be causal for HF, but they are rarely identified as effect variables. These results can be used to add the causal explanation of variables important for ML-based prediction modeling.
Similar Papers
Machine Learning Solutions Integrated in an IoT Healthcare Platform for Heart Failure Risk Stratification
Other Statistics
Finds sick hearts early, saving lives.
Stroke Disease Classification Using Machine Learning with Feature Selection Techniques
Machine Learning (CS)
Finds heart problems earlier with high accuracy.
Risk Prediction of Cardiovascular Disease for Diabetic Patients with Machine Learning and Deep Learning Techniques
Machine Learning (CS)
Helps doctors predict heart problems in diabetics.