Multi-level Latent Variable Models for Coheritability Analysis in Electronic Health Records
By: Yinjun Zhao, Nicholas Tatonetti, Yuanjia Wang
Potential Business Impact:
Finds shared genes for different health problems.
Electronic health records (EHRs) linked with familial relationship data offer a unique opportunity to investigate the genetic architecture of complex phenotypes at scale. However, existing heritability and coheritability estimation methods often fail to account for the intricacies of familial correlation structures, heterogeneity across phenotype types, and computational scalability. We propose a robust and flexible statistical framework for jointly estimating heritability and genetic correlation among continuous and binary phenotypes in EHR-based family studies. Our approach builds on multi-level latent variable models to decompose phenotypic covariance into interpretable genetic and environmental components, incorporating both within- and between-family variations. We derive iteration algorithms based on generalized equation estimations (GEE) for estimation. Simulation studies under various parameter configurations demonstrate that our estimators are consistent and yield valid inference across a range of realistic settings. Applying our methods to real-world EHR data from a large, urban health system, we identify significant genetic correlations between mental health conditions and endocrine/metabolic phenotypes, supporting hypotheses of shared etiology. This work provides a scalable and rigorous framework for coheritability analysis in high-dimensional EHR data and facilitates the identification of shared genetic influences in complex disease networks.
Similar Papers
Latent Factor Point Processes for Patient Representation in Electronic Health Records
Methodology
Finds hidden disease patterns in patient records.
Estimating heritability of survival traits using censored multiple variance component model
Methodology
Finds genes that help people live longer.
Enhancing Phenotype Discovery in Electronic Health Records through Prior Knowledge-Guided Unsupervised Learning
Applications
Finds hidden asthma types in patient records.