Truncated Gaussian copula principal component analysis with application to pediatric acute lymphoblastic leukemia patients' gut microbiome
By: Lei Wang, Yang Ni, Irina Gaynanova
Potential Business Impact:
Finds gut bugs that predict cancer patient infections.
Increasing epidemiologic evidence suggests that the diversity and composition of the gut microbiome can predict infection risk in cancer patients. Infections remain a major cause of morbidity and mortality during chemotherapy. Analyzing microbiome data to identify associations with infection pathogenesis for proactive treatment has become a critical research focus. However, the high-dimensional nature of the data necessitates the use of dimension-reduction methods to facilitate inference and interpretation. Traditional dimension reduction methods, which assume Gaussianity, perform poorly with skewed and zero-inflated microbiome data. To address these challenges, we propose a semiparametric principal component analysis (PCA) method based on a truncated latent Gaussian copula model that accommodates both skewness and zero inflation. Simulation studies demonstrate that the proposed method outperforms existing approaches by providing more accurate estimates of scores and loadings across various copula transformation settings. We apply our method, along with competing approaches, to gut microbiome data from pediatric patients with acute lymphoblastic leukemia. The principal scores derived from the proposed method reveal the strongest associations between pre-chemotherapy microbiome composition and adverse events during subsequent chemotherapy, offering valuable insights for improving patient outcomes.
Similar Papers
A Bayesian Semiparametric Mixture Model for Clustering Zero-Inflated Microbiome Data
Methodology
Finds hidden groups in gut germs for health.
Microbial correlation: a semi-parametric model for investigating microbial co-metabolism
Methodology
Finds how gut germs work together to make health.
Dissecting Microbial Community Structure and Heterogeneity via Multivariate Covariate-Adjusted Clustering
Methodology
Finds gut bacteria groups linked to health.