Contrastive CUR: Interpretable Joint Feature and Sample Selection for Case-Control Studies
By: Eric Zhang, Michael Love, Didong Li
Potential Business Impact:
Finds important differences between sick and healthy people.
Dimension reduction is an essential tool for analyzing high dimensional data. Most existing methods, including principal component analysis (PCA), as well as their extensions, provide principal components that are often linear combinations of features, which are often challenging to interpret. CUR decomposition, another matrix decomposition technique, is a more interpretable and efficient alternative, offers simultaneous feature and sample selection. Despite this, many biomedical studies involve two groups: a foreground (treatment or case) group and a background (control) group, where the objective is to identify features unique to or enriched in the foreground. This need for contrastive dimension reduction is not well addressed by existing CUR methods, nor by contrastive approaches rooted in PCAs. Furthermore, they fail to address a key challenge in biomedical studies: the need for selecting samples unique to the foreground. In this paper, we address this gap by proposing a Contrastive CUR (CCUR), a novel method specifically designed for case-control studies. Through extensive experiments, we demonstrate that CCUR outperforms existing techniques in isolating biologically relevant features as well as identifying sample-specific responses unique to the foreground, offering deeper insights into case-control biomedical data.
Similar Papers
Contrastive Dimension Reduction: A Systematic Review
Methodology
Finds hidden patterns in data, even when noisy.
Interpretable dimension reduction for compositional data
Methodology
Shows hidden patterns in tiny body bugs.
Beyond Correlation: Causal Multi-View Unsupervised Feature Selection Learning
Machine Learning (CS)
Finds important data by ignoring misleading clues.