Robust Gene Prioritization via Fast-mRMR Feature Selection in high-dimensional omics data
By: Rubén Fernández-Farelo , Jorge Paz-Ruza , Bertha Guijarro-Berdiñas and more
Potential Business Impact:
Finds important genes for health research faster.
Gene prioritization (identifying genes potentially associated with a biological process) is increasingly tackled with Artificial Intelligence. However, existing methods struggle with the high dimensionality and incomplete labelling of biomedical data. This work proposes a more robust and efficient pipeline that leverages Fast-mRMR feature selection to retain only relevant, non-redundant features for classifiers. This enables us to build simpler and more effective models, as well as to combine different biological feature sets. Experiments on Dietary Restriction datasets show significant improvements over existing methods, proving that feature selection can be critical for reliable gene prioritization.
Similar Papers
Improving statistical learning methods via features selection without replacement sampling and random projection
Quantitative Methods
Finds cancer genes better for new treatments.
On the (In)Significance of Feature Selection in High-Dimensional Datasets
Machine Learning (CS)
Randomly picking data works as well as picking.
Sparse minimum Redundancy Maximum Relevance for feature selection
Machine Learning (Stat)
Finds important clues in messy data.