HeFS: Helper-Enhanced Feature Selection via Pareto-Optimized Genetic Search
By: Yusi Fan , Tian Wang , Zhiying Yan and more
Potential Business Impact:
Finds hidden clues to make predictions better.
Feature selection is a combinatorial optimization problem that is NP-hard. Conventional approaches often employ heuristic or greedy strategies, which are prone to premature convergence and may fail to capture subtle yet informative features. This limitation becomes especially critical in high-dimensional datasets, where complex and interdependent feature relationships prevail. We introduce the HeFS (Helper-Enhanced Feature Selection) framework to refine feature subsets produced by existing algorithms. HeFS systematically searches the residual feature space to identify a Helper Set - features that complement the original subset and improve classification performance. The approach employs a biased initialization scheme and a ratio-guided mutation mechanism within a genetic algorithm, coupled with Pareto-based multi-objective optimization to jointly maximize predictive accuracy and feature complementarity. Experiments on 18 benchmark datasets demonstrate that HeFS consistently identifies overlooked yet informative features and achieves superior performance over state-of-the-art methods, including in challenging domains such as gastric cancer classification, drug toxicity prediction, and computer science applications. The code and datasets are available at https://healthinformaticslab.org/supp/.
Similar Papers
Optimizing Prognostic Biomarker Discovery in Pancreatic Cancer Through Hybrid Ensemble Feature Selection and Multi-Omics Data
Genomics
Finds key health clues to predict life span.
On the (In)Significance of Feature Selection in High-Dimensional Datasets
Machine Learning (CS)
Randomly picking data works as well as picking.
Hierarchical Evaluation Function (HEF): A Multi-Metric Approach for Optimizing Demand Forecasting Models
Machine Learning (CS)
Helps businesses predict future sales better.