Score: 1

Sparse minimum Redundancy Maximum Relevance for feature selection

Published: August 26, 2025 | arXiv ID: 2508.18901v1

By: Peter Naylor , Benjamin Poignard , Héctor Climente-González and more

Potential Business Impact:

Finds important clues in messy data.

Business Areas:

A/B Testing Data and Analytics

We propose a feature screening method that integrates both feature-feature and feature-target relationships. Inactive features are identified via a penalized minimum Redundancy Maximum Relevance (mRMR) procedure, which is the continuous version of the classic mRMR penalized by a non-convex regularizer, and where the parameters estimated as zero coefficients represent the set of inactive features. We establish the conditions under which zero coefficients are correctly identified to guarantee accurate recovery of inactive features. We introduce a multi-stage procedure based on the knockoff filter enabling the penalized mRMR to discard inactive features while controlling the false discovery rate (FDR). Our method performs comparably to HSIC-LASSO but is more conservative in the number of selected features. It only requires setting an FDR threshold, rather than specifying the number of features to retain. The effectiveness of the method is illustrated through simulations and real-world datasets. The code to reproduce this work is available on the following GitHub: https://github.com/PeterJackNaylor/SmRMR.

Robust Gene Prioritization via Fast-mRMR Feature Selection in high-dimensional omics data

Machine Learning (CS)

Finds important genes for health research faster.

26 Nov 2025 0

85%

When Features Beat Noise: A Feature Selection Technique Through Noise-Based Hypothesis Testing

Machine Learning (Stat)

Finds the most important information in messy data.

25 Nov 2025 0

85%

When Features Beat Noise: A Feature Selection Technique Through Noise-Based Hypothesis Testing

Machine Learning (Stat)

Finds the most important information in data.

25 Nov 2025 0

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

38 pages

Sparse minimum Redundancy Maximum Relevance for feature selection

Finds important clues in messy data.

Technical Abstract

Robust Gene Prioritization via Fast-mRMR Feature Selection in high-dimensional omics data

When Features Beat Noise: A Feature Selection Technique Through Noise-Based Hypothesis Testing

When Features Beat Noise: A Feature Selection Technique Through Noise-Based Hypothesis Testing