A General Stability Approach to False Discovery Rate Control
By: Jiajun Sun, Zhanrui Cai, Wei Zhong
Stability and reproducibility are essential considerations in various applications of statistical methods. False Discovery Rate (FDR) control methods are able to control false signals in scientific discoveries. However, many FDR control methods, such as Model-X knockoff and data-splitting approaches, yield unstable results due to the inherent randomness of the algorithms. To enhance the stability and reproducibility of statistical outcomes, we propose a general stability approach for FDR control in feature selection and multiple testing problems, named FDR Stabilizer. Taking feature selection as an example, our method first aggregates feature importance statistics obtained by multiple runs of the base FDR control procedure into a consensus ranking. Then, we construct a stabilized relaxed e-value for each feature and apply the e-BH procedure to these stabilized e-values to obtain the final selection set. We theoretically derive the finite-sample bounds for the FDR and the power of our method, and show that our method asymptotically controls the FDR without power loss. Moreover, we establish the stability of the proposed method, showing that the stabilized selection set converges to a deterministic limit as the number of repetitions increases. Extensive numerical experiments and applications to real datasets demonstrate that the proposed method generally outperforms existing alternatives.
Similar Papers
The e-Partitioning Principle of False Discovery Rate Control
Statistics Theory
Finds more true discoveries in data.
False Discovery Rate Control via Bayesian Mirror Statistic
Methodology
Finds important clues in huge amounts of data.
Dependence-Aware False Discovery Rate Control in Two-Sided Gaussian Mean Testing
Methodology
Finds more real discoveries in science data.