Score: 0

Gold after Randomized Sand: Model-X Split Knockoffs for Controlled Transformation Selection

Published: July 2, 2025 | arXiv ID: 2507.01732v2

By: Yang Cao , Hangyu Lin , Xinwei Sun and more

Potential Business Impact:

Finds important patterns in messy data.

Business Areas:
A/B Testing Data and Analytics

Controlling the False Discovery Rate (FDR) is critical for reproducible variable selection, especially given the prevalence of complex predictive modeling. The recent Split Knockoff method, an extension of the canonical Knockoffs framework, offers finite-sample FDR control for selecting sparse transformations but is limited to linear models with fixed designs. Extending this framework to random designs, which would accommodate a much broader range of models, is challenged by the fundamental difficulty of reconciling a random covariate design with a deterministic linear transformation. To bridge this gap, we introduce Model-X Split Knockoffs. Our method achieves robust FDR control for transformation selection in random designs by introducing a novel auxiliary randomized design. This key innovation effectively mediates the interaction between the random design and the deterministic transformation, enabling the construction of valid knockoffs. Like the classical Model-X framework, our approach provides provable, finite-sample FDR control under known or accurately estimated covariate distributions, regardless of the response's conditional distribution. Importantly, it guarantees at least the same, and often superior, selection power as standard Model-X Knockoffs when both are applicable. Empirical studies, including simulations and real-world applications to Alzheimer's disease imaging and university ranking analysis, demonstrate robust FDR control and improved statistical power.

Page Count
21 pages

Category
Statistics:
Methodology