Subspace Ordering for Maximum Response Preservation in Sufficient Dimension Reduction
By: Derik T. Boonstra, Rakheon Kim, Dean M. Young
Potential Business Impact:
Finds better ways to understand data for predictions.
Sufficient dimension reduction (SDR) methods aim to identify a dimension reduction subspace (DRS) that preserves all the information about the conditional distribution of a response given its predictor. Traditional SDR methods determine the DRS by solving a method-specific generalized eigenvalue problem and selecting the eigenvectors corresponding to the largest eigenvalues. In this article, we argue against the long-standing convention of using eigenvalues as the measure of subspace importance and propose alternative ordering criteria that directly assess the predictive relevance of each subspace. For a binary response, we introduce a subspace ordering criterion based on the absolute value of the independent Student's t-statistic. Theoretically, our criterion identifies subspaces that achieve the local minimum Bayes' error rate and yields consistent ordering of directions under mild regularity conditions. Additionally, we employ an F-statistic to provide a framework that unifies categorical and continuous responses under a single subspace criterion. We evaluate our proposed criteria within multiple SDR methods through extensive simulation studies and applications to real data. Our empirical results demonstrate the efficacy of reordering subspaces using our proposed criteria, which generally improves classification accuracy and subspace estimation compared to ordering by eigenvalues.
Similar Papers
Unified Distributed Estimation Framework for Sufficient Dimension Reduction Based on Conditional Moments
Methodology
Lets computers learn from data spread everywhere.
Understanding Fairness and Prediction Error through Subspace Decomposition and Influence Analysis
Machine Learning (Stat)
Fixes computer bias to make fair decisions.
An Iterative Problem-Driven Scenario Reduction Framework for Stochastic Optimization with Conditional Value-at-Risk
Optimization and Control
Finds best choices even with scary risks.