Finite-Sample Valid Rank Confidence Sets for a Broad Class of Statistical and Machine Learning Models
By: Onrina Chandra, Min-ge Xie
Potential Business Impact:
Figures out how sure we are about rankings.
Ranking populations such as institutions based on certain characteristics is often of interest, and these ranks are typically estimated using samples drawn from the populations. Due to sample randomness, it is important to quantify the uncertainty associated with the estimated ranks. This becomes crucial when latent characteristics are poorly separated and where many rank estimates may be incorrectly ordered. Understanding uncertainty can help quantify and mitigate these issues and provide a fuller picture. However, this task is especially challenging because the rank parameters are discrete and the central limit theorem does not apply to the rank estimates. In this article, we propose a Repro Samples Method to address this nontrivial inference problem by developing a confidence set for the true, unobserved population ranks. This method provides finite-sample coverage guarantees and is broadly applicable to ranking problems. The effectiveness of the method is illustrated and compared with several published large sample ranking approaches using simulation studies and real data examples involving samples both from traditional statistical models and modern data science algorithms.
Similar Papers
Finite-Sample Valid Rank Confidence Sets for a Broad Class of Statistical and Machine Learning Models
Methodology
Figures out how sure we are about rankings.
Reasonable uncertainty: Confidence intervals in empirical Bayes discrimination detection
Econometrics
Finds how much unfairness is really there.
Inference on multiple quantiles in regression models by a rank-score approach
Methodology
Finds more important patterns in data reliably.