Spearman's rho for bivariate zero-inflated data
By: Jasper Arends, Elisa Perrone
Potential Business Impact:
Measures how things are linked, even with lots of zeros.
Quantifying the association between two random variables is crucial in applications. Traditional estimation techniques for common association measures, such as Spearman's rank correlation coefficient, $\rho_S$, often fail when data contain ties. This is particularly problematic in zero-inflated contexts and fields like insurance, healthcare, and weather forecasting, where zeros are more frequent and require an extra probability mass. In this paper, we provide a new formulation of Spearman's rho specifically designed for zero-inflated data and propose a novel estimator of Spearman's rho based on our derived expression. Besides, we make our proposed estimator useful in practice by deriving its achievable bounds and suggest how to estimate them. We analyze our method in a comprehensive simulation study and show that our approach overcomes state-of-the-art methods in all the simulated scenarios. Additionally, we illustrate how the proposed theory can be used in practice for a more accurate quantification of association by considering two real-life applications.
Similar Papers
Rank-based concordance for zero-inflated data: New representations, estimators, and sharp bounds
Methodology
Fixes math for tricky data with many zeros.
On the Bernstein-smoothed lower-tail Spearman's rho estimator
Statistics Theory
Measures how two things are related, even when rare.
On Rank Correlation Coefficients
Statistics Theory
Finds hidden patterns in messy data better.