A new measure of dependence: Integrated $R^2$
By: Mona Azadkia, Pouya Roudaki
Potential Business Impact:
Finds important patterns in data automatically.
We propose a new measure of dependence that quantifies the degree to which a random variable $Y$ depends on a random vector $X$. This measure is zero if and only if $Y$ and $X$ are independent, and equals one if and only if $Y$ is a measurable function of $X$. We introduce a simple and interpretable estimator that is comparable in ease of computation to classical correlation coefficients such as Pearson's, Spearman's, or Chatterjee's. Building on this coefficient, we develop a model-free variable selection algorithm, feature ordering by dependence (FORD), inspired by FOCI. FORD requires no tuning parameters and is provably consistent under suitable sparsity assumptions. We demonstrate its effectiveness and improvements over FOCI through experiments on both synthetic and real datasets.
Similar Papers
An Interpretable Measure for Quantifying Predictive Dependence between Continuous Random Variables -- Extended Version
Machine Learning (CS)
Shows how two things are connected, even oddly.
A new coefficient of separation
Methodology
Measures how much one thing depends on others.
A dimension reduction for extreme types of directed dependence
Statistics Theory
Finds how one thing affects another, even complex ways.