Score: 0

A new measure of dependence: Integrated $R^2$

Published: May 23, 2025 | arXiv ID: 2505.18146v3

By: Mona Azadkia, Pouya Roudaki

Potential Business Impact:

Finds important patterns in data automatically.

Business Areas:

A/B Testing Data and Analytics

We propose a new measure of dependence that quantifies the degree to which a random variable $Y$ depends on a random vector $X$. This measure is zero if and only if $Y$ and $X$ are independent, and equals one if and only if $Y$ is a measurable function of $X$. We introduce a simple and interpretable estimator that is comparable in ease of computation to classical correlation coefficients such as Pearson's, Spearman's, or Chatterjee's. Building on this coefficient, we develop a model-free variable selection algorithm, feature ordering by dependence (FORD), inspired by FOCI. FORD requires no tuning parameters and is provably consistent under suitable sparsity assumptions. We demonstrate its effectiveness and improvements over FOCI through experiments on both synthetic and real datasets.