Testing Conditional Independence via Density Ratio Regression
By: Chunrong Ai, Zixuan Xu, Zheng Zhang
Potential Business Impact:
Finds hidden connections in messy data.
This paper develops a conditional independence (CI) test from a conditional density ratio (CDR) for weakly dependent data. The main contribution is presenting a closed-form expression for the estimated conditional density ratio function with good finite-sample performance. The key idea is exploiting the linear sieve combined with the quadratic norm. Matsushita et al. (2022) exploited the linear sieve to estimate the unconditional density ratio. We must exploit the linear sieve twice to estimate the conditional density ratio. First, we estimate an unconditional density ratio with an unweighted sieve least-squares regression, as done in Matsushita et al. (2022), and then the conditional density ratio with a weighted sieve least-squares regression, where the weights are the estimated unconditional density ratio. The proposed test has several advantages over existing alternatives. First, the test statistic is invariant to the monotone transformation of the data distribution and has a closed-form expression that enhances computational speed and efficiency. Second, the conditional density ratio satisfies the moment restrictions. The estimated ratio satisfies the empirical analog of those moment restrictions. As a result, the estimated density ratio is unlikely to have extreme values. Third, the proposed test can detect all deviations from conditional independence at rates arbitrarily close to $n^{-1/2}$ , and the local power loss is independent of the data dimension. A small-scale simulation study indicates that the proposed test outperforms the alternatives in various dependence structures.
Similar Papers
Estimating Unbounded Density Ratios: Applications in Error Control under Covariate Shift
Machine Learning (Stat)
Makes computer learning better with tricky data.
Density Ratio-based Causal Discovery from Bivariate Continuous-Discrete Data
Machine Learning (CS)
Finds cause when one thing is a number, another is a choice.
Testing independence and conditional independence in high dimensions via coordinatewise Gaussianization
Methodology
Finds hidden connections between data points.