Rethinking Causal Discovery Through the Lens of Exchangeability
By: Tiago Brogueira, Mário Figueiredo
Potential Business Impact:
Finds hidden causes in data better.
Causal discovery methods have traditionally been developed under two distinct regimes: independent and identically distributed (i.i.d.) and timeseries data, each governed by separate modelling assumptions. In this paper, we argue that the i.i.d. setting can and should be reframed in terms of exchangeability, a strictly more general symmetry principle. We present the implications of this reframing, alongside two core arguments: (1) a conceptual argument, based on extending the dependency of experimental causal inference on exchangeability to causal discovery; and (2) an empirical argument, showing that many existing i.i.d. causal-discovery methods are predicated on exchangeability assumptions, and that the sole extensive widely-used real-world "i.i.d." benchmark (the Tübingen dataset) consists mainly of exchangeable (and not i.i.d.) examples. Building on this insight, we introduce a novel synthetic dataset that enforces only the exchangeability assumption, without imposing the stronger i.i.d. assumption. We show that our exchangeable synthetic dataset mirrors the statistical structure of the real-world "i.i.d." dataset more closely than all other i.i.d. synthetic datasets. Furthermore, we demonstrate the predictive capability of this dataset by proposing a neural-network-based causal-discovery algorithm trained exclusively on our synthetic dataset, and which performs similarly to other state-of-the-art i.i.d. methods on the real-world benchmark.
Similar Papers
On the identifiability of causal graphs with multiple environments
Machine Learning (Stat)
Finds cause-and-effect relationships using different data.
The Robustness of Differentiable Causal Discovery in Misspecified Scenarios
Machine Learning (CS)
Makes computers understand cause and effect better.
Position: Causal Machine Learning Requires Rigorous Synthetic Experiments for Broader Adoption
Machine Learning (CS)
Tests AI to make sure its decisions are fair.