The Role of Confounders and Linearity in Ecological Inference: A Reassessment
By: Shiro Kuriwaki, Cory McCartan
Estimating conditional means using only the marginal means available from aggregate data is commonly known as the ecological inference problem (EI). We provide a reassessment of EI, including a new formalization of identification conditions and a demonstration of how these conditions fail to hold in common cases. The identification conditions reveal that, similar to causal inference, credible ecological inference requires controlling for confounders. The aggregation process itself creates additional structure to assist in estimation by restricting the conditional expectation function to be linear in the predictor variable. A linear model perspective also clarifies the differences between the EI methods commonly used in the literature, and when they lead to ecological fallacies. We provide an overview of new methodology which builds on both the identification and linearity results to flexibly control for confounders and yield improved ecological inferences. Finally, using datasets for common EI problems in which the ground truth is fortuitously observed, we show that, while covariates can help, all methods are prone to overestimating both racial polarization and nationalized partisan voting.
Similar Papers
Identification and Semiparametric Estimation of Conditional Means from Aggregate Data
Methodology
Figures out hidden group averages from overall averages.
Long-term Causal Inference via Modeling Sequential Latent Confounding
Machine Learning (CS)
Finds true causes even with messy past data.
Estimation in linear models with clustered data
Econometrics
Helps understand how groups of people affect each other.