Inference for Forecasting Accuracy: Pooled versus Individual Estimators in High-dimensional Panel Data
By: Tim Kutta, Martin Schumann, Holger Dette
Panels with large time $(T)$ and cross-sectional $(N)$ dimensions are a key data structure in social sciences and other fields. A central question in panel data analysis is whether to pool data across individuals or to estimate separate models. Pooled estimators typically have lower variance but may suffer from bias, creating a fundamental trade-off for optimal estimation. We develop a new inference method to compare the forecasting performance of pooled and individual estimators. Specifically, we propose a confidence interval for the difference between their forecasting errors and establish its asymptotic validity. Our theory allows for complex temporal and cross-sectional dependence in the model errors and covers scenarios where $N$ can be much larger than $T$-including the independent case under the classical condition $N/T^2 \to 0$. The finite-sample properties of the proposed method are examined in an extensive simulation study.
Similar Papers
Training and Testing with Multiple Splits: A Central Limit Theorem for Split-Sample Estimators
Econometrics
Improves computer learning by using data smarter.
Robust Inference Methods for Latent Group Panel Models under Possible Group Non-Separation
Econometrics
Finds hidden patterns in data to make better predictions.
Learning Across Experiments and Time: Tackling Heterogeneity in A/B Testing
Methodology
Makes online tests give truer results sooner.