Score: 0

Conditional Selective Inference for the Selected Groups in Panel Data

Published: November 6, 2025 | arXiv ID: 2511.04466v1

By: Chuang Wan, Jiajun Sun, Xingbai Xu

Potential Business Impact:

Fixes math tests that use data to find groups.

Business Areas:

A/B Testing Data and Analytics

We consider the problem of testing for differences in group-specific slopes between the selected groups in panel data identified via k-means clustering. In this setting, the classical Wald-type test statistic is problematic because it produces an extremely inflated type I error probability. The underlying reason is that the same dataset is used to identify the group structure and construct the test statistic, simultaneously. This creates dependence between the selection and inference stages. To address this issue, we propose a valid selective inference approach conditional on the selection event to account for the selection effect. We formally define the selective type I error and describe how to efficiently compute the correct p-values for clusters obtained using k-means clustering. Furthermore, the same idea can be extended to test for differences in coefficients due to a single covariate and can be incorporated into the GMM estimation framework. Simulation studies show that our method has satisfactory finite sample performance. We apply this method to explore the heterogeneous relationships between economic growth and the $CO_2$ emission across countries for which some new findings are discovered. An R package TestHomoPanel is provided to implement the proposed selective inference framework for panel data.