Reclustering: A New Method to Test the Appropriate Level of Clustering
By: Kentaro Fukumoto
Potential Business Impact:
Finds the best way to group data for analysis.
When scholars suspect units are dependent on each other within clusters but independent of each other across clusters, they employ cluster-robust standard errors (CRSEs). Nevertheless, what to cluster over is sometimes unknown. For instance, in the case of cross-sectional survey samples, clusters may be households, municipalities, counties, or states. A few approaches have been proposed, although they are based on asymptotics. I propose a new method to address this issue that works in a finite sample: reclustering. That is, we randomly and repeatedly group fine clusters into new gross clusters and calculate a statistic such as CRSEs. Under the null hypothesis that fine clusters are independent of each other, how they are grouped into gross clusters should not matter for any cluster-sensitive statistic. Thus, if the statistic based on the original clustering is a significant outlier against the distributions of the statistics induced by reclustering, it is reasonable to reject the null hypothesis and employ gross clusters. I compare the performance of reclustering with that of a few previous tests using Monte Carlo simulation and application.
Similar Papers
Hierarchical Clustering With Confidence
Methodology
Makes computer groupings more trustworthy and reliable.
Using Multiple Outcomes to Adjust Standard Errors for Spatial Correlation
Econometrics
Fixes math for studies about places.
Estimation in linear models with clustered data
Econometrics
Helps understand how groups of people affect each other.