Score: 0

Reclustering: A New Method to Test the Appropriate Level of Clustering

Published: November 11, 2025 | arXiv ID: 2511.08184v1

By: Kentaro Fukumoto

Potential Business Impact:

Finds the best way to group data for analysis.

Business Areas:
A/B Testing Data and Analytics

When scholars suspect units are dependent on each other within clusters but independent of each other across clusters, they employ cluster-robust standard errors (CRSEs). Nevertheless, what to cluster over is sometimes unknown. For instance, in the case of cross-sectional survey samples, clusters may be households, municipalities, counties, or states. A few approaches have been proposed, although they are based on asymptotics. I propose a new method to address this issue that works in a finite sample: reclustering. That is, we randomly and repeatedly group fine clusters into new gross clusters and calculate a statistic such as CRSEs. Under the null hypothesis that fine clusters are independent of each other, how they are grouped into gross clusters should not matter for any cluster-sensitive statistic. Thus, if the statistic based on the original clustering is a significant outlier against the distributions of the statistics induced by reclustering, it is reasonable to reject the null hypothesis and employ gross clusters. I compare the performance of reclustering with that of a few previous tests using Monte Carlo simulation and application.

Country of Origin
🇯🇵 Japan

Page Count
29 pages

Category
Statistics:
Methodology