Joint estimation of asymmetric community numbers in directed networks
By: Huan Qing
Potential Business Impact:
Finds hidden groups in connected information.
Community detection in directed networks is a central task in network analysis. Unlike undirected networks, directed networks encode inherently asymmetric relationships, giving rise to sender and receiver roles that may each follow distinct community organizations with possibly different numbers of communities. Estimating these two community counts simultaneously is therefore considerably more challenging than in the undirected setting, yet it is essential for faithful model specification and reliable downstream inference. This work addresses this challenge within the stochastic co-block model (ScBM), a powerful statistical framework for capturing asymmetric relational structures inherent in directed networks. We propose a novel goodness-of-fit test based on the deviation of the largest singular value of a normalized residual matrix from the constant value 2. We show that the upper bound of this test statistic converges to zero under the null hypothesis, while this statistic goes to infinity if the true model has finer communities than hypothesized. Leveraging this tail bounds behavior, we develop an efficient sequential testing algorithm that lexicographically explores candidate community number pairs. To enhance robustness in practical settings, we further introduce a ratio-based variant that detects the transition point in the test statistic sequence. We rigorously show both algorithms' consistency in recovering the true sender and receiver community counts under ScBM. Numerical experiments demonstrate the accuracy and robustness of our methods in estimating community numbers across diverse ScBM settings. %To our knowledge, this work presents the first theoretically guaranteed approach for jointly estimating the numbers of sender and receiver communities within the ScBM framework, providing a critical tool for reliable directed network analysis.
Similar Papers
Goodness-of-fit test for multi-layer stochastic block models
Methodology
Finds hidden groups in connected data.
Review on Determining the Number of Communities in Network Data
Methodology
Finds hidden groups in connected data.
Community detection in heterogeneous signed networks
Methodology
Finds hidden groups in networks with good and bad links.