Stochastic block models with many communities and the Kesten--Stigum bound
By: Byron Chin , Elchanan Mossel , Youngtak Sohn and more
Potential Business Impact:
Finds hidden groups in connected data.
We study the inference of communities in stochastic block models with a growing number of communities. For block models with $n$ vertices and a fixed number of communities $q$, it was predicted in Decelle et al. (2011) that there are computationally efficient algorithms for recovering the communities above the Kesten--Stigum (KS) bound and that efficient recovery is impossible below the KS bound. This conjecture has since stimulated a lot of interest, with the achievability side proven in a line of research that culminated in the work of Abbe and Sandon (2018). Conversely, recent work by Sohn and Wein (2025) provides evidence for the hardness part using the low-degree paradigm. In this paper we investigate community recovery in the regime $q=q_n \to \infty$ as $n\to\infty$ where no such predictions exist. We show that efficient inference of communities remains possible above the KS bound. Furthermore, we show that recovery of block models is low-degree hard below the KS bound when the number of communities satisfies $q\ll \sqrt{n}$. Perhaps surprisingly, we find that when $q \gg \sqrt{n}$, there is an efficient algorithm based on non-backtracking walks for recovery even below the KS bound. We identify a new threshold and ask if it is the threshold for efficient recovery in this regime. Finally, we show that detection is easy and identify (up to a constant) the information-theoretic threshold for community recovery as the number of communities $q$ diverges. Our low-degree hardness results also naturally have consequences for graphon estimation, improving results of Luo and Gao (2024).
Similar Papers
Phase Transition for Stochastic Block Model with more than $\sqrt{n}$ Communities
Machine Learning (Stat)
Finds hidden groups in complex networks.
Phase Transition for Stochastic Block Model with more than $\sqrt{n}$ Communities (II)
Machine Learning (Stat)
Finds hidden groups in large networks faster.
Rate-optimal community detection near the KS threshold via node-robust algorithms
Machine Learning (Stat)
Finds hidden groups in networks, even with fake data.