Optimal community detection in dense bipartite graphs
By: Julien Chhor, Parker Knight
Potential Business Impact:
Find hidden groups in complex connections.
We consider the problem of detecting a community of densely connected vertices in a high-dimensional bipartite graph of size $n_1 \times n_2$. Under the null hypothesis, the observed graph is drawn from a bipartite Erd\H{o}s-Renyi distribution with connection probability $p_0$. Under the alternative hypothesis, there exists an unknown bipartite subgraph of size $k_1 \times k_2$ in which edges appear with probability $p_1 = p_0 + \delta$ for some $\delta > 0$, while all other edges outside the subgraph appear with probability $p_0$. Specifically, we provide non-asymptotic upper and lower bounds on the smallest signal strength $\delta^*$ that is both necessary and sufficient to ensure the existence of a test with small enough type one and type two errors. We also derive novel minimax-optimal tests achieving these fundamental limits when the underlying graph is sufficiently dense. Our proposed tests involve a combination of hard-thresholded nonlinear statistics of the adjacency matrix, the analysis of which may be of independent interest. In contrast with previous work, our non-asymptotic upper and lower bounds match for any configuration of $n_1,n_2, k_1,k_2$.
Similar Papers
Sample Complexity of Correlation Detection in the Gaussian Wigner Model
Statistics Theory
Find hidden connections in data networks.
Rate-optimal community detection near the KS threshold via node-robust algorithms
Machine Learning (Stat)
Finds hidden groups in networks, even with fake data.
On the Asymptotics of the Connectivity Probability of Random Bipartite Graphs
Probability
Helps understand how computer networks connect.