Efficient Defective Clique Enumeration and Search with Worst-Case Optimal Search Space
By: Jihoon Jang , Yehyun Nam , Kunsoo Park and more
Potential Business Impact:
Find groups in networks faster.
A $k$-defective clique is a relaxation of the traditional clique definition, allowing up to $k$ missing edges. This relaxation is crucial in various real-world applications such as link prediction, community detection, and social network analysis. Although the problems of enumerating maximal $k$-defective cliques and searching a maximum $k$-defective clique have been extensively studied, existing algorithms suffer from limitations such as the combinatorial explosion of small partial solutions and sub-optimal search spaces. To address these limitations, we propose a novel clique-first branch-and-bound framework that first generates cliques and then adds missing edges. Furthermore, we introduce a new pivoting technique that achieves a search space size of $\mathcal{O}(3^{\frac{n}{3}} \cdot n^k)$, where $n$ is the number of vertices in the input graph. We prove that the worst-case number of maximal $k$-defective cliques is $Ω(3^{\frac{n}{3}} \cdot n^k)$ when $k$ is a constant, establishing that our algorithm's search space is worst-case optimal. Leveraging the diameter-two property of defective cliques, we further reduce the search space size to $\mathcal{O}(n \cdot 3^{\fracδ{3}} \cdot (δΔ)^k)$, where $δ$ is the degeneracy and $Δ$ is the maximum degree of the input graph. We also propose an efficient framework for maximum $k$-defective clique search based on our branch-and-bound, together with practical techniques to reduce the search space. Experiments on real-world benchmark datasets with more than 1 million edges demonstrate that each of our proposed algorithms for maximal $k$-defective clique enumeration and maximum $k$-defective clique search outperforms the respective state-of-the-art algorithms by up to four orders of magnitude in terms of processing time.
Similar Papers
On the Efficient Discovery of Maximum $k$-Defective Biclique
Data Structures and Algorithms
Finds important patterns even with missing data.
Finding Near-Optimal Maximum Set of Disjoint $k$-Cliques in Real-World Social Networks
Social and Information Networks
Finds best groups of friends in online games.
Engineering Algorithms for $\ell$-Isolated Maximal Clique Enumeration
Data Structures and Algorithms
Filters out unimportant groups in complex networks.