When Does the Silhouette Score Work? A Comprehensive Study in Network Clustering
By: Zongyue Teng , Jun Yan , Dandan Liu and more
Selecting the number of communities is a fundamental challenge in network clustering. The silhouette score offers an intuitive, model-free criterion that balances within-cluster cohesion and between-cluster separation. Albeit its widespread use in clustering analysis, its performance in network-based community detection remains insufficiently characterized. In this study, we comprehensively evaluate the performance of the silhouette score across unweighted, weighted, and fully connected networks, examining how network size, separation strength, and community size imbalance influence its performance. Simulation studies show that the silhouette score accurately identifies the true number of communities when clusters are well separated and balanced, but it tends to underestimate under strong imbalance or weak separation and to overestimate in sparse networks. Extending the evaluation to a real airline reachability network, we demonstrate that the silhouette-based clustering can recover geographically interpretable and market-oriented clusters. These findings provide empirical guidance for applying the silhouette score in network clustering and clarify the conditions under which its use is most reliable.
Similar Papers
Silhouette-Guided Instance-Weighted k-means
Machine Learning (CS)
Improves computer grouping by ignoring bad data.
Estimating the Optimal Number of Clusters in Categorical Data Clustering by Silhouette Coefficient
Machine Learning (CS)
Finds the best number of groups in data.
CAS Condensed and Accelerated Silhouette: An Efficient Method for Determining the Optimal K in K-Means Clustering
Machine Learning (CS)
Finds best groups in data much faster.