Scalable Varied-Density Clustering via Graph Propagation
By: Ninh Pham, Yingtao Zheng, Hugo Phibbs
Potential Business Impact:
Finds hidden groups in huge, messy data fast.
We propose a novel perspective on varied-density clustering for high-dimensional data by framing it as a label propagation process in neighborhood graphs that adapt to local density variations. Our method formally connects density-based clustering with graph connectivity, enabling the use of efficient graph propagation techniques developed in network science. To ensure scalability, we introduce a density-aware neighborhood propagation algorithm and leverage advanced random projection methods to construct approximate neighborhood graphs. Our approach significantly reduces computational cost while preserving clustering quality. Empirically, it scales to datasets with millions of points in minutes and achieves competitive accuracy compared to existing baselines.
Similar Papers
Data Skeleton Learning: Scalable Active Clustering with Sparse Graph Structures
Machine Learning (CS)
Makes computers group data better with less help.
Bounded Graph Clustering with Graph Neural Networks
Machine Learning (CS)
Lets computers find the right number of groups.
A Scalable Approach to Clustering Embedding Projections
Human-Computer Interaction
Finds patterns in data much faster.