A new type of federated clustering: A non-model-sharing approach
By: Yuji Kawamata , Kaoru Kamijo , Masateru Kihira and more
Potential Business Impact:
Lets groups learn from private data together.
In recent years, the growing need to leverage sensitive data across institutions has led to increased attention on federated learning (FL), a decentralized machine learning paradigm that enables model training without sharing raw data. However, existing FL-based clustering methods, known as federated clustering, typically assume simple data partitioning scenarios such as horizontal or vertical splits, and cannot handle more complex distributed structures. This study proposes data collaboration clustering (DC-Clustering), a novel federated clustering method that supports clustering over complex data partitioning scenarios where horizontal and vertical splits coexist. In DC-Clustering, each institution shares only intermediate representations instead of raw data, ensuring privacy preservation while enabling collaborative clustering. The method allows flexible selection between k-means and spectral clustering, and achieves final results with a single round of communication with the central server. We conducted extensive experiments using synthetic and open benchmark datasets. The results show that our method achieves clustering performance comparable to centralized clustering where all data are pooled. DC-Clustering addresses an important gap in current FL research by enabling effective knowledge discovery from distributed heterogeneous data. Its practical properties -- privacy preservation, communication efficiency, and flexibility -- make it a promising tool for privacy-sensitive domains such as healthcare and finance.
Similar Papers
Federated Learning: A Survey on Privacy-Preserving Collaborative Intelligence
Machine Learning (CS)
Trains computers together without sharing private info.
DFCA: Decentralized Federated Clustering Algorithm
Machine Learning (CS)
Lets computers learn together without a boss.
One-Shot Clustering for Federated Learning
Machine Learning (CS)
Finds best time to group devices for learning.