Improved Streaming Algorithm for Fair $k$-Center Clustering
By: Longkun Guo , Zeyu Lin , Chaoqi Jia and more
Potential Business Impact:
Helps computers group data fairly for everyone.
Many real-world applications pose challenges in incorporating fairness constraints into the $k$-center clustering problem, where the dataset consists of $m$ demographic groups, each with a specified upper bound on the number of centers to ensure fairness. Focusing on big data scenarios, this paper addresses the problem in a streaming setting, where data points arrive one by one sequentially in a continuous stream. Leveraging a structure called the $\lambda$-independent center set, we propose a one-pass streaming algorithm that first computes a reserved set of points during the streaming process. Then, for the post-streaming process, we propose an approach for selecting centers from the reserved point set by analyzing all three possible cases, transforming the most complicated one into a specially constrained vertex cover problem in an auxiliary graph. Our algorithm achieves a tight approximation ratio of 5 while consuming $O(k\log n)$ memory. It can also be readily adapted to solve the offline fair $k$-center problem, achieving a 3-approximation ratio that matches the current state of the art. Furthermore, we extend our approach to a semi-structured data stream, where data points from each group arrive in batches. In this setting, we present a 3-approximation algorithm for $m = 2$ and a 4-approximation algorithm for general $m$. Lastly, we conduct extensive experiments to evaluate the performance of our approaches, demonstrating that they outperform existing baselines in both clustering cost and runtime efficiency.
Similar Papers
Fair Center Clustering in Sliding Windows
Data Structures and Algorithms
Finds best spots for groups, fairly, using less space.
Local Search-based Individually Fair Clustering with Outliers
Data Structures and Algorithms
Groups data fairly, even with bad data.
Fair Clustering in the Sliding Window Model
Data Structures and Algorithms
Makes computer groups fairer, even with changing data.