Competitively Consistent Clustering
By: Niv Buchbinder, Roie Levin, Yue Yang
Potential Business Impact:
Keeps groups of data organized efficiently over time.
In fully-dynamic consistent clustering, we are given a finite metric space $(M,d)$, and a set $F\subseteq M$ of possible locations for opening centers. Data points arrive and depart, and the goal is to maintain an approximately optimal clustering solution at all times while minimizing the recourse, the total number of additions/deletions of centers over time. Specifically, we study fully dynamic versions of the classical $k$-center, facility location, and $k$-median problems. We design algorithms that, given a parameter $\beta\geq 1$, maintain an $O(\beta)$-approximate solution at all times, and whose total recourse is bounded by $O(\log |F| \log \Delta) \cdot \text{OPT}_\text{rec}^{\beta}$. Here $\text{OPT}_\text{rec}^{\beta}$ is the minimal recourse of an offline algorithm that maintains a $\beta$-approximate solution at all times, and $\Delta$ is the metric aspect ratio. Finally, while we compare the performance of our algorithms to an optimal solution that maintains $k$ centers, our algorithms are allowed to use slightly more than $k$ centers. We obtain our results via a reduction to the recently proposed Positive Body Chasing framework of [Bhattacharya, Buchbinder, Levin, Saranurak, FOCS 2023], which we show gives fractional solutions to our clustering problems online. Our contribution is to round these fractional solutions while preserving the approximation and recourse guarantees. We complement our positive results with logarithmic lower bounds which show that our bounds are nearly tight.
Similar Papers
New Algorithms and Hardness Results for Connected Clustering
Data Structures and Algorithms
Groups things together so they stay connected.
Dynamic Diameter in High-Dimensions against Adaptive Adversary and Beyond
Data Structures and Algorithms
Keeps data points organized even when they change.
Clustering in Varying Metrics
Data Structures and Algorithms
Finds best group for data from many views.