Fast approximate $\ell$-center clustering in high dimensional spaces
By: Mirosław Kowaluk, Andrzej Lingas, Mia Persson
Potential Business Impact:
Groups data faster, even with many details.
We study the design of efficient approximation algorithms for the $\ell$-center clustering and minimum-diameter $\ell$-clustering problems in high dimensional Euclidean and Hamming spaces. Our main tool is randomized dimension reduction. First, we present a general method of reducing the dependency of the running time of a hypothetical algorithm for the $\ell$-center problem in a high dimensional Euclidean space on the dimension size. Utilizing in part this method, we provide $(2+ε)$- approximation algorithms for the $\ell$-center clustering and minimum-diameter $\ell$-clustering problems in Euclidean and Hamming spaces that are substantially faster than the known $2$-approximation ones when both $\ell$ and the dimension are super-logarithmic. Next, we apply the general method to the recent fast approximation algorithms with higher approximation guarantees for the $\ell$-center clustering problem in a high dimensional Euclidean space. Finally, we provide a speed-up of the known $O(1)$-approximation method for the generalization of the $\ell$-center clustering problem to include $z$ outliers (i.e., $z$ input points can be ignored while computing the maximum distance of an input point to a center) in high dimensional Euclidean and Hamming spaces.
Similar Papers
Dimension Reduction for Clustering: The Curious Case of Discrete Centers
Data Structures and Algorithms
Makes computer data smaller for easier organizing.
Sublinear Data Structures for Nearest Neighbor in Ultra High Dimensions
Data Structures and Algorithms
Helps computers find similar items in huge, complex data.
Dynamic Diameter in High-Dimensions against Adaptive Adversary and Beyond
Data Structures and Algorithms
Keeps data points organized even when they change.