Score: 0

Streaming and Massively Parallel Algorithms for Euclidean Max-Cut

Published: March 18, 2025 | arXiv ID: 2503.14362v1

By: Nicolas Menand, Erik Waingarten

Potential Business Impact:

Divides data points to find the best groups.

Business Areas:
Big Data Data and Analytics

Given a set of vectors $X = \{ x_1,\dots, x_n \} \subset \mathbb{R}^d$, the Euclidean max-cut problem asks to partition the vectors into two parts so as to maximize the sum of Euclidean distances which cross the partition. We design new algorithms for Euclidean max-cut in models for massive datasets: $\bullet$ We give a fully-scalable constant-round MPC algorithm using $O(nd) + n \cdot \text{poly}( \log(n) / \epsilon)$ total space which gives a $(1+\epsilon)$-approximate Euclidean max-cut. $\bullet$ We give a dynamic streaming algorithm using $\text{poly}(d \log \Delta / \epsilon)$ space when $X \subseteq [\Delta]^d$, which provides oracle access to a $(1+\epsilon)$-approximate Euclidean max-cut. Recently, Chen, Jiang, and Krauthgamer $[\text{STOC}~'23]$ gave a dynamic streaming algorithm with space $\text{poly}(d\log\Delta/\epsilon)$ to approximate the value of the Euclidean max-cut, but could not provide oracle access to an approximately optimal cut. This was left open in that work, and we resolve it here. Both algorithms follow from the same framework, which analyzes a ``parallel'' and ``subsampled'' (Euclidean) version of a greedy algorithm of Mathieu and Schudy $[\text{SODA}~'08]$ for dense max-cut.

Country of Origin
🇺🇸 United States

Page Count
80 pages

Category
Computer Science:
Data Structures and Algorithms