Score: 2

Randomized Dimensionality Reduction for Euclidean Maximization and Diversity Measures

Published: May 30, 2025 | arXiv ID: 2506.00165v1

By: Jie Gao , Rajesh Jayaram , Benedikt Kolbe and more

BigTech Affiliations: Google

Potential Business Impact:

Makes computer problems solve faster by shrinking data.

Business Areas:
A/B Testing Data and Analytics

Randomized dimensionality reduction is a widely-used algorithmic technique for speeding up large-scale Euclidean optimization problems. In this paper, we study dimension reduction for a variety of maximization problems, including max-matching, max-spanning tree, max TSP, as well as various measures for dataset diversity. For these problems, we show that the effect of dimension reduction is intimately tied to the \emph{doubling dimension} $\lambda_X$ of the underlying dataset $X$ -- a quantity measuring intrinsic dimensionality of point sets. Specifically, we prove that a target dimension of $O(\lambda_X)$ suffices to approximately preserve the value of any near-optimal solution,which we also show is necessary for some of these problems. This is in contrast to classical dimension reduction results, whose dependence increases with the dataset size $|X|$. We also provide empirical results validating the quality of solutions found in the projected space, as well as speedups due to dimensionality reduction.

Country of Origin
🇮🇱 🇺🇸 Israel, United States

Page Count
23 pages

Category
Computer Science:
Data Structures and Algorithms