Fast-Convergent Proximity Graphs for Approximate Nearest Neighbor Search
By: Binhong Li, Xiao Yan, Shangqi Lu
Potential Business Impact:
Finds closest matches faster and more accurately.
Approximate nearest neighbor (ANN) search in high-dimensional metric spaces is a fundamental problem with many applications. Over the past decade, proximity graph (PG)-based indexes have demonstrated superior empirical performance over alternatives. However, these methods often lack theoretical guarantees regarding the quality of query results, especially in the worst-case scenarios. In this paper, we introduce the {\alpha}-convergent graph ({\alpha}-CG), a new PG structure that employs a carefully designed edge pruning rule. This rule eliminates candidate neighbors for each data point p by applying the shifted-scaled triangle inequalities among p, its existing out-neighbors, and new candidates. If the distance between the query point q and its exact nearest neighbor v* is at most {\tau} for some constant {\tau} > 0, our {\alpha}-CG finds the exact nearest neighbor in poly-logarithmic time, assuming bounded intrinsic dimensionality for the dataset; otherwise, it can find an ANN in the same time. To enhance scalability, we develop the {\alpha}-convergent neighborhood graph ({\alpha}-CNG), a practical variant that applies the pruning rule locally within each point's neighbors. We also introduce optimizations to reduce the index construction time. Experimental results show that our {\alpha}-CNG outperforms existing PGs on real-world datasets. For most datasets, {\alpha}-CNG can reduce the number of distance computations and search steps by over 15% and 45%, respectively, when compared with the best-performing baseline.
Similar Papers
Fast-Convergent Proximity Graphs for Approximate Nearest Neighbor Search
Data Structures and Algorithms
Finds closest items faster, even with many details.
Graph-Based Approximate Nearest Neighbor Search Revisited: Theoretical Analysis and Optimization
Data Structures and Algorithms
Finds similar items faster, builds lists quicker.
EnhanceGraph: A Continuously Enhanced Graph-based Index for High-dimensional Approximate Nearest Neighbor Search
Databases
Finds similar things faster and more accurately.