Score: 1

Graph-Based Approximate Nearest Neighbor Search Revisited: Theoretical Analysis and Optimization

Published: September 19, 2025 | arXiv ID: 2509.15531v1

By: Xinran Ma , Zhaoqi Zhou , Chuan Zhou and more

BigTech Affiliations: Huawei

Potential Business Impact:

Finds similar items faster, builds lists quicker.

Business Areas:
Semantic Search Internet Services

Graph-based approaches to approximate nearest neighbor search (ANNS) have achieved remarkable success in enabling fast, high-recall retrieval on billion-scale vector datasets. Among them, the Sparse Neighborhood Graph (SNG) has emerged as a widely adopted graph structure due to its superior search performance. However, the theoretical understanding of SNG remains limited, leading to reliance on heuristic-based and often suboptimal truncation strategies. In this work, we aim to bridge the gap between theory and practice by providing formal guarantees for graph-based ANNS methods and proposing principled optimization strategies for the truncation parameter. By characterizing the index construction process through martingale-based analysis, we show that the degree of the index graph is $O(n^{2/3+\epsilon})$, where $\epsilon$ is an arbitrarily small constant. Furthermore, we prove that the expected search path length during query processing is $O(\log n)$. Based on these theoretical insights, we introduce a novel and principled method for selecting the truncation parameter $R$ in SNG. Experimental results demonstrate that our method achieves comparable or superior performance in terms of query latency and Recall@10 compared to commonly used binary search heuristics, while yielding 2x to 9x speedups in overall index construction.

Country of Origin
🇨🇳 China

Page Count
26 pages

Category
Computer Science:
Data Structures and Algorithms