Score: 1

Filtered Approximate Nearest Neighbor Search: A Unified Benchmark and Systematic Experimental Study [Experiment, Analysis & Benchmark]

Published: September 9, 2025 | arXiv ID: 2509.07789v1

By: Jiayang Shi, Yuzheng Cai, Weiguo Zheng

Potential Business Impact:

Finds best matches faster, even with rules.

Business Areas:

Semantic Search Internet Services

For a given dataset $\mathcal{D}$ and structured label $f$, the goal of Filtered Approximate Nearest Neighbor Search (FANNS) algorithms is to find top-$k$ points closest to a query that satisfy label constraints, while ensuring both recall and QPS (Queries Per Second). In recent years, many FANNS algorithms have been proposed. However, the lack of a systematic investigation makes it difficult to understand their relative strengths and weaknesses. Additionally, we found that: (1) FANNS algorithms have coupled, dataset-dependent parameters, leading to biased comparisons. (2) Key impact factors are rarely analyzed systematically, leaving unclear when each algorithm performs well. (3) Disparate datasets, workloads, and biased experiment designs make cross-algorithm comparisons unreliable. Thus, a comprehensive survey and benchmark for FANNS is crucial to achieve the following goals: designing a fair evaluation and clarifying the classification of algorithms, conducting in-depth analysis of their performance, and establishing a unified benchmark. First, we propose a taxonomy (dividing methods into \textit{filter-then-search}, \textit{search-then-filter}, \textit{hybrid-search}) and a systematic evaluation framework, integrating unified parameter tuning and standardized filtering across algorithms to reduce implementation-induced performance variations and reflect core trade-offs. Then, we conduct a comprehensive empirical study to analyze how query difficulty and dataset properties impact performance, evaluating robustness under pressures like filter selectivity, Recall@k, and scalability to clarify each method's strengths. Finally, we establish a standardized benchmark with real-world datasets and open-source related resources to ensure reproducible future research.

Survey of Filtered Approximate Nearest Neighbor Search over the Vector-Scalar Hybrid Data

Databases

Helps computers find similar items faster.

10 May 2025 1

88%

Attribute Filtering in Approximate Nearest Neighbor Search: An In-depth Experimental Study

Databases

Finds similar items with specific rules.

22 Aug 2025 3

88%

Attribute Filtering in Approximate Nearest Neighbor Search: An In-depth Experimental Study

Databases

Finds similar items with specific rules.

22 Aug 2025 3

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com

Page Count

14 pages

Filtered Approximate Nearest Neighbor Search: A Unified Benchmark and Systematic Experimental Study [Experiment, Analysis & Benchmark]

Finds best matches faster, even with rules.

Technical Abstract

Survey of Filtered Approximate Nearest Neighbor Search over the Vector-Scalar Hybrid Data

Attribute Filtering in Approximate Nearest Neighbor Search: An In-depth Experimental Study

Attribute Filtering in Approximate Nearest Neighbor Search: An In-depth Experimental Study