Score: 0

SVFusion: A CPU-GPU Co-Processing Architecture for Large-Scale Real-Time Vector Search

Published: January 13, 2026 | arXiv ID: 2601.08528v1

By: Yuchen Peng , Dingyu Yang , Zhongle Xie and more

Approximate Nearest Neighbor Search (ANNS) underpins modern applications such as information retrieval and recommendation. With the rapid growth of vector data, efficient indexing for real-time vector search has become rudimentary. Existing CPU-based solutions support updates but suffer from low throughput, while GPU-accelerated systems deliver high performance but face challenges with dynamic updates and limited GPU memory, resulting in a critical performance gap for continuous, large-scale vector search requiring both accuracy and speed. In this paper, we present SVFusion, a GPU-CPU-disk collaborative framework for real-time vector search that bridges sophisticated GPU computation with online updates. SVFusion leverages a hierarchical vector index architecture that employs CPU-GPU co-processing, along with a workload-aware vector caching mechanism to maximize the efficiency of limited GPU memory. It further enhances performance through real-time coordination with CUDA multi-stream optimization and adaptive resource management, along with concurrency control that ensures data consistency under interleaved queries and updates. Empirical results demonstrate that SVFusion achieves significant improvements in query latency and throughput, exhibiting a 20.9x higher throughput on average and 1.3x to 50.7x lower latency compared to baseline methods, while maintaining high recall for large-scale datasets under various streaming workloads.

VecFlow: A High-Performance Vector Data Management System for Filtered-Search on GPUs

Databases

Finds AI information faster on computers.

1 Jun 2025 3

87%

Fantasy: Efficient Large-scale Vector Search on GPU Clusters with GPUDirect Async

Distributed, Parallel, and Cluster Computing

Speeds up AI by searching huge data faster.

1 Dec 2025 0

87%

ViFusion: In-Network Tensor Fusion for Scalable Video Feature Indexing

Multimedia

Speeds up finding videos by 8 to 22 times.

19 Jun 2025 1

View PDF Login to Bookmark

SVFusion: A CPU-GPU Co-Processing Architecture for Large-Scale Real-Time Vector Search

Technical Abstract

VecFlow: A High-Performance Vector Data Management System for Filtered-Search on GPUs

Fantasy: Efficient Large-scale Vector Search on GPU Clusters with GPUDirect Async

ViFusion: In-Network Tensor Fusion for Scalable Video Feature Indexing