Score: 0

Performant Synchronization in Geo-Distributed Databases

Published: November 27, 2025 | arXiv ID: 2511.22444v3

By: Duling Xu , Tong Li , Zegang Sun and more

Potential Business Impact:

Makes computer data share faster between faraway places.

Business Areas:
Content Delivery Network Content and Publishing

The deployment of databases across geographically distributed regions has become increasingly critical for ensuring data reliability and scalability. Recent studies indicate that distributed databases exhibit significantly higher latency than single-node databases, primarily due to consensus protocols maintaining data consistency across multiple nodes. We argue that synchronization cost constitutes the primary bottleneck for distributed databases, which is particularly pronounced in wide-area networks (WAN). Fortunately, we identify opportunities to optimize synchronization costs in real production environments: (1) network clustering phenomena, (2) triangle inequality violations in transmission, and (3) redundant data transfers. Based on these observations, we propose GeoCoCo, a synchronization acceleration framework for cross-region distributed databases. First, GeoCoCo presents a group rescheduling strategy that adapts to real-time network conditions to maximize WAN transmission efficiency. Second, GeoCoCo introduces a task-preserving data filtering method that reduces data volume transmitted over the WAN. Finally, GeoCoCo develops a consistency-guaranteed transmission framework integrating grouping and pruning. Extensive evaluations in both trace-driven simulations and real-world deployments demonstrate that GeoCoCo reduces synchronization cost-primarily by lowering WAN bandwidth usage-by up to 40.3%, and increases system throughput by up to 14.1% in GeoGauss.

Country of Origin
🇨🇳 China

Page Count
31 pages

Category
Computer Science:
Databases