Symmetry-Driven Asynchronous Forwarding for Reliable Distributed Coordination in Toroidal Networks
By: Shenshen Luan , Yumo Tian , Xinyu Zhang and more
Potential Business Impact:
Keeps computer messages flowing even when links break.
The proliferation of large-scale distributed systems, such as satellite constellations and high-performance computing clusters, demands robust communication primitives that maintain coordination under unreliable links. The torus topology, with its inherent rotational and reflection symmetries, is a prevalent architecture in these domains. However, conventional routing schemes suffer from substantial packet loss during control-plane synchronization after link failures. This paper introduces a symmetry-driven asynchronous forwarding mechanism that leverages the torus's geometric properties to achieve reliable packet delivery without control-plane coordination. We model packet flow using a topological potential gradient and demonstrate that symmetry-breaking failures naturally induce a reverse flow, which we harness for fault circumvention. We propose two local forwarding strategies, Reverse Flow with Counter-facing Priority (RF-CF) and Lateral-facing Priority (RF-LF), that guarantee reachability to the destination via forward-flow phase transition points, without protocol modifications or additional in-packet overhead. Through percolation analysis and packet-level simulations on a 16 x 16 torus, we show that our mechanism reduces packet loss by up to 17.5% under a 1% link failure rate, with the RF-LF strategy contributing to 28% of successfully delivered packets. This work establishes a foundational link between topological symmetry and communication resilience, providing a lightweight, protocol-agnostic substrate for enhancing distributed systems.
Similar Papers
Toward Self-Healing Networks-on-Chip: RL-Driven Routing in 2D Torus Architectures
Distributed, Parallel, and Cluster Computing
Makes computer chips work better when parts break.
Stable and Fault-Tolerant Decentralized Traffic Engineering
Networking and Internet Architecture
Keeps internet traffic flowing smoothly and safely.
Toward Co-adapting Machine Learning Job Shape and Cluster Topology
Distributed, Parallel, and Cluster Computing
Makes computers share resources much better.