A new $1/(1-ρ)$-scaling bound for multiserver queues via a leave-one-out technique
By: Yige Hong
Potential Business Impact:
Makes waiting lines shorter and more predictable.
Bounding the queue length in a multiserver queue is a central challenge in queueing theory. Even for the classic $GI/GI/n$ queue with homogeneous servers, it is highly non-trivial to derive a simple and accurate bound for the steady-state queue length that holds across all scaling regimes. A recent breakthrough by Li and Goldberg (2025) establishes bounds that scale as $1/(1-\rho)$ for any load $\rho < 1$ and number of servers $n$, which is the correct scaling in many well-known scaling regimes, including classic heavy-traffic, Halfin-Whitt and Nondegenerate-Slowdown. However, their bounds entail large constant factors and a highly intricate proof, suggesting room for further improvement. In this paper, we present a new $1/(1-\rho)$-scaling bound for the $GI/GI/n$ queue. Our bound, while restricted to the light-tailed case and the first moment of the queue length, has a more interpretable and often tighter leading constant. Our proof is relatively simple, utilizing a modified $GI/GI/n$ queue, the stationarity of a quadratic test function, and a novel leave-one-out coupling technique. Finally, we also extend our method to $GI/GI/n$ queues with fully heterogeneous service-time distributions.
Similar Papers
Analyzing homogenous and heterogeneous multi-server queues via neural networks
Performance
Predicts how many people will be waiting.
Stability and Heavy-traffic Delay Optimality of General Load Balancing Policies in Heterogeneous Service Systems
Performance
Makes jobs go to the right computer faster.
Finite-Time Behavior of Erlang-C Model: Mixing Time, Mean Queue Length and Tail Bounds
Probability
Helps computer systems predict wait times better.