The Merit of Simple Policies: Buying Performance With Parallelism and System Architecture
By: Mert Yildiz, Alexey Rolich, Andrea Baiocchi
Potential Business Impact:
Makes computer jobs finish faster with smart server setups.
While scheduling and dispatching of computational workloads is a well-investigated subject, only recently has Google provided publicly a vast high-resolution measurement dataset of its cloud workloads. We revisit dispatching and scheduling algorithms fed by traffic workloads derived from those measurements. The main finding is that mean job response time attains a minimum as the number of servers of the computing cluster is varied, under the constraint that the overall computational budget is kept constant. Moreover, simple policies, such as Join Idle Queue, appear to attain the same performance as more complex, size-based policies for suitably high degrees of parallelism. Further, better performance, definitely outperforming size-based dispatching policies, is obtained by using multi-stage server clusters, even using very simple policies such as Round Robin. The takeaway is that parallelism and architecture of computing systems might be powerful knobs to control performance, even more than policies, under realistic workload traffic.
Similar Papers
Dispatching Odyssey: Exploring Performance in Computing Clusters under Real-world Workloads
Distributed, Parallel, and Cluster Computing
Makes computers finish jobs faster by smarter organizing.
"Two-Stagification": Job Dispatching in Large-Scale Clusters via a Two-Stage Architecture
Distributed, Parallel, and Cluster Computing
Makes computer jobs finish faster by sorting them.
Stability and Heavy-traffic Delay Optimality of General Load Balancing Policies in Heterogeneous Service Systems
Performance
Makes jobs go to the right computer faster.