Universal and Asymptotically Optimal Data and Task Allocation in Distributed Computing
By: Javad Maheri, K. K. Krishnan Namboodiri, Petros Elia
Potential Business Impact:
Makes computers share work faster and use less data.
We study the joint minimization of communication and computation costs in distributed computing, where a master node coordinates $N$ workers to evaluate a function over a library of $n$ files. Assuming that the function is decomposed into an arbitrary subfunction set $\mathbf{X}$, with each subfunction depending on $d$ input files, renders our distributed computing problem into a $d$-uniform hypergraph edge partitioning problem wherein the edge set (subfunction set), defined by $d$-wise dependencies between vertices (files) must be partitioned across $N$ disjoint groups (workers). The aim is to design a file and subfunction allocation, corresponding to a partition of $\mathbf{X}$, that minimizes the communication cost $π_{\mathbf{X}}$, representing the maximum number of distinct files per server, while also minimizing the computation cost $δ_{\mathbf{X}}$ corresponding to a maximal worker subfunction load. For a broad range of parameters, we propose a deterministic allocation solution, the \emph{Interweaved-Cliques (IC) design}, whose information-theoretic-inspired interweaved clique structure simultaneously achieves order-optimal communication and computation costs, for a large class of decompositions $\mathbf{X}$. This optimality is derived from our achievability and converse bounds, which reveal -- under reasonable assumptions on the density of $\mathbf{X}$ -- that the optimal scaling of the communication cost takes the form $n/N^{1/d}$, revealing that our design achieves the order-optimal \textit{partitioning gain} that scales as $N^{1/d}$, while also achieving an order-optimal computation cost. Interestingly, this order optimality is achieved in a deterministic manner, and very importantly, it is achieved blindly from $\mathbf{X}$, therefore enabling multiple desired functions to be computed without reshuffling files.
Similar Papers
Byzantine-Resilient Distributed Computation via Task Replication and Local Computations
Information Theory
Makes computers work together even with bad helpers.
Fundamental Limits of Distributed Computing for Linearly Separable Functions
Information Theory
Makes computers share data faster and cheaper.
Typical Solutions of Multi-User Linearly-Decomposable Distributed Computing
Information Theory
Helps planes and satellites share information better.