Score: 1

Distributed Linearly Separable Computation with Arbitrary Heterogeneous Data Assignment

Published: January 15, 2026 | arXiv ID: 2601.10177v1

By: Ziting Zhang , Kai Wan , Minquan Cheng and more

Potential Business Impact:

Helps computers share data for faster calculations.

Business Areas:

Big Data Data and Analytics

Distributed linearly separable computation is a fundamental problem in large-scale distributed systems, requiring the computation of linearly separable functions over different datasets across distributed workers. This paper studies a heterogeneous distributed linearly separable computation problem, including one master and N distributed workers. The linearly separable task function involves Kc linear combinations of K messages, where each message is a function of one dataset. Distinguished from the existing homogeneous settings that assume each worker holds the same number of datasets, where the data assignment is carefully designed and controlled by the data center (e.g., the cyclic assignment), we consider a more general setting with arbitrary heterogeneous data assignment across workers, where `arbitrary' means that the data assignment is given in advance and `heterogeneous' means that the workers may hold different numbers of datasets. Our objective is to characterize the fundamental tradeoff between the computable dimension of the task function and the communication cost under arbitrary heterogeneous data assignment. Under the constraint of integer communication costs, for arbitrary heterogeneous data assignment, we propose a universal computing scheme and a universal converse bound by characterizing the structure of data assignment, where they coincide under some parameter regimes. We then extend the proposed computing scheme and converse bound to the case of fractional communication costs.