Score: 0

Where to Split? A Pareto-Front Analysis of DNN Partitioning for Edge Inference

Published: January 12, 2026 | arXiv ID: 2601.08025v1

By: Adiba Masud , Nicholas Foley , Pragathi Durga Rajarajan and more

The deployment of deep neural networks (DNNs) on resource-constrained edge devices is frequently hindered by their significant computational and memory requirements. While partitioning and distributing a DNN across multiple devices is a well-established strategy to mitigate this challenge, prior research has largely focused on single-objective optimization, such as minimizing latency or maximizing throughput. This paper challenges that view by reframing DNN partitioning as a multi-objective optimization problem. We argue that in real-world scenarios, a complex trade-off between latency and throughput exists, which is further complicated by network variability. To address this, we introduce ParetoPipe, an open-source framework that leverages Pareto front analysis to systematically identify optimal partitioning strategies that balance these competing objectives. Our contributions are threefold: we benchmark pipeline partitioned inference on a heterogeneous testbed of Raspberry Pis and a GPU-equipped edge server; we identify Pareto-optimal points to analyze the latency-throughput trade-off under varying network conditions; and we release a flexible, open-source framework to facilitate distributed inference and benchmarking. This toolchain features dual communication backends, PyTorch RPC and a custom lightweight implementation, to minimize overhead and support broad experimentation.

Rethinking Inference Placement for Deep Learning across Edge and Cloud Platforms: A Multi-Objective Optimization Perspective and Future Directions

Distributed, Parallel, and Cluster Computing

Makes smart apps run faster and safer.

27 Oct 2025 1

88%

Collaborative Inference Acceleration with Non-Penetrative Tensor Partitioning

Distributed, Parallel, and Cluster Computing

Makes smart gadgets process big pictures faster.

8 Jan 2025 1

88%

Privacy-Aware Joint DNN Model Deployment and Partitioning Optimization for Collaborative Edge Inference Services

Machine Learning (CS)

Makes smart devices answer faster, privately.

22 Feb 2025 2

View PDF Login to Bookmark

Where to Split? A Pareto-Front Analysis of DNN Partitioning for Edge Inference

Technical Abstract

Rethinking Inference Placement for Deep Learning across Edge and Cloud Platforms: A Multi-Objective Optimization Perspective and Future Directions

Collaborative Inference Acceleration with Non-Penetrative Tensor Partitioning

Privacy-Aware Joint DNN Model Deployment and Partitioning Optimization for Collaborative Edge Inference Services