Adaptive AI Model Partitioning over 5G Networks
By: Tam Thanh Nguyen , Tuan Van Ngo , Long Thanh Le and more
Potential Business Impact:
Lets phones run smart apps without draining battery.
Mobile devices increasingly rely on deep neural networks (DNNs) for complex inference tasks, but running entire models locally drains the device battery quickly. Offloading computation entirely to cloud or edge servers reduces processing load at devices but poses privacy risks and can incur high network bandwidth consumption and long delays. Split computing (SC) mitigates these challenges by partitioning DNNs between user equipment (UE) and edge servers. However, 5G wireless channels are time-varying and a fixed splitting scheme can lead to sub-optimal solutions. This paper addresses the limitations of fixed model partitioning in privacy-focused image processing and explores trade-offs in key performance metrics, including end-to-end (E2E) latency, energy consumption, and privacy, by developing an adaptive ML partitioning scheme based on real-time AI-powered throughput estimation. Evaluation in multiple scenarios demonstrates significant performance gains of our scheme.
Similar Papers
Rethinking Inference Placement for Deep Learning across Edge and Cloud Platforms: A Multi-Objective Optimization Perspective and Future Directions
Distributed, Parallel, and Cluster Computing
Makes smart apps run faster and safer.
Optimized Split Computing Framework for Edge and Core Devices
Networking and Internet Architecture
Lets phones run smart programs using less power.
P3SL: Personalized Privacy-Preserving Split Learning on Heterogeneous Edge Devices
Machine Learning (CS)
Lets phones learn without sharing private info.