Score: 1

SPARQ: An Optimization Framework for the Distribution of AI-Intensive Applications under Non-Linear Delay Constraints

Published: September 24, 2025 | arXiv ID: 2509.19913v1

By: Pietro Spadaccino , Paolo Di Lorenzo , Sergio Barbarossa and more

Potential Business Impact:

Makes apps run faster by smarter resource use.

Business Areas:

Quantum Computing Science and Engineering

Next-generation real-time compute-intensive applications, such as extended reality, multi-user gaming, and autonomous transportation, are increasingly composed of heterogeneous AI-intensive functions with diverse resource requirements and stringent latency constraints. While recent advances have enabled very efficient algorithms for joint service placement, routing, and resource allocation for increasingly complex applications, current models fail to capture the non-linear relationship between delay and resource usage that becomes especially relevant in AI-intensive workloads. In this paper, we extend the cloud network flow optimization framework to support queuing-delay-aware orchestration of distributed AI applications over edge-cloud infrastructures. We introduce two execution models, Guaranteed-Resource (GR) and Shared-Resource (SR), that more accurately capture how computation and communication delays emerge from system-level resource constraints. These models incorporate M/M/1 and M/G/1 queue dynamics to represent dedicated and shared resource usage, respectively. The resulting optimization problem is non-convex due to the non-linear delay terms. To overcome this, we develop SPARQ, an iterative approximation algorithm that decomposes the problem into two convex sub-problems, enabling joint optimization of service placement, routing, and resource allocation under nonlinear delay constraints. Simulation results demonstrate that the SPARQ not only offers a more faithful representation of system delays, but also substantially improves resource efficiency and the overall cost-delay tradeoff compared to existing state-of-the-art methods.

SparOA: Sparse and Operator-aware Hybrid Scheduling for Edge DNN Inference

Distributed, Parallel, and Cluster Computing

Makes smart devices run faster and use less power.

21 Nov 2025 2

87%

SPARS: A Reinforcement Learning-Enabled Simulator for Power Management in HPC Job Scheduling

Distributed, Parallel, and Cluster Computing

Saves computer energy by turning off unused parts.

15 Dec 2025 2

86%

A Unified QoS-Aware Multiplexing Framework for Next Generation Immersive Communication with Legacy Wireless Applications

Networking and Internet Architecture

Makes virtual reality and phone calls work together better.

30 Apr 2025 1

View PDF Login to Bookmark

Country of Origin

🇮🇹 Italy

Page Count

16 pages

SPARQ: An Optimization Framework for the Distribution of AI-Intensive Applications under Non-Linear Delay Constraints

Makes apps run faster by smarter resource use.

Technical Abstract

SparOA: Sparse and Operator-aware Hybrid Scheduling for Edge DNN Inference

SPARS: A Reinforcement Learning-Enabled Simulator for Power Management in HPC Job Scheduling

A Unified QoS-Aware Multiplexing Framework for Next Generation Immersive Communication with Legacy Wireless Applications