Score: 1

Dynamic Pricing for On-Demand DNN Inference in the Edge-AI Market

Published: March 6, 2025 | arXiv ID: 2503.04521v1

By: Songyuan Li , Jia Hu , Geyong Min and more

Potential Business Impact:

Smarter AI on phones, faster and cheaper.

Business Areas:

IaaS Software

The convergence of edge computing and AI gives rise to Edge-AI, which enables the deployment of real-time AI applications and services at the network edge. One of the fundamental research issues in Edge-AI is edge inference acceleration, which aims to realize low-latency high-accuracy DNN inference services by leveraging the fine-grained offloading of partitioned inference tasks from end devices to edge servers. However, existing research has yet to adopt a practical Edge-AI market perspective, which would systematically explore the personalized inference needs of AI users (e.g., inference accuracy, latency, and task complexity), the revenue incentives for AI service providers that offer edge inference services, and multi-stakeholder governance within a market-oriented context. To bridge this gap, we propose an Auction-based Edge Inference Pricing Mechanism (AERIA) for revenue maximization to tackle the multi-dimensional optimization problem of DNN model partition, edge inference pricing, and resource allocation. We investigate the multi-exit device-edge synergistic inference scheme for on-demand DNN inference acceleration, and analyse the auction dynamics amongst the AI service providers, AI users and edge infrastructure provider. Owing to the strategic mechanism design via randomized consensus estimate and cost sharing techniques, the Edge-AI market attains several desirable properties, including competitiveness in revenue maximization, incentive compatibility, and envy-freeness, which are crucial to maintain the effectiveness, truthfulness, and fairness of our auction outcomes. The extensive simulation experiments based on four representative DNN inference workloads demonstrate that our AERIA mechanism significantly outperforms several state-of-the-art approaches in revenue maximization, demonstrating the efficacy of AERIA for on-demand DNN inference in the Edge-AI market.

SynergAI: Edge-to-Cloud Synergy for Architecture-Driven High-Performance Orchestration for AI Inference

Distributed, Parallel, and Cluster Computing

Makes AI work faster on different devices.

12 Sep 2025 1

87%

The Larger the Merrier? Efficient Large AI Model Inference in Wireless Edge Networks

Machine Learning (CS)

Makes smart computer programs run faster on phones.

14 May 2025 1

87%

Scalability Optimization in Cloud-Based AI Inference Services: Strategies for Real-Time Load Balancing and Automated Scaling

Distributed, Parallel, and Cluster Computing

Makes AI services faster and use less power.

16 Apr 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

18 pages

Dynamic Pricing for On-Demand DNN Inference in the Edge-AI Market

Smarter AI on phones, faster and cheaper.

Technical Abstract

SynergAI: Edge-to-Cloud Synergy for Architecture-Driven High-Performance Orchestration for AI Inference

The Larger the Merrier? Efficient Large AI Model Inference in Wireless Edge Networks

Scalability Optimization in Cloud-Based AI Inference Services: Strategies for Real-Time Load Balancing and Automated Scaling