Score: 2

Optimizing NetGPT via Routing-Based Synergy and Reinforcement Learning

Published: November 27, 2025 | arXiv ID: 2511.22217v1

By: Yuxuan Chen , Rongpeng Li , Xianfu Chen and more

BigTech Affiliations: Huawei

Potential Business Impact:

Smart AI learns to answer questions faster and cheaper.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large language model (LLM) agents at the network edge offer low-latency execution for routine queries. In contrast, complex requests often require the superior capability of cloud models, incurring higher latency and cost. To navigate this quality-cost trade-off under dynamic network conditions, we propose a cloud-edge synergy for NetGPT that integrates network-aware routing with on-edge self-improvement. Specifically, our framework routes structured tool-calling requests to cloud or edge agents via a novel scoring policy. We prove that, under mild regularity assumptions, the optimal routing rule admits a unique fallback threshold with monotone dependence on bandwidth and round-trip time (RTT). Concurrently, based on the dataset collected from requests routed to the cloud and corresponding responses, we instantiate a schema-preserving reinforcement learning (RL) to improve the capability of the edge agent. We analyze a supervised finetuning (SFT)-anchored composite objective that combines a reverse-KL trust-region step with a forward-KL realignment toward the SFT prior, explaining stability and constraining policy drift. Both the network-aware routing policy and the edge agent are updated coherently. Experiments across controlled network states and pricing schedules demonstrate smooth quality-cost frontiers, consistent gains of dynamic fallback thresholds over fixed policies, and sustained reductions in offloading while maintaining task success and schema-correct outputs.

Efficient Routing of Inference Requests across LLM Instances in Cloud-Edge Computing

Distributed, Parallel, and Cluster Computing

Makes AI answer questions faster and cheaper.

21 Jul 2025 0

89%

A Flexible Multi-Agent Deep Reinforcement Learning Framework for Dynamic Routing and Scheduling of Latency-Critical Services

Networking and Internet Architecture

Ensures important messages arrive on time.

13 Oct 2025 1

88%

Topology-Aware Graph Reinforcement Learning for Dynamic Routing in Cloud Networks

Machine Learning (CS)

Makes computer networks send data faster.

5 Sep 2025 0

View PDF Login to Bookmark

Country of Origin

🇲🇴 🇯🇵 🇨🇳 Macao, Japan, China

Page Count

15 pages

Optimizing NetGPT via Routing-Based Synergy and Reinforcement Learning

Smart AI learns to answer questions faster and cheaper.

Technical Abstract

Efficient Routing of Inference Requests across LLM Instances in Cloud-Edge Computing

A Flexible Multi-Agent Deep Reinforcement Learning Framework for Dynamic Routing and Scheduling of Latency-Critical Services

Topology-Aware Graph Reinforcement Learning for Dynamic Routing in Cloud Networks