Optimizing NetGPT via Routing-Based Synergy and Reinforcement Learning

Yuxuan Chen; Rongpeng Li; Xianfu Chen; Celimuge Wu; Chenghui Peng; Zhifeng Zhao; and Honggang Zhang

arXiv:2511.22217·cs.NI·December 1, 2025

Optimizing NetGPT via Routing-Based Synergy and Reinforcement Learning

Yuxuan Chen, Rongpeng Li, Xianfu Chen, Celimuge Wu, Chenghui Peng, Zhifeng Zhao, and Honggang Zhang

PDF

Open Access

TL;DR

This paper presents a novel framework for optimizing large language model deployment at the network edge by combining network-aware routing with reinforcement learning to balance quality, latency, and cost.

Contribution

It introduces a routing policy with a fallback threshold based on network conditions and a reinforcement learning approach to enhance edge model capabilities, improving efficiency and performance.

Findings

01

Dynamic fallback thresholds outperform fixed policies.

02

Network-aware routing reduces offloading and latency.

03

Experiments show improved quality-cost trade-offs.

Abstract

Large language model (LLM) agents at the network edge offer low-latency execution for routine queries. In contrast, complex requests often require the superior capability of cloud models, incurring higher latency and cost. To navigate this quality-cost trade-off under dynamic network conditions, we propose a cloud-edge synergy for NetGPT that integrates network-aware routing with on-edge self-improvement. Specifically, our framework routes structured tool-calling requests to cloud or edge agents via a novel scoring policy. We prove that, under mild regularity assumptions, the optimal routing rule admits a unique fallback threshold with monotone dependence on bandwidth and round-trip time (RTT). Concurrently, based on the dataset collected from requests routed to the cloud and corresponding responses, we instantiate a schema-preserving reinforcement learning (RL) to improve the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIoT and Edge/Fog Computing · Advanced Neural Network Applications · Big Data and Digital Economy