Optimizing NetGPT via Routing-Based Synergy and Reinforcement Learning
Yuxuan Chen, Rongpeng Li, Xianfu Chen, Celimuge Wu, Chenghui Peng, Zhifeng Zhao, and Honggang Zhang

TL;DR
This paper presents a novel framework for optimizing large language model deployment at the network edge by combining network-aware routing with reinforcement learning to balance quality, latency, and cost.
Contribution
It introduces a routing policy with a fallback threshold based on network conditions and a reinforcement learning approach to enhance edge model capabilities, improving efficiency and performance.
Findings
Dynamic fallback thresholds outperform fixed policies.
Network-aware routing reduces offloading and latency.
Experiments show improved quality-cost trade-offs.
Abstract
Large language model (LLM) agents at the network edge offer low-latency execution for routine queries. In contrast, complex requests often require the superior capability of cloud models, incurring higher latency and cost. To navigate this quality-cost trade-off under dynamic network conditions, we propose a cloud-edge synergy for NetGPT that integrates network-aware routing with on-edge self-improvement. Specifically, our framework routes structured tool-calling requests to cloud or edge agents via a novel scoring policy. We prove that, under mild regularity assumptions, the optimal routing rule admits a unique fallback threshold with monotone dependence on bandwidth and round-trip time (RTT). Concurrently, based on the dataset collected from requests routed to the cloud and corresponding responses, we instantiate a schema-preserving reinforcement learning (RL) to improve the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Advanced Neural Network Applications · Big Data and Digital Economy
