JPPO++: Joint Power and Denoising-inspired Prompt Optimization for Mobile LLM Services
Feiran You, Hongyang Du, Kaibin Huang, and Abbas Jamalipour

TL;DR
This paper introduces JPPO++, a framework that jointly optimizes prompt compression and wireless transmission power for mobile LLM services, significantly reducing service time and prompt length while maintaining output quality.
Contribution
It proposes a novel denoising-inspired prompt compression scheme combined with DRL-based joint optimization for mobile LLM systems, enhancing efficiency and response speed.
Findings
JPPO++ reduces service time by 46.5% compared to no compression.
Up to 16x prompt length reduction with acceptable accuracy loss.
JPPO with 16x compression cuts total service time by 42.3%.
Abstract
Large Language Models (LLMs) are increasingly integrated into mobile services over wireless networks to support complex user requests. This trend has led to longer prompts, which improve LLMs' performance but increase data transmission costs and require more processing time, thereby reducing overall system efficiency and negatively impacting user experience. To address these challenges, we propose Joint Prompt and Power Optimization (JPPO), a framework that jointly optimizes prompt compression and wireless transmission power for mobile LLM services. JPPO leverages a Small Language Model (SLM) deployed at edge devices to perform lightweight prompt compression, reducing communication load before transmission to the cloud-based LLM. A Deep Reinforcement Learning (DRL) agent dynamically adjusts both the compression ratio and transmission power based on network conditions and service…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Packet Processing and Optimization
