AGFT: An Adaptive GPU Frequency Tuner for Real-Time LLM Inference Optimization

Zicong Ye; Kunming Zhang; Guoming Tang

arXiv:2508.01744·cs.LG·August 5, 2025

AGFT: An Adaptive GPU Frequency Tuner for Real-Time LLM Inference Optimization

Zicong Ye, Kunming Zhang, Guoming Tang

PDF

Open Access

TL;DR

AGFT is an adaptive GPU frequency tuning framework using reinforcement learning to optimize energy efficiency during real-time LLM inference, reducing energy consumption by over 40% with minimal latency impact.

Contribution

We introduce AGFT, a novel reinforcement learning-based framework that dynamically adjusts GPU frequencies for energy-efficient LLM inference without performance loss.

Findings

01

44.3% GPU energy savings achieved

02

Less than 10% latency overhead

03

Up to 40.3% energy-delay product improvement

Abstract

The explosive growth of interactive Large Language Models (LLMs) has placed unprecedented demands for low latency on cloud GPUs, forcing them into high-power modes and causing escalating energy costs. Real-time inference workloads exhibit significant dynamic volatility, presenting substantial energy-saving opportunities. However, traditional static or rule-based power management strategies struggle to exploit these opportunities without compromising peak performance. To address this challenge, we propose AGFT (An Adaptive GPU Frequency Tuner), a framework that employs online reinforcement learning to autonomously learn an optimal frequency tuning policy. By monitoring real-time features like request load and latency, AGFT utilizes fine-grained frequency control for precise adjustments and intelligent action space pruning for stable, efficient decision-making. This creates a robust,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBig Data and Digital Economy · Cloud Computing and Resource Management · Green IT and Sustainability