AE-LLM: Adaptive Efficiency Optimization for Large Language Models

Kaito Tanaka; Masato Ito; Yuji Nishimura; Keisuke Matsuda; Aya Nakayama

arXiv:2603.20492·cs.LG·March 24, 2026

AE-LLM: Adaptive Efficiency Optimization for Large Language Models

Kaito Tanaka, Masato Ito, Yuji Nishimura, Keisuke Matsuda, Aya Nakayama

PDF

Open Access

TL;DR

AE-LLM is a unified framework that automatically optimizes efficiency techniques for large language models, balancing accuracy, latency, memory, and energy to improve deployment performance across diverse scenarios.

Contribution

It introduces a multi-objective optimization approach that dynamically selects and combines efficiency techniques tailored to specific deployment needs, outperforming static configurations.

Findings

01

Achieves 2.8x efficiency improvement on average across models and tasks.

02

Maintains accuracy within 1.2% of baseline despite efficiency gains.

03

Generalizes effectively to vision-language models.

Abstract

Large Language Models (LLMs) have achieved remarkable success across diverse applications, yet their deployment remains challenging due to substantial computational costs, memory requirements, and energy consumption. Recent empirical studies have demonstrated that no single efficiency technique is universally optimal; instead, the effectiveness of methods such as efficient attention mechanisms, mixture-of-experts (MoE), parameter-efficient fine-tuning, and quantization varies significantly depending on task characteristics, resource constraints, and model scales. Building upon these insights, we propose AE-LLM, a unified framework that automatically selects and combines optimal efficiency techniques tailored to specific deployment scenarios. Our approach introduces a multi-objective optimization framework that jointly considers accuracy, latency, memory footprint, and energy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Natural Language Processing Techniques