Pimp My LLM: Leveraging Variability Modeling to Tune Inference Hyperparameters
Nada Zine, Cl\'ement Quinton, Romain Rouvoy

TL;DR
This paper introduces a variability modeling approach to systematically analyze and optimize inference hyperparameters of large language models, improving energy efficiency and understanding trade-offs.
Contribution
It applies variability management techniques to LLM inference configurations, enabling systematic analysis and predictive modeling of hyperparameter effects.
Findings
Variability modeling manages complexity of LLM inference configurations.
It reveals hyperparameter trade-offs and interactions.
Predictive models can estimate inference behavior from limited data.
Abstract
Large Language Models (LLMs) are being increasingly used across a wide range of tasks. However, their substantial computational demands raise concerns about the energy efficiency and sustainability of both training and inference. Inference, in particular, dominates total compute usage, making its optimization crucial. Recent research has explored optimization techniques and analyzed how configuration choices influence energy consumption. Yet, the vast configuration space of inference servers makes exhaustive empirical evaluation infeasible due to combinatorial explosion. In this paper, we introduce a new perspective on this problem by treating LLMs as configurable systems and applying variability management techniques to systematically analyze inference-time configuration choices. We evaluate our approach on the Hugging Face Transformers library by representing generation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
