Green LLM Techniques in Action: How Effective Are Existing Techniques for Improving the Energy Efficiency of LLM-Based Applications in Industry?
Pelin Rabia Kuran, Rumbidzai Chitakunye, Vincenzo Stoico, Ilja Heitlager, Justus Bogner

TL;DR
This study empirically evaluates various techniques to improve the energy efficiency of LLM-based industrial applications, finding that some methods significantly reduce energy use but often at the cost of accuracy, with only one method balancing both effectively.
Contribution
The paper provides the first empirical assessment of multiple energy-saving techniques applied to LLM applications in an industrial context, highlighting their effectiveness and limitations.
Findings
Prompt Optimization and 2-bit Quantization can reduce energy use by up to 90%.
Most techniques negatively impact accuracy, often beyond acceptable levels.
Small and Large Model Collaboration with NPCC achieves significant energy savings without major accuracy loss.
Abstract
The rapid adoption of large language models (LLMs) has raised concerns about their substantial energy consumption, especially when deployed at industry scale. While several techniques have been proposed to address this, limited empirical evidence exists regarding the effectiveness of applying them to LLM-based industry applications. To fill this gap, we analyzed a chatbot application in an industrial context at Schuberg Philis, a Dutch IT services company. We then selected four techniques, namely Small and Large Model Collaboration, Prompt Optimization, Quantization, and Batching, applied them to the application in eight variations, and then conducted experiments to study their impact on energy consumption, accuracy, and response time compared to the unoptimized baseline. Our results show that several techniques, such as Prompt Optimization and 2-bit Quantization, managed to reduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
