Toward Sustainable GenAI using Generation Directives for Carbon-Friendly Large Language Model Inference
Baolin Li, Yankai Jiang, Vijay Gadepally, Devesh Tiwari

TL;DR
This paper introduces Sprout, a framework that uses generation directives to significantly reduce the carbon footprint of large language model inference while maintaining high-quality outputs, advancing sustainable AI practices.
Contribution
Sprout is the first framework to incorporate generation directives and an offline quality evaluator to optimize carbon efficiency in LLM inference.
Findings
Over 40% reduction in carbon emissions achieved
Effective balancing of sustainability and output quality demonstrated
Real-world evaluation with Llama2 confirms practical viability
Abstract
The rapid advancement of Generative Artificial Intelligence (GenAI) across diverse sectors raises significant environmental concerns, notably the carbon emissions from their cloud and high performance computing (HPC) infrastructure. This paper presents Sprout, an innovative framework designed to address these concerns by reducing the carbon footprint of generative Large Language Model (LLM) inference services. Sprout leverages the innovative concept of "generation directives" to guide the autoregressive generation process, thereby enhancing carbon efficiency. Our proposed method meticulously balances the need for ecological sustainability with the demand for high-quality generation outcomes. Employing a directive optimizer for the strategic assignment of generation directives to user prompts and an original offline quality evaluator, Sprout demonstrates a significant reduction in carbon…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
