Data-Centric Green AI: An Exploratory Empirical Study
Roberto Verdecchia, Lu\'is Cruz, June Sallou, Michelle Lin, James, Wickenden, Estelle Hotellier

TL;DR
This study empirically investigates how data-centric modifications, such as changing dataset size and features, can significantly reduce AI energy consumption with minimal impact on accuracy, highlighting a promising avenue for Green AI.
Contribution
It provides the first empirical evidence that dataset modifications alone can drastically reduce AI energy consumption, emphasizing the importance of data-centric approaches for Green AI.
Findings
Dataset modifications can reduce energy consumption by up to 92.16%.
Changing algorithms can lead to energy savings up to two orders of magnitude.
Data-centric techniques are crucial for advancing Green AI.
Abstract
With the growing availability of large-scale datasets, and the popularization of affordable storage and computational capabilities, the energy consumed by AI is becoming a growing concern. To address this issue, in recent years, studies have focused on demonstrating how AI energy efficiency can be improved by tuning the model training strategy. Nevertheless, how modifications applied to datasets can impact the energy consumption of AI is still an open question. To fill this gap, in this exploratory study, we evaluate if data-centric approaches can be utilized to improve AI energy efficiency. To achieve our goal, we conduct an empirical experiment, executed by considering 6 different AI algorithms, a dataset comprising 5,574 data points, and two dataset modifications (number of data points and number of features). Our results show evidence that, by exclusively conducting modifications on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGreen IT and Sustainability · Big Data and Digital Economy · Machine Learning and Data Classification
