Ensembles at Any Cost? Accuracy-Energy Trade-offs in Recommender Systems
Jannik Nitschke, Lukas Wegmeth, Joeran Beel

TL;DR
This study evaluates the accuracy-energy trade-offs of ensemble methods in recommender systems through extensive experiments, revealing significant energy costs for modest accuracy gains.
Contribution
It provides a comprehensive empirical analysis of energy consumption in ensemble recommender systems, highlighting efficiency trade-offs and limitations.
Findings
Ensembles improve accuracy by up to 5.7% but can increase energy use by up to 2,549%.
Selective ensemble strategies are more energy-efficient than exhaustive averaging.
Ensemble methods can significantly increase CO2 emissions relative to single models.
Abstract
Ensemble methods are frequently used in recommender systems to improve accuracy by combining multiple models. Recent work reports sizable performance gains, but most studies still optimize primarily for accuracy and robustness rather than for energy efficiency. This paper measures accuracy energy trade offs of ensemble techniques relative to strong single models. We run 93 controlled experiments in two pipelines: 1. explicit rating prediction with Surprise (RMSE) and 2. implicit feedback ranking with LensKit (NDCG@10). We evaluate four datasets ranging from 100,000 to 7.8 million interactions (MovieLens 100K, MovieLens 1M, ModCloth, Anime). We compare four ensemble strategies (Average, Weighted, Stacking or Rank Fusion, Top Performers) against baselines and optimized single models. Whole system energy is measured with EMERS using a smart plug and converted to CO2 equivalents. Across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
