Optimal Dataset Size for Recommender Systems: Evaluating Algorithms' Performance via Downsampling
Ardalan Arabzadeh

TL;DR
This study evaluates how dataset downsampling affects the performance and energy efficiency of recommender systems, demonstrating significant reductions in runtime and carbon emissions while maintaining competitive recommendation quality.
Contribution
It introduces a systematic analysis of dataset downsampling as a method to optimize energy efficiency in recommender systems, with empirical evidence across multiple datasets and algorithms.
Findings
30% downsampling reduces runtime by 52% and carbon emissions by up to 51.02 KgCO2e.
Algorithms retain 81% of full dataset performance at 50% training data.
Some configurations outperform full datasets in recommendation quality.
Abstract
This thesis investigates dataset downsampling as a strategy to optimize energy efficiency in recommender systems while maintaining competitive performance. With increasing dataset sizes posing computational and environmental challenges, this study explores the trade-offs between energy efficiency and recommendation quality in Green Recommender Systems, which aim to reduce environmental impact. By applying two downsampling approaches to seven datasets, 12 algorithms, and two levels of core pruning, the research demonstrates significant reductions in runtime and carbon emissions. For example, a 30% downsampling portion can reduce runtime by 52% compared to the full dataset, leading to a carbon emission reduction of up to 51.02 KgCO2e during the training of a single algorithm on a single dataset. The analysis reveals that algorithm performance under different downsampling portions depends…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Bandit Algorithms Research · Recommender Systems and Techniques
MethodsSparse Evolutionary Training
