Algorithm Performance Spaces for Strategic Dataset Selection
Steffen Schulz

TL;DR
This paper introduces the Algorithm Performance Space, a framework for differentiating datasets based on algorithm performance to improve dataset selection in recommender system research.
Contribution
It proposes a novel framework and three metrics to quantify dataset differences, aiding in more appropriate dataset selection for algorithm evaluation.
Findings
Created an Algorithm Performance Space to differentiate datasets
Validated the use of three metrics for dataset comparison
Demonstrated the framework's potential for diverse dataset selection
Abstract
The evaluation of new algorithms in recommender systems frequently depends on publicly available datasets, such as those from MovieLens or Amazon. Some of these datasets are being disproportionately utilized primarily due to their historical popularity as baselines rather than their suitability for specific research contexts. This thesis addresses this issue by introducing the Algorithm Performance Space, a novel framework designed to differentiate datasets based on the measured performance of algorithms applied to them. An experimental study proposes three metrics to quantify and justify dataset selection to evaluate new algorithms. These metrics also validate assumptions about datasets, such as the similarity between MovieLens datasets of varying sizes. By creating an Algorithm Performance Space and using the proposed metrics, differentiating datasets was made possible, and diverse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Business Intelligence
