Our Model Achieves Excellent Performance on MovieLens: What Does it Mean?
Yu-chen Fan, Yitong Ji, Jie Zhang, Aixin Sun

TL;DR
This paper critically analyzes the MovieLens dataset, revealing how interaction dynamics and platform-specific factors influence recommendation evaluation, and discusses the implications for real-world applicability of models trained on this benchmark.
Contribution
The study provides a detailed analysis of MovieLens interactions, highlighting how dataset-specific factors affect recommendation evaluation and model performance.
Findings
User interactions vary significantly at different stages.
Platform recommendations influence user interactions.
Order of interactions impacts sequential model performance.
Abstract
A typical benchmark dataset for recommender system (RecSys) evaluation consists of user-item interactions generated on a platform within a time period. The interaction generation mechanism partially explains why a user interacts with (e.g., like, purchase, rate) an item, and the context of when a particular interaction happened. In this study, we conduct a meticulous analysis of the MovieLens dataset and explain the potential impact of using the dataset for evaluating recommendation algorithms. We make a few main findings from our analysis. First, there are significant differences in user interactions at the different stages when a user interacts with the MovieLens platform. The early interactions largely define the user portrait which affects the subsequent interactions. Second, user interactions are highly affected by the candidate movies that are recommended by the platform's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Bandit Algorithms Research · Energy Load and Power Forecasting
