TL;DR
This paper investigates how third-party implementations of recommender algorithms influence reproducibility, revealing potential benefits and challenges through comparative analysis of multiple implementations on public datasets.
Contribution
It provides a comprehensive examination of third-party implementations' impact on reproducibility, highlighting overlooked issues in recommender systems research.
Findings
Third-party implementations vary significantly from official versions.
Reproducibility can be compromised by unofficial implementations.
Some third-party versions outperform or underperform compared to official code.
Abstract
Reproducibility of recommender systems research has come under scrutiny during recent years. Along with works focusing on repeating experiments with certain algorithms, the research community has also started discussing various aspects of evaluation and how these affect reproducibility. We add a novel angle to this discussion by examining how unofficial third-party implementations could benefit or hinder reproducibility. Besides giving a general overview, we thoroughly examine six third-party implementations of a popular recommender algorithm and compare them to the official version on five public datasets. In the light of our alarming findings we aim to draw the attention of the research community to this neglected aspect of reproducibility.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
