RARD: The Related-Article Recommendation Dataset

Joeran Beel; Zeljko Carevic; Johann Schaible; Gabor Neusch

arXiv:1706.03428·cs.IR·June 21, 2017

RARD: The Related-Article Recommendation Dataset

Joeran Beel, Zeljko Carevic, Johann Schaible, Gabor Neusch

PDF

TL;DR

RARD is a large, detailed dataset of research-paper recommendations and user interactions, enabling advanced research and evaluation of scientific recommender systems.

Contribution

The paper introduces RARD, a comprehensive dataset for research-paper recommendations, with detailed recommendation logs and implicit ratings, filling a gap in scientific recommender system datasets.

Findings

01

Contains 57.4 million recommendations and click logs.

02

Includes diverse recommendation approaches and feature types.

03

Provides an implicit item-item rating matrix.

Abstract

Recommender-system datasets are used for recommender-system evaluations, training machine-learning algorithms, and exploring user behavior. While there are many datasets for recommender systems in the domains of movies, books, and music, there are rather few datasets from research-paper recommender systems. In this paper, we introduce RARD, the Related-Article Recommendation Dataset, from the digital library Sowiport and the recommendation-as-a-service provider Mr. DLib. The dataset contains information about 57.4 million recommendations that were displayed to the users of Sowiport. Information includes details on which recommendation approaches were used (e.g. content-based filtering, stereotype, most popular), what types of features were used in content based filtering (simple terms vs. keyphrases), where the features were extracted from (title or abstract), and the time when…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.