DataRec: A Python Library for Standardized and Reproducible Data Management in Recommender Systems
Alberto Carlo Maria Mancino, Salvatore Bufi, Angela Di Fazio, Antonio, Ferrara, Daniele Malitesta, Claudio Pomo, Tommaso Di Noia

TL;DR
DataRec is an open-source Python library that standardizes and simplifies data management in recommender system research, improving reproducibility, comparability, and methodological consistency across experiments.
Contribution
It introduces a unified framework for dataset handling, versioning, and integration, addressing fragmentation and opacity in current data management practices.
Findings
Promotes reproducibility and fair benchmarking in recommender systems.
Addresses common pitfalls in data preprocessing and management.
Enhances interoperability across different experimental setups.
Abstract
Recommender systems have demonstrated significant impact across diverse domains, yet ensuring the reproducibility of experimental findings remains a persistent challenge. A primary obstacle lies in the fragmented and often opaque data management strategies employed during the preprocessing stage, where decisions about dataset selection, filtering, and splitting can substantially influence outcomes. To address these limitations, we introduce DataRec, an open-source Python-based library specifically designed to unify and streamline data handling in recommender system research. By providing reproducible routines for dataset preparation, data versioning, and seamless integration with other frameworks, DataRec promotes methodological standardization, interoperability, and comparability across different experimental setups. Our design is informed by an in-depth review of 55 state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques
MethodsLib
