An Image Dataset for Benchmarking Recommender Systems with Raw Pixels
Yu Cheng, Yunzhu Pan, Jiaqi Zhang, Yongxin Ni, Aixin Sun, Fajie Yuan

TL;DR
PixelRec is a large-scale image recommendation dataset that enables models to learn directly from raw image pixels, demonstrating competitive performance and advantages in cold-start and cross-domain scenarios.
Contribution
The paper introduces PixelRec, a novel large-scale dataset with raw image pixels for recommendation, and shows how vision-based models can outperform ID-based models.
Findings
PixelNet matches or exceeds IDNet performance in standard settings.
PixelNet outperforms IDNet in cold-start scenarios.
PixelRec enables research on image-based recommendation models.
Abstract
Recommender systems (RS) have achieved significant success by leveraging explicit identification (ID) features. However, the full potential of content features, especially the pure image pixel features, remains relatively unexplored. The limited availability of large, diverse, and content-driven image recommendation datasets has hindered the use of raw images as item representations. In this regard, we present PixelRec, a massive image-centric recommendation dataset that includes approximately 200 million user-image interactions, 30 million users, and 400,000 high-quality cover images. By providing direct access to raw image pixels, PixelRec enables recommendation models to learn item representation directly from them. To demonstrate its utility, we begin by presenting the results of several classical pure ID-based baseline models, termed IDNet, trained on PixelRec. Then, to show the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques
