Scenario-Wise Rec: A Multi-Scenario Recommendation Benchmark

Xiaopeng Li; Jingtong Gao; Pengyue Jia; Xiangyu Zhao; Yichao Wang; Wanyu Wang; Yejing Wang; Yuhao Wang; Xiangyu Zhao; Huifeng Guo; Ruiming Tang

arXiv:2412.17374·cs.IR·October 14, 2025

Scenario-Wise Rec: A Multi-Scenario Recommendation Benchmark

Xiaopeng Li, Jingtong Gao, Pengyue Jia, Xiangyu Zhao, Yichao Wang, Wanyu Wang, Yejing Wang, Yuhao Wang, Xiangyu Zhao, Huifeng Guo, Ruiming Tang

PDF

Open Access 1 Repo 4 Reviews

TL;DR

This paper introduces Scenario-Wise Rec, a comprehensive benchmark with datasets, models, and pipelines for multi-scenario recommendation, addressing current challenges in dataset processing and model transparency.

Contribution

The paper provides the first unified benchmark for MSR, including datasets, models, and evaluation pipelines, to facilitate fair comparisons and foster collaborative research.

Findings

01

Benchmark validated with industrial advertising data

02

Includes 6 public datasets and 12 models

03

Provides open-source code for reproducibility

Abstract

Multi Scenario Recommendation (MSR) tasks, referring to building a unified model to enhance performance across all recommendation scenarios, have recently gained much attention. However, current research in MSR faces two significant challenges that hinder the field's development: the absence of uniform procedures for multi-scenario dataset processing, thus hindering fair comparisons, and most models being closed-sourced, which complicates comparisons with current SOTA models. Consequently, we introduce our benchmark, \textbf{Scenario-Wise Rec}, which comprises 6 public datasets and 12 benchmark models, along with a training and evaluation pipeline. Additionally, we validated the benchmark using an industrial advertising dataset, reinforcing its reliability and applicability in real-world scenarios. We aim for this benchmark to offer researchers valuable insights from prior work,…

Peer Reviews

Decision·ICLR 2025 Conference Withdrawn Submission

Reviewer 01Rating 6Confidence 4

Strengths

S1: The paper presents the first dedicated benchmark for multi-scenario recommendation tasks, which may become a valuable resource in the field. It offers a comprehensive pipeline that includes data processing, model training, and evaluation, setting a new standard for transparency and reproducibility in MSR research. S2: Including public and industrial datasets strengthens the benchmark's reliability and applicability, covering many real-world scenarios. The publicly available source code and

Weaknesses

W1: Although the advantages and disadvantages of different baselines are provided, it can further enhance insights by solving the theoretical basis of the model and providing mitigation strategies for the "seesaw effect." W2: The focus of standard MSR is understandable, but it does not explore the benchmark testing of more scenario-related topics (e.g., multi-scenario multi-task) and additional information (e.g., user's interactive history sequence), limiting the scope of its practical procedur

Reviewer 02Rating 3Confidence 4

Strengths

comprehensive experiments are done with a wide range of algorithms and datasets, including a newly collected industrial dataset, with code provided and clear instructions.

Weaknesses

The contribution and novelty is limited since the authors merely present existing public benchmark datasets with some features for differentiating “scenarios”, whereas some, if not all, of these features could very well be just normal features, and no justification is provided on why it’s a reasonable choice to make them “scenario” features, and it seems the results are not benchmarked with treating the dataset as “single-scenario” as is, and treat the “scenario-feature” as normal feature. How t

Reviewer 03Rating 3Confidence 3

Strengths

This paper developed a standardized benchmark specifically for Multi-Scenario Recommendation tasks. This idea is good and needed in this area. This platform is particularly novel. It provides a comprehensive framework that includes multiple datasets, implementations, and a full evaluation pipeline, aiming to give comparisons across scenarios. The idea of the standard platform and benchmark is quite important. It provides a valuable tool to address the growing need for reliable, standardized MSR

Weaknesses

The idea of the article is very good, providing a benchmark. However, because it is a benchmark, it must adapt to different needs and also take into account the multi-scenario interaction of MSR, which is a very difficult task. This paper has an ambiguous definition of multi-scenario. The datasets used are segmented but lack true cross-scenario relationships, which limits the ability to share knowledge between scenarios as MSR ideally should. The experiments focus on isolated scenario performa

Reviewer 04Rating 3Confidence 4

Strengths

1. The datasets and codes are provided. 2. The topic is meaningful in that researchers can implement baseline comparison via the provided benchmark. 3. The experiments are comprehensive with 10 times running.

Weaknesses

1. Some clarifications are confusing. The terminology requires revision, particularly the use of "recommendation," which does not accurately reflect the concept being discussed. 2. The comparative analysis presented in Table 1 does not sufficiently demonstrate novel contributions relative to existing work in the field. 3. The methodology section would benefit from additional detail, particularly regarding the model parameter selection process and optimization criteria. 4. The motivation and chal

Code & Models

Repositories

xiaopengli1/scenario-wise-rec
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSimulation Techniques and Applications