AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems
Yu Shang, Peijie Liu, Yuwei Yan, Zijing Wu, Leheng Sheng, Yuanqing Yu, Chumeng Jiang, An Zhang, Fengli Xu, Yu Wang, Min Zhang, Yong Li

TL;DR
This paper introduces AgentRecBench, a comprehensive benchmark environment for evaluating LLM-powered agentic recommender systems, highlighting their advantages over traditional methods through standardized testing and community engagement.
Contribution
It presents a novel benchmark framework, an interactive recommendation simulator, and a comparative study of classical and agentic recommendation methods.
Findings
Agentic systems outperform classical recommendation methods.
The benchmark environment is validated and publicly available.
Guidelines for designing effective agentic recommender systems.
Abstract
The emergence of agentic recommender systems powered by Large Language Models (LLMs) represents a paradigm shift in personalized recommendations, leveraging LLMs' advanced reasoning and role-playing capabilities to enable autonomous, adaptive decision-making. Unlike traditional recommendation approaches, agentic recommender systems can dynamically gather and interpret user-item interactions from complex environments, generating robust recommendation strategies that generalize across diverse scenarios. However, the field currently lacks standardized evaluation protocols to systematically assess these methods. To address this critical gap, we propose: (1) an interactive textual recommendation simulator incorporating rich user and item metadata and three typical evaluation scenarios (classic, evolving-interest, and cold-start recommendation tasks); (2) a unified modular framework for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
