LLM-as-a-Judge: Toward World Models for Slate Recommendation Systems
Baptiste Bonin, Maxime Heuillet, Audrey Durand

TL;DR
This paper explores the potential of large language models to serve as world models for user preferences in slate recommendation systems, using pairwise reasoning to improve recommendation accuracy across multiple datasets.
Contribution
It demonstrates how LLMs can be employed as effective world models for user preferences in slate recommendation, with empirical evidence across various tasks and datasets.
Findings
LLMs show promise as world models for user preferences
Performance varies based on preference function properties
Potential for improving recommendation systems using LLMs
Abstract
Modeling user preferences across domains remains a key challenge in slate recommendation (i.e. recommending an ordered sequence of items) research. We investigate how Large Language Models (LLM) can effectively act as world models of user preferences through pairwise reasoning over slates. We conduct an empirical study involving several LLMs on three tasks spanning different datasets. Our results reveal relationships between task performance and properties of the preference function captured by LLMs, hinting towards areas for improvement and highlighting the potential of LLMs as world models in recommender systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Sentiment Analysis and Opinion Mining · Expert finding and Q&A systems
