SteerEval: A Framework for Evaluating Steerability with Natural Language Profiles for Recommendation

Joyce Zhou; Weijie Zhou; Doug Turnbull; Thorsten Joachims

arXiv:2601.21105·cs.IR·January 30, 2026

SteerEval: A Framework for Evaluating Steerability with Natural Language Profiles for Recommendation

Joyce Zhou, Weijie Zhou, Doug Turnbull, Thorsten Joachims

PDF

Open Access 3 Datasets

TL;DR

SteerEval is a new framework for evaluating how well natural-language user profiles can steer recommendations across diverse and nuanced user preferences, addressing limitations of existing benchmarks.

Contribution

We introduce SteerEval, a comprehensive evaluation framework for measuring nuanced steerability in natural-language recommender systems, highlighting its potential and limitations.

Findings

01

Natural-language profiles enable explicit user preference articulation.

02

SteerEval reveals varying effectiveness of steering across different content types.

03

Different intervention strategies impact steerability outcomes.

Abstract

Natural-language user profiles have recently attracted attention not only for improved interpretability, but also for their potential to make recommender systems more steerable. By enabling direct editing, natural-language profiles allow users to explicitly articulate preferences that may be difficult to infer from past behavior. However, it remains unclear whether current natural-language-based recommendation methods can follow such steering commands. While existing steerability evaluations have shown some success for well-recognized item attributes (e.g., movie genres), we argue that these benchmarks fail to capture the richer forms of user control that motivate steerable recommendations. To address this gap, we introduce SteerEval, an evaluation framework designed to measure more nuanced and diverse forms of steerability by using interventions that range from genres to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Explainable Artificial Intelligence (XAI) · Sentiment Analysis and Opinion Mining