Rank, Don't Generate: Statement-level Ranking for Explainable Recommendation

Ben Kabongo; Arthur Satouf; Vincent Guigue

arXiv:2604.03724·cs.IR·April 7, 2026

Rank, Don't Generate: Statement-level Ranking for Explainable Recommendation

Ben Kabongo, Arthur Satouf, Vincent Guigue

PDF

TL;DR

This paper proposes a statement-level ranking approach for explainable recommendation systems, addressing evaluation challenges and introducing a new benchmark called StaR.

Contribution

It formalizes explainable recommendation as a ranking problem, develops an extraction and clustering pipeline, and introduces StaR benchmark for evaluation.

Findings

01

Popularity baselines perform well in global ranking.

02

State-of-the-art models excel in item-level ranking.

03

Limitations in personalized explanation ranking are identified.

Abstract

Textual explanations, generated with large language models (LLMs), are increasingly used to justify recommendations. Yet, evaluating these explanations remains a critical challenge. We advocate a shift in objective: rank, don't generate. We formalize explainable recommendation as a statement-level ranking problem, where systems rank candidate explanatory statements derived from reviews and return the top-k as explanation. This formulation mitigates hallucination by construction and enables fine-grained factual analysis. It also models factor importance through relevance scores and supports standardized, reproducible evaluation with established ranking metrics. Meaningful assessment, however, requires each statement to be explanatory (item facts affecting user experience), atomic (one opinion about one aspect), and unique (paraphrases consolidated), which is challenging to obtain from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.