On Faithfulness and Coherence of Language Explanations for Recommendation Systems
Zhouhang Xie, Julian McAuley, Bodhisattwa Prasad Majumder

TL;DR
This paper evaluates the faithfulness and coherence of generated reviews used as explanations in recommendation systems, revealing that current models produce explanations that are brittle and require further validation.
Contribution
The study systematically probes state-of-the-art review generation models to assess the reliability of their explanations for predicted ratings.
Findings
Generated reviews are brittle and not fully faithful as explanations.
Current models need further evaluation before explanations can be trusted.
Generated explanations may not accurately reflect the true rationale behind ratings.
Abstract
Reviews contain rich information about product characteristics and user interests and thus are commonly used to boost recommender system performance. Specifically, previous work show that jointly learning to perform review generation improves rating prediction performance. Meanwhile, these model-produced reviews serve as recommendation explanations, providing the user with insights on predicted ratings. However, while existing models could generate fluent, human-like reviews, it is unclear to what degree the reviews fully uncover the rationale behind the jointly predicted rating. In this work, we perform a series of evaluations that probes state-of-the-art models and their review generation component. We show that the generated explanations are brittle and need further evaluation before being taken as literal rationales for the estimated ratings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Machine Learning in Materials Science
