Mitigating Popularity Bias in Counterfactual Explanations using Large Language Models
Arjan Hasami, Masoud Mansoury

TL;DR
This paper introduces a method using large language models to filter user history data, reducing popularity bias in counterfactual explanations and making them more aligned with individual preferences.
Contribution
It proposes a novel pre-processing step leveraging large language models to mitigate popularity bias in counterfactual explanations for recommendation systems.
Findings
Counterfactuals are more aligned with user preferences.
The approach reduces popularity bias in explanations.
Improved explanation relevance demonstrated on public datasets.
Abstract
Counterfactual explanations (CFEs) offer a tangible and actionable way to explain recommendations by showing users a "what-if" scenario that demonstrates how small changes in their history would alter the system's output. However, existing CFE methods are susceptible to bias, generating explanations that might misalign with the user's actual preferences. In this paper, we propose a pre-processing step that leverages large language models to filter out-of-character history items before generating an explanation. In experiments on two public datasets, we focus on popularity bias and apply our approach to ACCENT, a neural CFE framework. We find that it creates counterfactuals that are more closely aligned with each user's popularity preferences than ACCENT alone.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education
