RosePO: Aligning LLM-based Recommenders with Human Values
Jiayi Liao, Xiangnan He, Ruobing Xie, Jiancan Wu, Yancheng Yuan,, Xingwu Sun, Zhanhui Kang, Xiang Wang

TL;DR
This paper introduces RosePO, a framework that aligns large language model-based recommenders with human values by enhancing helpfulness and harmlessness through preference optimization and bias mitigation techniques.
Contribution
RosePO is a novel framework that explicitly models user preferences and human values during post-training, improving recommendation quality and safety.
Findings
Enhanced recommendation performance on real-world datasets
Reduced semantic hallucination in recommendations
Mitigated popularity bias effectively
Abstract
Recently, there has been a growing interest in leveraging Large Language Models (LLMs) for recommendation systems, which usually adapt a pre-trained LLM to the recommendation scenario through supervised fine-tuning (SFT). However, both the pre-training and SFT stages fail to explicitly model the comparative relationships of a user's preferences on different items. To construct a "helpful and harmless" LLM-based recommender, we propose a general framework -- Recommendation with smoothing personalized Preference Optimization (RosePO), which better aligns with customized human values during the post-training stage. Specifically, in addition to the input and chosen response that naturally align with SFT data, we design a rejected sampling strategy tailored for enhancing helpfulness, along with two strategies aimed at mitigating biases to promote harmlessness. To ensure robustness against…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Data Quality and Management · Natural Language Processing Techniques
MethodsALIGN · Shrink and Fine-Tune
