RosePO: Aligning LLM-based Recommenders with Human Values

Jiayi Liao; Xiangnan He; Ruobing Xie; Jiancan Wu; Yancheng Yuan,; Xingwu Sun; Zhanhui Kang; Xiang Wang

arXiv:2410.12519·cs.IR·October 17, 2024

RosePO: Aligning LLM-based Recommenders with Human Values

Jiayi Liao, Xiangnan He, Ruobing Xie, Jiancan Wu, Yancheng Yuan,, Xingwu Sun, Zhanhui Kang, Xiang Wang

PDF

Open Access

TL;DR

This paper introduces RosePO, a framework that aligns large language model-based recommenders with human values by enhancing helpfulness and harmlessness through preference optimization and bias mitigation techniques.

Contribution

RosePO is a novel framework that explicitly models user preferences and human values during post-training, improving recommendation quality and safety.

Findings

01

Enhanced recommendation performance on real-world datasets

02

Reduced semantic hallucination in recommendations

03

Mitigated popularity bias effectively

Abstract

Recently, there has been a growing interest in leveraging Large Language Models (LLMs) for recommendation systems, which usually adapt a pre-trained LLM to the recommendation scenario through supervised fine-tuning (SFT). However, both the pre-training and SFT stages fail to explicitly model the comparative relationships of a user's preferences on different items. To construct a "helpful and harmless" LLM-based recommender, we propose a general framework -- Recommendation with smoothing personalized Preference Optimization (RosePO), which better aligns with customized human values during the post-training stage. Specifically, in addition to the input and chosen response that naturally align with SFT data, we design a rejected sampling strategy tailored for enhancing helpfulness, along with two strategies aimed at mitigating biases to promote harmlessness. To ensure robustness against…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Data Quality and Management · Natural Language Processing Techniques

MethodsALIGN · Shrink and Fine-Tune