SafeCRS: Personalized Safety Alignment for LLM-Based Conversational Recommender Systems
Haochang Hao, Yifan Xu, Xinzhuo Li, Yingqiang Ge, Lu Cheng

TL;DR
This paper introduces SafeCRS, a framework for enhancing safety in LLM-based conversational recommender systems by respecting personalized safety constraints, significantly reducing safety violations while maintaining recommendation quality.
Contribution
It formalizes personalized safety in CRS, creates the SafeRec benchmark, and proposes SafeCRS with novel safety-aware training methods to improve safety alignment.
Findings
Safety violation rates reduced by up to 96.5%
Maintains competitive recommendation quality
Provides a new benchmark for safety evaluation in CRS
Abstract
Current LLM-based conversational recommender systems (CRS) primarily optimize recommendation accuracy and user satisfaction. We identify an underexplored vulnerability in which recommendation outputs may negatively impact users by violating personalized safety constraints, when individualized safety sensitivities -- such as trauma triggers, self-harm history, or phobias -- are implicitly inferred from the conversation but not respected during recommendation. We formalize this challenge as personalized CRS safety and introduce SafeRec, a new benchmark dataset designed to systematically evaluate safety risks in LLM-based CRS under user-specific constraints. To further address this problem, we propose SafeCRS, a safety-aware training framework that integrates Safe Supervised Fine-Tuning (Safe-SFT) with Safe Group reward-Decoupled Normalization Policy Optimization (Safe-GDPO) to jointly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Emotion and Mood Recognition · Topic Modeling
