Personalized Safety Alignment for Text-to-Image Diffusion Models
Yu Lei, Jinbin Bai, Qingyu Shi, Aosong Feng, Hongcheng Gao, Xiao Zhang, Rex Ying

TL;DR
This paper introduces Personalized Safety Alignment (PSA), a framework that adapts safety measures in text-to-image diffusion models to individual user preferences, improving safety and visual quality through user-conditioned modulation.
Contribution
We propose PSA, a novel approach that personalizes safety in generative models using a large-scale user profile dataset and a parameter-efficient adapter, enabling dynamic safety-quality trade-offs.
Findings
PSA outperforms static baselines in safety and quality balance.
It achieves better safety adherence than prompt-engineering methods.
PSA enables personalized safety adjustments across diverse user profiles.
Abstract
Text-to-image diffusion models have revolutionized visual content generation, yet their deployment is hindered by a fundamental limitation: safety mechanisms enforce rigid, uniform standards that fail to reflect diverse user preferences shaped by age, culture, or personal beliefs. To address this, we propose Personalized Safety Alignment (PSA), a framework that transitions generative safety from static filtration to user-conditioned adaptation. We introduce Sage, a large-scale dataset capturing diverse safety boundaries across 1,000 simulated user profiles, covering complex risks often missed by traditional datasets. By integrating these profiles via a parameter-efficient cross-attention adapter, PSA dynamically modulates generation to align with individual sensitivities. Extensive experiments demonstrate that PSA achieves a calibrated safety-quality trade-off: under permissive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
