Personalized Safety Alignment for Text-to-Image Diffusion Models

Yu Lei; Jinbin Bai; Qingyu Shi; Aosong Feng; Hongcheng Gao; Xiao Zhang; Rex Ying

arXiv:2508.01151·cs.CV·February 6, 2026

Personalized Safety Alignment for Text-to-Image Diffusion Models

Yu Lei, Jinbin Bai, Qingyu Shi, Aosong Feng, Hongcheng Gao, Xiao Zhang, Rex Ying

PDF

TL;DR

This paper introduces Personalized Safety Alignment (PSA), a framework that adapts safety measures in text-to-image diffusion models to individual user preferences, improving safety and visual quality through user-conditioned modulation.

Contribution

We propose PSA, a novel approach that personalizes safety in generative models using a large-scale user profile dataset and a parameter-efficient adapter, enabling dynamic safety-quality trade-offs.

Findings

01

PSA outperforms static baselines in safety and quality balance.

02

It achieves better safety adherence than prompt-engineering methods.

03

PSA enables personalized safety adjustments across diverse user profiles.

Abstract

Text-to-image diffusion models have revolutionized visual content generation, yet their deployment is hindered by a fundamental limitation: safety mechanisms enforce rigid, uniform standards that fail to reflect diverse user preferences shaped by age, culture, or personal beliefs. To address this, we propose Personalized Safety Alignment (PSA), a framework that transitions generative safety from static filtration to user-conditioned adaptation. We introduce Sage, a large-scale dataset capturing diverse safety boundaries across 1,000 simulated user profiles, covering complex risks often missed by traditional datasets. By integrating these profiles via a parameter-efficient cross-attention adapter, PSA dynamically modulates generation to align with individual sensitivities. Extensive experiments demonstrate that PSA achieves a calibrated safety-quality trade-off: under permissive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.