Personalized Image Editing in Text-to-Image Diffusion Models via Collaborative Direct Preference Optimization
Connor Dunlop, Matthew Zheng, Kavana Venkatesh, Pinar Yanardag

TL;DR
This paper introduces C-DPO, a novel framework that personalizes image editing in text-to-image diffusion models by learning user preferences through a collaborative graph neural network, resulting in more aligned and satisfying edits.
Contribution
The paper presents the first method for personalized image editing in diffusion models, utilizing a collaborative preference graph and a new DPO objective for improved user-specific outputs.
Findings
Outperforms baselines in aligning edits with user preferences
Effectively shares information across users with similar tastes
Enhances diffusion model editing capabilities with personalized embeddings
Abstract
Text-to-image (T2I) diffusion models have made remarkable strides in generating and editing high-fidelity images from text. Yet, these models remain fundamentally generic, failing to adapt to the nuanced aesthetic preferences of individual users. In this work, we present the first framework for personalized image editing in diffusion models, introducing Collaborative Direct Preference Optimization (C-DPO), a novel method that aligns image edits with user-specific preferences while leveraging collaborative signals from like-minded individuals. Our approach encodes each user as a node in a dynamic preference graph and learns embeddings via a lightweight graph neural network, enabling information sharing across users with overlapping visual tastes. We enhance a diffusion model's editing capabilities by integrating these personalized embeddings into a novel DPO objective, which jointly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Visual Attention and Saliency Detection · Multimodal Machine Learning Applications
