Personalizing Text-to-Image Generation to Individual Taste
Anne-Sofie Maerten, Juliane Verwiebe, Shyamgopal Karthik, Ameya Prabhu, Johan Wagemans, Matthias Bethge

TL;DR
This paper introduces PAMELA, a dataset and framework for modeling personalized aesthetic preferences in text-to-image generation, enabling more subjective and user-specific visual outputs.
Contribution
It presents a new dataset and predictive model for personalized image evaluation, improving individual preference prediction over existing population-level methods.
Findings
The personalized model predicts individual preferences more accurately than existing methods.
Prompt optimization with the model can steer images towards user-specific tastes.
The dataset includes 70,000 ratings from 15 users across diverse image domains.
Abstract
Modern text-to-image (T2I) models generate high-fidelity visuals but remain indifferent to individual user preferences. While existing reward models optimize for "average" human appeal, they fail to capture the inherent subjectivity of aesthetic judgment. In this work, we introduce a novel dataset and predictive framework, called PAMELA, designed to model personalized image evaluations. Our dataset comprises 70,000 ratings across 5,000 diverse images generated by state-of-the-art models (Flux 2 and Nano Banana). Each image is evaluated by 15 unique users, providing a rich distribution of subjective preferences across domains such as art, design, fashion, and cinematic photography. Leveraging this data, we propose a personalized reward model trained jointly on our high-quality annotations and existing aesthetic assessment subsets. We demonstrate that our model predicts individual liking…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
