Stellar: Systematic Evaluation of Human-Centric Personalized Text-to-Image Methods
Panos Achlioptas, Alexandros Benetatos, Iordanis Fostiropoulos,, Dimitris Skourtis

TL;DR
This paper introduces Stellar, a comprehensive dataset and evaluation framework for personalized text-to-image generation, along with a new baseline that outperforms existing methods without test-time fine-tuning.
Contribution
The work provides a large, high-quality dataset (Stellar), new metrics aligned with human judgment, and a simple baseline that achieves state-of-the-art results in personalized text-to-image synthesis.
Findings
Stellar dataset is an order of magnitude larger than existing datasets.
New metrics better correlate with human judgment.
Proposed baseline outperforms previous methods in quality and efficiency.
Abstract
In this work, we systematically study the problem of personalized text-to-image generation, where the output image is expected to portray information about specific human subjects. E.g., generating images of oneself appearing at imaginative places, interacting with various items, or engaging in fictional activities. To this end, we focus on text-to-image systems that input a single image of an individual to ground the generation process along with text describing the desired visual context. Our first contribution is to fill the literature gap by curating high-quality, appropriate data for this task. Namely, we introduce a standardized dataset (Stellar) that contains personalized prompts coupled with images of individuals that is an order of magnitude larger than existing relevant datasets and where rich semantic ground-truth annotations are readily available. Having established Stellar…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Image Retrieval and Classification Techniques · Generative Adversarial Networks and Image Synthesis
MethodsFocus
