ImageGem: In-the-wild Generative Image Interaction Dataset for Generative Model Personalization

Yuanhe Guo; Linxi Xie; Zhuoran Chen; Kangrui Yu; Ryan Po; Guandao Yang; Gordon Wetztein; Hongyi Wen

arXiv:2510.18433·cs.CV·October 22, 2025

ImageGem: In-the-wild Generative Image Interaction Dataset for Generative Model Personalization

Yuanhe Guo, Linxi Xie, Zhuoran Chen, Kangrui Yu, Ryan Po, Guandao Yang, Gordon Wetztein, Hongyi Wen

PDF

Open Access

TL;DR

ImageGem is a comprehensive dataset of real-world user interactions and preferences that enables personalized generative image modeling, retrieval, and editing, advancing the development of user-aligned generative models.

Contribution

We introduce ImageGem, the first large-scale in-the-wild dataset with detailed user preferences, facilitating personalized generative model training and editing.

Findings

01

Improved preference alignment models trained with ImageGem data

02

Enhanced personalized image retrieval and model recommendation performance

03

Effective end-to-end framework for editing diffusion models based on user preferences

Abstract

We introduce ImageGem, a dataset for studying generative models that understand fine-grained individual preferences. We posit that a key challenge hindering the development of such a generative model is the lack of in-the-wild and fine-grained user preference annotations. Our dataset features real-world interaction data from 57K users, who collectively have built 242K customized LoRAs, written 3M text prompts, and created 5M generated images. With user preference annotations from our dataset, we were able to train better preference alignment models. In addition, leveraging individual user preference, we investigated the performance of retrieval models and a vision-language model on personalized image retrieval and generative model recommendation. Finally, we propose an end-to-end framework for editing customized diffusion models in a latent weight space to align with individual user…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Data Visualization and Analytics