Personalized Image Generation with Large Multimodal Models

Yiyan Xu; Wenjie Wang; Yang Zhang; Biao Tang; Peng Yan; Fuli Feng,; Xiangnan He

arXiv:2410.14170·cs.IR·February 5, 2025·2 cites

Personalized Image Generation with Large Multimodal Models

Yiyan Xu, Wenjie Wang, Yang Zhang, Biao Tang, Peng Yan, Fuli Feng,, Xiangnan He

PDF

Open Access 1 Repo

TL;DR

This paper introduces Pigeon, a framework leveraging large multimodal models for personalized image generation, effectively capturing user preferences from noisy data and limited supervision, demonstrated through sticker and poster generation.

Contribution

The paper presents a novel personalized image generation framework with a two-stage preference alignment scheme and modules for capturing user preferences from noisy data.

Findings

01

Pigeon outperforms baseline models in quantitative metrics.

02

Human evaluations favor Pigeon's generated images.

03

Effective preference alignment reduces data noise impact.

Abstract

Personalized content filtering, such as recommender systems, has become a critical infrastructure to alleviate information overload. However, these systems merely filter existing content and are constrained by its limited diversity, making it difficult to meet users' varied content needs. To address this limitation, personalized content generation has emerged as a promising direction with broad applications. Nevertheless, most existing research focuses on personalized text generation, with relatively little attention given to personalized image generation. The limited work in personalized image generation faces challenges in accurately capturing users' visual preferences and needs from noisy user-interacted images and complex multimodal instructions. Worse still, there is a lack of supervised data for training personalized image generation models. To overcome the challenges, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yiyanxu/pigeon
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques

MethodsSoftmax · Attention Is All You Need · ALIGN