EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM
Zhuofan Zong, Dongzhi Jiang, Bingqi Ma, Guanglu Song, Hao Shao,, Dazhong Shen, Yu Liu, Hongsheng Li

TL;DR
EasyRef is a novel plug-and-play method that enables diffusion models to condition on multiple reference images and text prompts by leveraging multimodal large language models, improving image generation quality and generalization.
Contribution
The paper introduces EasyRef, a new approach that combines multimodal LLMs with diffusion models for multi-reference image generation without extensive fine-tuning.
Findings
Outperforms existing tuning-free and tuning-based methods in image quality.
Achieves robust zero-shot generalization across diverse domains.
Introduces MRBench, a new benchmark for multi-reference image generation.
Abstract
Significant achievements in personalization of diffusion models have been witnessed. Conventional tuning-free methods mostly encode multiple reference images by averaging their image embeddings as the injection condition, but such an image-independent operation cannot perform interaction among images to capture consistent visual elements within multiple references. Although the tuning-based Low-Rank Adaptation (LoRA) can effectively extract consistent elements within multiple images through the training process, it necessitates specific finetuning for each distinct image group. This paper introduces EasyRef, a novel plug-and-play adaptation method that enables diffusion models to be conditioned on multiple reference images and the text prompt. To effectively exploit consistent visual elements within multiple images, we leverage the multi-image comprehension and instruction-following…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging
MethodsDiffusion
