Multi-LoRA Composition for Image Generation
Ming Zhong, Yelong Shen, Shuohang Wang, Yadong Lu, Yizhu Jiao, Siru, Ouyang, Donghan Yu, Jiawei Han, Weizhu Chen

TL;DR
This paper introduces two training-free methods for composing multiple LoRAs in image generation, improving the synthesis of complex images by effectively combining diverse LoRAs without additional training.
Contribution
It proposes novel, training-free techniques for multi-LoRA composition and establishes a new benchmark for evaluating their effectiveness in image generation.
Findings
Methods outperform baseline in complex LoRA compositions
Performance improves as more LoRAs are combined
Evaluation with GPT-4V confirms effectiveness
Abstract
Low-Rank Adaptation (LoRA) is extensively utilized in text-to-image models for the accurate rendition of specific elements like distinct characters or unique styles in generated images. Nonetheless, existing methods face challenges in effectively composing multiple LoRAs, especially as the number of LoRAs to be integrated grows, thus hindering the creation of complex imagery. In this paper, we study multi-LoRA composition through a decoding-centric perspective. We present two training-free methods: LoRA Switch, which alternates between different LoRAs at each denoising step, and LoRA Composite, which simultaneously incorporates all LoRAs to guide more cohesive image synthesis. To evaluate the proposed approaches, we establish ComposLoRA, a new comprehensive testbed as part of this research. It features a diverse range of LoRA categories with 480 composition sets. Utilizing an evaluation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Image Segmentation Techniques · Robotics and Sensor-Based Localization
