K-LoRA: Unlocking Training-Free Fusion of Any Subject and Style LoRAs
Ziheng Ouyang, Zhen Li, Qibin Hou

TL;DR
K-LoRA is a training-free method that fuses learned subject and style LoRAs by selecting top features, effectively preserving both aspects in diffusion models without additional training.
Contribution
It introduces a novel, training-free fusion technique for LoRAs that improves subject-style integration by leveraging intrinsic properties of LoRA in diffusion models.
Findings
Outperforms state-of-the-art training-based fusion methods
Effectively preserves subject and style during fusion
Demonstrates strong qualitative and quantitative results
Abstract
Recent studies have explored combining different LoRAs to jointly generate learned style and content. However, existing methods either fail to effectively preserve both the original subject and style simultaneously or require additional training. In this paper, we argue that the intrinsic properties of LoRA can effectively guide diffusion models in merging learned subject and style. Building on this insight, we propose K-LoRA, a simple yet effective training-free LoRA fusion approach. In each attention layer, K-LoRA compares the Top-K elements in each LoRA to be fused, determining which LoRA to select for optimal fusion. This selection mechanism ensures that the most representative features of both subject and style are retained during the fusion process, effectively balancing their contributions. Experimental results demonstrate that the proposed method effectively integrates the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsContext-Aware Activity Recognition Systems · Robotics and Automated Systems · Human Pose and Action Recognition
MethodsSoftmax · Attention Is All You Need · Diffusion
