Spanning the Visual Analogy Space with a Weight Basis of LoRAs
Hila Manor, Rinon Gal, Haggai Maron, Tomer Michaeli, Gal Chechik

TL;DR
This paper introduces LoRWeB, a method that composes a basis of LoRA modules to enable flexible and generalizable visual analogy transformations in image manipulation tasks.
Contribution
It proposes a learnable basis of LoRA modules and a dynamic selection mechanism to improve visual analogy learning beyond fixed adaptation modules.
Findings
Achieves state-of-the-art performance on visual analogy benchmarks.
Significantly improves generalization to unseen transformations.
Demonstrates the effectiveness of LoRA basis decomposition for flexible visual manipulation.
Abstract
Visual analogy learning enables image manipulation through demonstration rather than textual description, allowing users to specify complex transformations difficult to articulate in words. Given a triplet , , , the goal is to generate such that . Recent methods adapt text-to-image models to this task using a single Low-Rank Adaptation (LoRA) module, but they face a fundamental limitation: attempting to capture the diverse space of visual transformations within a fixed adaptation module constrains generalization capabilities. Inspired by recent work showing that LoRAs in constrained domains span meaningful, interpolatable semantic spaces, we propose LoRWeB, a novel approach that specializes the model for each analogy task at inference time through dynamic composition of learned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
