Latent Space Factorization in LoRA
Shashi Kumar, Yacouba Kaloga, John Mitros, Petr Motlicek, Ina Kodrasi

TL;DR
FVAE-LoRA enhances low-rank adaptation by explicitly disentangling task-relevant features from residual information using a VAE, leading to improved performance and robustness across multiple modalities.
Contribution
Introduces FVAE-LoRA, a novel method that uses a VAE to explicitly factorize task-relevant and residual information in low-rank adaptation.
Findings
Outperforms standard LoRA on text, audio, and image tasks.
Better isolates task-relevant signals, improving robustness.
Demonstrates effectiveness across diverse modalities.
Abstract
Low-rank adaptation (LoRA) is a widely used method for parameter-efficient finetuning. However, existing LoRA variants lack mechanisms to explicitly disambiguate task-relevant information within the learned low-rank subspace, potentially limiting downstream performance. We propose Factorized Variational Autoencoder LoRA (FVAE-LoRA), which leverages a VAE to learn two distinct latent spaces. Our novel Evidence Lower Bound formulation explicitly promotes factorization between the latent spaces, dedicating one latent space to task-salient features and the other to residual information. Extensive experiments on text, audio, and image tasks demonstrate that FVAE-LoRA consistently outperforms standard LoRA. Moreover, spurious correlation evaluations confirm that FVAE-LoRA better isolates task-relevant signals, leading to improved robustness under distribution shifts. Our code is publicly…
Peer Reviews
Decision·NeurIPS 2025 poster
Strengths: * The paper is well-written and clearly structured. * The idea of integrating a VAE into LoRA is interesting, VAE naturally brings more variabilities into the low-dimensional space, and the same inference format is an extra plus. * The evaluation across diverse modalities, image, text, and audio tasks, demonstrates the versatility and general applicability of the method. Weaknesses: * Performance wise, the advantage of FVAE-LoRA over HiRA is not clearly demonstrated. While both meth
## Strengths ### Novelty As of my knowledge, doing disentanglement in a PEFT method is a novel idea, so I appreciate the contribution. At the same time it seems very surprising that such a method should work at all since the variational formulation, while useful for factoring the low rank space, it does not seem directly related to the task at hand (see first two Questions). I can see it justified by the fact that when using bot the task loss and the VAE loss, only $\mathbf{z}\_1$ is involved i
Strengths: - the paper introduces a carefully designed objective and has run extensive empirical experiments to demonstrate the effectiveness the proposed method. - the paper is clearly written and easy to follow. Weakness: - the improvements of proposed method compared to LoRA are small on some benchmarks, sometimes the improvements are within the error bars range (e.g. WATERBIRDS). - error bars (stddev) are missing in Table 2. - lack of hyperparams tuning guide and analysis. - lack of inte
Strengths: 1. The integration of factorized VAE with LoRA is conceptually interesting, and the formulation with the cross-prior regularization term (Γ) is well-motivated. 2. The experimental evaluation spans multiple modalities (vision, text, audio) and includes diverse datasets, demonstrating the generality of the approach. 3. The spurious correlation experiments provide compelling evidence that the method indeed learns more robust representations. Weaknesses: 1. FVAE-LoRA replaces LoRA's
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Face recognition and analysis
