FLoRA: Fused forward-backward adapters for parameter efficient fine-tuning and reducing inference-time latencies of LLMs

Dhananjaya Gowda; Seoha Song; Junhyun Lee; Harshith Goka

arXiv:2511.00050·cs.LG·November 4, 2025

FLoRA: Fused forward-backward adapters for parameter efficient fine-tuning and reducing inference-time latencies of LLMs

Dhananjaya Gowda, Seoha Song, Junhyun Lee, Harshith Goka

PDF

Open Access

TL;DR

FLoRA introduces fused forward-backward adapters that enhance fine-tuning accuracy and reduce inference latency of large language models by combining ideas from LoRA and parallel adapters.

Contribution

The paper proposes FLoRA, a novel fused adapter method that improves fine-tuning performance and efficiency of LLMs, addressing gaps in existing PEFT techniques.

Findings

01

FLoRA outperforms LoRA in accuracy on downstream tasks.

02

FLoRA reduces inference latency compared to traditional adapters.

03

FLoRA maintains similar parameter efficiency as LoRA.

Abstract

As the large language models (LLMs) grow in size each day, efficient training and fine-tuning has never been as important as nowadays. This resulted in the great interest in parameter efficient fine-tuning (PEFT), and effective methods including low-rank adapters (LoRA) has emerged. Although the various PEFT methods have been studied extensively in the recent years, the greater part of the subject remains unexplored with the huge degree of freedom. In this paper, we propose FLoRA, a family of fused forward-backward adapters (FFBA) for parameter-efficient fine-tuning of LLMs on downstream tasks. The FFBA combine ideas from the popular LoRA and parallel adapters to improve the overall fine-tuning accuracies. At the same time, latencies are minimized by fusing the forward and backward adapters into existing projection layers of the base model. Experimental results show that the proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Speech Recognition and Synthesis