DiffLoRA: Differential Low-Rank Adapters for Large Language Models

Alexandre Misrahi; Nadezhda Chirkova; Maxime Louis; Vassilina Nikoulina

arXiv:2507.23588·cs.CL·August 1, 2025

DiffLoRA: Differential Low-Rank Adapters for Large Language Models

Alexandre Misrahi, Nadezhda Chirkova, Maxime Louis, Vassilina Nikoulina

PDF

Open Access

TL;DR

DiffLoRA introduces a low-rank adaptation of differential attention in Transformer models, aiming to improve efficiency and performance across NLP tasks, with mixed results but notable gains in some domains.

Contribution

It proposes DiffLoRA, a novel parameter-efficient method combining differential attention with low-rank adapters for Transformer fine-tuning.

Findings

01

DiffLoRA performs comparably to LoRA on many NLP benchmarks.

02

It achieves +11 points on HumanEval in certain settings.

03

Analysis of attention patterns explains its varied performance.

Abstract

Differential Transformer has recently been proposed to improve performance in Transformer models by canceling out noise through a denoiser attention mechanism. In this work, we introduce DiffLoRA, a parameter-efficient adaptation of the differential attention mechanism, with low-rank adapters on both positive and negative attention terms. This approach retains the efficiency of LoRA while aiming to benefit from the performance gains of differential attention. We evaluate DiffLoRA across a broad range of NLP tasks, including general benchmarks, many-shot in-context learning, RAG, and long-context tests. We observe that, although DiffLoRA falls short of other parameter-efficient fine-tuning methods in most evaluation tasks, it shows interesting results in certain domains (+11 pts on LoRA for HumanEval). We analyze the attention patterns post-finetuning to identify the reasons for this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis