Aletheia: Gradient-Guided Layer Selection for Efficient LoRA Fine-Tuning Across Architectures

Abdulmalek Saket

arXiv:2604.15351·cs.LG·April 20, 2026

Aletheia: Gradient-Guided Layer Selection for Efficient LoRA Fine-Tuning Across Architectures

Abdulmalek Saket

PDF

1 Models 1 Datasets

TL;DR

Aletheia introduces a gradient-guided method to selectively apply LoRA adapters to the most relevant layers, significantly improving training efficiency across diverse large language models while maintaining performance.

Contribution

It proposes a novel gradient-based layer selection technique for LoRA fine-tuning, reducing training time without compromising downstream task accuracy.

Findings

01

Achieves 15-28% training speedup across 14 models

02

Broadly preserves downstream performance within bounded degradation

03

Demonstrates consistent speed improvements across multiple architectures

Abstract

Low-Rank Adaptation (LoRA) has become the dominant parameter-efficient fine-tuning method for large language models, yet standard practice applies LoRA adapters uniformly to all transformer layers regardless of their relevance to the downstream task. We introduce Aletheia, a gradient-guided layer selection method that identifies the most task-relevant layers via a lightweight gradient probe and applies LoRA adapters only to those layers with asymmetric rank allocation. Across 81 experiment rows covering 14 successful models from 8 architecture families (0.5B-72B parameters, including dense and Mixture-of-Experts architectures), with one additional documented failed Pythia/GPT-NeoX attempt in Campaign 2, Aletheia achieves a 15-28% training speedup (mean 23.1%, p < 0.001) with bounded extra forgetting and broadly matched downstream behavior on the evaluated MMLU, GSM8K, and HumanEval…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
aletheiaprotocol/aletheia-lora
model

Datasets

aletheiaprotocol/aletheia-lora-evidence
dataset· 70 dl
70 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.