Bayesian-LoRA: Probabilistic Low-Rank Adaptation of Large Language Models

Moule Lin; Shuhao Guan; Andrea Patane; David Gregg; Goetz Botterweck

arXiv:2601.21003·cs.AI·April 16, 2026

Bayesian-LoRA: Probabilistic Low-Rank Adaptation of Large Language Models

Moule Lin, Shuhao Guan, Andrea Patane, David Gregg, Goetz Botterweck

PDF

TL;DR

Bayesian-LoRA introduces a probabilistic low-rank adaptation method for large language models, significantly improving calibration and uncertainty estimation with minimal additional parameters and training cost.

Contribution

It reformulates LoRA as a probabilistic model inspired by Gaussian Processes, enhancing calibration and uncertainty quantification in LLM fine-tuning.

Findings

01

Achieves up to 84% reduction in ECE and 76% reduction in NLL.

02

Maintains competitive accuracy on in-distribution and out-of-distribution tasks.

03

Adds approximately 0.42M parameters with 1.2x training cost.

Abstract

Large Language Models usually put more emphasis on accuracy and therefore, will guess even when not certain about the prediction, which is especially severe when fine-tuned on small datasets due to the inherent tendency toward miscalibration. In this work, we introduce Bayesian-LoRA, which reformulates the deterministic LoRA update as a probabilistic low-rank representation inspired by Sparse Gaussian Processes. We identify a structural isomorphism between LoRA's factorization and Kronecker-factored SGP posteriors, and show that LoRA emerges as a limiting case when posterior uncertainty collapses. We conduct extensive experiments on various LLM architectures across commonsense reasoning benchmarks. With only approximately 0.42M additional parameters and $\approx 1.2 \times$ training cost relative to standard LoRA, Bayesian-LoRA significantly improves calibration across models up to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.