Annotation Entropy Predicts Per-Example Learning Dynamics in LoRA Fine-Tuning

Brady Steele

arXiv:2604.16332·cs.LG·April 21, 2026

Annotation Entropy Predicts Per-Example Learning Dynamics in LoRA Fine-Tuning

Brady Steele

PDF

TL;DR

This paper shows that LoRA fine-tuning can cause un-learning on examples with high annotation disagreement, with annotation entropy predicting learning dynamics across multiple models and datasets.

Contribution

It introduces annotation entropy as a predictor of per-example learning behavior in LoRA fine-tuning, revealing un-learning patterns not seen in full fine-tuning.

Findings

01

High annotation entropy correlates positively with increasing loss during LoRA fine-tuning.

02

Decoder-only models show stronger correlation than encoder models at similar LoRA ranks.

03

The correlation persists across different datasets, seeds, and partial controls.

Abstract

We find that LoRA fine-tuning exhibits un-learning on contested examples: items with high annotator disagreement show increasing loss during training, a qualitatively distinct pattern largely absent under full fine-tuning and consistent across all six models tested (four encoder, two decoder-only). This discovery emerges from correlating annotation entropy, computed from ChaosNLI's 100 labels per example, with per-example area under the loss curve (AULC) on SNLI and MNLI. The correlation is positive in all 25 conditions tested (Spearman $ρ = 0.06$ - $0.43$ ), with decoder-only models showing stronger correlations than encoders at matched LoRA rank. The effect survives partial-correlation controls and replicates across seeds and datasets. A preliminary noise-injection experiment is consistent with these findings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.