Matched-Learning-Rate Analysis of Attention Drift and Transfer Retention in Fine-Tuned CLIP
Ruize Xia

TL;DR
This study compares Full Fine-Tuning and LoRA for CLIP adaptation under matched learning rates, revealing how learning rate influences attention drift and transfer retention, with LoRA generally preserving more zero-shot transfer.
Contribution
It provides a controlled analysis of how adaptation method and learning rate jointly affect CLIP's attention drift and transfer retention, clarifying prior confounded comparisons.
Findings
LoRA preserves more zero-shot transfer than Full FT at matched learning rates.
Learning rate modulates attention drift and structural changes in CLIP adaptation.
Matched-learning-rate evaluation alters the interpretation of Fine-Tuning versus LoRA.
Abstract
CLIP adaptation can improve in-domain accuracy while degrading out-of-domain transfer, but comparisons between Full Fine-Tuning (Full FT) and LoRA are often confounded by different learning-rate conventions. We study how adaptation method and optimization scale jointly shape attention drift and transfer retention in CLIP using a controlled matched-learning-rate comparison of Full FT and LoRA. The completed matrix contains 80 runs on CLIP ViT-B/32 across EuroSAT and Oxford-IIIT Pets, spanning four shared learning rates (, , , ) and five seeds, and evaluates attention-drift metrics, best validation accuracy, and adapter-aware CIFAR-100 zero-shot accuracy. Learning rate strongly modulates structural change: on EuroSAT, Full FT moves from mild entropy broadening at to marked contraction at , whereas LoRA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
