Bayesian Natural Gradient Fine-Tuning of CLIP Models via Kalman Filtering

Hossein Abdi; Mingfei Sun; Wei Pan

arXiv:2511.01694·cs.LG·November 4, 2025

Bayesian Natural Gradient Fine-Tuning of CLIP Models via Kalman Filtering

Hossein Abdi, Mingfei Sun, Wei Pan

PDF

Open Access

TL;DR

This paper introduces a Bayesian natural gradient fine-tuning method for CLIP models using Kalman filtering, improving convergence, generalization, and out-of-distribution robustness in vision-language tasks.

Contribution

It presents the first application of Kalman filtering to fine-tune CLIP models, combining second-order optimization with Bayesian inference for enhanced performance.

Findings

01

Achieves superior in-distribution accuracy

02

Improves out-of-distribution robustness

03

Demonstrates efficient and robust fine-tuning

Abstract

Vision-language pre-trained models, such as CLIP, have established new benchmarks in multimodal data mining. In such models, few-shot fine-tuning is a major challenge to achieve optimal performance on both in-distribution (ID) and out-of-distribution (OOD) datasets, especially when labeled data is scarce. Most existing fine-tuning approaches rely on first-order gradient-based optimizers, which typically suffer from slow convergence, sensitivity to step-size hyperparameters, and poor generalization in OOD settings. In contrast, second-order methods utilize local curvature information of the loss landscape to adjust the update step size. This is particularly beneficial for CLIP models, whose non-convex loss functions often contain sharp critical points. In such cases, natural gradient direction can offer more substantial and efficient per-iteration updates when fine-tuning with limited…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis