Semi-Supervised Fine-Tuning of Vision Foundation Models with   Content-Style Decomposition

Mariia Drozdova; Vitaliy Kinakh; Yury Belousov; Erica Lastufka; Slava; Voloshynovskiy

arXiv:2410.02069·cs.CV·October 7, 2024

Semi-Supervised Fine-Tuning of Vision Foundation Models with Content-Style Decomposition

Mariia Drozdova, Vitaliy Kinakh, Yury Belousov, Erica Lastufka, Slava, Voloshynovskiy

PDF

Open Access

TL;DR

This paper introduces a semi-supervised fine-tuning method for vision foundation models that uses content-style decomposition to improve performance on tasks with limited labeled data, addressing distribution shift issues.

Contribution

It proposes a novel semi-supervised fine-tuning approach leveraging content-style decomposition within an information-theoretic framework for vision models.

Findings

01

Improves performance in low-labeled data regimes

02

Enhances latent representations of pre-trained models

03

Effective across multiple datasets and backbone configurations

Abstract

In this paper, we present a semi-supervised fine-tuning approach designed to improve the performance of pre-trained foundation models on downstream tasks with limited labeled data. By leveraging content-style decomposition within an information-theoretic framework, our method enhances the latent representations of pre-trained vision foundation models, aligning them more effectively with specific task objectives and addressing the problem of distribution shift. We evaluate our approach on multiple datasets, including MNIST, its augmented variations (with yellow and white stripes), CIFAR-10, SVHN, and GalaxyMNIST. The experiments show improvements over supervised finetuning baseline of pre-trained models, particularly in low-labeled data regimes, across both frozen and trainable backbones for the majority of the tested datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Computational Techniques and Applications · Advanced Image and Video Retrieval Techniques · Neural Networks and Applications