How to Fine-tune Models with Few Samples: Update, Data Augmentation, and Test-time Augmentation
Yujin Kim, Jaehoon Oh, Sungnyun Kim, Se-Young Yun

TL;DR
This paper explores effective strategies for fine-tuning pre-trained models in few-shot learning, focusing on update methods, data augmentation, and test-time augmentation to improve performance with limited data.
Contribution
It systematically compares fine-tuning and linear probing, analyzes data augmentation effects, and proposes combined augmentation techniques for better few-shot learning performance.
Findings
Linear probing outperforms full fine-tuning with very few samples.
Data augmentation's effectiveness depends on augmentation intensity.
Combined support and query set augmentation improves few-shot accuracy.
Abstract
Most of the recent few-shot learning (FSL) algorithms are based on transfer learning, where a model is pre-trained using a large amount of source data, and the pre-trained model is fine-tuned using a small amount of target data. In transfer learning-based FSL, sophisticated pre-training methods have been widely studied for universal representation. Therefore, it has become more important to utilize the universal representation for downstream tasks, but there are few studies on fine-tuning in FSL. In this paper, we focus on how to transfer pre-trained models to few-shot downstream tasks from the three perspectives: update, data augmentation, and test-time augmentation. First, we compare the two popular update methods, full fine-tuning (i.e., updating the entire network, FT) and linear probing (i.e., updating only a linear classifier, LP). We find that LP is better than FT with extremely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and ELM · Multimodal Machine Learning Applications
