Few-Shot Recognition via Stage-Wise Retrieval-Augmented Finetuning
Tian Liu, Huixin Zhang, Shubham Parashar, Shu Kong

TL;DR
This paper introduces SWAT, a stage-wise retrieval-augmented finetuning method that leverages pretrained vision-language models for improved few-shot recognition, outperforming previous approaches by over 6% accuracy.
Contribution
The paper proposes a novel stage-wise finetuning approach, SWAT, to effectively utilize retrieval-augmented data in few-shot recognition tasks, addressing domain gap and data imbalance issues.
Findings
Finetuning on few-shot data alone outperforms zero-shot methods.
Combining retrieved data with few-shot examples improves accuracy.
SWAT achieves over 6% higher accuracy than previous methods on benchmarks.
Abstract
Few-shot recognition (FSR) aims to train a classification model with only a few labeled examples of each concept concerned by a downstream task, where data annotation cost can be prohibitively high. We develop methods to solve FSR by leveraging a pretrained Vision-Language Model (VLM). We particularly explore retrieval-augmented learning (RAL), which retrieves open data, e.g., the VLM's pretraining dataset, to learn models for better serving downstream tasks. RAL has been studied in zero-shot recognition but remains under-explored in FSR. Although applying RAL to FSR may seem straightforward, we observe interesting and novel challenges and opportunities. First, somewhat surprisingly, finetuning a VLM on a large amount of retrieved data underperforms state-of-the-art zero-shot methods. This is due to the imbalanced distribution of retrieved data and its domain gaps with the few-shot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optical Sensing Technologies · Medical Imaging Techniques and Applications · Geophysical Methods and Applications
MethodsSparse Evolutionary Training
