EfficientFSL: Enhancing Few-Shot Classification via Query-Only Tuning in Vision Transformers

Wenwen Liao; Hang Ruan; Jianbo Yu; Bing Song; YuansongWang; Xiaofeng Yang

arXiv:2601.08499·cs.CV·January 16, 2026

EfficientFSL: Enhancing Few-Shot Classification via Query-Only Tuning in Vision Transformers

Wenwen Liao, Hang Ruan, Jianbo Yu, Bing Song, YuansongWang, Xiaofeng Yang

PDF

Open Access 1 Video

TL;DR

EfficientFSL introduces a lightweight, query-only fine-tuning framework for Vision Transformers that achieves high few-shot classification accuracy with minimal computational resources by leveraging task-specific query synthesis and multi-layer feature fusion.

Contribution

The paper proposes a novel query-only fine-tuning method for ViTs that reduces training overhead while maintaining competitive performance in few-shot classification tasks.

Findings

01

Achieves state-of-the-art results on multiple few-shot datasets.

02

Reduces computational cost significantly compared to full fine-tuning.

03

Demonstrates robustness across in-domain and cross-domain scenarios.

Abstract

Large models such as Vision Transformers (ViTs) have demonstrated remarkable superiority over smaller architectures like ResNet in few-shot classification, owing to their powerful representational capacity. However, fine-tuning such large models demands extensive GPU memory and prolonged training time, making them impractical for many real-world low-resource scenarios. To bridge this gap, we propose EfficientFSL, a query-only fine-tuning framework tailored specifically for few-shot classification with ViT, which achieves competitive performance while significantly reducing computational overhead. EfficientFSL fully leverages the knowledge embedded in the pre-trained model and its strong comprehension ability, achieving high classification accuracy with an extremely small number of tunable parameters. Specifically, we introduce a lightweight trainable Forward Block to synthesize…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

EfficientFSL: Enhancing Few-Shot Classification via Query-Only Tuning In Vision Transformers· underline

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications