Tuning Pre-trained Model via Moment Probing
Mingze Gao, Qilong Wang, Zhenyi Lin, Pengfei Zhu, Qinghua, Hu, Jingbo Zhou

TL;DR
This paper introduces Moment Probing, a novel linear probing method that leverages feature distribution statistics for improved representation, and demonstrates its effectiveness across multiple benchmarks.
Contribution
The paper proposes Moment Probing, which exploits feature distribution moments for linear probing, and introduces MP+ with a partially shared module for enhanced fine-tuning.
Findings
MP outperforms traditional linear probing methods.
MP+ achieves state-of-the-art results on ten benchmarks.
Our approach requires less training cost than existing methods.
Abstract
Recently, efficient fine-tuning of large-scale pre-trained models has attracted increasing research interests, where linear probing (LP) as a fundamental module is involved in exploiting the final representations for task-dependent classification. However, most of the existing methods focus on how to effectively introduce a few of learnable parameters, and little work pays attention to the commonly used LP module. In this paper, we propose a novel Moment Probing (MP) method to further explore the potential of LP. Distinguished from LP which builds a linear classification head based on the mean of final features (e.g., word tokens for ViT) or classification tokens, our MP performs a linear classifier on feature distribution, which provides the stronger representation ability by exploiting richer statistical information inherent in features. Specifically, we represent feature distribution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Anomaly Detection Techniques and Applications · Topic Modeling
MethodsFocus · Swin Transformer · Contrastive Language-Image Pre-training · Masked autoencoder · ConvNeXt · Data-efficient Image Transformer
