SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning
Hongjun Wang, Sagar Vaze, Kai Han

TL;DR
SPTNet introduces a novel two-stage adaptation framework with spatial prompt tuning for generalized category discovery, effectively improving classification accuracy by focusing on object parts and requiring minimal additional parameters.
Contribution
The paper proposes SPTNet, a new two-stage adaptation approach combining model fine-tuning and prompt learning, with a novel spatial prompt tuning method for better data representation alignment.
Findings
Achieves 61.4% accuracy on SSB benchmark, outperforming previous methods.
Uses only 0.117% of backbone parameters, demonstrating high efficiency.
Outperforms existing GCD methods by approximately 10% on average.
Abstract
Generalized Category Discovery (GCD) aims to classify unlabelled images from both `seen' and `unseen' classes by transferring knowledge from a set of labelled `seen' class images. A key theme in existing GCD approaches is adapting large-scale pre-trained models for the GCD task. An alternate perspective, however, is to adapt the data representation itself for better alignment with the pre-trained model. As such, in this paper, we introduce a two-stage adaptation approach termed SPTNet, which iteratively optimizes model parameters (i.e., model-finetuning) and data parameters (i.e., prompt learning). Furthermore, we propose a novel spatial prompt tuning method (SPT) which considers the spatial property of image data, enabling the method to better focus on object parts, which can transfer between seen and unseen classes. We thoroughly evaluate our SPTNet on standard benchmarks and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Data Mining Algorithms and Applications · Data Management and Algorithms
MethodsSparse Evolutionary Training · Focus
