TL;DR
LAGCD introduces a residual linear adapter within each ViT block for generalized category discovery, improving flexibility and performance over existing methods by leveraging feature sparsity and auxiliary loss.
Contribution
It proposes a simple linear adapter approach with distribution alignment for better GCD, addressing overfitting and limited adaptation issues of prior methods.
Findings
LAGCD outperforms many sophisticated baselines on various datasets.
Linear adapters enhance model capacity by enabling more flexible adaptation.
Auxiliary distribution alignment reduces bias between seen and novel categories.
Abstract
Generalized Category Discovery (GCD) seeks to identify novel categories from unlabeled data while retaining the classification ability of seen categories. Prior GCD methods commonly leverage transferable representations from pre-trained models, adapting to downstream datasets via partial fine-tuning (updating only the final ViT block) and visual prompt tuning (appending learnable vectors to inputs). However, conventional partial fine-tuning offers limited flexibility, as it fails to adapt the entire model; meanwhile, visual prompt tuning is prone to overfitting, due to its sensitivity to initialization and inherently constrained capacity. To address these limitations, we propose LAGCD, a simple yet effective GCD approach that embeds a residual linear adapter into each ViT block. From the perspective of feature sparsity, we systematically show that non-linearity in conventional adapters…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
