PARTICLE: Part Discovery and Contrastive Learning for Fine-grained Recognition
Oindrila Saha, Subhransu Maji

TL;DR
This paper introduces PARTICLE, a self-supervised method that discovers parts and uses contrastive learning to improve fine-grained recognition and segmentation, outperforming existing methods on multiple datasets.
Contribution
It proposes an iterative approach combining part discovery and contrastive learning, enhancing fine-grained classification and segmentation without labeled data.
Findings
Improved classification accuracy on ImageNet-derived datasets.
Enhanced few-shot part segmentation performance.
Better representations for fine-grained tasks across different network architectures.
Abstract
We develop techniques for refining representations for fine-grained classification and segmentation tasks in a self-supervised manner. We find that fine-tuning methods based on instance-discriminative contrastive learning are not as effective, and posit that recognizing part-specific variations is crucial for fine-grained categorization. We present an iterative learning approach that incorporates part-centric equivariance and invariance objectives. First, pixel representations are clustered to discover parts. We analyze the representations from convolutional and vision transformer networks that are best suited for this task. Then, a part-centric learning step aggregates and contrasts representations of parts within an image. We show that this improves the performance on image classification and part segmentation tasks across datasets. For example, under a linear-evaluation scheme, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · 3D Surveying and Cultural Heritage
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Residual Connection · Layer Normalization · Dense Connections · Vision Transformer · Contrastive Learning
