PP-SSL : Priority-Perception Self-Supervised Learning for Fine-Grained   Recognition

ShuaiHeng Li; Qing Cai; Fan Zhang; Menghuan Zhang; Yangyang Shu; Zhi; Liu; Huafeng Li; Lingqiao Liu

arXiv:2412.00134·cs.CV·December 3, 2024

PP-SSL : Priority-Perception Self-Supervised Learning for Fine-Grained Recognition

ShuaiHeng Li, Qing Cai, Fan Zhang, Menghuan Zhang, Yangyang Shu, Zhi, Liu, Huafeng Li, Lingqiao Liu

PDF

Open Access

TL;DR

PP-SSL introduces a novel self-supervised learning framework that enhances fine-grained visual recognition by filtering irrelevant features and emphasizing subtle discriminative details, outperforming existing methods.

Contribution

The paper proposes PP-SSL, a new self-supervised learning approach with AIS and IADM components, specifically designed to improve fine-grained recognition by focusing on subtle differences.

Findings

01

PP-SSL significantly outperforms existing methods on various datasets.

02

The use of GradCAM from original images reveals more subtle class differences.

03

Knowledge distillation guides the model to focus on discriminative features.

Abstract

Self-supervised learning is emerging in fine-grained visual recognition with promising results. However, existing self-supervised learning methods are often susceptible to irrelevant patterns in self-supervised tasks and lack the capability to represent the subtle differences inherent in fine-grained visual recognition (FGVR), resulting in generally poorer performance. To address this, we propose a novel Priority-Perception Self-Supervised Learning framework, denoted as PP-SSL, which can effectively filter out irrelevant feature interference and extract more subtle discriminative features throughout the training process. Specifically, it composes of two main parts: the Anti-Interference Strategy (AIS) and the Image-Aided Distinction Module (IADM). In AIS, a fine-grained textual description corpus is established, and a knowledge distillation strategy is devised to guide the model in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Domain Adaptation and Few-Shot Learning · Text and Document Classification Technologies