R2-Trans:Fine-Grained Visual Categorization with Redundancy Reduction

Yu Wang; Shuo Ye; Shujian Yu; Xinge You

arXiv:2204.10095·cs.CV·April 22, 2022

R2-Trans:Fine-Grained Visual Categorization with Redundancy Reduction

Yu Wang, Shuo Ye, Shujian Yu, Xinge You

PDF

Open Access

TL;DR

R2-Trans introduces a novel FGVC method that reduces redundancy in class tokens and adaptively extracts discriminative regions, leading to improved accuracy on benchmark datasets.

Contribution

The paper proposes a new approach combining adaptive masking and the Information Bottleneck to enhance fine-grained visual categorization performance.

Findings

01

Outperforms state-of-the-art methods on benchmark datasets

02

Effectively reduces redundant information in class tokens

03

Improves discriminative region extraction accuracy

Abstract

Fine-grained visual categorization (FGVC) aims to discriminate similar subcategories, whose main challenge is the large intraclass diversities and subtle inter-class differences. Existing FGVC methods usually select discriminant regions found by a trained model, which is prone to neglect other potential discriminant information. On the other hand, the massive interactions between the sequence of image patches in ViT make the resulting class-token contain lots of redundant information, which may also impacts FGVC performance. In this paper, we present a novel approach for FGVC, which can simultaneously make use of partial yet sufficient discriminative information in environmental cues and also compress the redundant information in class-token with respect to the target. Specifically, our model calculates the ratio of high-weight regions in a batch, adaptively adjusts the masking…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques · Remote-Sensing Image Classification