Fully Convolutional Attention Networks for Fine-Grained Recognition
Xiao Liu, Tian Xia, Jiang Wang, Yi Yang, Feng Zhou, Yuanqing Lin

TL;DR
This paper introduces Fully Convolutional Attention Networks (FCANs), a reinforcement learning-based framework that localizes discriminative regions for fine-grained recognition without requiring detailed part annotations, improving efficiency and accuracy.
Contribution
The work proposes a weakly-supervised, fully-convolutional attention model for fine-grained recognition that is faster and more effective than previous methods.
Findings
Achieves state-of-the-art results on four fine-grained datasets.
Requires no expensive part annotations for training.
Speeds up training and testing through fully-convolutional architecture.
Abstract
Fine-grained recognition is challenging due to its subtle local inter-class differences versus large intra-class variations such as poses. A key to address this problem is to localize discriminative parts to extract pose-invariant features. However, ground-truth part annotations can be expensive to acquire. Moreover, it is hard to define parts for many fine-grained classes. This work introduces Fully Convolutional Attention Networks (FCANs), a reinforcement learning framework to optimally glimpse local discriminative regions adaptive to different fine-grained domains. Compared to previous methods, our approach enjoys three advantages: 1) the weakly-supervised reinforcement learning procedure requires no expensive part annotations; 2) the fully-convolutional architecture speeds up both training and testing; 3) the greedy reward strategy accelerates the convergence of the learning. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Human Pose and Action Recognition
