Multi-View Active Fine-Grained Recognition
Ruoyi Du, Wenqing Yu, Heqing Wang, Dongliang Chang, Ting-En Lin,, Yongbin Li, Zhanyu Ma

TL;DR
This paper introduces active view selection for fine-grained recognition, demonstrating that selecting key perspectives improves recognition accuracy efficiently, especially in real-world scenarios with multiple viewpoints.
Contribution
It proposes a novel active recognition framework using policy gradients, along with a new multi-view vehicle dataset and analysis of perspective importance.
Findings
Active view selection enhances recognition accuracy.
Different categories rely on different discriminative perspectives.
The proposed method outperforms previous FGVC approaches in efficiency and accuracy.
Abstract
As fine-grained visual classification (FGVC) being developed for decades, great works related have exposed a key direction -- finding discriminative local regions and revealing subtle differences. However, unlike identifying visual contents within static images, for recognizing objects in the real physical world, discriminative information is not only present within seen local regions but also hides in other unseen perspectives. In other words, in addition to focusing on the distinguishable part from the whole, for efficient and accurate recognition, it is required to infer the key perspective with a few glances, e.g., people may recognize a "Benz AMG GT" with a glance of its front and then know that taking a look at its exhaust pipe can help to tell which year's model it is. In this paper, back to reality, we put forward the problem of active fine-grained recognition (AFGR) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Visual Attention and Saliency Detection
