TL;DR
This paper introduces a simple two-stream framework with a multi-class attentional region module for multi-label image recognition, achieving state-of-the-art results efficiently without label dependency.
Contribution
It proposes a novel multi-class attentional region module that reduces the number of regions while maintaining diversity, improving multi-label recognition performance.
Findings
Achieved new state-of-the-art results on three benchmarks.
Effective recognition with low computational cost.
Demonstrated robustness across different settings.
Abstract
Multi-label image recognition is a practical and challenging task compared to single-label image classification. However, previous works may be suboptimal because of a great number of object proposals or complex attentional region generation modules. In this paper, we propose a simple but efficient two-stream framework to recognize multi-category objects from global image to local regions, similar to how human beings perceive objects. To bridge the gap between global and local streams, we propose a multi-class attentional region module which aims to make the number of attentional regions as small as possible and keep the diversity of these regions as high as possible. Our method can efficiently and effectively recognize multi-class objects with an affordable computation cost and a parameter-free region localization module. Over three benchmarks on multi-label image classification, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
