Cascade one-vs-rest detection network for fine-grained recognition without part annotations
Long Chen, Junyu Dong, ShengKe Wang, Kin-Man Lam, Muwei Jian, Hua, Zhang, XiaoChun Cao

TL;DR
This paper introduces a cascaded deep CNN detection framework for fine-grained recognition that detects whole objects without part annotations, using a one-vs-rest loss to improve category distinction, achieving competitive results.
Contribution
The novel cascaded detection framework eliminates the need for part annotations and employs a one-vs-rest loss to enhance fine-grained recognition performance.
Findings
Achieves comparable performance to state-of-the-art part-based methods.
Outperforms many part-based methods without requiring part annotations.
Effective in recognizing fine-grained categories without part supervision.
Abstract
Fine-grained recognition is a challenging task due to the small intra-category variances. Most of top-performing fine-grained recognition methods leverage parts of objects for better performance. Therefore, part annotations which are extremely computationally expensive are required. In this paper, we propose a novel cascaded deep CNN detection framework for fine-grained recognition which is trained to detect the whole object without considering parts. Nevertheless, most of current top-performing detection networks use the N+1 class (N object categories plus background) softmax loss, and the background category with much more training samples dominates the feature learning progress so that the features are not good for object categories with fewer samples. To bridge this gap, we introduce a cascaded structure to eliminate background and exploit a one-vs-rest loss to capture more minute…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Remote Sensing and LiDAR Applications · Animal Vocal Communication and Behavior
MethodsSoftmax
