Aligned to the Object, not to the Image: A Unified Pose-aligned Representation for Fine-grained Recognition
Pei Guo, Ryan Farrell

TL;DR
This paper introduces a pose-aligned hierarchical object representation that improves fine-grained recognition by explicitly modeling object pose, outperforming existing methods on standard datasets.
Contribution
It proposes a unified pose-aligned representation and an algorithm for pose estimation and feature extraction, enhancing recognition accuracy over prior approaches.
Findings
Achieved nearly 2% improvement on CUB-200 dataset.
Surpassed state-of-the-art by over 8% on NABirds dataset.
Demonstrated the importance of disentangling pose and appearance.
Abstract
Dramatic appearance variation due to pose constitutes a great challenge in fine-grained recognition, one which recent methods using attention mechanisms or second-order statistics fail to adequately address. Modern CNNs typically lack an explicit understanding of object pose and are instead confused by entangled pose and appearance. In this paper, we propose a unified object representation built from a hierarchy of pose-aligned regions. Rather than representing an object by regions aligned to image axes, the proposed representation characterizes appearance relative to the object's pose using pose-aligned patches whose features are robust to variations in pose, scale and rotation. We propose an algorithm that performs pose estimation and forms the unified object representation as the concatenation of hierarchical pose-aligned regions features, which is then fed into a classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Image Processing and 3D Reconstruction · Medical Imaging and Analysis
