Drawing Attention to Detail: Pose Alignment through Self-Attention for Fine-Grained Object Classification
Salwa Al Khatib, Mohamed El Amine Boudjoghra, Jameel Hassan

TL;DR
This paper introduces an end-to-end trainable self-attention based parts alignment module for fine-grained object classification, improving the handling of intra-class variations by learning optimal part arrangements.
Contribution
It replaces the graph-matching component in P2P-Net with a self-attention mechanism, enabling more effective and trainable parts alignment for fine-grained classification.
Findings
Improved accuracy in fine-grained classification tasks.
Effective learning of part arrangements through self-attention.
Enhanced invariance to viewpoint and intra-class variations.
Abstract
Intra-class variations in the open world lead to various challenges in classification tasks. To overcome these challenges, fine-grained classification was introduced, and many approaches were proposed. Some rely on locating and using distinguishable local parts within images to achieve invariance to viewpoint changes, intra-class differences, and local part deformations. Our approach, which is inspired by P2P-Net, offers an end-to-end trainable attention-based parts alignment module, where we replace the graph-matching component used in it with a self-attention mechanism. The attention module is able to learn the optimal arrangement of parts while attending to each other, before contributing to the global loss.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Image Processing and 3D Reconstruction · Advanced Image and Video Retrieval Techniques
