Attend and Guide (AG-Net): A Keypoints-driven Attention-based Deep   Network for Image Recognition

Asish Bera; Zachary Wharton; Yonghuai Liu; Nik Bessis; Ardhendu; Behera

arXiv:2110.12183·cs.CV·October 26, 2021

Attend and Guide (AG-Net): A Keypoints-driven Attention-based Deep Network for Image Recognition

Asish Bera, Zachary Wharton, Yonghuai Liu, Nik Bessis, Ardhendu, Behera

PDF

1 Repo

TL;DR

This paper introduces AG-Net, a novel keypoints-driven attention mechanism integrated into CNNs, significantly improving fine-grained image recognition by automatically identifying semantic regions without manual annotations.

Contribution

The paper proposes an end-to-end CNN model with a new attention mechanism based on keypoints, enhancing recognition of subtle image details without manual region annotations.

Findings

01

Outperforms state-of-the-art on six benchmark datasets.

02

Effectively captures semantic regions and spatial structures.

03

Improves accuracy in fine-grained image recognition tasks.

Abstract

This paper presents a novel keypoints-based attention mechanism for visual recognition in still images. Deep Convolutional Neural Networks (CNNs) for recognizing images with distinctive classes have shown great success, but their performance in discriminating fine-grained changes is not at the same level. We address this by proposing an end-to-end CNN model, which learns meaningful features linking fine-grained changes using our novel attention mechanism. It captures the spatial structures in images by identifying semantic regions (SRs) and their spatial distributions, and is proved to be the key to modelling subtle changes in images. We automatically identify these SRs by grouping the detected keypoints in a given image. The ``usefulness'' of these SRs for image recognition is measured using our innovative attentional mechanism focusing on parts of the image that are most relevant to a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

DanielKovach/AG-Net
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.