Looking for the Devil in the Details: Learning Trilinear Attention   Sampling Network for Fine-grained Image Recognition

Heliang Zheng; Jianlong Fu; Zheng-Jun Zha; Jiebo Luo

arXiv:1903.06150·cs.CV·June 12, 2019·38 cites

Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-grained Image Recognition

Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo

PDF

Open Access 1 Repo

TL;DR

This paper introduces TASN, a novel network that efficiently learns fine-grained image features by sampling and distilling part details using trilinear attention, outperforming existing methods in accuracy.

Contribution

The paper proposes TASN, a trilinear attention sampling network that captures detailed features from numerous parts efficiently for fine-grained recognition.

Findings

01

TASN achieves state-of-the-art performance on multiple datasets.

02

It effectively models inter-channel relationships for attention.

03

The approach reduces computational costs compared to existing methods.

Abstract

Learning subtle yet discriminative features (e.g., beak and eyes for a bird) plays a significant role in fine-grained image recognition. Existing attention-based approaches localize and amplify significant parts to learn fine-grained details, which often suffer from a limited number of parts and heavy computational cost. In this paper, we propose to learn such fine-grained features from hundreds of part proposals by Trilinear Attention Sampling Network (TASN) in an efficient teacher-student manner. Specifically, TASN consists of 1) a trilinear attention module, which generates attention maps by modeling the inter-channel relationships, 2) an attention-based sampler which highlights attended parts with high resolution, and 3) a feature distiller, which distills part features into a global one by weight sharing and feature preserving strategies. Extensive experiments verify that TASN…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

researchmm/tasn
mxnetOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning