Fine-grained Visual-textual Representation Learning

Xiangteng He; Yuxin Peng

arXiv:1709.00340·cs.CV·February 21, 2019

Fine-grained Visual-textual Representation Learning

Xiangteng He, Yuxin Peng

PDF

1 Repo

TL;DR

This paper introduces a novel fine-grained visual-textual representation learning method that uses GANs to automatically discover discriminative parts by jointly modeling visual and textual data, enhancing categorization accuracy.

Contribution

It proposes a new approach that leverages textual attention to guide visual part discovery and combines visual and textual features for improved fine-grained categorization.

Findings

01

Automatically discovers discriminative parts using GANs

02

Joint visual-textual modeling improves categorization accuracy

03

Enhances fine-grained recognition performance

Abstract

Fine-grained visual categorization is to recognize hundreds of subcategories belonging to the same basic-level category, which is a highly challenging task due to the quite subtle and local visual distinctions among similar subcategories. Most existing methods generally learn part detectors to discover discriminative regions for better categorization performance. However, not all parts are beneficial and indispensable for visual categorization, and the setting of part detector number heavily relies on prior knowledge as well as experimental validation. As is known to all, when we describe the object of an image via textual descriptions, we mainly focus on the pivotal characteristics, and rarely pay attention to common characteristics as well as the background areas. This is an involuntary transfer from human visual attention to textual attention, which leads to the fact that textual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

PKU-ICST-MIPL/OPAM_TIP2018
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.