Semantic Representation and Dependency Learning for Multi-Label Image   Recognition

Tao Pu; Mingzhan Sun; Hefeng Wu; Tianshui Chen; Ling Tian; Liang Lin

arXiv:2204.03795·cs.CV·January 10, 2023

Semantic Representation and Dependency Learning for Multi-Label Image Recognition

Tao Pu, Mingzhan Sun, Hefeng Wu, Tianshui Chen, Ling Tian, Liang Lin

PDF

Open Access

TL;DR

This paper introduces a novel semantic representation and dependency learning framework for multi-label image recognition that avoids reliance on pre-trained object detection models and improves recognition of rare categories.

Contribution

The proposed SRDL framework learns category-specific semantic features and dependencies without pre-trained detection models, enhancing multi-label recognition performance.

Findings

01

Outperforms state-of-the-art on MS-COCO and Pascal VOC 2007 datasets.

02

Effectively captures semantic dependencies among categories.

03

Improves recognition accuracy for rare categories.

Abstract

Recently many multi-label image recognition (MLR) works have made significant progress by introducing pre-trained object detection models to generate lots of proposals or utilizing statistical label co-occurrence enhance the correlation among different categories. However, these works have some limitations: (1) the effectiveness of the network significantly depends on pre-trained object detection models that bring expensive and unaffordable computation; (2) the network performance degrades when there exist occasional co-occurrence objects in images, especially for the rare categories. To address these problems, we propose a novel and effective semantic representation and dependency learning (SRDL) framework to learn category-specific semantic representation for each category and capture semantic dependency among all categories. Specifically, we design a category-specific attentional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Text and Document Classification Technologies · Advanced Image and Video Retrieval Techniques