Improving Long-tailed Object Detection with Image-Level Supervision by   Multi-Task Collaborative Learning

Bo Li; Yongqiang Yao; Jingru Tan; Xin Lu; Fengwei Yu; Ye Luo; Jianwei; Lu

arXiv:2210.05568·cs.CV·October 12, 2022·1 cites

Improving Long-tailed Object Detection with Image-Level Supervision by Multi-Task Collaborative Learning

Bo Li, Yongqiang Yao, Jingru Tan, Xin Lu, Fengwei Yu, Ye Luo, Jianwei, Lu

PDF

Open Access 1 Repo

TL;DR

This paper introduces CLIS, a multi-task collaborative framework leveraging image-level supervision to improve long-tailed object detection, especially for tail categories, achieving state-of-the-art results on LVIS dataset.

Contribution

Proposes a novel multi-task collaborative learning framework that effectively utilizes image-level supervision to enhance tail category detection in long-tailed datasets.

Findings

01

Achieves 31.1 AP on LVIS dataset, surpassing previous methods.

02

Improves tail category AP by 10.1 points.

03

Demonstrates effectiveness without complex loss engineering.

Abstract

Data in real-world object detection often exhibits the long-tailed distribution. Existing solutions tackle this problem by mitigating the competition between the head and tail categories. However, due to the scarcity of training samples, tail categories are still unable to learn discriminative representations. Bringing more data into the training may alleviate the problem, but collecting instance-level annotations is an excruciating task. In contrast, image-level annotations are easily accessible but not fully exploited. In this paper, we propose a novel framework CLIS (multi-task Collaborative Learning with Image-level Supervision), which leverage image-level supervision to enhance the detection ability in a multi-task collaborative way. Specifically, there are an object detection task (consisting of an instance-classification task and a localization task) and an image-classification…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

waveboo/clis
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsContrastive Learning