DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited   Annotations

Ximeng Sun; Ping Hu; Kate Saenko

arXiv:2206.09541·cs.CV·June 22, 2022·42 cites

DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations

Ximeng Sun, Ping Hu, Kate Saenko

PDF

Open Access 1 Repo 1 Video

TL;DR

DualCoOp leverages pretrained vision-language models with a novel context optimization framework to enable rapid adaptation for multi-label recognition tasks with limited annotations, outperforming existing methods.

Contribution

The paper introduces DualCoOp, a lightweight, unified framework that enhances multi-label recognition in low-label regimes by utilizing class prompts and strong pretrained alignments.

Findings

01

Outperforms state-of-the-art methods on standard benchmarks.

02

Effective in low-label and zero-shot multi-label recognition scenarios.

03

Quick adaptation with minimal additional training overhead.

Abstract

Solving multi-label recognition (MLR) for images in the low-label regime is a challenging task with many real-world applications. Recent work learns an alignment between textual and visual spaces to compensate for insufficient image labels, but loses accuracy because of the limited amount of available MLR annotations. In this work, we utilize the strong alignment of textual and visual features pretrained with millions of auxiliary image-text pairs and propose Dual Context Optimization (DualCoOp) as a unified framework for partial-label MLR and zero-shot MLR. DualCoOp encodes positive and negative contexts with class names as part of the linguistic input (i.e. prompts). Since DualCoOp only introduces a very light learnable overhead upon the pretrained vision-language framework, it can quickly adapt to multi-label recognition tasks that have limited annotations and even unseen classes.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sunxm2357/dualcoop
pytorch

Videos

DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations· slideslive

Taxonomy

TopicsText and Document Classification Technologies · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques