In-sample Contrastive Learning and Consistent Attention for Weakly   Supervised Object Localization

Minsong Ki; Youngjung Uh; Wonyoung Lee; Hyeran Byun

arXiv:2009.12063·cs.CV·September 28, 2020

In-sample Contrastive Learning and Consistent Attention for Weakly Supervised Object Localization

Minsong Ki, Youngjung Uh, Wonyoung Lee, Hyeran Byun

PDF

1 Repo

TL;DR

This paper introduces a novel weakly supervised object localization method that leverages contrastive attention and foreground consistency losses, improving localization accuracy by effectively utilizing background cues and enhancing attention maps.

Contribution

It proposes contrastive attention loss and foreground consistency loss, along with non-local attention blocks, to improve object localization accuracy in weakly supervised settings.

Findings

01

Achieves state-of-the-art results on CUB-200-2011 and ImageNet datasets.

02

Effectively utilizes background cues to guide feature activation.

03

Enhances attention maps with non-local attention blocks.

Abstract

Weakly supervised object localization (WSOL) aims to localize the target object using only the image-level supervision. Recent methods encourage the model to activate feature maps over the entire object by dropping the most discriminative parts. However, they are likely to induce excessive extension to the backgrounds which leads to over-estimated localization. In this paper, we consider the background as an important cue that guides the feature activation to cover the sophisticated object region and propose contrastive attention loss. The loss promotes similarity between foreground and its dropped version, and, dissimilarity between the dropped version and background. Furthermore, we propose foreground consistency loss that penalizes earlier layers producing noisy attention regarding the later layer as a reference to provide them with a sense of backgroundness. It guides the early…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MinSongKi/InCA.github.io
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.