One-Shot Object Detection with Co-Attention and Co-Excitation
Ting-I Hsieh, Yi-Chen Lo, Hwann-Tzong Chen, Tyng-Luh Liu

TL;DR
This paper introduces a novel co-attention and co-excitation framework for one-shot object detection, enabling the detection of unseen classes by leveraging non-local operations, adaptive feature emphasis, and a ranking loss.
Contribution
It proposes a new framework combining co-attention, co-excitation, and a ranking loss for effective one-shot object detection of unseen classes.
Findings
Achieves strong baseline performance on VOC and MS-COCO datasets.
Effectively detects objects from both seen and unseen classes.
Introduces a novel combination of non-local operations and adaptive feature emphasis.
Abstract
This paper aims to tackle the challenging problem of one-shot object detection. Given a query image patch whose class label is not included in the training data, the goal of the task is to detect all instances of the same class in a target image. To this end, we develop a novel {\em co-attention and co-excitation} (CoAE) framework that makes contributions in three key technical aspects. First, we propose to use the non-local operation to explore the co-attention embodied in each query-target pair and yield region proposals accounting for the one-shot situation. Second, we formulate a squeeze-and-co-excitation scheme that can adaptively emphasize correlated feature channels to help uncover relevant proposals and eventually the target objects. Third, we design a margin-based ranking loss for implicitly learning a metric to predict the similarity of a region proposal to the underlying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
Methods1x1 Convolution · Non-Local Operation
