Self-reinforcing Unsupervised Matching

Jiang Lu; Lei Li; Changshui Zhang

arXiv:1909.04138·cs.CV·February 28, 2023

Self-reinforcing Unsupervised Matching

Jiang Lu, Lei Li, Changshui Zhang

PDF

Open Access

TL;DR

This paper introduces SUM, a self-reinforcing unsupervised matching method that annotates images in new modalities without supervision, leveraging cross-modality matching to improve generalization and facilitate continual learning.

Contribution

The paper proposes a novel unsupervised approach for cross-modality image annotation that requires only one template and no supervision in the emerging modality.

Findings

01

Enables annotation in new modalities without supervision

02

Requires only one template in seen modality

03

Facilitates continual learning in deep models

Abstract

Remarkable gains in deep learning usually rely on tremendous supervised data. Ensuring the modality diversity for one object in training set is critical for the generalization of cutting-edge deep models, but it burdens human with heavy manual labor on data collection and annotation. In addition, some rare or unexpected modalities are new for the current model, causing reduced performance under such emerging modalities. Inspired by the achievements in speech recognition, psychology and behavioristics, we present a practical solution, self-reinforcing unsupervised matching (SUM), to annotate the images with 2D structure-preserving property in an emerging modality by cross-modality matching. This approach requires no any supervision in emerging modality and only one template in seen modality, providing a possible route towards continual learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Speech Recognition and Synthesis · Multimodal Machine Learning Applications