Self-Supervised Pyramid Representation Learning for Multi-Label Visual   Analysis and Beyond

Cheng-Yen Hsieh; Chih-Jung Chang; Fu-En Yang; Yu-Chiang Frank Wang

arXiv:2208.14439·cs.CV·August 31, 2022

Self-Supervised Pyramid Representation Learning for Multi-Label Visual Analysis and Beyond

Cheng-Yen Hsieh, Chih-Jung Chang, Fu-En Yang, Yu-Chiang Frank Wang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces SS-PRL, a self-supervised learning framework that learns multi-scale pyramid representations at patch levels, improving the generalization of vision models across various tasks like classification, detection, and segmentation.

Contribution

The paper proposes a novel self-supervised pyramid representation learning method that captures cross-scale patch correlations, enhancing multi-label visual analysis capabilities.

Findings

01

Effective pre-training for multi-label classification

02

Improved object detection and segmentation performance

03

Robust multi-scale patch representation learning

Abstract

While self-supervised learning has been shown to benefit a number of vision tasks, existing techniques mainly focus on image-level manipulation, which may not generalize well to downstream tasks at patch or pixel levels. Moreover, existing SSL methods might not sufficiently describe and associate the above representations within and across image scales. In this paper, we propose a Self-Supervised Pyramid Representation Learning (SS-PRL) framework. The proposed SS-PRL is designed to derive pyramid representations at patch levels via learning proper prototypes, with additional learners to observe and relate inherent semantic information within an image. In particular, we present a cross-scale patch-level correlation learning in SS-PRL, which allows the model to aggregate and associate information learned across patch scales. We show that, with our proposed SS-PRL for model pre-training,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wesleyhsieh0806/ss-prl
pytorchOfficial

Videos

Self-Supervised Pyramid Representation Learning for Multi-Label Visual Analysis and Beyond· youtube

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Text and Document Classification Technologies