Energy-Inspired Self-Supervised Pretraining for Vision Models

Ze Wang; Jiang Wang; Zicheng Liu; and Qiang Qiu

arXiv:2302.01384·cs.CV·February 6, 2023·1 cites

Energy-Inspired Self-Supervised Pretraining for Vision Models

Ze Wang, Jiang Wang, Zicheng Liu, and Qiang Qiu

PDF

Open Access 1 Video

TL;DR

This paper introduces an energy-based, self-supervised pretraining framework for vision models that unifies data encoding and restoration in a single network, enabling diverse pretext tasks and achieving competitive results with fewer training epochs.

Contribution

It proposes a novel energy-inspired pretraining method that integrates forward and backward passes into one model, eliminating the need for auxiliary components and broadening pretext task options.

Findings

01

Achieves comparable or better performance than state-of-the-art methods.

02

Requires significantly fewer training epochs.

03

Supports various pretext tasks like masking, denoising, and super-resolution.

Abstract

Motivated by the fact that forward and backward passes of a deep network naturally form symmetric mappings between input and output representations, we introduce a simple yet effective self-supervised vision model pretraining framework inspired by energy-based models (EBMs). In the proposed framework, we model energy estimation and data restoration as the forward and backward passes of a single network without any auxiliary components, e.g., an extra decoder. For the forward pass, we fit a network to an energy function that assigns low energy scores to samples that belong to an unlabeled dataset, and high energy otherwise. For the backward pass, we restore data from corrupted versions iteratively using gradient-based optimization along the direction of energy minimization. In this way, we naturally fold the encoder-decoder architecture widely used in masked image modeling into the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Energy-Inspired Self-Supervised Pretraining for Vision Models· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis · Medical Image Segmentation Techniques