Re-parameterizing Your Optimizers rather than Architectures

Xiaohan Ding; Honghao Chen; Xiangyu Zhang; Kaiqi Huang; Jungong Han,; Guiguang Ding

arXiv:2205.15242·cs.LG·February 10, 2023·27 cites

Re-parameterizing Your Optimizers rather than Architectures

Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Kaiqi Huang, Jungong Han,, Guiguang Ding

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces RepOptimizers, a method to incorporate model-specific prior knowledge into optimizers through gradient re-parameterization, enabling simple models like VGG to perform competitively with complex architectures.

Contribution

Proposes Gradient Re-parameterization to embed priors into optimizers, improving training efficiency and performance of simple models without extra computations.

Findings

01

RepOpt-VGG matches or exceeds performance of recent models.

02

RepOptimizers require no extra forward/backward computations.

03

RepOpt-VGG is efficient with high inference speed.

Abstract

The well-designed structures in neural networks reflect the prior knowledge incorporated into the models. However, though different models have various priors, we are used to training them with model-agnostic optimizers such as SGD. In this paper, we propose to incorporate model-specific prior knowledge into optimizers by modifying the gradients according to a set of model-specific hyper-parameters. Such a methodology is referred to as Gradient Re-parameterization, and the optimizers are named RepOptimizers. For the extreme simplicity of model structure, we focus on a VGG-style plain model and showcase that such a simple model trained with a RepOptimizer, which is referred to as RepOpt-VGG, performs on par with or better than the recent well-designed models. From a practical perspective, RepOpt-VGG is a favorable base model because of its simple structure, high inference speed and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dingxiaoh/repoptimizers
pytorchOfficial

Videos

Re-parameterizing Your Optimizers rather than Architectures· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Stochastic Gradient Descent · Balanced Selection