Towards Flexible Inductive Bias via Progressive Reparameterization   Scheduling

Yunsung Lee; Gyuseong Lee; Kwangrok Ryoo; Hyojun Go; Jihye Park; and; Seungryong Kim

arXiv:2210.01370·cs.CV·October 5, 2022

Towards Flexible Inductive Bias via Progressive Reparameterization Scheduling

Yunsung Lee, Gyuseong Lee, Kwangrok Ryoo, Hyojun Go, Jihye Park, and, Seungryong Kim

PDF

Open Access

TL;DR

This paper introduces Progressive Reparameterization Scheduling (PRS), a method to dynamically adjust inductive biases between convolution and self-attention in models, improving performance across different data scales.

Contribution

The paper proposes a novel reparameterization approach that interpolates inductive biases between CNNs and ViTs, enabling flexible adaptation to data scale variations.

Findings

01

PRS outperforms previous methods on small-scale datasets like CIFAR-100.

02

Fourier analysis reveals how inductive bias effectiveness varies with data scale.

03

Reparameterization accelerates the transition from convolution to self-attention during training.

Abstract

There are two de facto standard architectures in recent computer vision: Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). Strong inductive biases of convolutions help the model learn sample effectively, but such strong biases also limit the upper bound of CNNs when sufficient data are available. On the contrary, ViT is inferior to CNNs for small data but superior for sufficient data. Recent approaches attempt to combine the strengths of these two architectures. However, we show these approaches overlook that the optimal inductive bias also changes according to the target data scale changes by comparing various models' accuracy on subsets of sampled ImageNet at different ratios. In addition, through Fourier analysis of feature maps, the model's response patterns according to signal frequency changes, we observe which inductive bias is advantageous for each data scale.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Advanced Memory and Neural Computing

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · 1x1 Convolution · Max Pooling · Residual Block · Residual Connection · Kaiming Initialization · Average Pooling · Global Average Pooling · Bottleneck Residual Block