Towards Flexible Inductive Bias via Progressive Reparameterization Scheduling
Yunsung Lee, Gyuseong Lee, Kwangrok Ryoo, Hyojun Go, Jihye Park, and, Seungryong Kim

TL;DR
This paper introduces Progressive Reparameterization Scheduling (PRS), a method to dynamically adjust inductive biases between convolution and self-attention in models, improving performance across different data scales.
Contribution
The paper proposes a novel reparameterization approach that interpolates inductive biases between CNNs and ViTs, enabling flexible adaptation to data scale variations.
Findings
PRS outperforms previous methods on small-scale datasets like CIFAR-100.
Fourier analysis reveals how inductive bias effectiveness varies with data scale.
Reparameterization accelerates the transition from convolution to self-attention during training.
Abstract
There are two de facto standard architectures in recent computer vision: Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). Strong inductive biases of convolutions help the model learn sample effectively, but such strong biases also limit the upper bound of CNNs when sufficient data are available. On the contrary, ViT is inferior to CNNs for small data but superior for sufficient data. Recent approaches attempt to combine the strengths of these two architectures. However, we show these approaches overlook that the optimal inductive bias also changes according to the target data scale changes by comparing various models' accuracy on subsets of sampled ImageNet at different ratios. In addition, through Fourier analysis of feature maps, the model's response patterns according to signal frequency changes, we observe which inductive bias is advantageous for each data scale.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Advanced Memory and Neural Computing
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · 1x1 Convolution · Max Pooling · Residual Block · Residual Connection · Kaiming Initialization · Average Pooling · Global Average Pooling · Bottleneck Residual Block
