Training the Untrainable: Introducing Inductive Bias via Representational Alignment
Vighnesh Subramaniam, David Mayo, Colin Conwell, Tomaso Poggio, Boris Katz, Brian Cheung, Andrei Barbu

TL;DR
This paper introduces a method called guidance that uses a guide network to steer target networks via representational alignment, enabling training of architectures previously considered untrainable for certain tasks.
Contribution
The paper proposes a novel guidance technique that transfers architectural priors through representational similarity, improving training outcomes for various neural network architectures.
Findings
Guidance prevents fully connected networks from overfitting on ImageNet.
It narrows the gap between RNNs and Transformers.
Guidance-driven initialization mitigates overfitting in FCNs.
Abstract
We demonstrate that architectures which traditionally are considered to be ill-suited for a task can be trained using inductive biases from another architecture. We call a network untrainable when it overfits, underfits, or converges to poor results even when tuning their hyperparameters. For example, fully connected networks overfit on object recognition while deep convolutional networks without residual connections underfit. The traditional answer is to change the architecture to impose some inductive bias, although the nature of that bias is unknown. We introduce guidance, where a guide network steers a target network using a neural distance function. The target minimizes its task loss plus a layerwise representational similarity against the frozen guide. If the guide is trained, this transfers over the architectural prior and knowledge of the guide to the target. If the guide is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsEthics and Social Impacts of AI · Artificial Intelligence in Law · Organizational Management and Leadership
