LAuReL: Learned Augmented Residual Layer
Gaurav Menghani, Ravi Kumar, Sanjiv Kumar

TL;DR
LAuReL introduces a learned residual layer that enhances neural network performance and efficiency, outperforming traditional residual connections in vision and language models with minimal additional parameters.
Contribution
The paper proposes LAuReL, a novel learned residual layer that generalizes residual connections, improving model quality and efficiency in vision and language architectures.
Findings
LAuReL achieves 60% of the gains of an extra layer on ResNet-50 with only 0.003% more parameters.
LAuReL improves performance of 1B and 4B parameter LLMs by up to 20.05% on downstream tasks.
LAuReL adds significantly fewer parameters compared to traditional residual enhancements.
Abstract
One of the core pillars of efficient deep learning methods is architectural improvements such as the residual/skip connection, which has led to significantly better model convergence and quality. Since then the residual connection has become ubiquitous in not just convolutional neural networks but also transformer-based architectures, the backbone of LLMs. In this paper we introduce Learned Augmented Residual Layer (LAuReL) -- a novel generalization of the canonical residual connection -- with the goal to be an in-situ replacement of the latter while outperforming on both model quality and footprint metrics. Our experiments show that using LAuReL can help boost performance for both vision and language models. For example, on the ResNet-50, ImageNet 1K task, it achieves 60% of the gains from adding an extra layer, while only adding 0.003% more parameters, and matches it while adding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems · Neural Networks and Applications · Seismology and Earthquake Studies
MethodsResidual Connection
