ANCRe: Adaptive Neural Connection Reassignment for Efficient Depth Scaling
Yilang Zhang, Bingcong Li, Niao He, Georgios B. Giannakis

TL;DR
This paper introduces ANCRe, a lightweight adaptive framework that reassigns residual connections in neural networks, leading to faster convergence and better depth utilization across various large models.
Contribution
It provides a theoretical analysis of residual connections and proposes ANCRe, a novel method for data-driven residual reallocation with minimal overhead.
Findings
Accelerated convergence in large language models and diffusion models
Improved performance and depth efficiency in deep ResNets
Theoretical proof of residual connection layout impacting convergence rates
Abstract
Scaling network depth has been a central driver behind the success of modern foundation models, yet recent investigations suggest that deep layers are often underutilized. This paper revisits the default mechanism for deepening neural networks, namely residual connections, from an optimization perspective. Rigorous analysis proves that the layout of residual connections can fundamentally shape convergence behavior, and even induces an exponential gap in convergence rates. Prompted by this insight, we introduce adaptive neural connection reassignment (ANCRe), a principled and lightweight framework that parameterizes and learns residual connectivities from the data. ANCRe adaptively reassigns residual connections with negligible computational and memory overhead (), while enabling more effective utilization of network depth. Extensive numerical tests across pre-training of large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · 3D Shape Modeling and Analysis
