Path-conditioned training: a principled way to rescale ReLU neural networks

Arthur Lebeurrier; Titouan Vayer; R\'emi Gribonval

arXiv:2602.19799·stat.ML·February 24, 2026

Path-conditioned training: a principled way to rescale ReLU neural networks

Arthur Lebeurrier, Titouan Vayer, R\'emi Gribonval

PDF

Open Access

TL;DR

This paper introduces a principled rescaling method for ReLU neural networks based on path-lifting, which improves training efficiency by aligning network parameters with a reference kernel.

Contribution

It presents a geometrically motivated rescaling criterion and an efficient algorithm, leveraging the path-lifting framework to enhance neural network training.

Findings

01

Rescaling can significantly speed up training.

02

The method effectively aligns kernels in the path-lifting space.

03

Architecture and initialization scale influence the rescaling effectiveness.

Abstract

Despite recent algorithmic advances, we still lack principled ways to leverage the well-documented rescaling symmetries in ReLU neural network parameters. While two properly rescaled weights implement the same function, the training dynamics can be dramatically different. To offer a fresh perspective on exploiting this phenomenon, we build on the recent path-lifting framework, which provides a compact factorization of ReLU networks. We introduce a geometrically motivated criterion to rescale neural network parameters which minimization leads to a conditioning strategy that aligns a kernel in the path-lifting space with a chosen reference. We derive an efficient algorithm to perform this alignment. In the context of random network initialization, we analyze how the architecture and the initialization scale jointly impact the output of the proposed method. Numerical experiments illustrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Stochastic Gradient Optimization Techniques · Model Reduction and Neural Networks