Flat Channels to Infinity in Neural Loss Landscapes

Flavio Martinelli; Alexander Van Meegen; Berfin \c{S}im\c{s}ek; Wulfram Gerstner; Johanni Brea

arXiv:2506.14951·cs.LG·May 11, 2026

Flat Channels to Infinity in Neural Loss Landscapes

Flavio Martinelli, Alexander Van Meegen, Berfin \c{S}im\c{s}ek, Wulfram Gerstner, Johanni Brea

PDF

1 Video

TL;DR

This paper uncovers special flat regions in neural network loss landscapes called channels to infinity, where certain neurons diverge but the network function converges to gated linear units, revealing new geometric and functional insights.

Contribution

It characterizes channels to infinity in neural loss landscapes, linking divergence of parameters to the emergence of gated linear units and their geometric properties.

Findings

01

Gradient-based optimizers frequently reach channels to infinity.

02

Channels resemble flat minima but involve diverging parameters.

03

Gated linear units naturally emerge at the end of these channels.

Abstract

The loss landscapes of neural networks contain minima and saddle points that may be connected in flat regions or appear in isolation. We identify and characterize a special structure in the loss landscape: channels along which the loss decreases extremely slowly, while the output weights of at least two neurons, $a_{i}$ and $a_{j}$ , diverge to $\pm$ infinity, and their input weight vectors, $w_{i}$ and $w_{j}$ , become equal to each other. At convergence, the two neurons implement a gated linear unit: $a_{i} σ (w_{i} \cdot x) + a_{j} σ (w_{j} \cdot x) \to σ (w \cdot x) + (v \cdot x) σ^{'} (w \cdot x)$ . Geometrically, these channels to infinity are asymptotically parallel to symmetry-induced lines of critical points. Gradient flow solvers, and related optimization methods like SGD…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Flat Channels to Infinity in Neural Loss Landscapes· slideslive