Plateau in Monotonic Linear Interpolation -- A "Biased" View of Loss Landscape for Deep Networks
Xiang Wang, Annie N. Wang, Mo Zhou, Rong Ge

TL;DR
This paper investigates the phenomenon of monotonic linear interpolation in neural networks, revealing that biases significantly influence the loss landscape and can create plateaus, challenging the idea that MLI indicates easy optimization.
Contribution
The study demonstrates that biases affect MLI behavior and introduces the concept that class-specific biases cause plateaus, providing new insights into neural network loss landscapes.
Findings
Biases cause long plateaus in loss and accuracy during interpolation.
Linear interpolation of weights and biases affects network outputs differently.
Empirical results show biases influence MLI phenomena in real networks.
Abstract
Monotonic linear interpolation (MLI) - on the line connecting a random initialization with the minimizer it converges to, the loss and accuracy are monotonic - is a phenomenon that is commonly observed in the training of neural networks. Such a phenomenon may seem to suggest that optimization of neural networks is easy. In this paper, we show that the MLI property is not necessarily related to the hardness of optimization problems, and empirical observations on MLI for deep neural networks depend heavily on biases. In particular, we show that interpolating both weights and biases linearly leads to very different influences on the final output, and when different classes have different last-layer biases on a deep network, there will be a long plateau in both the loss and accuracy interpolation (which existing theory of MLI cannot explain). We also show how the last-layer biases for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Image and Signal Denoising Methods
