Simmering: Sufficient is better than optimal for training neural networks
Irina Babayan, Hazhir Aliahmadi, Greg van Anders

TL;DR
The paper introduces 'simmering', a physics-inspired training method for neural networks that emphasizes 'good enough' solutions over optimality, effectively preventing overfitting and outperforming traditional optimization techniques.
Contribution
It presents a novel physics-based training approach called simmering that challenges the optimization paradigm and demonstrates improved neural network training outcomes.
Findings
Simmering corrects overfitting caused by Adam.
Simmering outperforms optimization-based methods in experiments.
It avoids overfitting when used from the start.
Abstract
The broad range of neural network training techniques that invoke optimization but rely on ad hoc modification for validity suggests that optimization-based training is misguided. Shortcomings of optimization-based training are brought to particularly strong relief by the problem of overfitting, where naive optimization produces spurious outcomes. The broad success of neural networks for modelling physical processes has prompted advances that are based on inverting the direction of investigation and treating neural networks as if they were physical systems in their own right. These successes raise the question of whether broader, physical perspectives could motivate the construction of improved training algorithms. Here, we introduce simmering, a physics-based method that trains neural networks to generate weights and biases that are merely ``good enough'', but which, paradoxically,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsHigh-Order Consensuses · Adam
