Simmering: Sufficient is better than optimal for training neural networks

Irina Babayan; Hazhir Aliahmadi; Greg van Anders

arXiv:2410.19912·cs.LG·February 20, 2026

Simmering: Sufficient is better than optimal for training neural networks

Irina Babayan, Hazhir Aliahmadi, Greg van Anders

PDF

Open Access

TL;DR

The paper introduces 'simmering', a physics-inspired training method for neural networks that emphasizes 'good enough' solutions over optimality, effectively preventing overfitting and outperforming traditional optimization techniques.

Contribution

It presents a novel physics-based training approach called simmering that challenges the optimization paradigm and demonstrates improved neural network training outcomes.

Findings

01

Simmering corrects overfitting caused by Adam.

02

Simmering outperforms optimization-based methods in experiments.

03

It avoids overfitting when used from the start.

Abstract

The broad range of neural network training techniques that invoke optimization but rely on ad hoc modification for validity suggests that optimization-based training is misguided. Shortcomings of optimization-based training are brought to particularly strong relief by the problem of overfitting, where naive optimization produces spurious outcomes. The broad success of neural networks for modelling physical processes has prompted advances that are based on inverting the direction of investigation and treating neural networks as if they were physical systems in their own right. These successes raise the question of whether broader, physical perspectives could motivate the construction of improved training algorithms. Here, we introduce simmering, a physics-based method that trains neural networks to generate weights and biases that are merely ``good enough'', but which, paradoxically,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsHigh-Order Consensuses · Adam