Controlled Langevin Dynamics for Sampling of Feedforward Neural Networks Trained with Minibatches
Alessandro Zambon, Francesca Caruso, Riccardo Zecchina, Guido Tiana

TL;DR
This paper introduces a scalable pseudo-Langevin dynamics method for efficient Boltzmann sampling of large neural networks using minibatches, providing insights into network geometry and improving generalization without traditional training.
Contribution
The authors develop a controlled minibatch Langevin sampling technique that scales to large networks, enabling efficient exploration of neural network parameter distributions.
Findings
pL sampling maintains high efficiency for networks with over one million parameters
Sampling at intermediate temperatures improves generalization performance
The method scales favorably compared to hybrid Monte Carlo for large models
Abstract
Sampling the parameter space of artificial neural networks according to a Boltzmann distribution provides insight into the geometry of low-loss solutions and offers an alternative to conventional loss minimization for training. However, exact sampling methods such as hybrid Monte Carlo (hMC), while formally correct, become computationally prohibitive for realistic datasets because they require repeated evaluation of full-batch gradients. We introduce a pseudo-Langevin (pL) dynamics that enables efficient Boltzmann sampling of feed-forward neural networks trained with large datasets by using minibatches in a controlled manner. The method exploits the statistical properties of minibatch gradient noise and adjusts fictitious masses and friction coefficients to ensure that the induced stochastic process samples efficiently the desired equilibrium distribution. We validate numerically the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Machine Learning in Materials Science · Model Reduction and Neural Networks
