Understanding temperature tuning in energy-based models
Peter W Fields, Vudtiwat Ngampruetikorn, David J Schwab, Stephanie E Palmer

TL;DR
This paper provides a physically motivated framework to understand temperature tuning in energy-based models, revealing when lowering or raising temperature improves generative performance based on the energy landscape and data size.
Contribution
It introduces an interpretable framework explaining temperature tuning in energy-based models, linking it to energy gaps and data limitations, and guides optimal temperature selection.
Findings
Lowering temperature corrects overestimated high-energy states in models.
Optimal temperature depends on data size and energy landscape.
Raising temperature can improve generation under certain conditions.
Abstract
Generative models of complex systems often require post-hoc parameter adjustments to produce useful outputs. For example, energy-based models for protein design are sampled at an artificially low ''temperature'' to generate novel, functional sequences. This temperature tuning is a common yet poorly understood heuristic used across machine learning contexts to control the trade-off between generative fidelity and diversity. Here, we develop an interpretable, physically motivated framework to explain this phenomenon. We demonstrate that in systems with a large ''energy gap'' - separating a small fraction of meaningful states from a vast space of unrealistic states - learning from sparse data causes models to systematically overestimate high-energy state probabilities, a bias that lowering the sampling temperature corrects. More generally, we characterize how the optimal sampling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Generative Adversarial Networks and Image Synthesis · Protein Structure and Dynamics
