Understanding temperature tuning in energy-based models
By: Peter W Fields , Vudtiwat Ngampruetikorn , David J Schwab and more
Potential Business Impact:
Makes AI create better, more useful things.
Generative models of complex systems often require post-hoc parameter adjustments to produce useful outputs. For example, energy-based models for protein design are sampled at an artificially low ''temperature'' to generate novel, functional sequences. This temperature tuning is a common yet poorly understood heuristic used across machine learning contexts to control the trade-off between generative fidelity and diversity. Here, we develop an interpretable, physically motivated framework to explain this phenomenon. We demonstrate that in systems with a large ''energy gap'' - separating a small fraction of meaningful states from a vast space of unrealistic states - learning from sparse data causes models to systematically overestimate high-energy state probabilities, a bias that lowering the sampling temperature corrects. More generally, we characterize how the optimal sampling temperature depends on the interplay between data size and the system's underlying energy landscape. Crucially, our results show that lowering the sampling temperature is not always desirable; we identify the conditions where \emph{raising} it results in better generative performance. Our framework thus casts post-hoc temperature tuning as a diagnostic tool that reveals properties of the true data distribution and the limits of the learned model.
Similar Papers
A thermoinformational formulation for the description of neuropsychological systems
Neurons and Cognition
Measures how systems change and learn.
Exploring the Impact of Temperature on Large Language Models:Hot or Cold?
Computation and Language
Makes AI smarter by picking the best "thinking" speed.
Tempering the Bayes Filter towards Improved Model-Based Estimation
Systems and Control
Makes computer guesses better when information is missing.