Hierarchical Maximum Entropy via the Renormalization Group
By: Amir R. Asadi
Potential Business Impact:
Makes smart computer models learn faster and better.
Hierarchical structures, which include multiple levels, are prevalent in statistical and machine-learning models as well as physical systems. Extending the foundational result that the maximum entropy distribution under mean constraints is given by the exponential Gibbs-Boltzmann form, we introduce the framework of "hierarchical maximum entropy" to address these multilevel models. We demonstrate that Pareto optimal distributions, which maximize entropies across all levels of hierarchical transformations, can be obtained via renormalization-group procedures from theoretical physics. This is achieved by formulating multilevel extensions of the Gibbs variational principle and the Donsker-Varadhan variational representation of entropy. Moreover, we explore settings with hierarchical invariances that significantly simplify the renormalization-group procedures, enhancing computational efficiency: quadratic modular loss functions, logarithmic loss functions, and nearest-neighbor loss functions. This is accomplished through the introduction of the concept of parameter flows, which serves as an analog to renormalization flows in renormalization group theory. This work connects ideas from probability theory, information theory, and statistical mechanics.
Similar Papers
A mathematical study of the excess growth rate
Information Theory
Helps money grow faster by understanding information.
Entropies associated with orbits of finite groups
Information Theory
Unlocks new ways to measure information.
Hierarchical community detection via maximum entropy partitions and the renormalization group
Social and Information Networks
Finds hidden groups in complex connections.