Similarity as Thermodynamic Work: Between Depth and Diversity -- from Information Distance to Ugly Duckling
By: Kentaro Imafuku
Potential Business Impact:
Finds the true meaning hidden in data.
Defining similarity is a fundamental challenge in information science. Watanabe's Ugly Duckling Theorem highlights diversity, while algorithmic information theory emphasizes depth through Information Distance. We propose a statistical-mechanical framework that treats program length as energy, with a temperature parameter unifying these two aspects: in the low-temperature limit, similarity approaches Information Distance; in the high-temperature limit, it recovers the indiscriminability of the Ugly Duckling theorem; and at the critical point, it coincides with the Solomonoff prior. We refine the statistical-mechanical framework by introducing regular universal machines and effective degeneracy ratios, allowing us to separate redundant from core diversity. This refinement yields new tools for analyzing similarity and opens perspectives for information distance, model selection, and non-equilibrium extensions.
Similar Papers
The Exploratory Study on the Relationship Between the Failure of Distance Metrics in High-Dimensional Space and Emergent Phenomena
Information Theory
Helps predict when new things will appear.
The Information Theory of Similarity
Information Theory
Makes computers understand how alike things are.
Unifying Information-Theoretic and Pair-Counting Clustering Similarity
Machine Learning (Stat)
Unifies ways to check how well computer groups match.