Learning a distance measure from the information-estimation geometry of data
By: Guy Ohayon , Pierre-Etienne H. Fiquet , Florentin Guth and more
Potential Business Impact:
Measures how different pictures look to people.
We introduce the Information-Estimation Metric (IEM), a novel form of distance function derived from an underlying continuous probability density over a domain of signals. The IEM is rooted in a fundamental relationship between information theory and estimation theory, which links the log-probability of a signal with the errors of an optimal denoiser, applied to noisy observations of the signal. In particular, the IEM between a pair of signals is obtained by comparing their denoising error vectors over a range of noise amplitudes. Geometrically, this amounts to comparing the score vector fields of the blurred density around the signals over a range of blur levels. We prove that the IEM is a valid global metric and derive a closed-form expression for its local second-order approximation, which yields a Riemannian metric. For Gaussian-distributed signals, the IEM coincides with the Mahalanobis distance. But for more complex distributions, it adapts, both locally and globally, to the geometry of the distribution. In practice, the IEM can be computed using a learned denoiser (analogous to generative diffusion models) and solving a one-dimensional integral. To demonstrate the value of our framework, we learn an IEM on the ImageNet database. Experiments show that this IEM is competitive with or outperforms state-of-the-art supervised image quality metrics in predicting human perceptual judgments.
Similar Papers
The Exploratory Study on the Relationship Between the Failure of Distance Metrics in High-Dimensional Space and Emergent Phenomena
Information Theory
Helps predict when new things will appear.
Information-Theoretic Discrete Diffusion
Machine Learning (CS)
Improves AI's ability to guess missing words.
MMG: Mutual Information Estimation via the MMSE Gap in Diffusion
Machine Learning (CS)
Helps computers find hidden connections in data.