Structured Uncertainty Similarity Score (SUSS): Learning a Probabilistic, Interpretable, Perceptual Metric Between Images
By: Paula Seidler, Neill D. F. Campbell, Ivor J A Simpson
Potential Business Impact:
Helps computers see images like humans do.
Perceptual similarity scores that align with human vision are critical for both training and evaluating computer vision models. Deep perceptual losses, such as LPIPS, achieve good alignment but rely on complex, highly non-linear discriminative features with unknown invariances, while hand-crafted measures like SSIM are interpretable but miss key perceptual properties. We introduce the Structured Uncertainty Similarity Score (SUSS); it models each image through a set of perceptual components, each represented by a structured multivariate Normal distribution. These are trained in a generative, self-supervised manner to assign high likelihood to human-imperceptible augmentations. The final score is a weighted sum of component log-probabilities with weights learned from human perceptual datasets. Unlike feature-based methods, SUSS learns image-specific linear transformations of residuals in pixel space, enabling transparent inspection through decorrelated residuals and sampling. SUSS aligns closely with human perceptual judgments, shows strong perceptual calibration across diverse distortion types, and provides localized, interpretable explanations of its similarity assessments. We further demonstrate stable optimization behavior and competitive performance when using SUSS as a perceptual loss for downstream imaging tasks.
Similar Papers
A Novel Image Similarity Metric for Scene Composition Structure
CV and Pattern Recognition
Checks if AI images keep their real-world shapes.
A Novel Image Similarity Metric for Scene Composition Structure
CV and Pattern Recognition
Checks if AI pictures keep their real-world shapes.
Structures Meet Semantics: Multimodal Fusion via Graph Contrastive Learning
CV and Pattern Recognition
Helps computers understand feelings from voice, face, and words.