Score: 0

Training Free Zero-Shot Visual Anomaly Localization via Diffusion Inversion

Published: January 12, 2026 | arXiv ID: 2601.08022v1

By: Samet Hicsonmez, Abd El Rahman Shabayek, Djamila Aouada

Zero-Shot image Anomaly Detection (ZSAD) aims to detect and localise anomalies without access to any normal training samples of the target data. While recent ZSAD approaches leverage additional modalities such as language to generate fine-grained prompts for localisation, vision-only methods remain limited to image-level classification, lacking spatial precision. In this work, we introduce a simple yet effective training-free vision-only ZSAD framework that circumvents the need for fine-grained prompts by leveraging the inversion of a pretrained Denoising Diffusion Implicit Model (DDIM). Specifically, given an input image and a generic text description (e.g., "an image of an [object class]"), we invert the image to obtain latent representations and initiate the denoising process from a fixed intermediate timestep to reconstruct the image. Since the underlying diffusion model is trained solely on normal data, this process yields a normal-looking reconstruction. The discrepancy between the input image and the reconstructed one highlights potential anomalies. Our method achieves state-of-the-art performance on VISA dataset, demonstrating strong localisation capabilities without auxiliary modalities and facilitating a shift away from prompt dependence for zero-shot anomaly detection research. Code is available at https://github.com/giddyyupp/DIVAD.

Towards Zero-Shot Anomaly Detection and Reasoning with Multimodal Large Language Models

CV and Pattern Recognition

Finds weird things in pictures without examples.

11 Feb 2025 2

90%

ACD-CLIP: Decoupling Representation and Dynamic Fusion for Zero-Shot Anomaly Detection

CV and Pattern Recognition

Finds weird things in pictures better.

11 Aug 2025 0

90%

AD-DINOv3: Enhancing DINOv3 for Zero-Shot Anomaly Detection with Anomaly-Aware Calibration

CV and Pattern Recognition

Finds weird things even without knowing them.

17 Sep 2025 2

View PDF Login to Bookmark

Training Free Zero-Shot Visual Anomaly Localization via Diffusion Inversion

Technical Abstract

Towards Zero-Shot Anomaly Detection and Reasoning with Multimodal Large Language Models

ACD-CLIP: Decoupling Representation and Dynamic Fusion for Zero-Shot Anomaly Detection

AD-DINOv3: Enhancing DINOv3 for Zero-Shot Anomaly Detection with Anomaly-Aware Calibration