Diffusion Counterfactuals for Image Regressors
By: Trung Duc Ha, Sidney Bender
Potential Business Impact:
Shows how to change pictures to get different results.
Counterfactual explanations have been successfully applied to create human interpretable explanations for various black-box models. They are handy for tasks in the image domain, where the quality of the explanations benefits from recent advances in generative models. Although counterfactual explanations have been widely applied to classification models, their application to regression tasks remains underexplored. We present two methods to create counterfactual explanations for image regression tasks using diffusion-based generative models to address challenges in sparsity and quality: 1) one based on a Denoising Diffusion Probabilistic Model that operates directly in pixel-space and 2) another based on a Diffusion Autoencoder operating in latent space. Both produce realistic, semantic, and smooth counterfactuals on CelebA-HQ and a synthetic data set, providing easily interpretable insights into the decision-making process of the regression model and reveal spurious correlations. We find that for regression counterfactuals, changes in features depend on the region of the predicted value. Large semantic changes are needed for significant changes in predicted values, making it harder to find sparse counterfactuals than with classifiers. Moreover, pixel space counterfactuals are more sparse while latent space counterfactuals are of higher quality and allow bigger semantic changes.
Similar Papers
From Visual Explanations to Counterfactual Explanations with Latent Diffusion
CV and Pattern Recognition
Shows why computers make wrong picture guesses.
Diffusion Counterfactual Generation with Semantic Abduction
Machine Learning (CS)
Changes pictures while keeping the person the same.
Unifying Image Counterfactuals and Feature Attributions with Latent-Space Adversarial Attacks
Machine Learning (CS)
Shows why computers see what they see.