Score: 2

Explanations as Bias Detectors: A Critical Study of Local Post-hoc XAI Methods for Fairness Exploration

Published: May 1, 2025 | arXiv ID: 2505.00802v1

By: Vasiliki Papanikou , Danae Pla Karidi , Evaggelia Pitoura and more

Potential Business Impact:

Finds unfairness in AI, making it more just.

Business Areas:
Predictive Analytics Artificial Intelligence, Data and Analytics, Software

As Artificial Intelligence (AI) is increasingly used in areas that significantly impact human lives, concerns about fairness and transparency have grown, especially regarding their impact on protected groups. Recently, the intersection of explainability and fairness has emerged as an important area to promote responsible AI systems. This paper explores how explainability methods can be leveraged to detect and interpret unfairness. We propose a pipeline that integrates local post-hoc explanation methods to derive fairness-related insights. During the pipeline design, we identify and address critical questions arising from the use of explanations as bias detectors such as the relationship between distributive and procedural fairness, the effect of removing the protected attribute, the consistency and quality of results across different explanation methods, the impact of various aggregation strategies of local explanations on group fairness evaluations, and the overall trustworthiness of explanations as bias detectors. Our results show the potential of explanation methods used for fairness while highlighting the need to carefully consider the aforementioned critical aspects.

Country of Origin
🇩🇪 🇬🇷 Greece, Germany


Page Count
26 pages

Category
Computer Science:
Artificial Intelligence