Score: 0

Explainable AI in Big Data Fraud Detection

Published: December 17, 2025 | arXiv ID: 2512.16037v1

By: Ayush Jain, Rahul Kulkarni, Siyi Lin

Big Data has become central to modern applications in finance, insurance, and cybersecurity, enabling machine learning systems to perform large-scale risk assessments and fraud detection. However, the increasing dependence on automated analytics introduces important concerns about transparency, regulatory compliance, and trust. This paper examines how explainable artificial intelligence (XAI) can be integrated into Big Data analytics pipelines for fraud detection and risk management. We review key Big Data characteristics and survey major analytical tools, including distributed storage systems, streaming platforms, and advanced fraud detection models such as anomaly detectors, graph-based approaches, and ensemble classifiers. We also present a structured review of widely used XAI methods, including LIME, SHAP, counterfactual explanations, and attention mechanisms, and analyze their strengths and limitations when deployed at scale. Based on these findings, we identify key research gaps related to scalability, real-time processing, and explainability for graph and temporal models. To address these challenges, we outline a conceptual framework that integrates scalable Big Data infrastructure with context-aware explanation mechanisms and human feedback. The paper concludes with open research directions in scalable XAI, privacy-aware explanations, and standardized evaluation methods for explainable fraud detection systems.

Category
Computer Science:
Machine Learning (CS)