Interpreting LLMs as Credit Risk Classifiers: Do Their Feature Explanations Align with Classical ML?
By: Saeed AlMarri , Kristof Juhasz , Mathieu Ravaut and more
Potential Business Impact:
Helps banks predict loan defaults better.
Large Language Models (LLMs) are increasingly explored as flexible alternatives to classical machine learning models for classification tasks through zero-shot prompting. However, their suitability for structured tabular data remains underexplored, especially in high-stakes financial applications such as financial risk assessment. This study conducts a systematic comparison between zero-shot LLM-based classifiers and LightGBM, a state-of-the-art gradient-boosting model, on a real-world loan default prediction task. We evaluate their predictive performance, analyze feature attributions using SHAP, and assess the reliability of LLM-generated self-explanations. While LLMs are able to identify key financial risk indicators, their feature importance rankings diverge notably from LightGBM, and their self-explanations often fail to align with empirical SHAP attributions. These findings highlight the limitations of LLMs as standalone models for structured financial risk prediction and raise concerns about the trustworthiness of their self-generated explanations. Our results underscore the need for explainability audits, baseline comparisons with interpretable models, and human-in-the-loop oversight when deploying LLMs in risk-sensitive financial environments.
Similar Papers
Measuring What LLMs Think They Do: SHAP Faithfulness and Deployability on Financial Tabular Classification
Machine Learning (CS)
Makes AI explain financial risks more honestly.
Interpretable LLMs for Credit Risk: A Systematic Review and Taxonomy
Risk Management
Helps banks guess if people will pay back loans.
Forecasting Credit Ratings: A Case Study where Traditional Methods Outperform Generative LLMs
Risk Management
Helps predict company money health better.