Score: 0

A comparative analysis of machine learning algorithms for predicting probabilities of default

Published: June 24, 2025 | arXiv ID: 2506.19789v1

By: Adrian Iulian Cristescu, Matteo Giordano

Potential Business Impact:

Helps banks guess if people will repay loans.

Predicting the probability of default (PD) of prospective loans is a critical objective for financial institutions. In recent years, machine learning (ML) algorithms have achieved remarkable success across a wide variety of prediction tasks; yet, they remain relatively underutilised in credit risk analysis. This paper highlights the opportunities that ML algorithms offer to this field by comparing the performance of five predictive models-Random Forests, Decision Trees, XGBoost, Gradient Boosting and AdaBoost-to the predominantly used logistic regression, over a benchmark dataset from Scheule et al. (Credit Risk Analytics: The R Companion). Our findings underscore the strengths and weaknesses of each method, providing valuable insights into the most effective ML algorithms for PD prediction in the context of loan portfolios.

Page Count
6 pages

Category
Quantitative Finance:
Risk Management