Score: 0

Methodology for Comparing Machine Learning Algorithms for Survival Analysis

Published: October 28, 2025 | arXiv ID: 2510.24473v1

By: Lucas Buk Cardoso , Simone Aldrey Angelo , Yasmin Pacheco Gil Bonilha and more

Potential Business Impact:

Helps doctors guess how long cancer patients will live.

Business Areas:
Predictive Analytics Artificial Intelligence, Data and Analytics, Software

This study presents a comparative methodological analysis of six machine learning models for survival analysis (MLSA). Using data from nearly 45,000 colorectal cancer patients in the Hospital-Based Cancer Registries of S\~ao Paulo, we evaluated Random Survival Forest (RSF), Gradient Boosting for Survival Analysis (GBSA), Survival SVM (SSVM), XGBoost-Cox (XGB-Cox), XGBoost-AFT (XGB-AFT), and LightGBM (LGBM), capable of predicting survival considering censored data. Hyperparameter optimization was performed with different samplers, and model performance was assessed using the Concordance Index (C-Index), C-Index IPCW, time-dependent AUC, and Integrated Brier Score (IBS). Survival curves produced by the models were compared with predictions from classification algorithms, and predictor interpretation was conducted using SHAP and permutation importance. XGB-AFT achieved the best performance (C-Index = 0.7618; IPCW = 0.7532), followed by GBSA and RSF. The results highlight the potential and applicability of MLSA to improve survival prediction and support decision making.

Page Count
13 pages

Category
Computer Science:
Machine Learning (CS)