Score: 0

Uncovering latent territorial structure in ICFES Saber 11 performance with Bayesian multilevel spatial models

Published: December 18, 2025 | arXiv ID: 2512.17119v1

By: Laura Pardo, Juan Sosa

This article develops a Bayesian hierarchical framework to analyze academic performance in the 2022 second semester Saber 11 examination in Colombia. Our approach combines multilevel regression with municipal and departmental spatial random effects, and it incorporates Ridge and Lasso regularization priors to compare the contribution of sociodemographic covariates. Inference is implemented in a fully open source workflow using Markov chain Monte Carlo methods, and model behavior is assessed through synthetic data that mirror key features of the observed data. Simulation results indicate that Ridge provides the most balanced performance in parameter recovery, predictive accuracy, and sampling efficiency, while Lasso shows weaker fit and posterior stability, with gains in predictive accuracy under stronger multicollinearity. In the application, posterior rankings show a strong centralization of performance, with higher scores in central departments and lower scores in peripheral territories, and the strongest correlates of scores are student level living conditions, maternal education, access to educational resources, gender, and ethnic background, while spatial random effects capture residual regional disparities. A hybrid Bayesian segmentation based on K means propagates posterior uncertainty into clustering at departmental, municipal, and spatial scales, revealing multiscale territorial patterns consistent with structural inequalities and informing territorial targeting in education policy.

Category
Statistics:
Applications