Score: 0

Non-Linear Determinants of Pedestrian Injury Severity: Evidence from Administrative Data in Great Britain

Published: December 3, 2025 | arXiv ID: 2512.04022v1

By: Yifei Tong

Potential Business Impact:

Finds why car crashes hurt pedestrians more.

Business Areas:
Smart Cities Real Estate

This study investigates the non-linear determinants of pedestrian injury severity using administrative data from Great Britain's 2023 STATS19 dataset. To address inherent data-quality challenges, including missing information and substantial class imbalance, we employ a rigorous preprocessing pipeline utilizing mode imputation and Synthetic Minority Over-sampling (SMOTE). We utilize non-parametric ensemble methods (Random Forest and XGBoost) to capture complex interactions and heterogeneity often missed by linear models, while Shapley Additive Explanations are employed to ensure interpretability and isolate marginal feature effects. Our analysis reveals that vehicle count, speed limits, lighting, and road surface conditions are the primary predictors of severity, with police attendance and junction characteristics further distinguishing severe collisions. Spatially, while pedestrian risk is concentrated in dense urban Local Authority Districts (LADs), we identify that certain rural LADs experience disproportionately severe outcomes conditional on a collision occurring. These findings underscore the value of combining spatial analysis with interpretable machine learning to guide geographically targeted speed management, infrastructure investment, and enforcement strategies.

Country of Origin
🇺🇸 United States

Page Count
18 pages

Category
Computer Science:
Computers and Society