Introduction: Across the globe, policymakers are focusing on boosting sustainable transport options, notably cycling, to foster eco-friendly urban environments. However, the persistent safety challenges cyclists face continues to hinder these efforts. Method: This research explores a novel hybrid methodology to investigate the determinants of cyclist crash severity by combining eXtreme Gradient Boosting (XGBoost) with SHapley Additive exPlanations (SHAP) and a random parameters logit model with heterogeneity in means and variances (RPLHMV). Using crash data from the Department for Transport covering crashes in Great Britain from 2016 to 2019, the research evaluates the methodology’s effectiveness. The XGBoost-SHAP model reduced data dimensionality allowing the application of a robust statistical model, while the random parameters logit model with heterogeneity in means and variances captured heterogeneity in both means and variances. Results: The statistical model identified 10 significant variables with fixed parameters for the fatal crashes, 22 significant variables for the serious injuries, and two indicator variables such as cyclist age ≤ 17 and overtaking as a manoeuvre for the second vehicle with statistically significant random parameters associated with serious injury outcomes. The relationships revealed by the logit framework were further examined using the XGBoost-SHAP, which provided deeper insights into the interactions between random and fixed parameters. The use of the hybrid approach allowed to achieve a very good R2 McFadden value of 0.52 for the RPLHMV, demonstrating the model’s robustness. Conclusions: The hybrid approach not only provides a deeper understanding of crash severity dynamics but also helps in creating specific safety measures. Practical applications: This research can guide policymakers in identifying key factors and interactions that affect crash severity, leading to targeted safety improvements.
Cyclist crash severity modeling: A hybrid approach of XGBoost-SHAP and random parameters logit with heterogeneity in means and variances / Scarano, A.; Sadeghi, Matin; Mauriello, Filomena; Rella Riccardi, M.; Aghabayk, Kayvan; Montella, Alfonso. - In: JOURNAL OF SAFETY RESEARCH. - ISSN 0022-4375. - 93:(2025), pp. 373-398. [10.1016/j.jsr.2025.04.003]
Cyclist crash severity modeling: A hybrid approach of XGBoost-SHAP and random parameters logit with heterogeneity in means and variances
Scarano A.
Primo
;Mauriello Filomena;Rella Riccardi M.;Montella Alfonso
2025
Abstract
Introduction: Across the globe, policymakers are focusing on boosting sustainable transport options, notably cycling, to foster eco-friendly urban environments. However, the persistent safety challenges cyclists face continues to hinder these efforts. Method: This research explores a novel hybrid methodology to investigate the determinants of cyclist crash severity by combining eXtreme Gradient Boosting (XGBoost) with SHapley Additive exPlanations (SHAP) and a random parameters logit model with heterogeneity in means and variances (RPLHMV). Using crash data from the Department for Transport covering crashes in Great Britain from 2016 to 2019, the research evaluates the methodology’s effectiveness. The XGBoost-SHAP model reduced data dimensionality allowing the application of a robust statistical model, while the random parameters logit model with heterogeneity in means and variances captured heterogeneity in both means and variances. Results: The statistical model identified 10 significant variables with fixed parameters for the fatal crashes, 22 significant variables for the serious injuries, and two indicator variables such as cyclist age ≤ 17 and overtaking as a manoeuvre for the second vehicle with statistically significant random parameters associated with serious injury outcomes. The relationships revealed by the logit framework were further examined using the XGBoost-SHAP, which provided deeper insights into the interactions between random and fixed parameters. The use of the hybrid approach allowed to achieve a very good R2 McFadden value of 0.52 for the RPLHMV, demonstrating the model’s robustness. Conclusions: The hybrid approach not only provides a deeper understanding of crash severity dynamics but also helps in creating specific safety measures. Practical applications: This research can guide policymakers in identifying key factors and interactions that affect crash severity, leading to targeted safety improvements.| File | Dimensione | Formato | |
|---|---|---|---|
|
Scarano et al 2025_Cyclist crash severity modeling_JSR 93.pdf
accesso aperto
Tipologia:
Versione Editoriale (PDF)
Licenza:
Dominio pubblico
Dimensione
7.06 MB
Formato
Adobe PDF
|
7.06 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


