With increasing urbanisation, and a growing population, transport within cities has never been more important. Buses are the most widespread form of transport worldwide, often being cheaper and more flexible than rail, but also less reliable. Long term bus journey time predictions are important for advanced journey planning and scheduling of bus services. For this reason, several machine/deep learning techniques have been defined to predict bus journey time. Still, due to the number of involved factors, such as complexity and noise in bus data, road network topology, etc., accurate predictions remain elusive. In this paper we aim at validating some Machine Learning methods recently shown to be effective in the literature, on new bus datasets from Dublin and Genoa. The analysis of the results shows some interesting insights into bus networks, highlighting that the accuracy of the predictions is strongly related to the standard deviation of the whole journey times. It emerges that some bus routes show consistency in the prediction error across methods, and for these routes it makes sense to use methods that are fast and computationally efficient, as there is no benefit to applying more complex algorithms. We use features of the route data distribution to develop an explanatory model for the consistency of the route across methods, with a coefficient of determination (R2 ) of 0.94. Finally, we identify a systematic anomaly in the data in Dublin that alters the performance of the methods.
Bus Journey Time Prediction with Machine Learning: An Empirical Experience in Two Cities / Dunne, L.; Rocco Di Torrepadula, F.; Di Martino, S.; Mcardle, G.; Nardone, D.. - 13912:(2023), pp. 105-120. ( 20th International Symposium on Web and Wireless Geographical Information Systems, W2GIS 2023 can 2023) [10.1007/978-3-031-34612-5_7].
Bus Journey Time Prediction with Machine Learning: An Empirical Experience in Two Cities
Rocco Di Torrepadula F.;Di Martino S.;
2023
Abstract
With increasing urbanisation, and a growing population, transport within cities has never been more important. Buses are the most widespread form of transport worldwide, often being cheaper and more flexible than rail, but also less reliable. Long term bus journey time predictions are important for advanced journey planning and scheduling of bus services. For this reason, several machine/deep learning techniques have been defined to predict bus journey time. Still, due to the number of involved factors, such as complexity and noise in bus data, road network topology, etc., accurate predictions remain elusive. In this paper we aim at validating some Machine Learning methods recently shown to be effective in the literature, on new bus datasets from Dublin and Genoa. The analysis of the results shows some interesting insights into bus networks, highlighting that the accuracy of the predictions is strongly related to the standard deviation of the whole journey times. It emerges that some bus routes show consistency in the prediction error across methods, and for these routes it makes sense to use methods that are fast and computationally efficient, as there is no benefit to applying more complex algorithms. We use features of the route data distribution to develop an explanatory model for the consistency of the route across methods, with a coefficient of determination (R2 ) of 0.94. Finally, we identify a systematic anomaly in the data in Dublin that alters the performance of the methods.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


