A comparative study and simple baseline for travel time prediction

Abstract Accurate travel time prediction (TTP) is essential to freeway users, including drivers, administrators, and freight-related companies, for enabling them to plan trips effectively and mitigate traffic congestion. However, TTP is a complex challenge even for researchers due to the difficulty...

Full description

Saved in:

Bibliographic Details
Main Authors:	Chuang-Chieh Lin, Ming-Chu Ho, Chih-Chieh Hung, Hui-Huang Hsu
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-07-01
Series:	Scientific Reports
Online Access:	https://doi.org/10.1038/s41598-025-02303-5
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849332784942809088
author	Chuang-Chieh Lin Ming-Chu Ho Chih-Chieh Hung Hui-Huang Hsu
author_facet	Chuang-Chieh Lin Ming-Chu Ho Chih-Chieh Hung Hui-Huang Hsu
author_sort	Chuang-Chieh Lin
collection	DOAJ
description	Abstract Accurate travel time prediction (TTP) is essential to freeway users, including drivers, administrators, and freight-related companies, for enabling them to plan trips effectively and mitigate traffic congestion. However, TTP is a complex challenge even for researchers due to the difficulty of capturing numerous and diverse factors such as driver behaviors, rush hours, special events, and traffic incidents, etc. A multitude of studies have proposed methods to address this issue, yet these approaches often involve multiple stages and steps, including data preprocessing, feature selection, data imputation, prediction model. The intricacy of these processes makes it difficult to pinpoint which steps or factors most significantly influence prediction accuracy. In this paper, we investigate the impact of various steps on TTP accuracy by examining existing methods. Beginning with the data pre-processing phase, we evaluate the effect of deep learning, interpolation, and max value imputation techniques on dealing with missing values. We also examine the influence of temporal features and weather conditions on the prediction accuracy. Furthermore, we compare five distinct hybrid models by assessing their strengths and limitations. To ensure our experiments align with real-world situations well, we conduct experiments using datasets from Taiwan and California. The experimental results reveal that the data-preprocessing phase, including feature editing, plays a pivotal role in TTP accuracy. Additionally, base models such as Long Short-Term Memory (LSTM) and eXtreme Gradient Boosting (XGBoost) outperform all hybrid models on real-world datasets. Based on these insights, we propose a baseline that fuses the complementary strengths of XGBoost and LSTM via a gating network. This approach dynamically allocates weights, guided by key statistical features, to each model, enabling the model to robustly adapt to both stable and volatile traffic conditions and achieve superior prediction accuracy compared to existing methods. By breaking down the TTP process and analyzing each component, this study provides insights into the factors which affect prediction accuracy most significantly, thereby offering guidance and foundation for developing more effective TTP methods in the future.
format	Article
id	doaj-art-7d409f9bf05a49e0a79c1f19bf773d16
institution	Kabale University
issn	2045-2322
language	English
publishDate	2025-07-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj-art-7d409f9bf05a49e0a79c1f19bf773d162025-08-20T03:46:07ZengNature PortfolioScientific Reports2045-23222025-07-0115112410.1038/s41598-025-02303-5A comparative study and simple baseline for travel time predictionChuang-Chieh Lin0Ming-Chu Ho1Chih-Chieh Hung2Hui-Huang Hsu3Department of Computer Science and Engineering, National Taiwan Ocean UniversityDepartment of Management Information Systems, National Chung Hsing UniversityDepartment of Management Information Systems, National Chung Hsing UniversityDepartment of Computer Science and Information Engineering, Tamkang UniversityAbstract Accurate travel time prediction (TTP) is essential to freeway users, including drivers, administrators, and freight-related companies, for enabling them to plan trips effectively and mitigate traffic congestion. However, TTP is a complex challenge even for researchers due to the difficulty of capturing numerous and diverse factors such as driver behaviors, rush hours, special events, and traffic incidents, etc. A multitude of studies have proposed methods to address this issue, yet these approaches often involve multiple stages and steps, including data preprocessing, feature selection, data imputation, prediction model. The intricacy of these processes makes it difficult to pinpoint which steps or factors most significantly influence prediction accuracy. In this paper, we investigate the impact of various steps on TTP accuracy by examining existing methods. Beginning with the data pre-processing phase, we evaluate the effect of deep learning, interpolation, and max value imputation techniques on dealing with missing values. We also examine the influence of temporal features and weather conditions on the prediction accuracy. Furthermore, we compare five distinct hybrid models by assessing their strengths and limitations. To ensure our experiments align with real-world situations well, we conduct experiments using datasets from Taiwan and California. The experimental results reveal that the data-preprocessing phase, including feature editing, plays a pivotal role in TTP accuracy. Additionally, base models such as Long Short-Term Memory (LSTM) and eXtreme Gradient Boosting (XGBoost) outperform all hybrid models on real-world datasets. Based on these insights, we propose a baseline that fuses the complementary strengths of XGBoost and LSTM via a gating network. This approach dynamically allocates weights, guided by key statistical features, to each model, enabling the model to robustly adapt to both stable and volatile traffic conditions and achieve superior prediction accuracy compared to existing methods. By breaking down the TTP process and analyzing each component, this study provides insights into the factors which affect prediction accuracy most significantly, thereby offering guidance and foundation for developing more effective TTP methods in the future.https://doi.org/10.1038/s41598-025-02303-5
spellingShingle	Chuang-Chieh Lin Ming-Chu Ho Chih-Chieh Hung Hui-Huang Hsu A comparative study and simple baseline for travel time prediction Scientific Reports
title	A comparative study and simple baseline for travel time prediction
title_full	A comparative study and simple baseline for travel time prediction
title_fullStr	A comparative study and simple baseline for travel time prediction
title_full_unstemmed	A comparative study and simple baseline for travel time prediction
title_short	A comparative study and simple baseline for travel time prediction
title_sort	comparative study and simple baseline for travel time prediction
url	https://doi.org/10.1038/s41598-025-02303-5
work_keys_str_mv	AT chuangchiehlin acomparativestudyandsimplebaselinefortraveltimeprediction AT mingchuho acomparativestudyandsimplebaselinefortraveltimeprediction AT chihchiehhung acomparativestudyandsimplebaselinefortraveltimeprediction AT huihuanghsu acomparativestudyandsimplebaselinefortraveltimeprediction AT chuangchiehlin comparativestudyandsimplebaselinefortraveltimeprediction AT mingchuho comparativestudyandsimplebaselinefortraveltimeprediction AT chihchiehhung comparativestudyandsimplebaselinefortraveltimeprediction AT huihuanghsu comparativestudyandsimplebaselinefortraveltimeprediction

A comparative study and simple baseline for travel time prediction

Similar Items