A comparative study and simple baseline for travel time prediction

Abstract Accurate travel time prediction (TTP) is essential to freeway users, including drivers, administrators, and freight-related companies, for enabling them to plan trips effectively and mitigate traffic congestion. However, TTP is a complex challenge even for researchers due to the difficulty...

Full description

Saved in:
Bibliographic Details
Main Authors: Chuang-Chieh Lin, Ming-Chu Ho, Chih-Chieh Hung, Hui-Huang Hsu
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-02303-5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849332784942809088
author Chuang-Chieh Lin
Ming-Chu Ho
Chih-Chieh Hung
Hui-Huang Hsu
author_facet Chuang-Chieh Lin
Ming-Chu Ho
Chih-Chieh Hung
Hui-Huang Hsu
author_sort Chuang-Chieh Lin
collection DOAJ
description Abstract Accurate travel time prediction (TTP) is essential to freeway users, including drivers, administrators, and freight-related companies, for enabling them to plan trips effectively and mitigate traffic congestion. However, TTP is a complex challenge even for researchers due to the difficulty of capturing numerous and diverse factors such as driver behaviors, rush hours, special events, and traffic incidents, etc. A multitude of studies have proposed methods to address this issue, yet these approaches often involve multiple stages and steps, including data preprocessing, feature selection, data imputation, prediction model. The intricacy of these processes makes it difficult to pinpoint which steps or factors most significantly influence prediction accuracy. In this paper, we investigate the impact of various steps on TTP accuracy by examining existing methods. Beginning with the data pre-processing phase, we evaluate the effect of deep learning, interpolation, and max value imputation techniques on dealing with missing values. We also examine the influence of temporal features and weather conditions on the prediction accuracy. Furthermore, we compare five distinct hybrid models by assessing their strengths and limitations. To ensure our experiments align with real-world situations well, we conduct experiments using datasets from Taiwan and California. The experimental results reveal that the data-preprocessing phase, including feature editing, plays a pivotal role in TTP accuracy. Additionally, base models such as Long Short-Term Memory (LSTM) and eXtreme Gradient Boosting (XGBoost) outperform all hybrid models on real-world datasets. Based on these insights, we propose a baseline that fuses the complementary strengths of XGBoost and LSTM via a gating network. This approach dynamically allocates weights, guided by key statistical features, to each model, enabling the model to robustly adapt to both stable and volatile traffic conditions and achieve superior prediction accuracy compared to existing methods. By breaking down the TTP process and analyzing each component, this study provides insights into the factors which affect prediction accuracy most significantly, thereby offering guidance and foundation for developing more effective TTP methods in the future.
format Article
id doaj-art-7d409f9bf05a49e0a79c1f19bf773d16
institution Kabale University
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-7d409f9bf05a49e0a79c1f19bf773d162025-08-20T03:46:07ZengNature PortfolioScientific Reports2045-23222025-07-0115112410.1038/s41598-025-02303-5A comparative study and simple baseline for travel time predictionChuang-Chieh Lin0Ming-Chu Ho1Chih-Chieh Hung2Hui-Huang Hsu3Department of Computer Science and Engineering, National Taiwan Ocean UniversityDepartment of Management Information Systems, National Chung Hsing UniversityDepartment of Management Information Systems, National Chung Hsing UniversityDepartment of Computer Science and Information Engineering, Tamkang UniversityAbstract Accurate travel time prediction (TTP) is essential to freeway users, including drivers, administrators, and freight-related companies, for enabling them to plan trips effectively and mitigate traffic congestion. However, TTP is a complex challenge even for researchers due to the difficulty of capturing numerous and diverse factors such as driver behaviors, rush hours, special events, and traffic incidents, etc. A multitude of studies have proposed methods to address this issue, yet these approaches often involve multiple stages and steps, including data preprocessing, feature selection, data imputation, prediction model. The intricacy of these processes makes it difficult to pinpoint which steps or factors most significantly influence prediction accuracy. In this paper, we investigate the impact of various steps on TTP accuracy by examining existing methods. Beginning with the data pre-processing phase, we evaluate the effect of deep learning, interpolation, and max value imputation techniques on dealing with missing values. We also examine the influence of temporal features and weather conditions on the prediction accuracy. Furthermore, we compare five distinct hybrid models by assessing their strengths and limitations. To ensure our experiments align with real-world situations well, we conduct experiments using datasets from Taiwan and California. The experimental results reveal that the data-preprocessing phase, including feature editing, plays a pivotal role in TTP accuracy. Additionally, base models such as Long Short-Term Memory (LSTM) and eXtreme Gradient Boosting (XGBoost) outperform all hybrid models on real-world datasets. Based on these insights, we propose a baseline that fuses the complementary strengths of XGBoost and LSTM via a gating network. This approach dynamically allocates weights, guided by key statistical features, to each model, enabling the model to robustly adapt to both stable and volatile traffic conditions and achieve superior prediction accuracy compared to existing methods. By breaking down the TTP process and analyzing each component, this study provides insights into the factors which affect prediction accuracy most significantly, thereby offering guidance and foundation for developing more effective TTP methods in the future.https://doi.org/10.1038/s41598-025-02303-5
spellingShingle Chuang-Chieh Lin
Ming-Chu Ho
Chih-Chieh Hung
Hui-Huang Hsu
A comparative study and simple baseline for travel time prediction
Scientific Reports
title A comparative study and simple baseline for travel time prediction
title_full A comparative study and simple baseline for travel time prediction
title_fullStr A comparative study and simple baseline for travel time prediction
title_full_unstemmed A comparative study and simple baseline for travel time prediction
title_short A comparative study and simple baseline for travel time prediction
title_sort comparative study and simple baseline for travel time prediction
url https://doi.org/10.1038/s41598-025-02303-5
work_keys_str_mv AT chuangchiehlin acomparativestudyandsimplebaselinefortraveltimeprediction
AT mingchuho acomparativestudyandsimplebaselinefortraveltimeprediction
AT chihchiehhung acomparativestudyandsimplebaselinefortraveltimeprediction
AT huihuanghsu acomparativestudyandsimplebaselinefortraveltimeprediction
AT chuangchiehlin comparativestudyandsimplebaselinefortraveltimeprediction
AT mingchuho comparativestudyandsimplebaselinefortraveltimeprediction
AT chihchiehhung comparativestudyandsimplebaselinefortraveltimeprediction
AT huihuanghsu comparativestudyandsimplebaselinefortraveltimeprediction