A comparative study and simple baseline for travel time prediction
Abstract Accurate travel time prediction (TTP) is essential to freeway users, including drivers, administrators, and freight-related companies, for enabling them to plan trips effectively and mitigate traffic congestion. However, TTP is a complex challenge even for researchers due to the difficulty...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | Scientific Reports |
| Online Access: | https://doi.org/10.1038/s41598-025-02303-5 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849332784942809088 |
|---|---|
| author | Chuang-Chieh Lin Ming-Chu Ho Chih-Chieh Hung Hui-Huang Hsu |
| author_facet | Chuang-Chieh Lin Ming-Chu Ho Chih-Chieh Hung Hui-Huang Hsu |
| author_sort | Chuang-Chieh Lin |
| collection | DOAJ |
| description | Abstract Accurate travel time prediction (TTP) is essential to freeway users, including drivers, administrators, and freight-related companies, for enabling them to plan trips effectively and mitigate traffic congestion. However, TTP is a complex challenge even for researchers due to the difficulty of capturing numerous and diverse factors such as driver behaviors, rush hours, special events, and traffic incidents, etc. A multitude of studies have proposed methods to address this issue, yet these approaches often involve multiple stages and steps, including data preprocessing, feature selection, data imputation, prediction model. The intricacy of these processes makes it difficult to pinpoint which steps or factors most significantly influence prediction accuracy. In this paper, we investigate the impact of various steps on TTP accuracy by examining existing methods. Beginning with the data pre-processing phase, we evaluate the effect of deep learning, interpolation, and max value imputation techniques on dealing with missing values. We also examine the influence of temporal features and weather conditions on the prediction accuracy. Furthermore, we compare five distinct hybrid models by assessing their strengths and limitations. To ensure our experiments align with real-world situations well, we conduct experiments using datasets from Taiwan and California. The experimental results reveal that the data-preprocessing phase, including feature editing, plays a pivotal role in TTP accuracy. Additionally, base models such as Long Short-Term Memory (LSTM) and eXtreme Gradient Boosting (XGBoost) outperform all hybrid models on real-world datasets. Based on these insights, we propose a baseline that fuses the complementary strengths of XGBoost and LSTM via a gating network. This approach dynamically allocates weights, guided by key statistical features, to each model, enabling the model to robustly adapt to both stable and volatile traffic conditions and achieve superior prediction accuracy compared to existing methods. By breaking down the TTP process and analyzing each component, this study provides insights into the factors which affect prediction accuracy most significantly, thereby offering guidance and foundation for developing more effective TTP methods in the future. |
| format | Article |
| id | doaj-art-7d409f9bf05a49e0a79c1f19bf773d16 |
| institution | Kabale University |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-7d409f9bf05a49e0a79c1f19bf773d162025-08-20T03:46:07ZengNature PortfolioScientific Reports2045-23222025-07-0115112410.1038/s41598-025-02303-5A comparative study and simple baseline for travel time predictionChuang-Chieh Lin0Ming-Chu Ho1Chih-Chieh Hung2Hui-Huang Hsu3Department of Computer Science and Engineering, National Taiwan Ocean UniversityDepartment of Management Information Systems, National Chung Hsing UniversityDepartment of Management Information Systems, National Chung Hsing UniversityDepartment of Computer Science and Information Engineering, Tamkang UniversityAbstract Accurate travel time prediction (TTP) is essential to freeway users, including drivers, administrators, and freight-related companies, for enabling them to plan trips effectively and mitigate traffic congestion. However, TTP is a complex challenge even for researchers due to the difficulty of capturing numerous and diverse factors such as driver behaviors, rush hours, special events, and traffic incidents, etc. A multitude of studies have proposed methods to address this issue, yet these approaches often involve multiple stages and steps, including data preprocessing, feature selection, data imputation, prediction model. The intricacy of these processes makes it difficult to pinpoint which steps or factors most significantly influence prediction accuracy. In this paper, we investigate the impact of various steps on TTP accuracy by examining existing methods. Beginning with the data pre-processing phase, we evaluate the effect of deep learning, interpolation, and max value imputation techniques on dealing with missing values. We also examine the influence of temporal features and weather conditions on the prediction accuracy. Furthermore, we compare five distinct hybrid models by assessing their strengths and limitations. To ensure our experiments align with real-world situations well, we conduct experiments using datasets from Taiwan and California. The experimental results reveal that the data-preprocessing phase, including feature editing, plays a pivotal role in TTP accuracy. Additionally, base models such as Long Short-Term Memory (LSTM) and eXtreme Gradient Boosting (XGBoost) outperform all hybrid models on real-world datasets. Based on these insights, we propose a baseline that fuses the complementary strengths of XGBoost and LSTM via a gating network. This approach dynamically allocates weights, guided by key statistical features, to each model, enabling the model to robustly adapt to both stable and volatile traffic conditions and achieve superior prediction accuracy compared to existing methods. By breaking down the TTP process and analyzing each component, this study provides insights into the factors which affect prediction accuracy most significantly, thereby offering guidance and foundation for developing more effective TTP methods in the future.https://doi.org/10.1038/s41598-025-02303-5 |
| spellingShingle | Chuang-Chieh Lin Ming-Chu Ho Chih-Chieh Hung Hui-Huang Hsu A comparative study and simple baseline for travel time prediction Scientific Reports |
| title | A comparative study and simple baseline for travel time prediction |
| title_full | A comparative study and simple baseline for travel time prediction |
| title_fullStr | A comparative study and simple baseline for travel time prediction |
| title_full_unstemmed | A comparative study and simple baseline for travel time prediction |
| title_short | A comparative study and simple baseline for travel time prediction |
| title_sort | comparative study and simple baseline for travel time prediction |
| url | https://doi.org/10.1038/s41598-025-02303-5 |
| work_keys_str_mv | AT chuangchiehlin acomparativestudyandsimplebaselinefortraveltimeprediction AT mingchuho acomparativestudyandsimplebaselinefortraveltimeprediction AT chihchiehhung acomparativestudyandsimplebaselinefortraveltimeprediction AT huihuanghsu acomparativestudyandsimplebaselinefortraveltimeprediction AT chuangchiehlin comparativestudyandsimplebaselinefortraveltimeprediction AT mingchuho comparativestudyandsimplebaselinefortraveltimeprediction AT chihchiehhung comparativestudyandsimplebaselinefortraveltimeprediction AT huihuanghsu comparativestudyandsimplebaselinefortraveltimeprediction |