Construction of machine learning-based models for screening the high-risk patients with gastric precancerous lesions
Abstract Background The individualized prediction and discrimination of precancerous lesions of gastric cancer (PLGC) is critical for the early prevention of gastric cancer (GC). However, accurate non-invasive methods for distinguishing between PLGC and GC are currently lacking. This study therefore...
Saved in:
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2025-01-01
|
Series: | Chinese Medicine |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13020-025-01059-4 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841544294981173248 |
---|---|
author | Shuxian Yu Haiyang Jiang Jing Xia Jie Gu Mengting Chen Yan Wang Xiaohong Zhao Zehua Liao Puhua Zeng Tian Xie Xinbing Sui |
author_facet | Shuxian Yu Haiyang Jiang Jing Xia Jie Gu Mengting Chen Yan Wang Xiaohong Zhao Zehua Liao Puhua Zeng Tian Xie Xinbing Sui |
author_sort | Shuxian Yu |
collection | DOAJ |
description | Abstract Background The individualized prediction and discrimination of precancerous lesions of gastric cancer (PLGC) is critical for the early prevention of gastric cancer (GC). However, accurate non-invasive methods for distinguishing between PLGC and GC are currently lacking. This study therefore aimed to develop a risk prediction model by machine learning and deep learning techniques to aid the early diagnosis of GC. Methods In this study, a total of 2229 subjects were recruited from nine tertiary hospitals between October 2022 and November 2023. We designed a comprehensive questionnaire, identified statistically significant factors, and created a web-based column chart. Then, a risk prediction model was subsequently developed by machine learning techniques. In addition, a tongue image-based risk prediction model was established by deep learning algorithms. Results Based on logistic regression analysis, a dynamic web-based nomogram was developed and it is freely accessible at: https://yz6677.shinyapps.io/GC67/ . Then, the prediction model was established using ten different machine learning algorithms and the Random Forest (RF) model achieved the highest accuracy at 85.65%. According with the predictive results, the top 10 key risk factors were age, traditional Chinese medicine (TCM) constitution type, tongue coating color, tongue color, irregular meals, pickled food, greasy fur, over-hot eating habit, anxiety and sleep onset latency. These factors are all significant risk indicators for the progression of PLGC patients to GC patients. Subsequently, the Swin Transformer architecture was used to develop a tongue image-based model for predicting the risk for progression of PLGC. The verification set showed an accuracy of 73.33% and an area under curve (AUC) greater than 0.8 across all models. Conclusions Our study developed machine learning and deep learning-based models for predicting the risk for progression of PLGC to GC, which will offer the assistance to determine the high-risk patients from PLGC and improve the early diagnosis of GC. Graphical Abstract |
format | Article |
id | doaj-art-85d58e914da34d99875ee56c35546a9f |
institution | Kabale University |
issn | 1749-8546 |
language | English |
publishDate | 2025-01-01 |
publisher | BMC |
record_format | Article |
series | Chinese Medicine |
spelling | doaj-art-85d58e914da34d99875ee56c35546a9f2025-01-12T12:39:44ZengBMCChinese Medicine1749-85462025-01-0120111510.1186/s13020-025-01059-4Construction of machine learning-based models for screening the high-risk patients with gastric precancerous lesionsShuxian Yu0Haiyang Jiang1Jing Xia2Jie Gu3Mengting Chen4Yan Wang5Xiaohong Zhao6Zehua Liao7Puhua Zeng8Tian Xie9Xinbing Sui10School of Pharmacy, Hangzhou Normal UniversitySchool of Pharmacy, Hangzhou Normal UniversitySchool of Pharmacy, Hangzhou Normal UniversitySchool of Pharmacy, Hangzhou Normal UniversitySchool of Pharmacy, Hangzhou Normal UniversitySchool of Pharmacy, Hangzhou Normal UniversitySchool of Pharmacy, Hangzhou Normal UniversitySchool of Pharmacy, Hangzhou Normal UniversityThe Affiliated Hospital of Hunan Academy of Traditional Chinese MedicineSchool of Pharmacy, Hangzhou Normal UniversitySchool of Pharmacy, Hangzhou Normal UniversityAbstract Background The individualized prediction and discrimination of precancerous lesions of gastric cancer (PLGC) is critical for the early prevention of gastric cancer (GC). However, accurate non-invasive methods for distinguishing between PLGC and GC are currently lacking. This study therefore aimed to develop a risk prediction model by machine learning and deep learning techniques to aid the early diagnosis of GC. Methods In this study, a total of 2229 subjects were recruited from nine tertiary hospitals between October 2022 and November 2023. We designed a comprehensive questionnaire, identified statistically significant factors, and created a web-based column chart. Then, a risk prediction model was subsequently developed by machine learning techniques. In addition, a tongue image-based risk prediction model was established by deep learning algorithms. Results Based on logistic regression analysis, a dynamic web-based nomogram was developed and it is freely accessible at: https://yz6677.shinyapps.io/GC67/ . Then, the prediction model was established using ten different machine learning algorithms and the Random Forest (RF) model achieved the highest accuracy at 85.65%. According with the predictive results, the top 10 key risk factors were age, traditional Chinese medicine (TCM) constitution type, tongue coating color, tongue color, irregular meals, pickled food, greasy fur, over-hot eating habit, anxiety and sleep onset latency. These factors are all significant risk indicators for the progression of PLGC patients to GC patients. Subsequently, the Swin Transformer architecture was used to develop a tongue image-based model for predicting the risk for progression of PLGC. The verification set showed an accuracy of 73.33% and an area under curve (AUC) greater than 0.8 across all models. Conclusions Our study developed machine learning and deep learning-based models for predicting the risk for progression of PLGC to GC, which will offer the assistance to determine the high-risk patients from PLGC and improve the early diagnosis of GC. Graphical Abstracthttps://doi.org/10.1186/s13020-025-01059-4Precancerous lesions of gastric cancerGastric cancerMachine learningDeep learningTraditional Chinese medicine |
spellingShingle | Shuxian Yu Haiyang Jiang Jing Xia Jie Gu Mengting Chen Yan Wang Xiaohong Zhao Zehua Liao Puhua Zeng Tian Xie Xinbing Sui Construction of machine learning-based models for screening the high-risk patients with gastric precancerous lesions Chinese Medicine Precancerous lesions of gastric cancer Gastric cancer Machine learning Deep learning Traditional Chinese medicine |
title | Construction of machine learning-based models for screening the high-risk patients with gastric precancerous lesions |
title_full | Construction of machine learning-based models for screening the high-risk patients with gastric precancerous lesions |
title_fullStr | Construction of machine learning-based models for screening the high-risk patients with gastric precancerous lesions |
title_full_unstemmed | Construction of machine learning-based models for screening the high-risk patients with gastric precancerous lesions |
title_short | Construction of machine learning-based models for screening the high-risk patients with gastric precancerous lesions |
title_sort | construction of machine learning based models for screening the high risk patients with gastric precancerous lesions |
topic | Precancerous lesions of gastric cancer Gastric cancer Machine learning Deep learning Traditional Chinese medicine |
url | https://doi.org/10.1186/s13020-025-01059-4 |
work_keys_str_mv | AT shuxianyu constructionofmachinelearningbasedmodelsforscreeningthehighriskpatientswithgastricprecancerouslesions AT haiyangjiang constructionofmachinelearningbasedmodelsforscreeningthehighriskpatientswithgastricprecancerouslesions AT jingxia constructionofmachinelearningbasedmodelsforscreeningthehighriskpatientswithgastricprecancerouslesions AT jiegu constructionofmachinelearningbasedmodelsforscreeningthehighriskpatientswithgastricprecancerouslesions AT mengtingchen constructionofmachinelearningbasedmodelsforscreeningthehighriskpatientswithgastricprecancerouslesions AT yanwang constructionofmachinelearningbasedmodelsforscreeningthehighriskpatientswithgastricprecancerouslesions AT xiaohongzhao constructionofmachinelearningbasedmodelsforscreeningthehighriskpatientswithgastricprecancerouslesions AT zehualiao constructionofmachinelearningbasedmodelsforscreeningthehighriskpatientswithgastricprecancerouslesions AT puhuazeng constructionofmachinelearningbasedmodelsforscreeningthehighriskpatientswithgastricprecancerouslesions AT tianxie constructionofmachinelearningbasedmodelsforscreeningthehighriskpatientswithgastricprecancerouslesions AT xinbingsui constructionofmachinelearningbasedmodelsforscreeningthehighriskpatientswithgastricprecancerouslesions |