A multi-biomarker machine learning approach for early prediction of interstitial lung disease in rheumatoid arthritis
Abstract Background Interstitial lung disease (ILD) is a severe complication affecting 10–30% of rheumatoid arthritis (RA) patients. Current diagnostic methods typically detect ILD only after substantial lung damage has occurred. This delay emphasizes the need for early detection strategies. This st...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-08-01
|
| Series: | BMC Pulmonary Medicine |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12890-025-03855-y |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849334037301166080 |
|---|---|
| author | Jiaojiao Xu Wei Zhang Weili Bai Nannan Gai Jing Li Yunqi Bao |
| author_facet | Jiaojiao Xu Wei Zhang Weili Bai Nannan Gai Jing Li Yunqi Bao |
| author_sort | Jiaojiao Xu |
| collection | DOAJ |
| description | Abstract Background Interstitial lung disease (ILD) is a severe complication affecting 10–30% of rheumatoid arthritis (RA) patients. Current diagnostic methods typically detect ILD only after substantial lung damage has occurred. This delay emphasizes the need for early detection strategies. This study aims to develop and validate machine learning models for early RA-ILD prediction and identify key predictive biomarkers. Methods We conducted a cross-sectional study enrolling 149 RA patients (84 with ILD, 65 without ILD) between January 2020 and December 2023. We evaluated demographic characteristics, clinical parameters, and laboratory markers, including inflammatory indicators, hematological parameters, and specific biomarkers. We developed and compared four machine learning (ML) models (XGBoost, Random Forest, Support Vector Machine, and Logistic Regression) for ILD prediction capabilities. Results The XGBoost model demonstrated superior predictive performance (AUC = 0.891, 95% CI: 0.847–0.935). Feature importance analysis identified Krebs von den Lungen-6 (KL-6) as the strongest predictor (importance score = 0.285), followed by interleukin-6 (IL-6) and cytokeratin 19 fragment (CYFRA21-1). The ILD group exhibited significantly elevated levels of inflammatory markers and specific biomarkers, particularly KL-6 (826.4 ± 458.2 vs. 285.6 ± 124.8 U/ml, P < 0.001), alongside distinct patterns in hematological parameters. Conclusion Machine learning approaches, particularly XGBoost, demonstrate promising potential for early RA-ILD prediction. The integration of KL-6 and other identified biomarkers into clinical screening protocols may facilitate early detection and improved patient outcomes. These findings suggest that machine learning models could serve as valuable tools for risk stratification and early intervention in RA-ILD management, providing new approaches for individualized risk assessment in clinical practice. |
| format | Article |
| id | doaj-art-c80ed961e844422e8e7e68a130de1c79 |
| institution | Kabale University |
| issn | 1471-2466 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | BMC |
| record_format | Article |
| series | BMC Pulmonary Medicine |
| spelling | doaj-art-c80ed961e844422e8e7e68a130de1c792025-08-20T03:45:40ZengBMCBMC Pulmonary Medicine1471-24662025-08-0125111310.1186/s12890-025-03855-yA multi-biomarker machine learning approach for early prediction of interstitial lung disease in rheumatoid arthritisJiaojiao Xu0Wei Zhang1Weili Bai2Nannan Gai3Jing Li4Yunqi Bao5Department of Rheumatology, Xi’an Fifth HospitalDepartment of Rheumatology, Xi’an Fifth HospitalDepartment of Rheumatology, Xi’an Fifth HospitalDepartment of Rheumatology, Xi’an Fifth HospitalDepartment of Rheumatology, Xi’an Fifth HospitalDepartment of Rheumatology, Xi’an Fifth HospitalAbstract Background Interstitial lung disease (ILD) is a severe complication affecting 10–30% of rheumatoid arthritis (RA) patients. Current diagnostic methods typically detect ILD only after substantial lung damage has occurred. This delay emphasizes the need for early detection strategies. This study aims to develop and validate machine learning models for early RA-ILD prediction and identify key predictive biomarkers. Methods We conducted a cross-sectional study enrolling 149 RA patients (84 with ILD, 65 without ILD) between January 2020 and December 2023. We evaluated demographic characteristics, clinical parameters, and laboratory markers, including inflammatory indicators, hematological parameters, and specific biomarkers. We developed and compared four machine learning (ML) models (XGBoost, Random Forest, Support Vector Machine, and Logistic Regression) for ILD prediction capabilities. Results The XGBoost model demonstrated superior predictive performance (AUC = 0.891, 95% CI: 0.847–0.935). Feature importance analysis identified Krebs von den Lungen-6 (KL-6) as the strongest predictor (importance score = 0.285), followed by interleukin-6 (IL-6) and cytokeratin 19 fragment (CYFRA21-1). The ILD group exhibited significantly elevated levels of inflammatory markers and specific biomarkers, particularly KL-6 (826.4 ± 458.2 vs. 285.6 ± 124.8 U/ml, P < 0.001), alongside distinct patterns in hematological parameters. Conclusion Machine learning approaches, particularly XGBoost, demonstrate promising potential for early RA-ILD prediction. The integration of KL-6 and other identified biomarkers into clinical screening protocols may facilitate early detection and improved patient outcomes. These findings suggest that machine learning models could serve as valuable tools for risk stratification and early intervention in RA-ILD management, providing new approaches for individualized risk assessment in clinical practice.https://doi.org/10.1186/s12890-025-03855-yRheumatoid arthritisInterstitial lung diseaseMachine learningKrebs von Den Lungen-6 |
| spellingShingle | Jiaojiao Xu Wei Zhang Weili Bai Nannan Gai Jing Li Yunqi Bao A multi-biomarker machine learning approach for early prediction of interstitial lung disease in rheumatoid arthritis BMC Pulmonary Medicine Rheumatoid arthritis Interstitial lung disease Machine learning Krebs von Den Lungen-6 |
| title | A multi-biomarker machine learning approach for early prediction of interstitial lung disease in rheumatoid arthritis |
| title_full | A multi-biomarker machine learning approach for early prediction of interstitial lung disease in rheumatoid arthritis |
| title_fullStr | A multi-biomarker machine learning approach for early prediction of interstitial lung disease in rheumatoid arthritis |
| title_full_unstemmed | A multi-biomarker machine learning approach for early prediction of interstitial lung disease in rheumatoid arthritis |
| title_short | A multi-biomarker machine learning approach for early prediction of interstitial lung disease in rheumatoid arthritis |
| title_sort | multi biomarker machine learning approach for early prediction of interstitial lung disease in rheumatoid arthritis |
| topic | Rheumatoid arthritis Interstitial lung disease Machine learning Krebs von Den Lungen-6 |
| url | https://doi.org/10.1186/s12890-025-03855-y |
| work_keys_str_mv | AT jiaojiaoxu amultibiomarkermachinelearningapproachforearlypredictionofinterstitiallungdiseaseinrheumatoidarthritis AT weizhang amultibiomarkermachinelearningapproachforearlypredictionofinterstitiallungdiseaseinrheumatoidarthritis AT weilibai amultibiomarkermachinelearningapproachforearlypredictionofinterstitiallungdiseaseinrheumatoidarthritis AT nannangai amultibiomarkermachinelearningapproachforearlypredictionofinterstitiallungdiseaseinrheumatoidarthritis AT jingli amultibiomarkermachinelearningapproachforearlypredictionofinterstitiallungdiseaseinrheumatoidarthritis AT yunqibao amultibiomarkermachinelearningapproachforearlypredictionofinterstitiallungdiseaseinrheumatoidarthritis AT jiaojiaoxu multibiomarkermachinelearningapproachforearlypredictionofinterstitiallungdiseaseinrheumatoidarthritis AT weizhang multibiomarkermachinelearningapproachforearlypredictionofinterstitiallungdiseaseinrheumatoidarthritis AT weilibai multibiomarkermachinelearningapproachforearlypredictionofinterstitiallungdiseaseinrheumatoidarthritis AT nannangai multibiomarkermachinelearningapproachforearlypredictionofinterstitiallungdiseaseinrheumatoidarthritis AT jingli multibiomarkermachinelearningapproachforearlypredictionofinterstitiallungdiseaseinrheumatoidarthritis AT yunqibao multibiomarkermachinelearningapproachforearlypredictionofinterstitiallungdiseaseinrheumatoidarthritis |