Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases
Objective Systemic autoimmune rheumatic diseases (SARDs) encompass a diverse group of complex conditions with overlapping clinical features, making accurate diagnosis challenging. This study aims to develop a multiclass machine learning (ML) model for early-stage SARDs classification using accessibl...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMJ Publishing Group
2024-05-01
|
| Series: | Lupus Science and Medicine |
| Online Access: | https://lupus.bmj.com/content/11/1/e001125.full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846172418386690048 |
|---|---|
| author | Feng Wang Ting Wang Wei Wei Yun Wang Xu Yuan Renren Ouyang Rujia Chen Hongyan Hou Shiji Wu |
| author_facet | Feng Wang Ting Wang Wei Wei Yun Wang Xu Yuan Renren Ouyang Rujia Chen Hongyan Hou Shiji Wu |
| author_sort | Feng Wang |
| collection | DOAJ |
| description | Objective Systemic autoimmune rheumatic diseases (SARDs) encompass a diverse group of complex conditions with overlapping clinical features, making accurate diagnosis challenging. This study aims to develop a multiclass machine learning (ML) model for early-stage SARDs classification using accessible laboratory indicators.Methods A total of 925 SARDs patients were included, categorised into SLE, Sjögren’s syndrome (SS) and inflammatory myositis (IM). Clinical characteristics and laboratory markers were collected and nine key indicators, including anti-dsDNA, anti-SS-A60, anti-Sm/nRNP, antichromatin, anti-dsDNA (indirect immunofluorescence assay), haemoglobin (Hb), platelet, neutrophil percentage and cytoplasmic patterns (AC-19, AC-20), were selected for model building. Various ML algorithms were used to construct a tripartite classification ML model.Results Patients were divided into two cohorts, cohort 1 was used to construct a tripartite classification model. Among models assessed, the random forest (RF) model demonstrated superior performance in distinguishing SLE, IM and SS (with area under curve=0.953, 0.903 and 0.836; accuracy= 0.892, 0.869 and 0.857; sensitivity= 0.890, 0.868 and 0.795; specificity= 0.910, 0.836 and 0.748; positive predictive value=0.922, 0.727 and 0.663; and negative predictive value= 0.854, 0.915 and 0.879). The RF model excelled in classifying SLE (precision=0.930, recall=0.985, F1 score=0.957). For IM and SS, RF model outcomes were (precision=0.793, 0.950; recall=0.920, 0.679; F1 score=0.852, 0.792). Cohort 2 served as an external validation set, achieving an overall accuracy of 87.3%. Individual classification performances for SLE, SS and IM were excellent, with precision, recall and F1 scores specified. SHAP analysis highlighted significant contributions from antibody profiles.Conclusion This pioneering multiclass ML model, using basic laboratory indicators, enhances clinical feasibility and demonstrates promising potential for SARDs classification. The collaboration of clinical expertise and ML offers a nuanced approach to SARDs classification, with potential for enhanced patient care. |
| format | Article |
| id | doaj-art-f459f3c27e8c48deab380ad59af4099c |
| institution | Kabale University |
| issn | 2053-8790 |
| language | English |
| publishDate | 2024-05-01 |
| publisher | BMJ Publishing Group |
| record_format | Article |
| series | Lupus Science and Medicine |
| spelling | doaj-art-f459f3c27e8c48deab380ad59af4099c2024-11-10T08:30:08ZengBMJ Publishing GroupLupus Science and Medicine2053-87902024-05-0111110.1136/lupus-2023-001125Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseasesFeng Wang0Ting Wang1Wei Wei2Yun Wang3Xu Yuan4Renren Ouyang5Rujia Chen6Hongyan Hou7Shiji Wu8Department of Laboratory Medicine, Nantong University Affiliated Hospital, Nantong, ChinaDepartment of Laboratory Medicine, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, ChinaACellera, Vancouver, BC, CanadaDepartment of Laboratory Medicine, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, ChinaDepartment of Laboratory Medicine, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, ChinaDepartment of Laboratory Medicine, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, ChinaDepartment of Laboratory Medicine, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, ChinaDepartment of Laboratory Medicine, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, ChinaDepartment of Laboratory Medicine, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, ChinaObjective Systemic autoimmune rheumatic diseases (SARDs) encompass a diverse group of complex conditions with overlapping clinical features, making accurate diagnosis challenging. This study aims to develop a multiclass machine learning (ML) model for early-stage SARDs classification using accessible laboratory indicators.Methods A total of 925 SARDs patients were included, categorised into SLE, Sjögren’s syndrome (SS) and inflammatory myositis (IM). Clinical characteristics and laboratory markers were collected and nine key indicators, including anti-dsDNA, anti-SS-A60, anti-Sm/nRNP, antichromatin, anti-dsDNA (indirect immunofluorescence assay), haemoglobin (Hb), platelet, neutrophil percentage and cytoplasmic patterns (AC-19, AC-20), were selected for model building. Various ML algorithms were used to construct a tripartite classification ML model.Results Patients were divided into two cohorts, cohort 1 was used to construct a tripartite classification model. Among models assessed, the random forest (RF) model demonstrated superior performance in distinguishing SLE, IM and SS (with area under curve=0.953, 0.903 and 0.836; accuracy= 0.892, 0.869 and 0.857; sensitivity= 0.890, 0.868 and 0.795; specificity= 0.910, 0.836 and 0.748; positive predictive value=0.922, 0.727 and 0.663; and negative predictive value= 0.854, 0.915 and 0.879). The RF model excelled in classifying SLE (precision=0.930, recall=0.985, F1 score=0.957). For IM and SS, RF model outcomes were (precision=0.793, 0.950; recall=0.920, 0.679; F1 score=0.852, 0.792). Cohort 2 served as an external validation set, achieving an overall accuracy of 87.3%. Individual classification performances for SLE, SS and IM were excellent, with precision, recall and F1 scores specified. SHAP analysis highlighted significant contributions from antibody profiles.Conclusion This pioneering multiclass ML model, using basic laboratory indicators, enhances clinical feasibility and demonstrates promising potential for SARDs classification. The collaboration of clinical expertise and ML offers a nuanced approach to SARDs classification, with potential for enhanced patient care.https://lupus.bmj.com/content/11/1/e001125.full |
| spellingShingle | Feng Wang Ting Wang Wei Wei Yun Wang Xu Yuan Renren Ouyang Rujia Chen Hongyan Hou Shiji Wu Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases Lupus Science and Medicine |
| title | Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases |
| title_full | Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases |
| title_fullStr | Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases |
| title_full_unstemmed | Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases |
| title_short | Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases |
| title_sort | novel multiclass classification machine learning approach for the early stage classification of systemic autoimmune rheumatic diseases |
| url | https://lupus.bmj.com/content/11/1/e001125.full |
| work_keys_str_mv | AT fengwang novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases AT tingwang novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases AT weiwei novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases AT yunwang novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases AT xuyuan novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases AT renrenouyang novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases AT rujiachen novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases AT hongyanhou novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases AT shijiwu novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases |