Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases

Objective Systemic autoimmune rheumatic diseases (SARDs) encompass a diverse group of complex conditions with overlapping clinical features, making accurate diagnosis challenging. This study aims to develop a multiclass machine learning (ML) model for early-stage SARDs classification using accessibl...

Full description

Saved in:
Bibliographic Details
Main Authors: Feng Wang, Ting Wang, Wei Wei, Yun Wang, Xu Yuan, Renren Ouyang, Rujia Chen, Hongyan Hou, Shiji Wu
Format: Article
Language:English
Published: BMJ Publishing Group 2024-05-01
Series:Lupus Science and Medicine
Online Access:https://lupus.bmj.com/content/11/1/e001125.full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846172418386690048
author Feng Wang
Ting Wang
Wei Wei
Yun Wang
Xu Yuan
Renren Ouyang
Rujia Chen
Hongyan Hou
Shiji Wu
author_facet Feng Wang
Ting Wang
Wei Wei
Yun Wang
Xu Yuan
Renren Ouyang
Rujia Chen
Hongyan Hou
Shiji Wu
author_sort Feng Wang
collection DOAJ
description Objective Systemic autoimmune rheumatic diseases (SARDs) encompass a diverse group of complex conditions with overlapping clinical features, making accurate diagnosis challenging. This study aims to develop a multiclass machine learning (ML) model for early-stage SARDs classification using accessible laboratory indicators.Methods A total of 925 SARDs patients were included, categorised into SLE, Sjögren’s syndrome (SS) and inflammatory myositis (IM). Clinical characteristics and laboratory markers were collected and nine key indicators, including anti-dsDNA, anti-SS-A60, anti-Sm/nRNP, antichromatin, anti-dsDNA (indirect immunofluorescence assay), haemoglobin (Hb), platelet, neutrophil percentage and cytoplasmic patterns (AC-19, AC-20), were selected for model building. Various ML algorithms were used to construct a tripartite classification ML model.Results Patients were divided into two cohorts, cohort 1 was used to construct a tripartite classification model. Among models assessed, the random forest (RF) model demonstrated superior performance in distinguishing SLE, IM and SS (with area under curve=0.953, 0.903 and 0.836; accuracy= 0.892, 0.869 and 0.857; sensitivity= 0.890, 0.868 and 0.795; specificity= 0.910, 0.836 and 0.748; positive predictive value=0.922, 0.727 and 0.663; and negative predictive value= 0.854, 0.915 and 0.879). The RF model excelled in classifying SLE (precision=0.930, recall=0.985, F1 score=0.957). For IM and SS, RF model outcomes were (precision=0.793, 0.950; recall=0.920, 0.679; F1 score=0.852, 0.792). Cohort 2 served as an external validation set, achieving an overall accuracy of 87.3%. Individual classification performances for SLE, SS and IM were excellent, with precision, recall and F1 scores specified. SHAP analysis highlighted significant contributions from antibody profiles.Conclusion This pioneering multiclass ML model, using basic laboratory indicators, enhances clinical feasibility and demonstrates promising potential for SARDs classification. The collaboration of clinical expertise and ML offers a nuanced approach to SARDs classification, with potential for enhanced patient care.
format Article
id doaj-art-f459f3c27e8c48deab380ad59af4099c
institution Kabale University
issn 2053-8790
language English
publishDate 2024-05-01
publisher BMJ Publishing Group
record_format Article
series Lupus Science and Medicine
spelling doaj-art-f459f3c27e8c48deab380ad59af4099c2024-11-10T08:30:08ZengBMJ Publishing GroupLupus Science and Medicine2053-87902024-05-0111110.1136/lupus-2023-001125Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseasesFeng Wang0Ting Wang1Wei Wei2Yun Wang3Xu Yuan4Renren Ouyang5Rujia Chen6Hongyan Hou7Shiji Wu8Department of Laboratory Medicine, Nantong University Affiliated Hospital, Nantong, ChinaDepartment of Laboratory Medicine, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, ChinaACellera, Vancouver, BC, CanadaDepartment of Laboratory Medicine, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, ChinaDepartment of Laboratory Medicine, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, ChinaDepartment of Laboratory Medicine, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, ChinaDepartment of Laboratory Medicine, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, ChinaDepartment of Laboratory Medicine, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, ChinaDepartment of Laboratory Medicine, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan, Hubei, ChinaObjective Systemic autoimmune rheumatic diseases (SARDs) encompass a diverse group of complex conditions with overlapping clinical features, making accurate diagnosis challenging. This study aims to develop a multiclass machine learning (ML) model for early-stage SARDs classification using accessible laboratory indicators.Methods A total of 925 SARDs patients were included, categorised into SLE, Sjögren’s syndrome (SS) and inflammatory myositis (IM). Clinical characteristics and laboratory markers were collected and nine key indicators, including anti-dsDNA, anti-SS-A60, anti-Sm/nRNP, antichromatin, anti-dsDNA (indirect immunofluorescence assay), haemoglobin (Hb), platelet, neutrophil percentage and cytoplasmic patterns (AC-19, AC-20), were selected for model building. Various ML algorithms were used to construct a tripartite classification ML model.Results Patients were divided into two cohorts, cohort 1 was used to construct a tripartite classification model. Among models assessed, the random forest (RF) model demonstrated superior performance in distinguishing SLE, IM and SS (with area under curve=0.953, 0.903 and 0.836; accuracy= 0.892, 0.869 and 0.857; sensitivity= 0.890, 0.868 and 0.795; specificity= 0.910, 0.836 and 0.748; positive predictive value=0.922, 0.727 and 0.663; and negative predictive value= 0.854, 0.915 and 0.879). The RF model excelled in classifying SLE (precision=0.930, recall=0.985, F1 score=0.957). For IM and SS, RF model outcomes were (precision=0.793, 0.950; recall=0.920, 0.679; F1 score=0.852, 0.792). Cohort 2 served as an external validation set, achieving an overall accuracy of 87.3%. Individual classification performances for SLE, SS and IM were excellent, with precision, recall and F1 scores specified. SHAP analysis highlighted significant contributions from antibody profiles.Conclusion This pioneering multiclass ML model, using basic laboratory indicators, enhances clinical feasibility and demonstrates promising potential for SARDs classification. The collaboration of clinical expertise and ML offers a nuanced approach to SARDs classification, with potential for enhanced patient care.https://lupus.bmj.com/content/11/1/e001125.full
spellingShingle Feng Wang
Ting Wang
Wei Wei
Yun Wang
Xu Yuan
Renren Ouyang
Rujia Chen
Hongyan Hou
Shiji Wu
Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases
Lupus Science and Medicine
title Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases
title_full Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases
title_fullStr Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases
title_full_unstemmed Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases
title_short Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases
title_sort novel multiclass classification machine learning approach for the early stage classification of systemic autoimmune rheumatic diseases
url https://lupus.bmj.com/content/11/1/e001125.full
work_keys_str_mv AT fengwang novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases
AT tingwang novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases
AT weiwei novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases
AT yunwang novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases
AT xuyuan novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases
AT renrenouyang novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases
AT rujiachen novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases
AT hongyanhou novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases
AT shijiwu novelmulticlassclassificationmachinelearningapproachfortheearlystageclassificationofsystemicautoimmunerheumaticdiseases