Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study

ABSTRACT Objective Modic changes (MCs) classification system is the most widely used method in magnetic resonance imaging (MRI) for characterizing subchondral vertebral marrow changes. However, it shows a high degree of sensitivity to variations in MRI because of its semiquantitative nature. In 2021...

Full description

Saved in:
Bibliographic Details
Main Authors: Li‐peng Xing, Gang Liu, Hao‐chen Zhang, Lei Wang, Shan Zhu, Man Du La Hua Bao, Yan‐ni Wang, Chao Chen, Zhi Wang, Xin‐yu Liu, Shuai Zhang, Qiang Yang
Format: Article
Language:English
Published: Wiley 2025-01-01
Series:Orthopaedic Surgery
Subjects:
Online Access:https://doi.org/10.1111/os.14280
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841527103140397056
author Li‐peng Xing
Gang Liu
Hao‐chen Zhang
Lei Wang
Shan Zhu
Man Du La Hua Bao
Yan‐ni Wang
Chao Chen
Zhi Wang
Xin‐yu Liu
Shuai Zhang
Qiang Yang
author_facet Li‐peng Xing
Gang Liu
Hao‐chen Zhang
Lei Wang
Shan Zhu
Man Du La Hua Bao
Yan‐ni Wang
Chao Chen
Zhi Wang
Xin‐yu Liu
Shuai Zhang
Qiang Yang
author_sort Li‐peng Xing
collection DOAJ
description ABSTRACT Objective Modic changes (MCs) classification system is the most widely used method in magnetic resonance imaging (MRI) for characterizing subchondral vertebral marrow changes. However, it shows a high degree of sensitivity to variations in MRI because of its semiquantitative nature. In 2021, the authors of this classification system further proposed a quantitative and reliable MC grading method. However, automated tools to grade MCs are lacking. This study developed and investigated the performance of convolutional neural network (CNN) in detecting and grading MCs based on their maximum vertical extent. In order to verify performance, we tested CNNs' generalization performance, the performance of CNN with that of junior doctors, and the consistency of junior doctors after AI assistance. Methods A retrospective analysis of 139 patients' MRIs with MCs was conducted and annotated by a spine surgeon. Of the 139 patients, MRIs from 109 patients were acquired using Philips scanners from June 2020 to June 2021, constituting Dataset 1. The remaining 30 patients had MRIs obtained from both Philips and United Imaging scanners from June 2022 to March 2023, forming Dataset 2. YOLOv8 and YOLOv5 were developed in PyCharm using the Python language and based on the PyTorch deep learning framework, data enhancement and transfer learning were applied to enhance model generalization. The model's performance was compared with precision, recall, F1 score, and mAP50. It also tested generalizability and compared it with the junior doctor's performance on the second data set (Dataset 2). Post hoc, the junior doctor graded Dataset 2 with CNN assistance. In addition, the region of interest was displayed using the class activation mapping heat map. Results On the unseen test set, the YOLOv8 and YOLOv5 models achieved precision of 81.60% and 61.59%, recall of 80.90% and 67.16%, mAP50 of 84.40% and 68.88%, and F1 of 0.81 and 0.60 respectively. On Dataset 2, YOLOv8 and junior doctor achieved precision of 95.1% and 72.5%, recall of 68.3% and 60.6%. In the AI‐assisted experiment, agreement between the junior doctor and the senior spine surgeon significantly improved from Cohen's kappa of 0.368–0.681. Conclusions YOLOv8 in detecting and grading MCs was significantly superior to that of YOLOv5. The performance of YOLOv8 is superior to that of junior doctors, and it can enhance the capabilities of junior doctors and improve the reliability of diagnoses.
format Article
id doaj-art-6f6503f9923d4ad28e1e34de138fa05d
institution Kabale University
issn 1757-7853
1757-7861
language English
publishDate 2025-01-01
publisher Wiley
record_format Article
series Orthopaedic Surgery
spelling doaj-art-6f6503f9923d4ad28e1e34de138fa05d2025-01-16T05:31:15ZengWileyOrthopaedic Surgery1757-78531757-78612025-01-0117123324310.1111/os.14280Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative StudyLi‐peng Xing0Gang Liu1Hao‐chen Zhang2Lei Wang3Shan Zhu4Man Du La Hua Bao5Yan‐ni Wang6Chao Chen7Zhi Wang8Xin‐yu Liu9Shuai Zhang10Qiang Yang11State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical Engineering Hebei University of Technology Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaState Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical Engineering Hebei University of Technology Tianjin ChinaState Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical Engineering Hebei University of Technology Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Orthopedic Surgery, Qilu Hospital, Cheeloo College of Medicine Shandong University Jinan Shandong ChinaState Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical Engineering Hebei University of Technology Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaABSTRACT Objective Modic changes (MCs) classification system is the most widely used method in magnetic resonance imaging (MRI) for characterizing subchondral vertebral marrow changes. However, it shows a high degree of sensitivity to variations in MRI because of its semiquantitative nature. In 2021, the authors of this classification system further proposed a quantitative and reliable MC grading method. However, automated tools to grade MCs are lacking. This study developed and investigated the performance of convolutional neural network (CNN) in detecting and grading MCs based on their maximum vertical extent. In order to verify performance, we tested CNNs' generalization performance, the performance of CNN with that of junior doctors, and the consistency of junior doctors after AI assistance. Methods A retrospective analysis of 139 patients' MRIs with MCs was conducted and annotated by a spine surgeon. Of the 139 patients, MRIs from 109 patients were acquired using Philips scanners from June 2020 to June 2021, constituting Dataset 1. The remaining 30 patients had MRIs obtained from both Philips and United Imaging scanners from June 2022 to March 2023, forming Dataset 2. YOLOv8 and YOLOv5 were developed in PyCharm using the Python language and based on the PyTorch deep learning framework, data enhancement and transfer learning were applied to enhance model generalization. The model's performance was compared with precision, recall, F1 score, and mAP50. It also tested generalizability and compared it with the junior doctor's performance on the second data set (Dataset 2). Post hoc, the junior doctor graded Dataset 2 with CNN assistance. In addition, the region of interest was displayed using the class activation mapping heat map. Results On the unseen test set, the YOLOv8 and YOLOv5 models achieved precision of 81.60% and 61.59%, recall of 80.90% and 67.16%, mAP50 of 84.40% and 68.88%, and F1 of 0.81 and 0.60 respectively. On Dataset 2, YOLOv8 and junior doctor achieved precision of 95.1% and 72.5%, recall of 68.3% and 60.6%. In the AI‐assisted experiment, agreement between the junior doctor and the senior spine surgeon significantly improved from Cohen's kappa of 0.368–0.681. Conclusions YOLOv8 in detecting and grading MCs was significantly superior to that of YOLOv5. The performance of YOLOv8 is superior to that of junior doctors, and it can enhance the capabilities of junior doctors and improve the reliability of diagnoses.https://doi.org/10.1111/os.14280deep learningendplate osteochondritismagnetic resonance imagingModic changes
spellingShingle Li‐peng Xing
Gang Liu
Hao‐chen Zhang
Lei Wang
Shan Zhu
Man Du La Hua Bao
Yan‐ni Wang
Chao Chen
Zhi Wang
Xin‐yu Liu
Shuai Zhang
Qiang Yang
Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study
Orthopaedic Surgery
deep learning
endplate osteochondritis
magnetic resonance imaging
Modic changes
title Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study
title_full Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study
title_fullStr Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study
title_full_unstemmed Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study
title_short Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study
title_sort evaluating cnn architectures for the automated detection and grading of modic changes in mri a comparative study
topic deep learning
endplate osteochondritis
magnetic resonance imaging
Modic changes
url https://doi.org/10.1111/os.14280
work_keys_str_mv AT lipengxing evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy
AT gangliu evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy
AT haochenzhang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy
AT leiwang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy
AT shanzhu evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy
AT mandulahuabao evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy
AT yanniwang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy
AT chaochen evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy
AT zhiwang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy
AT xinyuliu evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy
AT shuaizhang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy
AT qiangyang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy