Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study

ABSTRACT Objective Modic changes (MCs) classification system is the most widely used method in magnetic resonance imaging (MRI) for characterizing subchondral vertebral marrow changes. However, it shows a high degree of sensitivity to variations in MRI because of its semiquantitative nature. In 2021...

Full description

Saved in:

Bibliographic Details
Main Authors:	Li‐peng Xing, Gang Liu, Hao‐chen Zhang, Lei Wang, Shan Zhu, Man Du La Hua Bao, Yan‐ni Wang, Chao Chen, Zhi Wang, Xin‐yu Liu, Shuai Zhang, Qiang Yang
Format:	Article
Language:	English
Published:	Wiley 2025-01-01
Series:	Orthopaedic Surgery
Subjects:	deep learning endplate osteochondritis magnetic resonance imaging Modic changes
Online Access:	https://doi.org/10.1111/os.14280
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841527103140397056
author	Li‐peng Xing Gang Liu Hao‐chen Zhang Lei Wang Shan Zhu Man Du La Hua Bao Yan‐ni Wang Chao Chen Zhi Wang Xin‐yu Liu Shuai Zhang Qiang Yang
author_facet	Li‐peng Xing Gang Liu Hao‐chen Zhang Lei Wang Shan Zhu Man Du La Hua Bao Yan‐ni Wang Chao Chen Zhi Wang Xin‐yu Liu Shuai Zhang Qiang Yang
author_sort	Li‐peng Xing
collection	DOAJ
description	ABSTRACT Objective Modic changes (MCs) classification system is the most widely used method in magnetic resonance imaging (MRI) for characterizing subchondral vertebral marrow changes. However, it shows a high degree of sensitivity to variations in MRI because of its semiquantitative nature. In 2021, the authors of this classification system further proposed a quantitative and reliable MC grading method. However, automated tools to grade MCs are lacking. This study developed and investigated the performance of convolutional neural network (CNN) in detecting and grading MCs based on their maximum vertical extent. In order to verify performance, we tested CNNs' generalization performance, the performance of CNN with that of junior doctors, and the consistency of junior doctors after AI assistance. Methods A retrospective analysis of 139 patients' MRIs with MCs was conducted and annotated by a spine surgeon. Of the 139 patients, MRIs from 109 patients were acquired using Philips scanners from June 2020 to June 2021, constituting Dataset 1. The remaining 30 patients had MRIs obtained from both Philips and United Imaging scanners from June 2022 to March 2023, forming Dataset 2. YOLOv8 and YOLOv5 were developed in PyCharm using the Python language and based on the PyTorch deep learning framework, data enhancement and transfer learning were applied to enhance model generalization. The model's performance was compared with precision, recall, F1 score, and mAP50. It also tested generalizability and compared it with the junior doctor's performance on the second data set (Dataset 2). Post hoc, the junior doctor graded Dataset 2 with CNN assistance. In addition, the region of interest was displayed using the class activation mapping heat map. Results On the unseen test set, the YOLOv8 and YOLOv5 models achieved precision of 81.60% and 61.59%, recall of 80.90% and 67.16%, mAP50 of 84.40% and 68.88%, and F1 of 0.81 and 0.60 respectively. On Dataset 2, YOLOv8 and junior doctor achieved precision of 95.1% and 72.5%, recall of 68.3% and 60.6%. In the AI‐assisted experiment, agreement between the junior doctor and the senior spine surgeon significantly improved from Cohen's kappa of 0.368–0.681. Conclusions YOLOv8 in detecting and grading MCs was significantly superior to that of YOLOv5. The performance of YOLOv8 is superior to that of junior doctors, and it can enhance the capabilities of junior doctors and improve the reliability of diagnoses.
format	Article
id	doaj-art-6f6503f9923d4ad28e1e34de138fa05d
institution	Kabale University
issn	1757-7853 1757-7861
language	English
publishDate	2025-01-01
publisher	Wiley
record_format	Article
series	Orthopaedic Surgery
spelling	doaj-art-6f6503f9923d4ad28e1e34de138fa05d2025-01-16T05:31:15ZengWileyOrthopaedic Surgery1757-78531757-78612025-01-0117123324310.1111/os.14280Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative StudyLi‐peng Xing0Gang Liu1Hao‐chen Zhang2Lei Wang3Shan Zhu4Man Du La Hua Bao5Yan‐ni Wang6Chao Chen7Zhi Wang8Xin‐yu Liu9Shuai Zhang10Qiang Yang11State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical Engineering Hebei University of Technology Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaState Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical Engineering Hebei University of Technology Tianjin ChinaState Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical Engineering Hebei University of Technology Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Orthopedic Surgery, Qilu Hospital, Cheeloo College of Medicine Shandong University Jinan Shandong ChinaState Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical Engineering Hebei University of Technology Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaABSTRACT Objective Modic changes (MCs) classification system is the most widely used method in magnetic resonance imaging (MRI) for characterizing subchondral vertebral marrow changes. However, it shows a high degree of sensitivity to variations in MRI because of its semiquantitative nature. In 2021, the authors of this classification system further proposed a quantitative and reliable MC grading method. However, automated tools to grade MCs are lacking. This study developed and investigated the performance of convolutional neural network (CNN) in detecting and grading MCs based on their maximum vertical extent. In order to verify performance, we tested CNNs' generalization performance, the performance of CNN with that of junior doctors, and the consistency of junior doctors after AI assistance. Methods A retrospective analysis of 139 patients' MRIs with MCs was conducted and annotated by a spine surgeon. Of the 139 patients, MRIs from 109 patients were acquired using Philips scanners from June 2020 to June 2021, constituting Dataset 1. The remaining 30 patients had MRIs obtained from both Philips and United Imaging scanners from June 2022 to March 2023, forming Dataset 2. YOLOv8 and YOLOv5 were developed in PyCharm using the Python language and based on the PyTorch deep learning framework, data enhancement and transfer learning were applied to enhance model generalization. The model's performance was compared with precision, recall, F1 score, and mAP50. It also tested generalizability and compared it with the junior doctor's performance on the second data set (Dataset 2). Post hoc, the junior doctor graded Dataset 2 with CNN assistance. In addition, the region of interest was displayed using the class activation mapping heat map. Results On the unseen test set, the YOLOv8 and YOLOv5 models achieved precision of 81.60% and 61.59%, recall of 80.90% and 67.16%, mAP50 of 84.40% and 68.88%, and F1 of 0.81 and 0.60 respectively. On Dataset 2, YOLOv8 and junior doctor achieved precision of 95.1% and 72.5%, recall of 68.3% and 60.6%. In the AI‐assisted experiment, agreement between the junior doctor and the senior spine surgeon significantly improved from Cohen's kappa of 0.368–0.681. Conclusions YOLOv8 in detecting and grading MCs was significantly superior to that of YOLOv5. The performance of YOLOv8 is superior to that of junior doctors, and it can enhance the capabilities of junior doctors and improve the reliability of diagnoses.https://doi.org/10.1111/os.14280deep learningendplate osteochondritismagnetic resonance imagingModic changes
spellingShingle	Li‐peng Xing Gang Liu Hao‐chen Zhang Lei Wang Shan Zhu Man Du La Hua Bao Yan‐ni Wang Chao Chen Zhi Wang Xin‐yu Liu Shuai Zhang Qiang Yang Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study Orthopaedic Surgery deep learning endplate osteochondritis magnetic resonance imaging Modic changes
title	Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study
title_full	Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study
title_fullStr	Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study
title_full_unstemmed	Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study
title_short	Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study
title_sort	evaluating cnn architectures for the automated detection and grading of modic changes in mri a comparative study
topic	deep learning endplate osteochondritis magnetic resonance imaging Modic changes
url	https://doi.org/10.1111/os.14280
work_keys_str_mv	AT lipengxing evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT gangliu evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT haochenzhang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT leiwang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT shanzhu evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT mandulahuabao evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT yanniwang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT chaochen evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT zhiwang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT xinyuliu evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT shuaizhang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT qiangyang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy

Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study

Similar Items