Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study
ABSTRACT Objective Modic changes (MCs) classification system is the most widely used method in magnetic resonance imaging (MRI) for characterizing subchondral vertebral marrow changes. However, it shows a high degree of sensitivity to variations in MRI because of its semiquantitative nature. In 2021...
Saved in:
Main Authors: | , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2025-01-01
|
Series: | Orthopaedic Surgery |
Subjects: | |
Online Access: | https://doi.org/10.1111/os.14280 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841527103140397056 |
---|---|
author | Li‐peng Xing Gang Liu Hao‐chen Zhang Lei Wang Shan Zhu Man Du La Hua Bao Yan‐ni Wang Chao Chen Zhi Wang Xin‐yu Liu Shuai Zhang Qiang Yang |
author_facet | Li‐peng Xing Gang Liu Hao‐chen Zhang Lei Wang Shan Zhu Man Du La Hua Bao Yan‐ni Wang Chao Chen Zhi Wang Xin‐yu Liu Shuai Zhang Qiang Yang |
author_sort | Li‐peng Xing |
collection | DOAJ |
description | ABSTRACT Objective Modic changes (MCs) classification system is the most widely used method in magnetic resonance imaging (MRI) for characterizing subchondral vertebral marrow changes. However, it shows a high degree of sensitivity to variations in MRI because of its semiquantitative nature. In 2021, the authors of this classification system further proposed a quantitative and reliable MC grading method. However, automated tools to grade MCs are lacking. This study developed and investigated the performance of convolutional neural network (CNN) in detecting and grading MCs based on their maximum vertical extent. In order to verify performance, we tested CNNs' generalization performance, the performance of CNN with that of junior doctors, and the consistency of junior doctors after AI assistance. Methods A retrospective analysis of 139 patients' MRIs with MCs was conducted and annotated by a spine surgeon. Of the 139 patients, MRIs from 109 patients were acquired using Philips scanners from June 2020 to June 2021, constituting Dataset 1. The remaining 30 patients had MRIs obtained from both Philips and United Imaging scanners from June 2022 to March 2023, forming Dataset 2. YOLOv8 and YOLOv5 were developed in PyCharm using the Python language and based on the PyTorch deep learning framework, data enhancement and transfer learning were applied to enhance model generalization. The model's performance was compared with precision, recall, F1 score, and mAP50. It also tested generalizability and compared it with the junior doctor's performance on the second data set (Dataset 2). Post hoc, the junior doctor graded Dataset 2 with CNN assistance. In addition, the region of interest was displayed using the class activation mapping heat map. Results On the unseen test set, the YOLOv8 and YOLOv5 models achieved precision of 81.60% and 61.59%, recall of 80.90% and 67.16%, mAP50 of 84.40% and 68.88%, and F1 of 0.81 and 0.60 respectively. On Dataset 2, YOLOv8 and junior doctor achieved precision of 95.1% and 72.5%, recall of 68.3% and 60.6%. In the AI‐assisted experiment, agreement between the junior doctor and the senior spine surgeon significantly improved from Cohen's kappa of 0.368–0.681. Conclusions YOLOv8 in detecting and grading MCs was significantly superior to that of YOLOv5. The performance of YOLOv8 is superior to that of junior doctors, and it can enhance the capabilities of junior doctors and improve the reliability of diagnoses. |
format | Article |
id | doaj-art-6f6503f9923d4ad28e1e34de138fa05d |
institution | Kabale University |
issn | 1757-7853 1757-7861 |
language | English |
publishDate | 2025-01-01 |
publisher | Wiley |
record_format | Article |
series | Orthopaedic Surgery |
spelling | doaj-art-6f6503f9923d4ad28e1e34de138fa05d2025-01-16T05:31:15ZengWileyOrthopaedic Surgery1757-78531757-78612025-01-0117123324310.1111/os.14280Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative StudyLi‐peng Xing0Gang Liu1Hao‐chen Zhang2Lei Wang3Shan Zhu4Man Du La Hua Bao5Yan‐ni Wang6Chao Chen7Zhi Wang8Xin‐yu Liu9Shuai Zhang10Qiang Yang11State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical Engineering Hebei University of Technology Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaState Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical Engineering Hebei University of Technology Tianjin ChinaState Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical Engineering Hebei University of Technology Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaDepartment of Orthopedic Surgery, Qilu Hospital, Cheeloo College of Medicine Shandong University Jinan Shandong ChinaState Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical Engineering Hebei University of Technology Tianjin ChinaDepartment of Spine Surgery, Tianjin Hospital Tianjin University Tianjin ChinaABSTRACT Objective Modic changes (MCs) classification system is the most widely used method in magnetic resonance imaging (MRI) for characterizing subchondral vertebral marrow changes. However, it shows a high degree of sensitivity to variations in MRI because of its semiquantitative nature. In 2021, the authors of this classification system further proposed a quantitative and reliable MC grading method. However, automated tools to grade MCs are lacking. This study developed and investigated the performance of convolutional neural network (CNN) in detecting and grading MCs based on their maximum vertical extent. In order to verify performance, we tested CNNs' generalization performance, the performance of CNN with that of junior doctors, and the consistency of junior doctors after AI assistance. Methods A retrospective analysis of 139 patients' MRIs with MCs was conducted and annotated by a spine surgeon. Of the 139 patients, MRIs from 109 patients were acquired using Philips scanners from June 2020 to June 2021, constituting Dataset 1. The remaining 30 patients had MRIs obtained from both Philips and United Imaging scanners from June 2022 to March 2023, forming Dataset 2. YOLOv8 and YOLOv5 were developed in PyCharm using the Python language and based on the PyTorch deep learning framework, data enhancement and transfer learning were applied to enhance model generalization. The model's performance was compared with precision, recall, F1 score, and mAP50. It also tested generalizability and compared it with the junior doctor's performance on the second data set (Dataset 2). Post hoc, the junior doctor graded Dataset 2 with CNN assistance. In addition, the region of interest was displayed using the class activation mapping heat map. Results On the unseen test set, the YOLOv8 and YOLOv5 models achieved precision of 81.60% and 61.59%, recall of 80.90% and 67.16%, mAP50 of 84.40% and 68.88%, and F1 of 0.81 and 0.60 respectively. On Dataset 2, YOLOv8 and junior doctor achieved precision of 95.1% and 72.5%, recall of 68.3% and 60.6%. In the AI‐assisted experiment, agreement between the junior doctor and the senior spine surgeon significantly improved from Cohen's kappa of 0.368–0.681. Conclusions YOLOv8 in detecting and grading MCs was significantly superior to that of YOLOv5. The performance of YOLOv8 is superior to that of junior doctors, and it can enhance the capabilities of junior doctors and improve the reliability of diagnoses.https://doi.org/10.1111/os.14280deep learningendplate osteochondritismagnetic resonance imagingModic changes |
spellingShingle | Li‐peng Xing Gang Liu Hao‐chen Zhang Lei Wang Shan Zhu Man Du La Hua Bao Yan‐ni Wang Chao Chen Zhi Wang Xin‐yu Liu Shuai Zhang Qiang Yang Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study Orthopaedic Surgery deep learning endplate osteochondritis magnetic resonance imaging Modic changes |
title | Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study |
title_full | Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study |
title_fullStr | Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study |
title_full_unstemmed | Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study |
title_short | Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study |
title_sort | evaluating cnn architectures for the automated detection and grading of modic changes in mri a comparative study |
topic | deep learning endplate osteochondritis magnetic resonance imaging Modic changes |
url | https://doi.org/10.1111/os.14280 |
work_keys_str_mv | AT lipengxing evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT gangliu evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT haochenzhang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT leiwang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT shanzhu evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT mandulahuabao evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT yanniwang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT chaochen evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT zhiwang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT xinyuliu evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT shuaizhang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy AT qiangyang evaluatingcnnarchitecturesfortheautomateddetectionandgradingofmodicchangesinmriacomparativestudy |