Automated detection and classification of osteolytic lesions in panoramic radiographs using CNNs and vision transformers

Abstract Background Diseases underlying osteolytic lesions in jaws are characterized by the absorption of bone tissue and are often asymptomatic, delaying their diagnosis. Well-defined lesions (benign cyst-like lesions) and ill-defined lesions (osteomyelitis or malignancy) can be detected early in a...

Full description

Saved in:

Bibliographic Details
Main Authors:	Niels van Nistelrooij, Iman Ghanad, Amir K. Bigdeli, Daniel G. E. Thiem, Constantin von See, Carsten Rendenbach, Ira Maistreli, Tong Xi, Stefaan Bergé, Max Heiland, Shankeeth Vinayahalingam, Robert Gaudin
Format:	Article
Language:	English
Published:	BMC 2025-06-01
Series:	BMC Oral Health
Subjects:	Deep Learning Osteolytic Lesions Panoramic Radiograph Vision Transformer
Online Access:	https://doi.org/10.1186/s12903-025-06209-6
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Abstract Background Diseases underlying osteolytic lesions in jaws are characterized by the absorption of bone tissue and are often asymptomatic, delaying their diagnosis. Well-defined lesions (benign cyst-like lesions) and ill-defined lesions (osteomyelitis or malignancy) can be detected early in a panoramic radiograph (PR) by an experienced examiner, but most dentists lack appropriate training. To support dentists, this study aimed to develop and evaluate deep learning models for the detection of osteolytic lesions in PRs. Methods A dataset of 676 PRs (165 well-defined, 181 ill-defined, 330 control) was collected from the Department of Oral and Maxillofacial Surgery at Charité Berlin, Germany. The osteolytic lesions were pixel-wise segmented and labeled as well-defined or ill-defined. Four model architectures for instance segmentation (Mask R-CNN with a Swin-Tiny or ResNet-50 backbone, Mask DINO, and YOLOv5) were employed with five-fold cross-validation. Their effectiveness was evaluated with sensitivity, specificity, F1-score, and AUC and failure cases were shown. Results Mask R-CNN with a Swin-Tiny backbone was most effective (well-defined F1 = 0.784, AUC = 0.881; ill-defined F1 = 0.904, AUC = 0.971) and the model architectures including vision transformer components were more effective than those without. Model mistakes were observed around the maxillary sinus, at tooth extraction sites, and for radiolucent bands. Conclusions Promising deep learning models were developed for the detection of osteolytic lesions in PRs, particularly those with vision transformer components (Mask R-CNN with Swin-Tiny and Mask DINO). These results underline the potential of vision transformers for enhancing the automated detection of osteolytic lesions, offering a significant improvement over traditional deep learning models.
ISSN:	1472-6831

Automated detection and classification of osteolytic lesions in panoramic radiographs using CNNs and vision transformers

Similar Items