Accurate and lightweight oral cancer detection using SE-MobileViT on clinically validated image dataset

Abstract Oral cancer poses a critical global health challenge, with early detection significantly improving patient survival rates and treatment outcomes. This study proposes an advanced deep learning-based diagnostic model, LightSE-MobileViT, specifically designed to classify oral cancer using medi...

Full description

Saved in:

Bibliographic Details
Main Authors:	Md Firoz Kabir, Md Yousuf Ahmad, Roise Uddin, Martin Cordero, Shashi Kant
Format:	Article
Language:	English
Published:	Springer 2025-07-01
Series:	Discover Artificial Intelligence
Subjects:	Oral cancer detection Lightweight deep learning MobileViT transformer Squeeze-and-excitation (SE) module Medical imaging
Online Access:	https://doi.org/10.1007/s44163-025-00442-2
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849342440317648896
author	Md Firoz Kabir Md Yousuf Ahmad Roise Uddin Martin Cordero Shashi Kant
author_facet	Md Firoz Kabir Md Yousuf Ahmad Roise Uddin Martin Cordero Shashi Kant
author_sort	Md Firoz Kabir
collection	DOAJ
description	Abstract Oral cancer poses a critical global health challenge, with early detection significantly improving patient survival rates and treatment outcomes. This study proposes an advanced deep learning-based diagnostic model, LightSE-MobileViT, specifically designed to classify oral cancer using medical imaging. The Oral Cancer Classification dataset used in this study comprises clinically validated lip and tongue images collected from various ENT hospitals in Ahmedabad. The original dataset consisted of 131 images (87 cancerous and 44 non-cancerous). To address class imbalance and enhance model generalizability, data augmentation techniques were employed, expanding the dataset to 981 images with equal distribution across both classes. Our proposed model, LightSE-MobileViT, integrates a lightweight convolutional neural network (CNN) backbone consisting of sequential convolutional layers enhanced with batch normalization and rectified linear unit activations. To further enrich feature representation and spatial attention, a Squeeze-and-Excitation block is embedded after the third convolutional layer. Subsequently, a MobileViT transformer encoder is employed, effectively capturing global contextual information through efficient multi-headed self-attention mechanisms. Experimental evaluations revealed that LightSE-MobileViT achieved superior diagnostic performance, attaining an accuracy of 98.39%, precision and recall values approaching 1.00 for both cancerous and non-cancerous categories, a macro F1-score of 0.98, and an ROC-AUC of 1.00. Comparative analysis demonstrated notable improvements over benchmark models, including CST-CNN (98% accuracy), MobileNetV2 (97% accuracy), DenseNet121 (97% accuracy), and InceptionV3 (90% accuracy). The exceptional performance of LightSE-MobileViT underscores its robust capability and clinical applicability, suggesting significant potential for deployment in automated oral cancer screening, thus facilitating early detection and timely intervention.
format	Article
id	doaj-art-52b6f5ab05a34d848b5b09c7f25689f5
institution	Kabale University
issn	2731-0809
language	English
publishDate	2025-07-01
publisher	Springer
record_format	Article
series	Discover Artificial Intelligence
spelling	doaj-art-52b6f5ab05a34d848b5b09c7f25689f52025-08-20T03:43:22ZengSpringerDiscover Artificial Intelligence2731-08092025-07-015112110.1007/s44163-025-00442-2Accurate and lightweight oral cancer detection using SE-MobileViT on clinically validated image datasetMd Firoz Kabir0Md Yousuf Ahmad1Roise Uddin2Martin Cordero3Shashi Kant4University of the CumberlandsTrine UniversityPacific States UniversityUniversity of the CumberlandsBule Hora UniversityAbstract Oral cancer poses a critical global health challenge, with early detection significantly improving patient survival rates and treatment outcomes. This study proposes an advanced deep learning-based diagnostic model, LightSE-MobileViT, specifically designed to classify oral cancer using medical imaging. The Oral Cancer Classification dataset used in this study comprises clinically validated lip and tongue images collected from various ENT hospitals in Ahmedabad. The original dataset consisted of 131 images (87 cancerous and 44 non-cancerous). To address class imbalance and enhance model generalizability, data augmentation techniques were employed, expanding the dataset to 981 images with equal distribution across both classes. Our proposed model, LightSE-MobileViT, integrates a lightweight convolutional neural network (CNN) backbone consisting of sequential convolutional layers enhanced with batch normalization and rectified linear unit activations. To further enrich feature representation and spatial attention, a Squeeze-and-Excitation block is embedded after the third convolutional layer. Subsequently, a MobileViT transformer encoder is employed, effectively capturing global contextual information through efficient multi-headed self-attention mechanisms. Experimental evaluations revealed that LightSE-MobileViT achieved superior diagnostic performance, attaining an accuracy of 98.39%, precision and recall values approaching 1.00 for both cancerous and non-cancerous categories, a macro F1-score of 0.98, and an ROC-AUC of 1.00. Comparative analysis demonstrated notable improvements over benchmark models, including CST-CNN (98% accuracy), MobileNetV2 (97% accuracy), DenseNet121 (97% accuracy), and InceptionV3 (90% accuracy). The exceptional performance of LightSE-MobileViT underscores its robust capability and clinical applicability, suggesting significant potential for deployment in automated oral cancer screening, thus facilitating early detection and timely intervention.https://doi.org/10.1007/s44163-025-00442-2Oral cancer detectionLightweight deep learningMobileViT transformerSqueeze-and-excitation (SE) moduleMedical imaging
spellingShingle	Md Firoz Kabir Md Yousuf Ahmad Roise Uddin Martin Cordero Shashi Kant Accurate and lightweight oral cancer detection using SE-MobileViT on clinically validated image dataset Discover Artificial Intelligence Oral cancer detection Lightweight deep learning MobileViT transformer Squeeze-and-excitation (SE) module Medical imaging
title	Accurate and lightweight oral cancer detection using SE-MobileViT on clinically validated image dataset
title_full	Accurate and lightweight oral cancer detection using SE-MobileViT on clinically validated image dataset
title_fullStr	Accurate and lightweight oral cancer detection using SE-MobileViT on clinically validated image dataset
title_full_unstemmed	Accurate and lightweight oral cancer detection using SE-MobileViT on clinically validated image dataset
title_short	Accurate and lightweight oral cancer detection using SE-MobileViT on clinically validated image dataset
title_sort	accurate and lightweight oral cancer detection using se mobilevit on clinically validated image dataset
topic	Oral cancer detection Lightweight deep learning MobileViT transformer Squeeze-and-excitation (SE) module Medical imaging
url	https://doi.org/10.1007/s44163-025-00442-2
work_keys_str_mv	AT mdfirozkabir accurateandlightweightoralcancerdetectionusingsemobilevitonclinicallyvalidatedimagedataset AT mdyousufahmad accurateandlightweightoralcancerdetectionusingsemobilevitonclinicallyvalidatedimagedataset AT roiseuddin accurateandlightweightoralcancerdetectionusingsemobilevitonclinicallyvalidatedimagedataset AT martincordero accurateandlightweightoralcancerdetectionusingsemobilevitonclinicallyvalidatedimagedataset AT shashikant accurateandlightweightoralcancerdetectionusingsemobilevitonclinicallyvalidatedimagedataset

Accurate and lightweight oral cancer detection using SE-MobileViT on clinically validated image dataset

Similar Items