Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classification

Abstract The yearly incidence of breast cancer, which is already among the highest of all cancers, is steadily rising. Without surgical biopsy, predicting the benign or malignant nature of tumors by analyzing various indicators of cell nuclei can effectively assist doctors in diagnosis and reduce pa...

Full description

Saved in:
Bibliographic Details
Main Authors: Mukesh Soni, Mohammed Wasim Bhatt, Paul Ofori-Amanfo
Format: Article
Language:English
Published: Springer 2025-07-01
Series:Discover Applied Sciences
Subjects:
Online Access:https://doi.org/10.1007/s42452-025-07390-7
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849343541627584512
author Mukesh Soni
Mohammed Wasim Bhatt
Paul Ofori-Amanfo
author_facet Mukesh Soni
Mohammed Wasim Bhatt
Paul Ofori-Amanfo
author_sort Mukesh Soni
collection DOAJ
description Abstract The yearly incidence of breast cancer, which is already among the highest of all cancers, is steadily rising. Without surgical biopsy, predicting the benign or malignant nature of tumors by analyzing various indicators of cell nuclei can effectively assist doctors in diagnosis and reduce patients’ suffering. Research continuously shows that LightGBM hybrid models outperform conventional classifiers in terms of accuracy. With improvements in accuracy, speed, and efficiency, LightGBM-based hybrid models frequently outperform baseline or standard classifiers. A model for the identification of breast cancer based on the lightweight gradient boosting machine (GBM) algorithm. To address the problem of skewed diagnostic data for breast cancer, the Borderline-SMOTE method is used. In the Sparrow Search Algorithm (SSA), piecewise linear chaotic map (PWLCM), novel inertia weights, and a new longitudinal-lateral crossover algorithm are introduced for improvement, followed by the application of the improved SSA algorithm for automatic parameter optimization of Light-GBM. Due to Light-GBM’s sensitivity to noise, an OVR-Jacobian regularization method is proposed for denoising. It improved ensemble model strength and successively used for breast cancer diagnosis. The suggested ensemble model achieves better performance than standard models in terms of mean square error, according to the experimental data, determination coefficient, and cross-validation score, demonstrating its better diagnostic performance.
format Article
id doaj-art-d6bd7b50f3c84da39ba024b8f85c19b8
institution Kabale University
issn 3004-9261
language English
publishDate 2025-07-01
publisher Springer
record_format Article
series Discover Applied Sciences
spelling doaj-art-d6bd7b50f3c84da39ba024b8f85c19b82025-08-20T03:42:56ZengSpringerDiscover Applied Sciences3004-92612025-07-017712810.1007/s42452-025-07390-7Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classificationMukesh Soni0Mohammed Wasim Bhatt1Paul Ofori-Amanfo2Centre for Research Impact & Outcome, Chitkara University Institute of Engineering and Technology, Chitkara UniversityModel Institute of Engineering and TechnologyUniversity of Energy and Natural ResourcesAbstract The yearly incidence of breast cancer, which is already among the highest of all cancers, is steadily rising. Without surgical biopsy, predicting the benign or malignant nature of tumors by analyzing various indicators of cell nuclei can effectively assist doctors in diagnosis and reduce patients’ suffering. Research continuously shows that LightGBM hybrid models outperform conventional classifiers in terms of accuracy. With improvements in accuracy, speed, and efficiency, LightGBM-based hybrid models frequently outperform baseline or standard classifiers. A model for the identification of breast cancer based on the lightweight gradient boosting machine (GBM) algorithm. To address the problem of skewed diagnostic data for breast cancer, the Borderline-SMOTE method is used. In the Sparrow Search Algorithm (SSA), piecewise linear chaotic map (PWLCM), novel inertia weights, and a new longitudinal-lateral crossover algorithm are introduced for improvement, followed by the application of the improved SSA algorithm for automatic parameter optimization of Light-GBM. Due to Light-GBM’s sensitivity to noise, an OVR-Jacobian regularization method is proposed for denoising. It improved ensemble model strength and successively used for breast cancer diagnosis. The suggested ensemble model achieves better performance than standard models in terms of mean square error, according to the experimental data, determination coefficient, and cross-validation score, demonstrating its better diagnostic performance.https://doi.org/10.1007/s42452-025-07390-7CancerDeep learningSMOTE algorithmMinority oversamplingBreast cancerLight-GBM model
spellingShingle Mukesh Soni
Mohammed Wasim Bhatt
Paul Ofori-Amanfo
Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classification
Discover Applied Sciences
Cancer
Deep learning
SMOTE algorithm
Minority oversampling
Breast cancer
Light-GBM model
title Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classification
title_full Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classification
title_fullStr Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classification
title_full_unstemmed Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classification
title_short Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classification
title_sort light gbm based minority oversampling model using biomedical data analysis for breast cancer classification
topic Cancer
Deep learning
SMOTE algorithm
Minority oversampling
Breast cancer
Light-GBM model
url https://doi.org/10.1007/s42452-025-07390-7
work_keys_str_mv AT mukeshsoni lightgbmbasedminorityoversamplingmodelusingbiomedicaldataanalysisforbreastcancerclassification
AT mohammedwasimbhatt lightgbmbasedminorityoversamplingmodelusingbiomedicaldataanalysisforbreastcancerclassification
AT pauloforiamanfo lightgbmbasedminorityoversamplingmodelusingbiomedicaldataanalysisforbreastcancerclassification