Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classification
Abstract The yearly incidence of breast cancer, which is already among the highest of all cancers, is steadily rising. Without surgical biopsy, predicting the benign or malignant nature of tumors by analyzing various indicators of cell nuclei can effectively assist doctors in diagnosis and reduce pa...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-07-01
|
| Series: | Discover Applied Sciences |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s42452-025-07390-7 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849343541627584512 |
|---|---|
| author | Mukesh Soni Mohammed Wasim Bhatt Paul Ofori-Amanfo |
| author_facet | Mukesh Soni Mohammed Wasim Bhatt Paul Ofori-Amanfo |
| author_sort | Mukesh Soni |
| collection | DOAJ |
| description | Abstract The yearly incidence of breast cancer, which is already among the highest of all cancers, is steadily rising. Without surgical biopsy, predicting the benign or malignant nature of tumors by analyzing various indicators of cell nuclei can effectively assist doctors in diagnosis and reduce patients’ suffering. Research continuously shows that LightGBM hybrid models outperform conventional classifiers in terms of accuracy. With improvements in accuracy, speed, and efficiency, LightGBM-based hybrid models frequently outperform baseline or standard classifiers. A model for the identification of breast cancer based on the lightweight gradient boosting machine (GBM) algorithm. To address the problem of skewed diagnostic data for breast cancer, the Borderline-SMOTE method is used. In the Sparrow Search Algorithm (SSA), piecewise linear chaotic map (PWLCM), novel inertia weights, and a new longitudinal-lateral crossover algorithm are introduced for improvement, followed by the application of the improved SSA algorithm for automatic parameter optimization of Light-GBM. Due to Light-GBM’s sensitivity to noise, an OVR-Jacobian regularization method is proposed for denoising. It improved ensemble model strength and successively used for breast cancer diagnosis. The suggested ensemble model achieves better performance than standard models in terms of mean square error, according to the experimental data, determination coefficient, and cross-validation score, demonstrating its better diagnostic performance. |
| format | Article |
| id | doaj-art-d6bd7b50f3c84da39ba024b8f85c19b8 |
| institution | Kabale University |
| issn | 3004-9261 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Springer |
| record_format | Article |
| series | Discover Applied Sciences |
| spelling | doaj-art-d6bd7b50f3c84da39ba024b8f85c19b82025-08-20T03:42:56ZengSpringerDiscover Applied Sciences3004-92612025-07-017712810.1007/s42452-025-07390-7Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classificationMukesh Soni0Mohammed Wasim Bhatt1Paul Ofori-Amanfo2Centre for Research Impact & Outcome, Chitkara University Institute of Engineering and Technology, Chitkara UniversityModel Institute of Engineering and TechnologyUniversity of Energy and Natural ResourcesAbstract The yearly incidence of breast cancer, which is already among the highest of all cancers, is steadily rising. Without surgical biopsy, predicting the benign or malignant nature of tumors by analyzing various indicators of cell nuclei can effectively assist doctors in diagnosis and reduce patients’ suffering. Research continuously shows that LightGBM hybrid models outperform conventional classifiers in terms of accuracy. With improvements in accuracy, speed, and efficiency, LightGBM-based hybrid models frequently outperform baseline or standard classifiers. A model for the identification of breast cancer based on the lightweight gradient boosting machine (GBM) algorithm. To address the problem of skewed diagnostic data for breast cancer, the Borderline-SMOTE method is used. In the Sparrow Search Algorithm (SSA), piecewise linear chaotic map (PWLCM), novel inertia weights, and a new longitudinal-lateral crossover algorithm are introduced for improvement, followed by the application of the improved SSA algorithm for automatic parameter optimization of Light-GBM. Due to Light-GBM’s sensitivity to noise, an OVR-Jacobian regularization method is proposed for denoising. It improved ensemble model strength and successively used for breast cancer diagnosis. The suggested ensemble model achieves better performance than standard models in terms of mean square error, according to the experimental data, determination coefficient, and cross-validation score, demonstrating its better diagnostic performance.https://doi.org/10.1007/s42452-025-07390-7CancerDeep learningSMOTE algorithmMinority oversamplingBreast cancerLight-GBM model |
| spellingShingle | Mukesh Soni Mohammed Wasim Bhatt Paul Ofori-Amanfo Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classification Discover Applied Sciences Cancer Deep learning SMOTE algorithm Minority oversampling Breast cancer Light-GBM model |
| title | Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classification |
| title_full | Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classification |
| title_fullStr | Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classification |
| title_full_unstemmed | Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classification |
| title_short | Light-GBM based minority oversampling model using biomedical data analysis for breast cancer classification |
| title_sort | light gbm based minority oversampling model using biomedical data analysis for breast cancer classification |
| topic | Cancer Deep learning SMOTE algorithm Minority oversampling Breast cancer Light-GBM model |
| url | https://doi.org/10.1007/s42452-025-07390-7 |
| work_keys_str_mv | AT mukeshsoni lightgbmbasedminorityoversamplingmodelusingbiomedicaldataanalysisforbreastcancerclassification AT mohammedwasimbhatt lightgbmbasedminorityoversamplingmodelusingbiomedicaldataanalysisforbreastcancerclassification AT pauloforiamanfo lightgbmbasedminorityoversamplingmodelusingbiomedicaldataanalysisforbreastcancerclassification |