A comparative assessment of machine learning models and algorithms for osteosarcoma cancer detection and classification

Osteosarcoma is a bone-forming tumor that is more common in children and young adults than in adults. Timely detection and classification of its type is crucial to its proper treatment and possible survival. Machine learning (ML) models trained on disease datasets are more effective in detection and...

Full description

Saved in:
Bibliographic Details
Main Author: Amoakoh Gyasi-Agyei
Format: Article
Language:English
Published: Elsevier 2025-06-01
Series:Healthcare Analytics
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2772442524000820
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841553721461309440
author Amoakoh Gyasi-Agyei
author_facet Amoakoh Gyasi-Agyei
author_sort Amoakoh Gyasi-Agyei
collection DOAJ
description Osteosarcoma is a bone-forming tumor that is more common in children and young adults than in adults. Timely detection and classification of its type is crucial to its proper treatment and possible survival. Machine learning (ML) models trained on disease datasets are more effective in detection and classification than the conventional methods with hand-crafted features highly dependent on pathologists’ expertise. A publicly available raw osteosarcoma dataset was explored and then preprocessed using different combinations of data denoising techniques (including principal component analysis, mutual information gain, analysis of variance and Kendall’s rank correlation analysis) and data augmentation to derive seven different datasets. Using the seven derived datasets and eight ML algorithms, this study designed and performed an extensive comparative analysis of seven sets of ML models (altogether over 160 models) with their hyperparameters optimized using grid search. The performance differences between the learned ML models were then validated using repeated stratified 10-fold cross-validation and 5x2 cross-validation paired t-tests to select the best model for our task. The empirical model based on the extra trees algorithm and fitted to class-balanced dataset via random oversampling and multicollinearity removed via principal component analysis proved to be the best, as it detected and classified osteosarcoma cancer in 10 ms with 97.8% area under the receiver operating characteristics curve and acceptably low false alarm and misdetection. Thus, the proposed models can be cutting-edge techniques for automated detection and classification of osteosarcoma tumors to aid timely diagnosis, prognosis, and treatment.
format Article
id doaj-art-0f8df580e2cd4c5b8101d937bf1cb7f3
institution Kabale University
issn 2772-4425
language English
publishDate 2025-06-01
publisher Elsevier
record_format Article
series Healthcare Analytics
spelling doaj-art-0f8df580e2cd4c5b8101d937bf1cb7f32025-01-09T06:15:00ZengElsevierHealthcare Analytics2772-44252025-06-017100380A comparative assessment of machine learning models and algorithms for osteosarcoma cancer detection and classificationAmoakoh Gyasi-Agyei0School of Information Technology & Engineering (SITE), Melbourne Institute of Technology (MIT), 288 La Trobe St, Melbourne VIC3000, AustraliaOsteosarcoma is a bone-forming tumor that is more common in children and young adults than in adults. Timely detection and classification of its type is crucial to its proper treatment and possible survival. Machine learning (ML) models trained on disease datasets are more effective in detection and classification than the conventional methods with hand-crafted features highly dependent on pathologists’ expertise. A publicly available raw osteosarcoma dataset was explored and then preprocessed using different combinations of data denoising techniques (including principal component analysis, mutual information gain, analysis of variance and Kendall’s rank correlation analysis) and data augmentation to derive seven different datasets. Using the seven derived datasets and eight ML algorithms, this study designed and performed an extensive comparative analysis of seven sets of ML models (altogether over 160 models) with their hyperparameters optimized using grid search. The performance differences between the learned ML models were then validated using repeated stratified 10-fold cross-validation and 5x2 cross-validation paired t-tests to select the best model for our task. The empirical model based on the extra trees algorithm and fitted to class-balanced dataset via random oversampling and multicollinearity removed via principal component analysis proved to be the best, as it detected and classified osteosarcoma cancer in 10 ms with 97.8% area under the receiver operating characteristics curve and acceptably low false alarm and misdetection. Thus, the proposed models can be cutting-edge techniques for automated detection and classification of osteosarcoma tumors to aid timely diagnosis, prognosis, and treatment.http://www.sciencedirect.com/science/article/pii/S2772442524000820Osteosarcoma classificationMachine learningAI in healthcareCancer detectionHealthcare informaticsMedical data mining
spellingShingle Amoakoh Gyasi-Agyei
A comparative assessment of machine learning models and algorithms for osteosarcoma cancer detection and classification
Healthcare Analytics
Osteosarcoma classification
Machine learning
AI in healthcare
Cancer detection
Healthcare informatics
Medical data mining
title A comparative assessment of machine learning models and algorithms for osteosarcoma cancer detection and classification
title_full A comparative assessment of machine learning models and algorithms for osteosarcoma cancer detection and classification
title_fullStr A comparative assessment of machine learning models and algorithms for osteosarcoma cancer detection and classification
title_full_unstemmed A comparative assessment of machine learning models and algorithms for osteosarcoma cancer detection and classification
title_short A comparative assessment of machine learning models and algorithms for osteosarcoma cancer detection and classification
title_sort comparative assessment of machine learning models and algorithms for osteosarcoma cancer detection and classification
topic Osteosarcoma classification
Machine learning
AI in healthcare
Cancer detection
Healthcare informatics
Medical data mining
url http://www.sciencedirect.com/science/article/pii/S2772442524000820
work_keys_str_mv AT amoakohgyasiagyei acomparativeassessmentofmachinelearningmodelsandalgorithmsforosteosarcomacancerdetectionandclassification
AT amoakohgyasiagyei comparativeassessmentofmachinelearningmodelsandalgorithmsforosteosarcomacancerdetectionandclassification