Evaluating Machine Learning Models for Prostate Cancer Classification Using Gene Expression Profiles from DNA Microarrays

This study evaluates various machine learning models for classifying prostate cancer using gene expression profiles from DNA microarrays. Due to the high dimensionality of these datasets, effective dimensionality reduction through feature selection is essential to identify and remove redundant genes...

Full description

Saved in:
Bibliographic Details
Main Authors: Haddou Bouazza Sara, Haddou Bouazza Jihad
Format: Article
Language:English
Published: EDP Sciences 2024-01-01
Series:ITM Web of Conferences
Online Access:https://www.itm-conferences.org/articles/itmconf/pdf/2024/12/itmconf_maih2024_02004.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study evaluates various machine learning models for classifying prostate cancer using gene expression profiles from DNA microarrays. Due to the high dimensionality of these datasets, effective dimensionality reduction through feature selection is essential to identify and remove redundant genes. We applied multiple feature selection methods, including Signal to Noise Ratio (SNR), ReliefF, Correlation Coefficient (CC), Mutual Information (MI), and several others. These methods were combined with classifiers such as K Nearest Neighbor (KNN), Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), Decision Tree Classifier (DTC), Naïve Bayes (NB), and Artificial Neural Network (ANN). Our results demonstrated that the best combination was the Signal to Noise Ratio with Linear Discriminant Analysis, achieving a classification accuracy of 95% using only six genes. This study underscores the importance of effective feature selection and classifier combination for precise and efficient prostate cancer diagnosis, paving the way for improved personalized healthcare strategies. Future work will focus on validating these findings with larger datasets and exploring advanced machine learning techniques to enhance classification performance further.
ISSN:2271-2097