A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional Data

High-dimensional datasets, where the number of features far exceeds the number of observations, present significant challenges in feature selection and model performance. This study proposes a novel two-stage feature-selection approach that integrates Artificial Bee Colony (ABC) optimization with Ad...

Full description

Saved in:
Bibliographic Details
Main Authors: Efe Precious Onakpojeruo, Nuriye Sancar
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:AppliedMath
Subjects:
Online Access:https://www.mdpi.com/2673-9909/4/4/81
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846106111478857728
author Efe Precious Onakpojeruo
Nuriye Sancar
author_facet Efe Precious Onakpojeruo
Nuriye Sancar
author_sort Efe Precious Onakpojeruo
collection DOAJ
description High-dimensional datasets, where the number of features far exceeds the number of observations, present significant challenges in feature selection and model performance. This study proposes a novel two-stage feature-selection approach that integrates Artificial Bee Colony (ABC) optimization with Adaptive Least Absolute Shrinkage and Selection Operator (AD_LASSO). The initial stage reduces dimensionality while effectively dealing with complex, high-dimensional search spaces by using ABC to conduct a global search for the ideal subset of features. The second stage applies AD_LASSO, refining the selected features by eliminating redundant features and enhancing model interpretability. The proposed ABC-ADLASSO method was compared with the AD_LASSO, LASSO, stepwise, and LARS methods under different simulation settings in high-dimensional data and various real datasets. According to the results obtained from simulations and applications on various real datasets, ABC-ADLASSO has shown significantly superior performance in terms of accuracy, precision, and overall model performance, particularly in scenarios with high correlation and a large number of features compared to the other methods evaluated. This two-stage approach offers robust feature selection and improves predictive accuracy, making it an effective tool for analyzing high-dimensional data.
format Article
id doaj-art-18160c3b9e964f81a4b1a2bb248cda4c
institution Kabale University
issn 2673-9909
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series AppliedMath
spelling doaj-art-18160c3b9e964f81a4b1a2bb248cda4c2024-12-27T14:07:10ZengMDPI AGAppliedMath2673-99092024-12-01441522153810.3390/appliedmath4040081A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional DataEfe Precious Onakpojeruo0Nuriye Sancar1Operational Research Center in Healthcare, Near East University, TRNC Mersin 10, Nicosia 99138, TurkeyDepartment of Mathematics, Near East University, TRNC Mersin 10, Nicosia 99138, TurkeyHigh-dimensional datasets, where the number of features far exceeds the number of observations, present significant challenges in feature selection and model performance. This study proposes a novel two-stage feature-selection approach that integrates Artificial Bee Colony (ABC) optimization with Adaptive Least Absolute Shrinkage and Selection Operator (AD_LASSO). The initial stage reduces dimensionality while effectively dealing with complex, high-dimensional search spaces by using ABC to conduct a global search for the ideal subset of features. The second stage applies AD_LASSO, refining the selected features by eliminating redundant features and enhancing model interpretability. The proposed ABC-ADLASSO method was compared with the AD_LASSO, LASSO, stepwise, and LARS methods under different simulation settings in high-dimensional data and various real datasets. According to the results obtained from simulations and applications on various real datasets, ABC-ADLASSO has shown significantly superior performance in terms of accuracy, precision, and overall model performance, particularly in scenarios with high correlation and a large number of features compared to the other methods evaluated. This two-stage approach offers robust feature selection and improves predictive accuracy, making it an effective tool for analyzing high-dimensional data.https://www.mdpi.com/2673-9909/4/4/81feature selectionartificial bee colonyadaptive LASSOhigh-dimensional data
spellingShingle Efe Precious Onakpojeruo
Nuriye Sancar
A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional Data
AppliedMath
feature selection
artificial bee colony
adaptive LASSO
high-dimensional data
title A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional Data
title_full A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional Data
title_fullStr A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional Data
title_full_unstemmed A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional Data
title_short A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional Data
title_sort two stage feature selection approach based on artificial bee colony and adaptive lasso in high dimensional data
topic feature selection
artificial bee colony
adaptive LASSO
high-dimensional data
url https://www.mdpi.com/2673-9909/4/4/81
work_keys_str_mv AT efepreciousonakpojeruo atwostagefeatureselectionapproachbasedonartificialbeecolonyandadaptivelassoinhighdimensionaldata
AT nuriyesancar atwostagefeatureselectionapproachbasedonartificialbeecolonyandadaptivelassoinhighdimensionaldata
AT efepreciousonakpojeruo twostagefeatureselectionapproachbasedonartificialbeecolonyandadaptivelassoinhighdimensionaldata
AT nuriyesancar twostagefeatureselectionapproachbasedonartificialbeecolonyandadaptivelassoinhighdimensionaldata