A Novel Synthetic Minority Oversampling Technique for Multiclass Imbalance Problems

Multi-class imbalanced datasets present significant challenges in many real-world classification tasks, where certain classes are severely underrepresented. This study addresses the classification problems with multi-class imbalanced datasets, which are inherently more complicated than binary imbala...

Full description

Saved in:
Bibliographic Details
Main Authors: Jiao Wang, Norhashidah Awang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10829925/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841542545408000000
author Jiao Wang
Norhashidah Awang
author_facet Jiao Wang
Norhashidah Awang
author_sort Jiao Wang
collection DOAJ
description Multi-class imbalanced datasets present significant challenges in many real-world classification tasks, where certain classes are severely underrepresented. This study addresses the classification problems with multi-class imbalanced datasets, which are inherently more complicated than binary imbalanced problems. To tackle this problem, a novel and effective method called the One-vs-One Center Hybrid Synthetic Minority Over-sampling Technique (OCH-SMOTE) algorithm is proposed, which combines the enhanced Synthetic Minority Oversampling Techniques (SMOTE) with the One-vs-One (OVO) decomposition strategy. The OCH-SMOTE algorithm comprises two key components: the OVO strategy is used to decompose the multi-class imbalanced datasets, and the enhanced CH-SMOTE algorithm is used to generate the balanced training datasets to improve the classification performance. The OCH-SMOTE algorithm is extensively evaluated on 18 real-world multi-class imbalanced datasets, using the CART decision tree as the base classifier. The proposed method is compared with classical and state-of-the-art oversampling methods. On average, the OCH-SMOTE algorithm improves <inline-formula> <tex-math notation="LaTeX">$P_{macro}$ </tex-math></inline-formula>by 8.19%, MAVA by 9.19%, MG by 30.68%, MFM by 8.78%, and Kappa coefficient by 0.0462 across all datasets compared to the baseline methods. The experimental results demonstrate that the OCH-SMOTE algorithm significantly enhances multi-class imbalanced datasets classification performance.
format Article
id doaj-art-510c502a0b6240dba710f985ab59e1d4
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-510c502a0b6240dba710f985ab59e1d42025-01-14T00:02:32ZengIEEEIEEE Access2169-35362025-01-01136054606610.1109/ACCESS.2025.352667310829925A Novel Synthetic Minority Oversampling Technique for Multiclass Imbalance ProblemsJiao Wang0https://orcid.org/0000-0001-6022-1807Norhashidah Awang1https://orcid.org/0000-0002-2280-7193School of Mathematical Sciences, Universiti Sains Malaysia, Penang, MalaysiaSchool of Mathematical Sciences, Universiti Sains Malaysia, Penang, MalaysiaMulti-class imbalanced datasets present significant challenges in many real-world classification tasks, where certain classes are severely underrepresented. This study addresses the classification problems with multi-class imbalanced datasets, which are inherently more complicated than binary imbalanced problems. To tackle this problem, a novel and effective method called the One-vs-One Center Hybrid Synthetic Minority Over-sampling Technique (OCH-SMOTE) algorithm is proposed, which combines the enhanced Synthetic Minority Oversampling Techniques (SMOTE) with the One-vs-One (OVO) decomposition strategy. The OCH-SMOTE algorithm comprises two key components: the OVO strategy is used to decompose the multi-class imbalanced datasets, and the enhanced CH-SMOTE algorithm is used to generate the balanced training datasets to improve the classification performance. The OCH-SMOTE algorithm is extensively evaluated on 18 real-world multi-class imbalanced datasets, using the CART decision tree as the base classifier. The proposed method is compared with classical and state-of-the-art oversampling methods. On average, the OCH-SMOTE algorithm improves <inline-formula> <tex-math notation="LaTeX">$P_{macro}$ </tex-math></inline-formula>by 8.19%, MAVA by 9.19%, MG by 30.68%, MFM by 8.78%, and Kappa coefficient by 0.0462 across all datasets compared to the baseline methods. The experimental results demonstrate that the OCH-SMOTE algorithm significantly enhances multi-class imbalanced datasets classification performance.https://ieeexplore.ieee.org/document/10829925/Multi-class imbalancedover-samplingSMOTE algorithmone-vs-one approach
spellingShingle Jiao Wang
Norhashidah Awang
A Novel Synthetic Minority Oversampling Technique for Multiclass Imbalance Problems
IEEE Access
Multi-class imbalanced
over-sampling
SMOTE algorithm
one-vs-one approach
title A Novel Synthetic Minority Oversampling Technique for Multiclass Imbalance Problems
title_full A Novel Synthetic Minority Oversampling Technique for Multiclass Imbalance Problems
title_fullStr A Novel Synthetic Minority Oversampling Technique for Multiclass Imbalance Problems
title_full_unstemmed A Novel Synthetic Minority Oversampling Technique for Multiclass Imbalance Problems
title_short A Novel Synthetic Minority Oversampling Technique for Multiclass Imbalance Problems
title_sort novel synthetic minority oversampling technique for multiclass imbalance problems
topic Multi-class imbalanced
over-sampling
SMOTE algorithm
one-vs-one approach
url https://ieeexplore.ieee.org/document/10829925/
work_keys_str_mv AT jiaowang anovelsyntheticminorityoversamplingtechniqueformulticlassimbalanceproblems
AT norhashidahawang anovelsyntheticminorityoversamplingtechniqueformulticlassimbalanceproblems
AT jiaowang novelsyntheticminorityoversamplingtechniqueformulticlassimbalanceproblems
AT norhashidahawang novelsyntheticminorityoversamplingtechniqueformulticlassimbalanceproblems