An oversampling FCM-KSMOTE algorithm for imbalanced data classification

In recent years, imbalanced data classification has emerged as a challenging task. To address this issue, we propose a novel oversampling method named FCM-KSMOTE. The algorithm initially performs a density-based fuzzy clustering on the data, then iterates to partition regions and perform oversamplin...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hongfang Zhou, Jiahao Tong, Yuhan Liu, Kangyun Zheng, Chenhui Cao
Format:	Article
Language:	English
Published:	Elsevier 2024-12-01
Series:	Journal of King Saud University: Computer and Information Sciences
Subjects:	FCM-KSMOTE Imbalanced data classification Density-based fuzzy clustering Partition regions Oversampling
Online Access:	http://www.sciencedirect.com/science/article/pii/S1319157824003379
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846100637059645440
author	Hongfang Zhou Jiahao Tong Yuhan Liu Kangyun Zheng Chenhui Cao
author_facet	Hongfang Zhou Jiahao Tong Yuhan Liu Kangyun Zheng Chenhui Cao
author_sort	Hongfang Zhou
collection	DOAJ
description	In recent years, imbalanced data classification has emerged as a challenging task. To address this issue, we propose a novel oversampling method named FCM-KSMOTE. The algorithm initially performs a density-based fuzzy clustering on the data, then iterates to partition regions and perform oversampling inside each cluster. Secondly, it merges the clusters and conducts noise detection to obtain a balanced dataset. Finally, we conducted the experiments on 19 public datasets and 3 synthetic datasets. Six evaluation metrics of Recall, Accuracy, G-mean, Specificity, AUC and F1-Score were used in the experiments. The experimental results demonstrate that our method can significantly improve the recognition rate of the minority class while maintaining high accuracy for the majority class. Particularly with the RF classifier, our method ranks first in all evaluation metrics, with a Recall difference of up to 0.2 compared to the least performing method, demonstrating its substantial performance advantage.
format	Article
id	doaj-art-c1791d6a953c4f828bbfa714c1127ea0
institution	Kabale University
issn	1319-1578
language	English
publishDate	2024-12-01
publisher	Elsevier
record_format	Article
series	Journal of King Saud University: Computer and Information Sciences
spelling	doaj-art-c1791d6a953c4f828bbfa714c1127ea02024-12-30T04:15:29ZengElsevierJournal of King Saud University: Computer and Information Sciences1319-15782024-12-013610102248An oversampling FCM-KSMOTE algorithm for imbalanced data classificationHongfang Zhou0Jiahao Tong1Yuhan Liu2Kangyun Zheng3Chenhui Cao4School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China; Shaanxi Key Laboratory of Network Computing and Security Technology, Xi’an 710048, China; Corresponding author at: School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China.School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, ChinaSchool of Finance, Hebei University of Economics and Business, Shijiazhuang 050061, ChinaSchool of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, ChinaSchool of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, ChinaIn recent years, imbalanced data classification has emerged as a challenging task. To address this issue, we propose a novel oversampling method named FCM-KSMOTE. The algorithm initially performs a density-based fuzzy clustering on the data, then iterates to partition regions and perform oversampling inside each cluster. Secondly, it merges the clusters and conducts noise detection to obtain a balanced dataset. Finally, we conducted the experiments on 19 public datasets and 3 synthetic datasets. Six evaluation metrics of Recall, Accuracy, G-mean, Specificity, AUC and F1-Score were used in the experiments. The experimental results demonstrate that our method can significantly improve the recognition rate of the minority class while maintaining high accuracy for the majority class. Particularly with the RF classifier, our method ranks first in all evaluation metrics, with a Recall difference of up to 0.2 compared to the least performing method, demonstrating its substantial performance advantage.http://www.sciencedirect.com/science/article/pii/S1319157824003379FCM-KSMOTEImbalanced data classificationDensity-based fuzzy clusteringPartition regionsOversampling
spellingShingle	Hongfang Zhou Jiahao Tong Yuhan Liu Kangyun Zheng Chenhui Cao An oversampling FCM-KSMOTE algorithm for imbalanced data classification Journal of King Saud University: Computer and Information Sciences FCM-KSMOTE Imbalanced data classification Density-based fuzzy clustering Partition regions Oversampling
title	An oversampling FCM-KSMOTE algorithm for imbalanced data classification
title_full	An oversampling FCM-KSMOTE algorithm for imbalanced data classification
title_fullStr	An oversampling FCM-KSMOTE algorithm for imbalanced data classification
title_full_unstemmed	An oversampling FCM-KSMOTE algorithm for imbalanced data classification
title_short	An oversampling FCM-KSMOTE algorithm for imbalanced data classification
title_sort	oversampling fcm ksmote algorithm for imbalanced data classification
topic	FCM-KSMOTE Imbalanced data classification Density-based fuzzy clustering Partition regions Oversampling
url	http://www.sciencedirect.com/science/article/pii/S1319157824003379
work_keys_str_mv	AT hongfangzhou anoversamplingfcmksmotealgorithmforimbalanceddataclassification AT jiahaotong anoversamplingfcmksmotealgorithmforimbalanceddataclassification AT yuhanliu anoversamplingfcmksmotealgorithmforimbalanceddataclassification AT kangyunzheng anoversamplingfcmksmotealgorithmforimbalanceddataclassification AT chenhuicao anoversamplingfcmksmotealgorithmforimbalanceddataclassification AT hongfangzhou oversamplingfcmksmotealgorithmforimbalanceddataclassification AT jiahaotong oversamplingfcmksmotealgorithmforimbalanceddataclassification AT yuhanliu oversamplingfcmksmotealgorithmforimbalanceddataclassification AT kangyunzheng oversamplingfcmksmotealgorithmforimbalanceddataclassification AT chenhuicao oversamplingfcmksmotealgorithmforimbalanceddataclassification

An oversampling FCM-KSMOTE algorithm for imbalanced data classification

Similar Items